Class 49: Relevance & Importance Of Evidence In Models
First basics of models and the Deadly Sin. Also the superiority of relevance over the misleading idea of independence. WARNING for those reading the email version! The text below might appear to be gibberish. If so, it means the LaTeX did not render in the emails. I’m working on this. Meanwhile, please click on the headline and read the post on the site itself. Thank you.
Video
Links: YouTube * Twitter – X * Rumble * Bitchute * Class Page * Jaynes Book * Uncertainty
HOMEWORK: Given below; see end of lecture.
Lecture
This is an excerpt from Chapter 8 of Uncertainty.
As covered in Chapter 4, an X that is added to a model which, in the presence of the other premises (other “Xs”), does not change the probability of Y is irrelevant or not probative. The data X which are used to “fit” a model are of course themselves premises, i.e. X1,1 = “Observed the value 112.2 mm/Hg” (for the first premise, say of systolic blood pressure, first observation from some collection). The importance of each premise, given the presence of the other premises, is judged by how much it changes the probability of Y. If an X does not result in extreme probabilities of Y, this X is not necessarily causal, though an injurious, flabbergasting tradition has developed (predominately in the “soft” sciences) which says or assumes it is.
For example, if Pr(Y|X_1)=Pr(Y|X_1X_2) then X_2 is irrelevant in the presence of X_1, even if Pr(Y|X_2) is something other than the unit interval. That is, X_2 may be separately probative to Y but it adds no information about Y that is not already in X_1. There are thus two kinds of relevance, in-model, which is rather a measure of importance, how much a premise changes our understanding of Y, and out-model, whether the premises is even needed. A third is a variant of the first: sample relevance.
Suppose Y itself takes different states (like temperature) and that Pr(Y_a|X_1) = Pr(Y_a|X_1X_2) but Pr(Yb|X1) ≠ Pr(Y_b|X_1X_2). X_2 in the presence of X_1—the condition which must always be stated—is then relevant to Y; or, better, relevant only when Y is Y_b.
Suppose Pr(Y|X_1) = Pr(Y|X_1X_2)+ϵ, with the obvious constraints on ϵ. Then X_2 in the presence of X_1 is relevant. Whether the difference ϵ makes any difference to any decision is not a question probability can answer. “Practically irrelevant” is not irrelevant. Irrelevance is a logical condition. The practice of modeling is the art of selecting those X (premises) which are relevant, in the presence of other premises, to Y. Invariably, some new premise will add “only” ϵ to the understanding of Y. Whether this is enough to “make a difference” is a question only the modeler and whomever is going to use the model can answer. The only “test” for relevance is thus any change in the conditional probability of Y.
Relevance, as we see next Chapter, is how models should be judged before verification of their predictions arrive. Assessing relevance is hard work—but who said modeling had to be easy? That modeling is now far too easy is a major problem; because anybody can do it, everybody thinks they’re good at it. Supposing Y is simple (yes or no, true or false), and a list of premises, the relevance of each X_i—it subscript indicates it is variable—is assessed by holding those other X_j which are variable at some fixed level and then varying the X_i. For example, to assess the relevance of X_1, which can take the values a1 and a2, compute
where W are those premises which are fixed (deductions, assumptions, etc.), and
The difference between these two probabilities is the in-model relevance of X_1 given the values the other X take. The out-model relevance is assessed by next computing
and comparing that to the model which retains X_1. Note that all the other X have kept their values. Sample relevance is computed by calculating the same probability but or the addition (or subtraction) of a new “data point.” Irrelevance is:
For instance, suppose we have n observations and on the n+1 the probability Y is true remains unchanged. Then this new data point has added no new information, in the presence of the first n. Of course, these may be used to hunt for those data points which are most relevant, or rather, most important, and which are irrelevant (given the others). Those familiar with classical parametric methods will see the similarities; this approach is superior because all measures are stated directly and with respect to the proposition of interest Y.
I should highlight I am not here trying to develop a set of procedures per se, only defining the philosophically relevant constituents of probability models. We want to know what it means to be a probability model—any probability model, and not just one for some stated purpose. Readers interested in working on new problems will discover lots of fertile ground here, though.
It should be by now obvious that each of these probabilities is also a prediction. They say “Here is the probability of Y should all these other things hold.” So not only probabilities, but all predictions are conditional, too. This form of model also forms the basis of how statistical methods should work. All concentration is centered on asking how X influences our knowledge of Y—and even in rare cases, how X causes or determines Y.
Relevance is made more difficult when Y is allowed to vary, but the underlying idea is the same. Except for the X of interest which is varied or removed from the model, fix the others Xs, and compute the probability of the Ys and see what changes to this probability happen. Relevance is when there are changes, and irrelevance when not. This is obviously going to be a lot of work for complex Ys and Xs, but nothing else gives a fairer and more complete picture of the uncertainty inherent in the problem. And, again, who said it had to be easy?
Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank. BUY ME A COFFEE.



The "Materiality" of evidence to a proposition has a long and distinguished history, Professor. (I know you know... Bacon really did a wondrous thing with Novum Organum). As do terms like "probative value"... it's almost as if courts are where propositions get built on probabilistic evidence for an impartial panel and stacked together - and upon which the panel votes their conscience as to what level of evidence has been met by the proponent.
(On intermediate questions we let some supposedly learned guy or gal in a black robe sort the arguments from competent "arguers" for each side's position and on questions about a piece of evidence's relevance - which includes "materiality" - or "probative value" to the proposition for which it's being offered.)
It's almost as if these issues have been important for as long as people have existed in large societies. ;-)
(Polya's stuff on the various ways evidence changes our perception of a proposition's "plausibility" is really amazing.)