# Video

*Announcement: In order to preserve sanity, I am not going to include any technical proofs in the Video lectures from now on. As important, indeed crucial, as these are, there are at most 10 guys in the world who would be interested, and 9 of them don't even know they should be. Because everybody thinks they have probability already all figured out. Which we have been proving they do not. Proofs will still appear in the written lectures, and all questions about them will be answered.*

Links:

Bitchute (often a day or so behind, for whatever reason)

**HOMEWORK:** (A) How many TOTAL groups can you get from groups of 5 from the numbers 1 through 70 AND one group of 1 from the numbers 1 through 25? (B) Which gives a higher TOTAL number of groups? Adding 19 numbers to the first batch (so we have 1 through 99), or we keep the 70 but choose groups of 6 instead of 5?

# Lecture

Last week's homework was recalling Pr(M_6|E) = 1/6, and we did Pr(M_6|E + "fair") = 1/6 (a circular argument!). Now give us Pr(M_6|E + "unfair") = ?

First, if "fair" means anything, it means this: Pr(M_6|E+ Pr(M_6|E)) = P(M_6|E) *by definition*, a circular arguemnt, or it means some impossibly perfect symmetry in the device/die and its workings. Did you watch the coin flip video? (Blog, Substack). Now what is Pr(M_6|E + "unfair") = ?

There is no answer! It depends on what you bring to the word UNFAIR. There is no definite meaning. Each of you will havc a different tacit definition in mind, *each of which changes the probability*.

There are only two lessons in this entire course. Just two. Last week I said one, but I meant two. Because two is more than one. The lessons are:

ALL probability about the proposition of interest Y is conditional on assuming the evidence X, i.e. Pr(Y|X). Change the X, you change the probability. Change the minutest fraction of X that is relevant to Y, you change the probability. Add a data point, change the probability. Make a new emphasis, change the probability. ANY change necessitates a change in the probability (keeping in mind relevance).

The ontology is not the epistemology. Probability is not real. The uncertainty we have

*in*a thing is not*the thing itself*, which has no uncertainty in itself! Forgetting this leads to the Deadly Sin of Reification and accounts for the large (non-DIE) errors in science.

Some will think "unfair" means Briggs must have weighted the die so that it always comes up 6. So the probability is 1/6. Others will recall I do amateur magic, and that I would have figured people would choose 6, so I did 1 instead. So the probability is 0. Still others will say they have no idea, except that one of the states must still obtain. We still assume E! So the probability is in [0,1] -- with strict bounds. And on and on, depending on what you mean by "unfair".

Change the evidence, change the probability!

Now we also learn to count. But we do a whirlwind tour of counting. Counting is one of the most complicated mathematical subjects that exist, and goes by the name combinatorics.

You are also welcome to download (free!) my book *Breaking the Law of Averages*, and read Chapter 3 on counting.

Briefly, though, we learn that the number of ways to arrange n things (where the order matters) is n!, which is n x (n-1) x (n-2) x ... x 1. Because of calculus (gamma functions), 0! = 1. We learn that the number of ways to arrange k out of n is n!/k!, where the order still matters. And we learn that that way to choose groups of k from n, where the order does not matter is called "n choose k", which equals n!/ [ k! x (n - k)! ].

We really only need these simple facts, and these three probability rules, and we can derive nearly all the probability we'll ever need. That's how much we have done, though it might not see like it.

Pr(AB|C) = Pr(A|BC)Pr(B|C) = Pr(B|AC)Pr(A|C). Bayes's theorem.

Pr(A|B) + Pr(not-A|B) = 1.

Pr(A_i|B) = 1/n, if B says there are N states only one of which must obtain (and says nothing more).

That is it. From there we can really go to town, and do. Next time we start on Jaynes Chapter 3, which brings these rules together in an elegant way.

Below, is David Stove's proof that allows us to get that third point.

*This is an excerpt from Chapter 4 of *Uncertainty.*All the references have been removed. Like last week, I have not attempted to LaTeX it here, because Substack does not support inline equations. If you want to read this in a more elegant way, PLEASE SEE THE BLOG. CASUAL READERS CAN SKIP THIS.*

Now Stove's attempt, in my notation and somewhat shortened. The statistical syllogism is deduced from the symmetry of logical constants in this example. Given H = "Just two of Abe, Bob, and Charles are black", the probability of B = "Abe is black", relying on the statistical syllogism, is 2/3. Let T be any tautology, a necessary truth. Then Pr(HB|T) = Pr(H|T)Pr(B|TH). Rearranging, and because logically TH is equivalent to H, we have Pr(B|H) = Pr(HB|T) / Pr(H|T).

H is logically equivalent to

B_1B_2B^c_3 or B_1B^c_2B_3 or B^c_1B_2B_3,

where B_1 = "Abe is black", B^c_3 = "Charles is not black", and so forth. And that means

Pr(B_1|H) = (Pr((B_1B_2B^c_3 or B_1B^c_2B_3 or B^c_1B_2B_3)B_1|T)) / (Pr(B_1B_2B^c_3 or B_1B^c_2B_3 or B^c_1B_2B_3|T)).

Distributing B_1 in the numerator gives

Pr(B_1|H) = (Pr(B_1B_2B^c_3|T) + Pr(B_1B^c_2B_3|T)) / (Pr(B_1B_2B^c_3|T) + Pr(B_1B^c_2B_3|T) + Pr(B^c_1B_2B_3|T)).

because B^c_1B_2B_3B_1 is impossible. Here is Stove's big move. He states

Pr(B_1B_2B^c_3|T) = Pr(B_1B^c_2B_3|T) = Pr(B^c_1B_2B_3|T);

but *also*

0<Pr(B_1B_2B^c_3|T) <1.

Thus because of the symmetry of individual constants, the statistical syllogism is deduced. The 2/3 probability follows from the labels, here the names, being "exchangeable" with respect to T.

**Subscribe or donate to support this site and its wholly independent host using credit card click here****. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank.**

edited Jul 23Briggs I've been a pilot since 1973 and can assure you that the airport indications (ICAO, IATA) are FOUR symbols not three. The first symbol is a letter which gives the country code. Mainland US is K, Cananda is C and so on. The remaining three can be any combination of letters and numbers. And some smaller airports only use two.

> gamma functions

No need for the gamma function in this case, we can extend the definition of n! to n=0 using (n+1)! = n! * (n + 1), setting n=0 we get 1! = 0! * 1, and 0! = 1 immediately follows