My friends, as arcane as all this seems, we will need this in the Restoration. We need to restore Reality in all things, including Science.
Video
Links:
Bitchute (often a day or so behind, for whatever reason)
HOMEWORK: With B = M red balls, N total balls, and so N - M white balls, and with R_i = red drawn on i-th draw, and L = "At least one red in k previous draws", what is Pr(R_k+1|BL)?
Lecture
I still have the idea I am rushing. I’m peeling off less and less per lecture, forgoing most of the math on the assumption that those who want it can look it up in the text or written lecture. Our concern is not how to do proofs anyway, which I view as purely mechanical. My goal is to teach you the why, not the how of the math. In a physical setting, I’d make you do the math. Here, I can’t. I can only hope you see the point of it.
Even so, I still think I’m cramming a lot into a short period of time. I don’t have the idea I’m conveying the deeper ideas. Yet if you’re following, then it’s okay. Let me know.
Let’s recall what we did! We deduced, from first principles, probability can be math; we deduced its functional form. We deduced it can be represented by numbers, and we discovered, via more deduction, what these numbers can be. With that, we deduced our first formal model, the hypergeometric model.
The model’s premises are (using Jaynes’s notation) B = “A device will and must record one of two states, R or W, success or failure, on or off, red or white, or whatever you like as long as it is dichotomous, and the only thing we know is that there are N of these records, M or which are R/success/on/red. And we will read off these states n at a time (draw balls from a bag).” From that B we deduce N – M are W/failure/off/white. See that “only“? If I could come through the screen and bang you over the head with it, I would. Failure to remember it leads to the Deadly Sin of Reification.
“But how do we know this model B applies to this real-life thing I have in front of me?” How indeed! That is an entirely different and separate and not-the-same-at-all question as “What is the probability the i-th record is R/success/on/red given we have observed n = r + w previous states?” (for any n, r, and w combo given the obvious constraints, e.g. r <= M). We haven’t come close to learning how to judge how this model works with real things. We have come to the knowledge of how to answer questions like this, though.
What should at this point be pure and clear is that everything has been deduced. We are working with logic. There is no subjectivity to any of this, except for the formation of B. And in propositions of interest (POI), such as A = “How much will asparagus cost X next week?” What is the probability of A given B? None at all, unless we add the tacit, or rather implicit, premise that A is contingent, then Pr(A|B) is in [0,1].
That’s a silly POI, isn’t it? But we can do it. As they used to say, it’s a free country. Well, things change.
Most people would pick a POI like A = “State is R, and n = 0”. Or whatever. This A sounds like B is probative of A, as indeed this POI is, because it speaks of the same things B speaks of.
Trivial point? Well, we shall see in time.
Now I like to write Pr(A|B) because everything is laid out in plain English, there is no ambiguity, which makes making mistakes more difficult. But it’s a pain in the keister to do math with, because it’s clunky. So we use shorthand, which is both dandy and fine—as long as we remember it is shorthand! Alas, carried away in their enthusiasm for mathematics (easy to do), some see life emerge from the symbols. The math becomes Reality. People begin to say “The probability of A” and not “The probability of A given B”. The Deadly Sin of Reification has happened. You might not believe me. Yet. You will.
(If you are at Substack you might want to surf over to my main site where the math is pretty. There is no current way to do in line LaTeX in Substack.)
In any case:
We multiply and divide the right hand side by 1/𝑁, and then let 𝑁→∞, 𝑀→∞ and 𝑀/𝑁→𝑓. You can read Jaynes Chapter 3 for the math (or just do it; it’s easy). Afterwards we get
which is called the binomial or binomial “distribution”. A distribution simply gives you the probability of everything that can happen, given B. Given B. Given B. Given B. Get it? If you’re like most, you’ll forget it.
Just as many forget that this binomial is only an approximation to the hypergeometric distribution. Approximation. Approximation. A mighty fine one, too, when N is, say, 10 or at least 20. It stinks below that. Don’t use it.
Maybe you see the looming problem, the little item responsible for so much damaged thinking in science. That f. Just what is it?
Well, it’s 𝑁/𝑀→∞! Uh huh. That’s the math. But what is it? Can you really be so bold as to think you can assign probabilities to infinite numbers of items? Infinite. Do you know how big that is? I bet you do not.
We are too glib with infinity. It’s too damned easy to write. Let’s think about it. You know things like 10^2 = 100 and 10^3 = 1000. These are dinky numbers. To write bigger ones we use notation like the tetration, which is an iterated exponentiation. One way to write them is (^2)10= 10^10^10 = 10^10000000000 (2 iterations of powers of 10; this is supposed to look like a superscript 2 in front of the 10, not behind it, but I cannot discover how Substack does superscripts, if it does them at all). That’s a mighty number. Know how far from infinity it is? An infinite distance.
Let a = (^2)10, then compute (^𝑎)10. Can you imagine it? I cannot. Know how far from infinity it is? An infinite distance. Then let b = (^𝑎)10 and compute (^𝑏)10. Easy to write! Impossible to understand. And that doesn’t even come close to infinity.
Yet not only do we say we can assign probabilities to these kinds of numbers, we say we can do it for an infinite sized urn. And not only that, we do it for infinities that are infinitely larger than these puny counting infinities. One marvels are our impudence.
We will meet these topics again and again.
Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank.
Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank.
B is the following:
M=red balls
N=total balls
K=number of draws before the K+1 draw
L=at least one red ball drawn in previous k draws
Rk+1=red ball on k+1 draw
P(no red balls in k draws)=(N-M choose K)/(N choose K)
=(N-M)(N-M-1)…(N-M-K+1)/N(N-1)…(N-K+1)
P(L|B)=1-P(no red balls drawn)=1 - (N-M)(N-M-1)…(N-M-K+1)/N(N-1)…(N-K+1)
Now Bayes gives P(Rk+1|BL)=P(BL|Rk+1)P(Rk+1)/P(BL)
We know P(BL)= from the above
but P(BL|Rk+1) = P(BL) if we know we draw a red ball on the k+1 and we don’t change the previous draws so we have
P(BL|Rk+1)=P(BL) then P(Rk+1|BL)=[P(BL)(M/N)]/P(BL)=M/N
Great stuff William! Love every second. It compliments my 5 decades of misunderstanding (frequentist) probability well. HAHA. Honestly, I had a solid year of graduate level probability theory and not once heard about any of this.