Class 28: Randomization Redux (Randomization…

William M Briggs

Nov 11, 2024

Jaynes’s book (first part): https://bayes.wustl.edu/etj/prob/book.pdf

Read →

9 Comments

Paul Fischer

Nov 11, 2024

Come on Briggs! You already proved Splenetic Fever may be controlled by Broccoli.

BROCCOLI REDUCES THE RISK OF SPLENETIC FEVER! THE USE OF INDUCTION AND FALSIFIABILITY IN STATISTICS AND MODEL SELECTION William M. Briggs

Oh my I bet the Medical School loved you for this paper.

Expand full comment

Paul Fischer

Nov 11, 2024

Great use of Bayes — nicely done! I see where you're coming from now, and I appreciate your perspective. Back in the day, we insisted on using randomization. The idea was that by randomizing, we'd get a sample that represented the population we were studying. But as I think about it now, what we were really doing was a form of control. So whether it's control or representation — potato, potahto — it got us where we needed to be.

Expand full comment

PE Bird

Nov 11, 2024

Similar to Adderall, Profital requires 2 ll's.

Expand full comment

Paul Fischer

Nov 15, 2024

Here's my proof for the necessity of random sampling.

Forgive the use of frequentist arguments William….but I had to use them.

I set out to prove that a stratified sample must use random sampling within each stratum to ensure it reflects the characteristics of the underlying population.

Definitions

Stratified Sampling: The population is divided into distinct subgroups or strata based on shared attributes (e.g., age, income, gender). A sample is then drawn from each stratum.

Random Sampling within Strata: Within each stratum, individuals are selected using a random sampling process.

I will show that if the selection within strata is not random, the sample can be biased, and the resulting stratified sample may not represent the population accurately.

Notation

Let P be the entire population.

Let P1,P2,…,Pk be the strata, with Pi representing the i-th stratum.

Let Si be the sample drawn from stratum Pi.

Let S be the combined stratified sample S=S1∪S2∪⋯∪Sk.

Let A be an attribute of interest (e.g., income level).

Let Pr(A∣Pi) be the proportion of individuals with attribute A in stratum Pi.

Let Pr(A∣Si) be the proportion of individuals with attribute A in the sample Si.

Proof

Random Sampling within a Stratum:

When we use random sampling within a stratum Pi each individual has an equal chance of being selected. The expected proportion of individuals with attribute A in the sample Si will match the proportion in the population stratum Pi: E[Pr(A∣Si)]=Pr(A∣Pi).

Thus, when the sample sizes are large enough, Pr(A∣Si) will converge to Pr(A∣Pi) by the law of large numbers.

Non-Random Sampling within a Stratum:

Suppose instead that the sampling within a stratum is non-random (i.e., I select only women).

This introduces selection bias. The sample Si overrepresents women causing Pr(A∣Si)≠Pr(A∣Pi).

The issue is that non-random sampling changes the expected attribute distribution in Si. The sample Pr(A∣Si) will be biased, leading to: E[Pr(A∣Si)]≠Pr(A∣Pi).

Impact on the Combined Stratified Sample:

For the combined stratified sample S, the overall proportion of attribute A is: Pr(A∣S)=∑_1^k▒〖Pr⁡(A│Si) ni/n〗, where ni is the size of Si and n is the total sample size.

If Pr(A∣Si)≠Pr(A∣Pi) due to non-random sampling, then Pr(A∣S) will not match the population proportion: Pr(A∣S)≠ ∑_1^k▒〖Pr⁡(A│Si) Ni/N〗, where Ni is the size of stratum Pi in the population and N is the total population size.

Conclusion:

If the sampling within each stratum is not random, then Pr(A∣Si) is biased for each i, and the combined sample S will not accurately represent the population in terms of attribute A.

Therefore, to obtain an unbiased and representative stratified sample, we must use random sampling within each stratum Pi. This ensures that: E[Pr(A∣Si)]=Pr(A∣Pi) and E[Pr(A∣S)]= ∑_1^k▒〖Pr⁡(A│Si) Ni/N〗, which matches the population proportion.

Expand full comment

Paul Fischer

Nov 15, 2024

William, I went through your proof and summarized it as follows:

This proof attempts to demonstrate that randomization in a sample does not alter our knowledge of an individual's hidden attribute, assuming the setup described.

The Argument

1. Notation:

o Ai: An individual having attribute i.

o Di: The distribution or description of the actual sample of individuals with attribute i.

o R: Information about the randomization mechanism.

o Gi: Indicator of an individual being in Group i (e.g., Group 1 or Group 2 after coin flip).

2. Initial Claim:

Pr(Ai∣Di)=pi

This states the probability of an individual having attribute i given the sample information is pi, based on the sample itself.

3. Randomization:

o The randomization process does not depend on individual attributes but assigns individuals to groups with a certain probability (coin flip 50-50 split).

o The claim is that the probability a person has attribute 1 given they are in Group 1 and under randomization is: Pr(A1∣G1D1R)

o The main claim is that this is equivalent to: Pr(A1∣D1)

4. Proof Outline:

o We have the identity: Pr(AG∣DR)=Pr(A∣GDR)PrG∣DR)

o Solve for Pr(A∣GDR):

Pr(A∣GDR)=Pr(G∣ADR)Pr(A∣DR)/Pr(G∣DR

o Argue that Pr(G∣ADR)=Pr(G∣DR) because the randomization does not depend on the attribute A.

o Thus: Pr(A∣GDR)=Pr(A∣DR)

o We conclude Pr(A∣GDR)=Pr(A∣D) because R does not affect the probability of attribute A. QED.

Is the Proof Legitimate?

The proof's reasoning is sound under the conditional independence assumption that the randomization mechanism R does not influence the attribute A directly. Here are some key aspects that support this:

1. Conditional Independence:

o The crucial step in the proof relies on Pr(G∣ADR)=Pr(G∣DR), which asserts that randomization (coin flip in this situation) is independent of the attribute A once you know the randomization mechanism R. Obviously true.

o This is obvious if the randomization is done in ignorance of the attributes.

2. Interpretation of R:

o R is not the actual group assignment but the mechanism by which individuals are randomized. This distinction is critical because R represents our knowledge about the procedure, not the outcome.

o This allows the simplification Pr(A∣DR)=Pr(A∣D) because knowledge about the randomization process R does not alter the probability of A.

3. Conclusion Validity:

o The final conclusion Pr(A∣GDR)=Pr(A∣D) logically follows if the randomization mechanism does not influence the underlying probability of attribute A, which is a core premise in experimental design.

Expand full comment

Reply (1)

Leon Voß

Dec 5, 2024

You summarized it? You write a lot like GPT

Expand full comment

Paul Fischer

Nov 12, 2024

Sorry about the post about profital. I thought I had it in the correct place. Obviously not. So, you read my post about this randomization business and I do apprecaiate the use of bayes but I still feel we are talking about two very different things. In the context I was refering to last time we corresponded the task at hand was to obtain a representative sample. I don't care about attributes, I care about representing the population. So in that context what we were doing is exactly what you presented here. Okay you don't buy it. I'll prove it to you. Off to the chalk board! This could take a while I'm rusty.

Expand full comment

Terry Oldberg

Nov 11, 2024

The measure of an event is not necessarily a probabiility.( (https://search.yahoo.com/search?p=christensen+and+reichert&fr=yfp-t&fr2=p%3Afp%2Cm%3Asb&fp=1 ) but is assumed to be a probabability by Jaynes in his book!

Expand full comment

Reply (1)

Paul Fischer

Nov 11, 2024

Nor should it be.

Expand full comment

Science Is Not The Answer

Class 28: Randomization Redux (Randomization…