Statistical Challenge: Can You Rate How Good…

Feb 16, 2024

I will lose many of you today.

27 Comments

Feb 16, 2024

Not knowing a damned thing about statistical analysis (despite your best efforts Mr. SGT Briggs), I’m gonna go with a simple “study bad.” Looks like they did lots of data manipulation (not distinguishing U from N on the initial analysis, smoothing, modeling, etc.) so I imagine it would be difficult to conclude anything with certainty. Thus, my “study bad” conclusion.

Expand full comment

Lon Guyland

Feb 16, 2024

When I was choosing a career all those years ago I wanted to be a statistician but I had too much personality so I became a tax accountant.

Just kidding.

“Any kid that had any kind of malady was excluded.“

And

“From all this the authors conclude that there is a higher risk of malady…”

Call me stupid, but this seems rather problematic, as they say in corporate-drone speak these days.

I’m going with a choice you did not give: “useless waste of time and (presumably) taxpayer money”.

Expand full comment

Reply (1)

William M Briggs

Feb 16, 2024

This is a key problem, yes.

Expand full comment

Argo

Feb 16, 2024Edited

I'm neither a trained statistician nor a scientist, but the number of steps and confounding involved and the uneven sample sizes make it scream "mystery meat study" to me. U sample is too small, we don't know enough about how many got sick in either group, a lot of modeling was used, and so on. It just seems like they tried to make something out of nothing.

The scientific equivalent of ultra-processed food, somehow getting from pig carcass to Spam can.

Expand full comment

Reply (1)

William M Briggs

Feb 16, 2024

That no one was sick is, I think, key.

Expand full comment

Reply (1)

Argo

Feb 17, 2024

I was reading the sniffles and stuff as being weak symptoms but I suppose that's not being sick, that's just life. Sickness is transient, sniffles are eternal.

Expand full comment

SPQR

Feb 16, 2024

I'm always concerned about medical studies because there are just too many uncontrolled variables in the Complex design of the human organism and its heredity to reach definitive conclusions.

Expand full comment

Deep Dive

Feb 16, 2024

Sorry, Briggs, but I didn't make it past the 8th bullet point. I will expect to be receiving only partial-credit for showing my work, if any credit.

For #1, you have a possibly-unrepresentative sample by looking at just the healthy kids (selection bias).

For #2, you measure a surrogate marker of a thing, rather than measuring the real thing.

For #3, a small sample size of U destroys hopes for precision in the estimate (too much wiggle-room).

For #4, non-random "re-sampling" without explaining the who, what, when, where, why, how.

For #7, my understanding is that the regression was a "two-block time-series" with a join-point at P (checking for a slope change after a critical event), which could be too sensitive to people's baselines.

For #8, the multiple regression model made is one of a thousand different ones that could get made.

Expand full comment

Reply (1)

William M Briggs

Feb 16, 2024

All good points, of which I think #2 is the most important. As we'll see.

Expand full comment

Vxi7

Feb 16, 2024

I got a research induced headache reading through this. :)

Expand full comment

AndyinBC

Feb 16, 2024

Good thing I didn't have anything important scheduled for the next few hours!

Expand full comment

agent Roger W.

Feb 16, 2024

"From all this the authors conclude that there is a higher risk of malady from lower levels of B caused by the bad thing, regardless of number of procedures."

This is about the modification of the body: children are more likely to suffer the bad thing if they do not take in the synthetic, unnatural thing, which modifies their blood (their body) in such a way that maybe, some of the modified children are statistically less likely to suffer B.

Is permanent the modification of the body by the unnatural thing (drug, procedure, whatever)? That's important for the unknowable long-term effects.

Let me put a concrete example: Children diagnosed with leukemia. The unnatural thing in this case protects some of the kids with a potential for being diagnosed with leukemia. But let's say that when they reach age 30 they develop brain cancer, as a consequence of U. So, the medical and political intervention on the health decisions of people may save some lives now, at the expense of losing lives later. We cannot know if a number of 30 year-olds with brain cancer from this treatment will produce more suffering to other people (their children, perhaps?) or more economic damage in general. Maybe it's beneficial in economic terms that those adults die then, because the ratio of the estimated economic loss of losing them younger over losing them at 30 favors the intervention. This can only be guesstimated later, after the fact. Which is no way to make policy now.

With these data alone and these conclusions, I don't think I would recommend the U treatment. Not even in the case of a contagious disease. I would encourage people to make their own decisions and bear the consequences, good or bad. Anything else is Communistic, in my opinion.

If the numbers were more skewed favoring the U intervention to prevent the bad thing, and if the bad thing was nasty or very deadly, and a lot of people were susceptible, then I would, perhaps, start singing "Soyuz nerushimyy respublik svobodnykh" and recommend intervention U to people. But if the numbers were very favorable, and assuming no error or cheat, then most people, without compulsion, nudge, bribe or threat, would naturally use the unnatural intervention to alter their children's bodies. Secure and Big benefits are very persuasive and the best stimulus into action.

Expand full comment

Reply (1)

William M Briggs

Feb 16, 2024

This is a good analysis about U. It doesn't quite fit here, though, and for reasons you couldn't know. But very nice analysis just the same.

What you couldn't know is the essence of U. When you hear, you will not be happy.

Expand full comment

Reply (1)

agent Roger W.

Feb 16, 2024

I will not be happy but I'll manage to learn something new. I'm waiting to read more about this challenge.

Expand full comment

cm27874

Feb 16, 2024

I occasionally give a talk in which I illustrate how hopeless it is to extract anything meaningful (a binomial proportion, for example) from a sample size of 100. And this study even divides the sample in two, and transcends from observables to functions of differences of observables. No, thanks.

Regarding study design, however, the most serious misstep is step 1.

In the diagram, it seems to me that the whole idea of a slope change is due to a single data point. At the 99% confidence level (remember, there are less than 100 data points), the confidence interval should be huge.

Expand full comment

Reply (2)

agent Roger W.

Feb 16, 2024

This kind of thing is like putting lipstick on a pig and tell people to marry the pig. The undercover pig is merely an ideological preference of politicians playing doctor. In the best case scenario, they only seek money.

Expand full comment

William M Briggs

Feb 16, 2024

That change in slope is another key. Why there? Very odd.

Expand full comment

Jon Cutchins

Feb 18, 2024

Before looking at other commenters or doing any original research:

The first problem is that the analysis does not seem to have been predesigned. They looked at data then decided how to model the data, then tweaked the model based on results, then seemed to tweak it again based on those results. A good analysis would determine its steps and parameters before looking at the data.

Second is what I call 'model stacking' and has the issue that Briggs likes to bring up of not 'carrying forward the uncertainty'. They build models based on other models. Briggs doesn't present any error bars but I presume that with realistic error bars available we would find that all results are within the range of measurement error.

Third, my personal pet peeve in medicine is recommending interventions based not on patient outcomes but based on a lab value proxy. None of the participants in the study got whatever the condition is that B was supposed to be a proxy for. Presumably the larger number of N suggests that N is less likely to get the condition since they were both 0, but nobody really knows. Is 0 greater than 0?

Fourth, since no one got the condition whatever else we conclude we have to conclude that the study did not enroll enough participants. It is 'underpowered'.

Fifth, the lack of homogeneity and seemingly spontaneous decisions about doing blood draws calls everything into question.

I don't see this study as providing any useful information to anyone.

Expand full comment

Springer

Feb 17, 2024

That's a Baye's theorem problem. It is the best route to go when you see a prior hypothesis.

P(A|B) = P(A)* P(B|A)/

P(B)

Which tells us: how often A happens given that B happens, written P(A|B),

When we know: how often B happens given that A happens, written P(B|A)

and how likely A is on its own, written P(A)

and how likely B is on its own, written P(B)

Now it a matter of interrogating his problem and plugging the values in

Expand full comment

Jon B

Feb 18, 2024

Any study which does not produce results which show clear strong correlations for preselected end points is at best worthless, and more commonly is an exercise in mmanipulation.

Drawing any sort of curve through the shotgun blast data graph you presented is ridiculous.

I suspect this study was conceived to lend the fading credibility of "science" to something repulsive.

Expand full comment

Humbert Rivière

Feb 16, 2024

According to the information presented in the study paraphrase, I would assert that the study's quality can be deemed as subpar or below average, yet not entirely unsatisfactory:

Matching the two groups U and N based on essential demographic variables, such as age, weight, and parents' age, enhances the internal validity when comparing these groups.

However, vital information regarding the participants remains absent, such as the specific quantity of individuals in every group who displayed anomalous blood findings or inflammation. Also on the quantity, the analysis could potentially be affected due to the substantial difference in sample sizes between group U, which consists of only 19 children, and group N, which comprises 79 children.

Acquiring a breakdown of the groups would have significantly enhanced the potential for thorough analysis. Reasons for excluding some kids and selectively re-testing only some are unclear.

Utilizing a regression model for data analysis, while accounting for possible confounding factors such as procedures and blood markers, is statistically valid. However, the lack of disclosure regarding the raw data hinders the ability to confirm the findings through external validation.

The extensive dispersion and inadequate measures of fit for the regression models prompt inquiries regarding whether the models effectively encompassed the correlation between the variables.

Considering the deficiencies related to absent particulars from the primary data as well as the shortcomings of the regression models, it appears unsubstantiated to assert that group U surpasses group N.

The terms "malady" and "bad things" present in the blood are imprecisely defined, lacking any established standards of reference. However, I presume that this choice was made in order to provide a broader framework for the study.

Researchers had prior knowledge of the group identities during the subsequent analysis, thereby creating a possibility for biased outcomes.

However, taking into consideration factors that may obscure the results, such as procedures and blood markers, is a scientifically rigorous approach.

Enhancing robustness through the utilization of various model specifications and data subsets as part of a sensitivity analysis.

Including divergent perspectives on the potential consequences of negative events can enhance complexity and depth of understanding.

Providing the factual narrative (while ensuring confidentiality through anonymization) enhances clarity and openness.

There are several aspects that could have been enhanced in order to improve the study's effectiveness and quality:

Furnish additional information on the criteria used to select participants and the methodology employed for sampling to ensure the equivalence of groups.

Justify any exclusions made after data collection with explicit reasoning.

To enhance the efficacy and resilience of analyses, it is advisable to contemplate employing a broader sample size, particularly when dealing with the smaller group U.

Employ a pre-analysis strategy to mitigate any possible prejudices arising from having knowledge about group affiliations during the process of modeling and analysis.

Take into account supplementary variables such as socioeconomic indicators, behavioral patterns, and other relevant factors that may impact the results.

Enhance the validation of regression models through comprehensive evaluations encompassing the appropriateness of fit, analysis of the impact of outliers/leverage points, and examination of model assumptions.

Adopt more cautious interpretations considering the existing constraints and refrain from making unsupported assertions regarding the ramifications for policy.

Expand full comment

Reply (1)

William M Briggs

Feb 16, 2024

Excellent analysis, as you will learn.

I could not mention what the bad things are. If I did, well, you'll see.

Expand full comment

Rick Beesley

Feb 18, 2024

I have to say I am baffled by the description of this study, so I'm answering from the gut, but I did get that they are running a regression on a model, which sounds to me like a regression on a regression. I've done a lot of regressions in my career and each regression is an opportunity to overfit or introduce artifacts into the data, so stacking a regression on a regression sounds like a procedure unlikely to produce insightful results.

Expand full comment

Reply (1)

Rick Beesley

Feb 18, 2024

What I mean to say is my gut says this study is probably not producing useful results.

Expand full comment

Rick Beesley

Feb 18, 2024

I agree this indicates they made an assumption and then expressed the assumption as a conclusion. It reminds me of the Mayo Clinic's statin decision guide which I was looking at recently. You enter some numbers and it estimates your risk of a major acute cardiovascular event in the next ten years if you don't and if you do take a statin. I clicked thru to the study upon which they built their predictive model, and the study said they couldn't treat statin therapy as a variable in the model because their data sample didn't include enough people who took statins and/or they didn't have good info on whether they took statins. How does the Mayo Clinic have the gall to use such a study as the basis for advising people on the risk reduction they will get if they take a statins? It's garbage.

Expand full comment

American Aristocrat

Feb 17, 2024

This seems like a bad study. Excluding children with maladies, which is what they allegedly wished to test the potential for, is problematic. The statistical methods look overcomplicated and fishy, at least to my untrained eye. As well, are we certain that B is even a real problem? With no information on what B is, we cannot tell for certain.

Expand full comment

Science Is Not The Answer

Statistical Challenge: Can You Rate How Good…