If All Models Only Say What They're Told To Say, Can We Learn From Them? Mask Model Example
From reader Goncalo Dias comes this excellent question:
I have just sat through your seminar on (super-)models. [LINK]
I have a quick question: if you assume some fluid mechanical properties and some extensive properties of fluids, virus particles, etc, and run a calculation about propagation, even though everything is in the assumptions+data+model, would you say that I could find out whether masks work? Would it be untangling the knowledge from the premises to the conclusion?
Rephrasing, would in this case be said that I could learn something from models?
To re-re-reiterate, yes, all models only say what they're told to say. Which is not intrinsically good or bad.
Now even though this is so, it doesn't mean the modeler knew of everything he told the model as a whole to say. Even though he told every part of the model what to say. In other words, the model may be so large and complex that they modeler will not foresee every possible output given every possible input.
So, as you suggest, he might very well understand how the various parts of the model work together, how one stage causes changes in the next, by varying the inputs and studying the model's innards and outputs. And thus learn from the model.
Like you say, you could have a large, sophisticated model "fluid mechanical properties and some extensive properties of fluids, virus particles", air transport, mask type and permeability, moisture context of the air and mask, viral load, state of disease (which is related to virus shedding), test used to characterize disease (false positives and negatives accounted for; Ct level in PCR tests, etc.), mask cleanliness, hours masks worn, locations where mask worn, how close people were apart and populations density of these places at the various times when masks on and masks off, and so on and so forth.
This model requires a ton of inputs, all of which must assumed. The model, as said, can be run under various assumptions, and you could study it results. Let us suppose you discovered that when the moisture content rises a certain known amount, ceteris paribus, model-based mask efficacy (judged by infection rate, or eventual death, or whatever else you pick) decreases by such-and-such a level.
Now even though the model was bound to say that, because you made it say that, it doesn't mean you understood you were making it say that. Because, again, of the complexity.
Yet this does not prove that masks work when dry. Nor does it prove masks don't work when wet. Because both of these, as it turns out, are assumptions, or direct deducible consequences of the assumptions you made.
In order to prove masks work, as you say they work, you have to duplicate, in real life, the same set of assumptions you made, and check the actual outcome (infection rate, deaths, or whatever). If these match, you have evidence your model is working with this set, and this set only, of assumptions. Obviously, you have to check the range of all possible assumptions with reality to prove you have a good model.
Even so, this is not necessarily proof you have identified all the proper causes.
There are all kinds of reasons your model could work well, but you have misunderstood the causes. For one example, some of the causal connections in your model might be down or upstream proxies of real causes. Or you may have identified an acausal correlation (long time readers will recall that people dying by strangulation in bedsheets and GDP example). And so on.
So in order to get at real causes, you also have to rule out other possible causes of the observed results. What might these be? No universal list exists. After all, if we knew all possible causes, we wouldn't have to model. We'd know the causes. Even if we think we might have listed all possible causes, we might have missed some.
Getting at cause is not easy, and becomes harder---and more tedious and expensive---the more complex the situation.
This is why people like statistical shortcuts. They believe if the model passes a "test", by comparing it favorably to reality, then they don't have to bother checking further for causes. Oh, sure, everybody knows these correlational tests don't prove causation---it's a common adage after all---but everybody believes that, for them, correlation is causation.
Subscribe or donate to support this site and its wholly independent host using credit card or PayPal click here