Links:
This week I really could have benefited from seeing people’s reactions. Ah, well.
HOMEWORK It is true that “If p then q entails If not-q then not-p”. Find a p and q where this works in Reality, and another p and q where it fails.
Lecture
This is Chapter 2 of Uncertainty.All the references have been removed.
“All cats are creatures understanding French,” said Alice’s father. “And some chickens are cats.”
“Wait, I know!” said Alice, chirruping. “That means that some chickens are creatures understanding French.”
“What you said is true, my dear,” said Alice’s father, his voice full of pride.
What Alice said was true in the conditional sense that given or accepting or conditional on the evidence or premises or observations “All cats are creatures understanding French and some chickens are cats” then “Some chickens are creatures understanding French” logically follows. The conclusion is conditionally true.
Of course, Charles Dodgson knew, and we all know, that there are no chats qui comprennent le francais, and that being so it cannot be true that des poulets comprennent le francais. Which is to say we know these propositions are false. How? Because all evidence we have of our feline friends insists none understand French. Cats are diffidence personified and refuse familiarity with any language save their own.
So the proposition “Some chickens are creatures understanding French” is both true and false. There is no contradiction. It is true based on one set of premises, false on another. Logic says so. Logic is the science or study of connections or relations between propositions, and to say an argument is true or false is to speak of the relation and not strictly of the propositions, thus when any proposition in an argument changes, the relation is liable to morph, too. The relation between Alice’s evidence and the proposition is therefore true, and the relation between our observational evidence and the proposition is therefore false.
Dodgson, writing as Lewis Carroll, said his propositions were:
so related that, if the first two were true, the third would be true. (The first two are, as it happens, not strictly true in our planet. But there is nothing to hinder them from being true in some other planet, say Mars or Jupiter—in which case the third would also be true in that planet, and its inhabitants would probably engage chickens as nursery-governesses. They would thus secure a singular contingent privilege, unknown in England, namely, that they would be able, at any time when provisions ran short, to utilise the nursery-governess for the nursery-dinner!)
Since probability, the main focus of this book, is the continuation, fullness, or completion of logic, and since logic is the study of relations between propositions, therefore probability is also the study of relations between propositions. Note carefully: between. With that conclusion we are done: we have everything we need to know; this is the complete “theory” of probability and statistics. The rest is mere detail.
Language
Aristotle (again) says our knowledge of truth begins in sense experience. But not everything we know or is true can be sensed, except in the weakest form of that term by which “sensed” means the workings of our thought process, which can be felt not as muscle movement or nerve excitement, but as mental images and exertions and so forth. There are three acts of the mind: apprehension, judgment, reasoning. We need to understand each, at least broadly.
Apprehension is learning the content of each argument. We first need to apprehend the nature of each word and also the grammar in which propositions are written. Ambiguities are more than possible, especially when asking questions. For example, how happy are you? On a scale of one to eleven-point-four, of course, in units of the seventh root of π (numbers make this science). There are many who find this question comprehensible. I do not. Happiness we can grasp, but arbitrarily indexing it to a number just so it can be manipulated by well-loved equations I do not follow. This question and its multitudinous cousins are responsible for a great deal of scientism and over-confidence. About these subjects, and about ambiguity, more later.
Every term, every universal, has extension. Tree is a term, individual trees are its extension. Happy is a term, individual instances, mental states of persons, are its extension. Every term also has intension (not intention) or comprehension. Intensions are those attributes or qualities that make up the notion of the term. These are important probabilistically. We have a way of speaking which indicates universality which does not follow strict rules for syllogisms, but which nevertheless conveys truth. For example, when we say “Men are taller than women” we do not imply that the shortest man is taller than the tallest women. Instead we mean it is in the nature or essence of men to be taller than women, a truth conditional on extensive observation and the induction to the generalization. Men taller than women is what we expect to find (I use this word in its plain English connotation). This is also to speak probabilistically: the sentence implies there is some high but unquantified chance that any given man is taller than any given woman. Stereotypes also follow these rules. Steven Goldberg notes that most stereotypes are true in this probabilistic sense, but that people’s conception of why they are true is often at fault. It is the why that is a major concern to us in this book.
The second act of the mind is judgment. Our concern is with propositions, sentences which can be true or false or somewhere in between. In logic we move from one, or one set, of evidentiary propositions to a conclusion, which is another proposition. In strict logic there is the idea that the conclusion is derived or deduced, and while this happens in formal cases, in reality it is we who specify the evidentiary propositions, also called premises or evidence or data, and we who specify the conclusion or proposition of interest. This freedom is what gives rise to the fallacy of “subjective” probability, as we’ll see.
The last act of the mind is reasoning, the activity that separates men from brutes. For instance, we reason (here, a verb) that “P is not not-P” and that tautologies are truths. Tautologies have a special place in logic and in probability. Examples: “If it is raining, it is raining”, “You either have cancer or you don’t”, “All gloomy people are gloomy”, or the old classic “All bachelors are unmarried men.” These propositions are all necessarily true, given our innate knowledge of logical rules and of the words themselves, given, that is, our understanding of the nature of logic and the intension of the terms. Since tautologies are necessarily true, they add nothing to any argument. What good and what insight, after all, is there in telling a woman, “You either have cancer or you don’t”? None, of course. And nothing is added if you change the tautology to “You either have breast cancer or you don’t.” Yet that tautology is suddenly seen full of fearful importance. (Switch “breast” with “prostate” for men.) And that is because people understand the grammar to mean more than it strictly implies. Words matter.
Logic is Not Empirical
Much of this section is a paraphrase from David Stove’s magnum opus The Rationality of Induction, a neglected, or rather unknown, work in probability.
We cannot know all logical truths empirically. For example, there is no way to learn through observation the validity of the argument “‘James is a man and Harry is a man’ entails ‘James is a man.” We can of course observe the maleness of either individual, but we cannot witness the entailment, that which makes the argument true. Neither can we observe that “‘X is a man and Y is a man’ entails ‘X is a man'”, because witnessing each and every X and Y is impossible. Neither is it true that “‘James is a man and Harry is a man’ entails ‘James is a man” because “‘X is a man and Y is a man’ entails ‘X is a man'”; it is true all on its own and not because it is part of some schema or formal theory. It is not our organizations of logic that makes true statements true: they are true on their own merits.
A matter of supreme importance can be teased from this. Here is a proof that we must come by knowledge that cannot be acquired solely by experience. The knowledge alluded to here are the rules of logic, the very steps in reasoning, the how we know when something is true or false.
This example is also from Stove (modified slightly). In order to know via experience the validity of (say) the schema A = “For all x, all F, all G, either ‘x is F and all F are G’ is false, or ‘x is G’ is true”, we could make observations like O_1 = “David is bald and David is a person now in this room and all persons in this room are bald.” But in order to get from O_1 to A; that is, to know A is necessarily true, we have to already know that O_2 = “O_1 confirms A”, and that is to have non-empirical logical knowledge. Or you could insist that O_2 was learned by experience, but that would require knowing some other logical knowledge, call it O_3, which somehow confirms O_2. And then there would have to be some O_4 which somehow confirms O_3, and so on. There cannot be an infinite regress—the series must stop somewhere, at a point where we just know (my guess is O_2)—so we must, are forced, to rely on induction (which is examined next chapter) to supply us things like O_2.
This isn’t all. We can learn from observation the following argument is invalid: “‘All men are mortal and David is mortal’ therefore ‘David is a man” if perchance we see David is not a man (maybe he’s a puppy). And we can learn from observation the invalidity of “‘All men are mortal and Peter is mortal’ therefore ‘Peter is a man” only if we see Peter is not a man (maybe he’s a cow). But we cannot learn the invalidity of “‘All men are mortal and X is mortal’ therefore ‘X is a man” through observation because we would have to measure every imaginable X, and that’s not possible. If we believe “‘All men are mortal and X is mortal’ therefore ‘X is a man” is unsound, and it surely is, this belief can be informed by experience but it cannot be solely because of it that we have knowledge of it.
Stove: “If an argument from P to Q is invalid, then its invalidity can be learnt from experience if, but also only if, P is true and Q is false in fact, and the conjunction P-and-not-Q, as well as being true, is observational. This has the consequence, first, that only singular judgments of invalidity can be learnt from experience; and second, that very few even of them can be so learnt.” And here’s the kicker: “If the premise P should happen to be false; or the conclusion Q should be true; or if the conjunction P-and-not-Q is not observational but entails some metaphysical proposition, or some scientific-theoretical one, or even a mere universal contingent like ‘All men are mortal’: then it will not be possible to learn, by experience, the invalidity of even this particular argument” (pp. 155–156). The key is that “scarcely any of the vast fund of knowledge of invalidity which every normal human being possesses can have been acquired from experience.”
Examples? The invalidity of the argument “Given ‘The moon is made of cheese’ therefore ‘Cats do not understand French'” cannot be learned from experience. Neither can “Given ‘Men can breathe underwater unaided’ therefore ‘The atmosphere is largely transparent to sunlight'”. In neither can we can ever observe the conjunct P-and-not-Q. Yet we know these are false. Why? Induction again.
We often in mathematics invoke the continuum, the infinity of numbers on the “real line”, or of different kinds of infinities. None of these are ever observed and can never be observed, yet no mathematician doubts their truth. These and various other “puzzles” are solved by induction, a highly misunderstood concept, as we’ll see.
Syllogistic Logic
Mathematical propositions are highly formal. It would not do, for example, to claim this mathematical equation is universally true: “AB=BA.” The proposition is sometimes true, for instance when A and B are finite natural numbers, but it is in general false when $A$ and $B$ are matrices. Mathematicians in their proofs are thus sticklers for detail. Limitations and constraints on their propositions are laid out with excruciating rigor. Indeed, rigor is a high compliment. Sometimes these efforts are beautiful; sometimes they are ponderous. Only rarely does somebody catch out an error in a proof, and when it happens it is usually not because of a miscalculation but because a constraint not thought necessary actually was. As a result of this vigorous scrutiny, people trust mathematicians when they say something is so. But we must never forget that their proofs are true in relation to the premises used. And those premises are true only because of earlier premises in the chain of the proof, and so on down to the axioms which everybody believes true conditional on their intuitions (induction). This is what makes for a necessary truth.
Besides the usual mathematical expressions found in analysis, number theory, and the like, there is a formalization of logic which has various names like symbolic or mathematical logic and propositional or sentential calculus. These fields belong properly to mathematics, though they provide useful results to syllogistic logic, which is our main interest. Syllogistic or Socratic logic is meta or street logic, arguments as they are used in life and in science and statistics or in assessing the value of the more formalized logics. It is always there, the bulwark of everything else. Mathematical logics are no different than other mathematical subjects: proofs are given with meticulous assiduity paid to constraints on the symbols used, indeed to the very languages used, languages which (oft times) resemble actual speech not at all. But since we need ordinary words to have real arguments, we need to grasp the limitations, fallibilities, and the ultimate strengths of Socratic logic.
Syllogistic, two-valued, Aristotelian, plain-words logic is employed when philosophers attempt to prove the superiority of other logics or in describing the usefulness and necessity of mathematical logics, and even to explain why syllogistic logic is not to be preferred. Even in Principia Mathematica, the book which taught us (eventually) 1+1=2, Alfred North Whitehead and Bertrand Russell were obliged to use plain language to describe what their symbols meant.
Language and not mathematics is the tool we’re stuck with and which we must use to express ideas, such as certainty and uncertainty. If you disagree, write me a letter stating why. Syllogistic logic is written in ordinary language, which is always and necessarily found at the start and finish of any argument or analysis, including scientific analyses. It’s there when we tell our audience what the results mean and what we should do about them. Syllogistic or meta-logic is needed in mathematics, too, especially in the branch known as applied mathematics. This is when mathematical, which is to say purely metaphysical, ideas are applied to real life contingent, i.e. physical, processes.
Every time an equation is called into to support, say, how much weight a bridge can hold, syllogistic logic comes in play. The equations used in support of engineering have no meaning by themselves; they must be given meaning by us. The arguments which support these labellings are difficult and drawn out, but they are all examples of syllogistic logic. Since these arguments involve physical principles, i.e. contingent events, the end results are always at best conditional truths, and sometimes only probable truths. And occasionally even nonsense. I have seen equations applied to human behavior, usually in economics or social science, all mathematically spotless, which when applied to real people are gibberish. The problem is that people commit the simple fallacy, “Since the equations are mathematical truths, the objects to which they are said to apply must be those objects, and therefore the predictions and theories which gave rise to the mathematics are therefore true.” We’ll meet these fallacies later when we discuss models.
Besides, if we opt for symbolic logic we’re apt to take perfectly understandable propositions like this “Socrates is wise” and turn them into curiosities like this: $(\exists x)(Sx \& (y)(Sy \leftrightarrow y = w) \& Wx)$, an example from Ode2005, p. 191. As useful as this sort of thing is to understand the fine shadings of mathematical logic, it is a positive bar to clear understanding real problems.
Syllogisms
There isn’t much point rehearsing the kinds of syllogisms, enthymemes, major and minor premises, barbara, celarent, and other staples of logic. These are all too well known, and there are many texts which do a superior job. Even high schoolers still know that given “No academics have a sense of humor and all teachers are academics” that it is conditionally true that “No teachers have a sense of humor.” No: what we have to understand is what kind of truths syllogisms give us.
As the last example showed, syllogisms can give us conditional truths. Since the premise—written as one single premise, cobbling the major and minor premise together, but there is no difficulty in doing so—is known not to be true in your author’s case. The proposition “No teachers have a sense of humor” is, with respect to that evidence, universally false. Thus it is false and true simultaneously, depending on which set of premises one chooses to believe or employ. But don’t forget, any set of premises includes tacit knowledge of the meaning of the words used. This is inescapable.
Some syllogisms give universal truths. Given “All men are mortal and you, the reader, are among the race of mankind” then the proposition “You are mortal” is necessarily true because the premises are known, through a chain of sound observation and argument, to be necessarily true. Don’t wait forever to make out your will.
As said in Chapter 1, in ordinary speech we’ll say the conclusions of both arguments are “true”, which can be harmless because most people take the point. But since language can be ambiguous, we have to take care to say just we mean when speaking formally, or when discussing sensitive topics like politics. An example in line with our ultimate goal: given “Some systems of government are stultifying and all stultifying systems of government are deadly” then it is true that “Some systems of government are deadly.” The proposition is therefore true and not probable, though it carries the sound of probability because of that “some”.
Later we will learn that given “The rules of logic and mathematics and accepting this and such evidence which is probative of Y” that the propositions “It is probable that Y”, “Y has a $p$\% chance”, “Y might be true” and the like are themselves true. But conditionally true because although the rules of logic and mathematics are necessarily true, the evidence probative of Y will not be. It is we who specify that evidence, picking from an veritable infinite universe of evidence.
For this reason many, but not all, probability statements are conditional truths. An example: X = “Given certain evidence, the chance of Y is p%”. X itself, assuming the tacit premise of no miscalculation or other mistake in applying the evidence to Y, is necessarily true. But that the “Chance of Y is $p$\%” is only conditionally true based on the certain evidence.
One logical tidbit which is awfully useful: Any valid argument from P to Q is unchanged if a necessary truth is added to the list of premises P. That is, P to Q is identical in truth value with P & T to Q, where T is a necessary truth (this is like multiplying an algebraic equation by 1). This applies to syllogisms and probability arguments. Another: it is impossible to deduce necessarily true consequences from contingent premises. Believing the opposite is like trying to support the earth with turtles.
Informality
Because street logic is informal it is not possible to constrain the reach and type of propositions used. Anything goes. This freedom, as all freedoms do, comes at a price. Each argument must be judged on its merits; judged individually, I mean. Truth tables, proof by parallel argument, similarity to a set of symbols said to represent arguments of this or that schema, and the like are therefore not useless but are of limited applicability.
Stove defines formality in an argument as when “it employs at least one individual variable, or predicate variable, or propositional variable, and places no restriction on the values that that variable can take”. Stove claims that “few or no such things” can be found. This will be useful for us to recall when discussing the hideously complex regression models that are much in fashion in some circles. The so-called rule of transposition is an example of what formality in logic might look like. The rule is: the proposition “If p then q” entails “If not-q then not-p” for all p and for all q. This is formal in the sense that we have the propositional variables p and q for which we can substitute actual instances, but for which there are no restrictions. If Stove is right, then we should be able to find an example of formal transposition that fails.
First a common example that works: let p = “There is food” and q = “I can eat”, then “If p then q” translated is “If there is food I can eat”. By transposition, not-q = “I can’t eat” and not-p = “There is no food” thus “If not-q then not-p” translates to “I can’t eat if there is no food.” For an example in which formal transposition fails (this is Stove’s, too), let p = “Baby cries” and q = “We beat him”, thus “If p then q” translates to “If Baby cries then we beat him”. Heartless; but logic is a hard taskmaster. But then by transposition, not-q = “We do not beat Baby”, not-p = “He does not cry”, thus “If not-q then not-p” translates to “If we do not beat Baby then he does not cry.” And this is obviously false.
So we have found an instance of formal transposition that fails, which means logic cannot be “formal” in Stove’s sense (I do not intend to give a full proof of this here). It also means that theorems which use transposition in their proofs will have instances in which those theorems are false if restrictions are not placed on its variables—it’s the restrictions that are important. It’s actually worse than this, because transposition is logically equivalent to several other logical rules, putting those theorems in jeopardy. Those who prove theorems are, however, usually very careful detailing restrictions and, as said, those theorems found to have failed usually suffered from lack of complete or improper restrictions.
It is Stove’s contention that all logical forms have examples where the logic is turned on its head, like with transposition, unless, like in formal mathematics and mathematical logic, restrictions are in place. As said above, it is not a universal truth that “AB=BA.” But it is when we add the restriction to (say) natural numbers. This means we have to be very careful in saying what are the precise conditions and limitations of our models.
Fallacy
Not all fallacies are what they seem. Given “All dogs have four legs and Iris has four legs” it does not follow that “Iris is a dog”, not because in some formula or schema like “‘X is F’ does not follow from ‘All F are G and X is G”‘, but because it might be that Iris is a cow or some other creature with four legs. It is because we can summon evidence about the range of these alternate possibilities that the truth of the proposition is in question. Symbols in formulae and so forth are scratches on a page and take no meaning until we supply one, thus symbols or schema can’t be true or false. Only arguments can.
To avoid fallacy, we always must take the information or evidence supplied as given and concrete, as sacrosanct, even, just as we do in any mathematical problem. Therefore we accept arguendo that “All dogs have four legs”, even though by our observation we know some do not (never play fetch near a highway). We know of the four-leggedness of dogs because that is their nature, just as we know three-legged dogs are deficient or are suffering from a privation.
It is not true that “Iris is a dog” given “All dogs etc.” But that does not imply the logical negation of our proposition, i.e. it might be so that Iris is, in fact, a dog. Another way of putting it is that given “All dogs etc.” it is not false that “Iris is a dog”. The status of our proposition, conditional on the only evidence we have, is murky. Relying solely on dichotomous logic leaves us hanging. The logical status of “Iris is a dog” is not empty, though, and it would be a mistake to think it was. Here we know Iris has four legs; dogs do, too. Iris therefore might be a dog. We just don’t know for certain she is. The ambiguity drives us to probability, where we complete our understanding of the proposition. There is no fallacy in the argument unless somebody insists or implies it is certain Iris is a dog.
A depressing but far from unknown fallacy goes: “If X then not-P, P, therefore not-P.” The variable X is a cherished theory or belief, usually a popular or faddish theory of human behavior, but also perhaps a physical theory which has powerful interests, and P any proposition about that behavior or a state the theory says cannot happen. After P is observed to occur, the theory X would appear to be in difficulty. But human ardency is infinitely malleable, especially if X is the creation of its believer. P is denied, or perhaps the No True Scotsman is invoked for P, or an R is invented such that the original argument is modified to “If X-and-R then P or not-P, R, etc.”, where R is an excuse; anyway, X survives. This will be important when we discuss falsifying models.
Another popular fallacy in studies which use statistics is the peer-review or credential fallacy, which appears in several forms, as variants of the Appeal to Authority and Genetic fallacies. The most common is in journal writing, where an author will write, “Jones (1999) showed that X”, with the implication (at least sometimes) that X is therefore true because “Jones (1999)”. The reference is offered as sufficient proof, especially if the journal in which Jones’s work appeared is prestigious. Usually authors will clump a dozen or so references as a useful summary, and this can be a move to bludgeon the reader into submission. If several authors write on a doubtful proposition, the mass of citations is often taken as proof. Hence intellectual fads have strong inertia in our publishing age. Physicians also sign their names on papers as “John Smith, M.D.” for the same reason. Members of the public, though very often academics, too, especially on political subjects, will refuse to listen to an argument of an opponent unless it first be ensconced in a “peer-reviewed” publication. These are obvious and perennial fallacies, but still unfortunately persuasive. Since they have been with us forever, it is rational to conclude (via induction, which we discuss next chapter) that they always will be.
There are many other fallacies, which will be dealt with in turn, when they are more specifically tied to certain statistical procedures, such as the epidemiologist fallacy. Formal fallacies, broken syllogisms and the like, are easy to spot, and when they are a necessary truth has been discovered, which is the complement of the fallacy. Formal fallacies aren’t especially rare, either, and are found in increasing frequency the further one gets from mathematics. Particle physicists, say, generate few formal fallacies, but literature and social science professors are positively bursting with them. We’ll examine the more common of these in due course.
Perhaps the worst fallacy is the We-Have-To-Do-Something fallacy. Interest centers around some proposition Y, about which little is known. The suggestion will arise that some inferior, or itself fallacious, method be used to judge Y because action on Y must take place. The inferiority of fallaciousness of the method used to judge Y will then be forgotten. You may think this rare. It isn’t. An entire branch of statistics, hypothesis testing, is built around this fallacy. We come to this in time.
Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank.
"We-Have-To-Do-Something fallacy." One of my favorites. No, we don't. First off there is no we, second no you don't. You are trying to co-opt permission to do your will. Mostly that attitude fulfills the form of caring but almost never the deed of caring.
If ALIVE then BREATHING
If not-ALIVE then not-BREATHING
Works (assuming you are not on a lung machine and actually brain dead)
If WOKE then STUPID
If not-WOKE then not-STUPID
Fails (alas)
What you need to find in the not-q then not-p is a q that can only be found in p.