Class 38: Truly Random & Information Complexity

Feb 10, 2025

There is nothing magical or mystical about simulations, and “randomness” has nothing to do with any of them. Let’s remove the unrealistic thinking from simulations by understanding them for what they really are.

Uncertainty & Probability Theory: The Logic of Science

Video

Links: YouTube * Twitter – X * Rumble * Bitchute * Class Page * Jaynes Book * Uncertainty

HOMEWORK: You must look up and discover a Research Shows paper and see if it conforms to the conditions given below. Have you done it yet?

Lecture

Besides this excerpt from Chapter 6 of Uncertainty, you must read the Chaitin paper “Randomness and mathematical proof“.

One thing has to be admitted outright: that no field invents more tantalizing (and marketable) terms than computer science. Machine learning, neural nets, universal approximators, genetic algorithms, artificial intelligence, fuzzy logic, and the list goes ever onward. The promises implied in these phrases is inspiring. Machines that learn! Algorithms that figure out any problem with no human intervention! That a method fails to live up to its pledges is never remembered in the rush to embrace the next.

The empiricist bias of the methods is never noticed, either. Of course, computers work with “data”, with empirical renderings of one kind or another, so that methods that are entirely empirically based are to be expected in practice. But we have already seen that some inductive reasoning extends beyond the empirical. Though many would argue the point, computers cannot do what rational minds can, for instance in induction-intellection. Recall in Chapter 3 that Groarke said induction-intellection provides the “Abstraction of necessary concepts, definitions, essences, necessary attributes, first principles, natural facts, moral principles.” Though data to start these induction-intellections comes from empirical senses, it immediately (and instantaneously) extends beyond the empirical to the universal, to what can never be empirically verified. Computers, since they cannot think in this way, cannot do this.

Induction-intellection, induction-intuition, induction-argument, induction-analogy, i.e. the bulk of inductive reasoning, lie beyond the ability of any formal algorithm. Computer methods will thus never be a panacea for creating knowledge. Induction-probability is, however, is ripe for the picking, at least as far as the kinds of propositions in which we have an interest are observable, which we understand by now isn’t always the case. The propositions of science are in large part empirical, so it is here we expect information theory and computer science to play the largest role.

One of the architects of information theory was Ray Solomonoff. His classic paper “A Formal Theory of Inductive Inference. Part I” purports to be an existence proof for probabilities for “all problems in inductive inference”, where he uses the term induction in its induction-probability sense, and where he does not appear aware that other senses of induction exist. Neither is the field to this date aware, as far as I can tell. In this paper (p. 16) he says his model—where he uses “model” in the sense of his scheme for computing probabilities and not in the sense used by statisticians—accounts for new observations in some sequence in an “optimum manner” (pp. 16–17):

By “optimum manner” it is meant that the model we are discussing is at least as good as any other model of the universe in accounting for the sequence in question. Other models may devise mechanistic explanations of the sequence in terms of the known laws of science, or they may devise empirical mechanisms that optimally approximate the behavior and observations of the man within certain limits. Most of the models that we use to explain the universe around us are based upon laws and informal stochastic relations that are the result of induction using much data that we or others have observed. The induction methods used in the present paper are meant to bypass the explicit formulation of scientific laws, and use the data of the past directly to make inductive inferences about specific future events.
It should be noted, then, that if the present model of the universe is to compete with other models of the universe that use scientific laws, then the sequence used in the present model must contain enough data of the sort that gave rise to the induction of these scientific laws.
The laws of science that have been discovered can be viewed as summaries of large amounts of empirical data about the universe. In the present context, each such law can be transformed into a method of compactly coding the empirical data that gave rise to that law.

The hope is that an automatic method to discover all scientific “laws” (a term I discuss next Chapter) is on the horizon. All we need is data of sufficient length, a computer powerful enough to hold it, and his algorithm which “automatically” applies probabilities, and all knowledge will be ours. But since “laws”, such as they are, involve understanding causality, nature, and essences, and these acts of understanding are provided by inductions of forms other than induction-probability, this is a false hope.

What is to be applauded in Solomonoff’s reasoning is his emphasis on prediction. He is not in the least interested in parameters, the obsession of statisticians, but only in what past observations have to say about future (rather, those not yet made known) data. We’ll meet Solomonoff’s probability formulation in the discussion of parameters in Chapter 8.

Related to Solomonoff’s work is the idea of algorithmic complexity, and, with it, what information scientists call “random.” Chief is the concept that we are working with a set of observed data, or a “string” of some fixed length written in some code (as the data of this sentence is written in English). We next take some model, or computer or real language, and express the string (observed data) in that language. The complexity of the string is length of the shortest description of the string from the models under consideration. Loosely speaking, if this shortest description, conditional on the models, is no shorter than the length of the original data, the data is said to be “random”. Another way: if the data can’t be compressed by the model, the data is “random.” Randomness is thus conditional on the model or models considered and is, as is clear, a synonym of unpredictable. Chaitin (2001) p. 111 says, “There’s only one definition of random…something is random if it is algorithmically incompressible or irreducible,” and he humbly develops a measure of this which he calls “Chaitin randomness.” Although his notation does not show it, Chaitin randomness, i.e. randomness, is conditional on the model or “machine” used, as notions of uncertainty always are. Again, unpredictable.

Knowledge provided by forms of induction like induction-intellection is thus random in this sense, since there is no way to get to this knowledge using any model. So axioms are random, too; all sui generis knowledge is random in this way. Randomness in the sense used by information theory is thus related to predictability. That which cannot be predicted with certainty, i.e. deduced, from some model, i.e. some set of accepted premises, is in some sense random; and this accords with the statistical meaning of the term. It is “beyond” the model or base of existing knowledge. The digits of π, for instance, are random in this sense, because their simplest description is just to list the digits. We know not from whence these digits come in any universal sense, given the premises that come in (for example) number theory. But the digits of π can be calculated to any finite expansion because algorithms exist. So π is not entirely random.

Finally, we can now see that so-called tests for randomness are misnamed. Since there is no such thing as randomness, tests for it are like tests for Bigfoot. Instead, what is tested for, and what should be acknowledged, is predictiveness. A sequence of numbers, or a string, or whatever, is more or less predictable. So what does predictable mean? That we have identified the premises which determine or which cause the sequence. Once this model is known, if it can be known, and we have seen in QM that not all models are knowable, but then again what causes axioms to be true is also not knowable, we can predict with certainty. Being able to predict without certainty is where uncertainty or “randomness” enters.

The amusing things about many tests for “randomness”, i.e. predictability, is that they always turn a blind eye to the premises which are known to be determinative. One such algorithm is the Mersenne Twister. Its content is not of interest; what is, is that the sequence put out by it is, knowing the content and initial conditions, perfectly known. Tests for randomness are used on a given sequence, and these are said to be “random”, but only because the content of the algorithm are ignored in the test! There are also firms that will supply, for a fee, “genuinely random” numbers, perhaps created through physical or mechanical processes. But since these don’t exist, what is the customer getting? Simply a sequence which, examining only the sequence, does not allow certain predictions to be made of the (of future values of the) sequence. Of course, we have that sort of thing with QM. We can only predict within certain bounds, depending on the kind of experiment. And, again, the only limitation, but a big one, is that we a guaranteed not to know the causes behind the sequence. Since we can prove this by other means, there is no need to have a “test” for the randomness of such sequences.

Let me clarify that last, but utmost important, point. We often know what determines (i.e. ascertain) a necessary truth; these determinations are the basis of proof. But we can never know why, or rather, what causes these truths. Why are Peano’s or any axioms true? Why a universe (where I use that word in its philosophical sense of all there is) like this, with these fundamental properties, whatever they turn out to be? I do not claim we know what we now call fundamental is fundamental in the same sense axioms are. I only ask why whatever is fundamental is fundamental. Answer: we have no idea, and we can have no idea. The mind of God is not ours to know. Necessary truths are the Way Things Are. And that is that.

Lastly, to clarify the clarification, here is an interesting point about “true” randomness that arose from a work by Donald Knuth. Start with these equations:

$e = \sum_0^\infty \frac{1}{n!},$

$\pi = \sum_{k = 0}^{\infty}\left[ \frac{1}{16^k} \left( \frac{4}{8k + 1} - \frac{2}{8k + 4} - \frac{1}{8k + 5} - \frac{1}{8k + 6} \right) \right].$

The remarkable thing about the second is that we can figure the n-th digit of π without having to compute any digit that came before. All it takes is time, just like in calculating the digits of e in the first. Now (the digits of) π and e are often said to be “random”. Since we have a formula, we cannot say that the digits of π; are unknown or unpredictable. Yet there they all are: laid bare in a simple equation. I mean, it would be incorrect to say that the digits are “random” except in the sense that before we calculate them, we don’t know them. They are perfectly predictable, though it will take infinite time to get to them all. But by “random” what is meant is that e and π are transcendental, meaning numbers that aren’t algebraic, which in turns means that they cannot be explicitly and completely solved for. Yet these equations solve for them in the certain sense that all the digits can be had if one is willing to wait long enough.

The equations here are determinative; they tell us what the digits of e and π, and so these transcendentals are not random in a predictive sense since we have perfect predictability, but they are random in the sense that their origins are unknown. They don’t tell us why it’s these digits rather than some others. Nature is silent on the cause of these values. Why is 3.141593…? and not something else entirely? Answer: we do not know. It is the Way Things Are.

Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank. BUY ME A COFFEE.

Paul Fischer

Feb 10

Here’s the thing William, this is one of many definitions of “random”. There’s long-run frequency stability (Frequentist). Then there’s subjective uncertainty (Bayesian). And then there’s algorithmic incompressibility (Kolmogorov). And let’s not forget lack of causal explanation (Philosophy of Science). I’m sure there’s many more. So what’s your flavor of the day? It seems to me this lecture is more about information theory than randomness itself.

Expand full comment

M Yao

Nassim Taleb in his book “Fooled by Randomness” talks about randomness simply being a matter of point of view. What is wholly deterministic from the person generating the numbers, for example, appears random to the person receiving them, as you point out. An example he gives is the 911 attack - seemingly out of nowhere to the government and public of the US, but obviously predictable to the men planning and carrying out the attack.

1 reply

4 more comments...

Science Is Not The Answer

Discussion about this post