9 Comments
User's avatar
Paul Fischer's avatar

Here’s the thing William, this is one of many definitions of “random”. There’s long-run frequency stability (Frequentist). Then there’s subjective uncertainty (Bayesian). And then there’s algorithmic incompressibility (Kolmogorov). And let’s not forget lack of causal explanation (Philosophy of Science). I’m sure there’s many more. So what’s your flavor of the day? It seems to me this lecture is more about information theory than randomness itself.

M Yao's avatar

Nassim Taleb in his book “Fooled by Randomness” talks about randomness simply being a matter of point of view. What is wholly deterministic from the person generating the numbers, for example, appears random to the person receiving them, as you point out. An example he gives is the 911 attack - seemingly out of nowhere to the government and public of the US, but obviously predictable to the men planning and carrying out the attack.

Paul Fischer's avatar

That's the point. Information theory is about who knows what when and how the information gets there.

Fabius Minarchus's avatar

P.S. Fuzzy logic is the biggest breakthrough in philosophy since Aristotle. Reality does not map neatly to symbols.

Modern day followers on Ayn Rand and Ludwig von Mises ignore this fact and indulge in silly Proof by Definition.

Meanwhile, the East got paralyzed by dwelling on where this mapping fails, and overlooked where logical thinking works for a couple thousand years. Supposedly some of them achieved Nirvanna or something.

Fabius Minarchus's avatar

Quality of random number generator makes a heap big difference when doing Monte Carlo integrations. Cryptographers will happily give you a verbal spanking as well.

You are on much stronger ground when you question the assumption of Gaussian residuals. The Gaussian bell curve comes from *adding* together a *large number* of *uncorrelated* signals. Fat tails and other "anomalies" happen when the signals are limited in number and/or have couplings (not additive).

Intelligence comes to mind. Intelligence is NOT a simple addition of a huge number of factors. There is a great deal of nonlinearity. People who are good at logic problems are more likely to practice logic problems. People who are good at reading inhale more information. People who could do physics but would be mediocre physicists take on less strenuous fields. (The same holds for athletic prowess. Those who don't make the team work out far less than those who do. Those who make first string get more practice reps than the bench warmers.)

Now here is the question (which I don't have an answer for tonight): does selecting candidates for a clinical trial using an uncorrelated high quality random number generator truly guarantee that residuals should be random and Gaussian distributed? What if there are only a dozen or two significant signals other than what you are controlling for? What if one or more of them is correlated to willingness to participate in the study?

I haven't worked out the details, but it seems that that is where to attack P values. (Apologies if you have already covered this. Sometimes your notation puts me to sleep so I haven't followed all of your lessons.)

pyrrhus's avatar

My father, an experimental nuclear physicist, opined that nothing is truly random in nature, and physicists use lists of "random" numbers compiled in different ways, knowing that.....

James Bowery's avatar

"The hope is that an automatic method to discover all scientific “laws” (a term I discuss next Chapter) is on the horizon."

This is the classic error people make interpreting Solomonmoff's contribution. Model generation is not model selection, despite the dreams for an Artificial Intelligence that will render our brains superfluous.

The importance of Solomonoff is not in artificial intelligence. It is in an information criterion for model selection superior to those used in statistics.

While it is most certainly the case that every decision is an act if not leap of faith based on limited and biased information "processed" by we beings of limited intelligence, it is rather mean spirited to relegate us to the tender mercies of the likes of Popper and Kuhn and their social science spawns when the world could have been a very different place if statasticians had recognized that approximating algorithmic information of datasets is superior to their unprincipled grab-bag of information criteria.

Dumpsters are reeking from the decay of aborted fetuses that may very well have become the men and women they were meant to be but for the enemy of the good that is the demand for perfection before we take our leap of faith and decide on a course of action that is good.

Please, let us make better decisions in with our limited information, limited intelligence, limited time and limited resources -- and at least slow the slaughter of innocence.

https://claude.ai/share/df6fdf28-b73b-48b7-9f5b-01b265e3748c

William M Briggs's avatar

Yes. We haven’t even got to things like model selection yet. Coming soon (ish).

James Bowery's avatar

I need your assistance in selecting the data for this. I may have funding adequate to saw off a leg of the 3 legged stool upon which the social pseudosciences sit enthroned as humanity dies.

https://github.com/jabowery/HumesGuillotine

Before you commit yourself in writing to model selection there are a number of easily debunked critiques of the ALGIC for model selection that have achieved pseudo-orthodox status simply by repetition.

1) Jorma Rissanen's 1970s "MDL" paper attempted to approximate the ALGIC by using statistical rather than computational codes. Everyone therefore thinks they can dismiss the ALGIC for model selection because "it has long-ago been tried and failed." This is directly related to the aforementioned fallacy (borne of artificial intelligence dreams) of conflating model generation with model selection, but it is also related to another, deeper, fallacy that kept physics in kinematic modes of description from Aristotle until Newton's dynamical* mode of description that included velocity in the "state".

2) "The choice of Universal Turing Machine is arbitrary", which is fallacious. If you choose the UTM after you have seen the data, you are engaging in post hoc theorization. This debunks all claims that one can construct the "instruction set" of the UTM to output the data with essentially no algorithm size.

3) UTMs are impossible anyway because they require "infinite tapes". #2 and this are related to what I call NiNOR compexity -- a more rigorous finite state machine measure of complexity than is Kolmogorov Complexity. NiNOR Complexity replaces infinities with a single unknown finite number. https://jimbowery.blogspot.com/2023/10/ninor-complexity.html

4) "Kolmogorov Complexity (or, more rigorously, NiNOR complexity) is incomputable" is fallacious because it impertinently applied only to ALGIC when it could equally be applied to any model selection criterion. As I discuss briefly in the Hume's Guillotine link, no scientist is required to prove that his data model is the simplest of all *possible* data models -- he is merely challenged to compare the complexity of his data model with all *existing* models of the same data.

5) Popper's "falsification" dogma is probably the most obvious nonsense of all when applied to the ALGIC. Let's try it: "All you have to do to construct another model of the data is append the error residuals to the executable archive data model -- so nothing is falsifiable under the ALGIC therefore you the ALGIC violates Popper's falsification criterion for META-criteria for model selection!" All I can say to such numskulls is I hope their job at McDonald's doesn't pay enough to spawn more of them. No, numskull -- Popper's falsification dogma is meta-falsified by the ALGIRC since appending such residuals increases the length of the algorithmic description compared to models that predicted those residuals. Now go back to your pseudo-sociologist hell-hole out of which you slithered.

* Dynamical modes of description have been shown compatible with teleology by Maupertuis -- so it isn't quite correct to state that algorithmic modes of description are _inherently_ mechanistic -- although under Maupertuis if you have access to "final causation" you are deterministic in the infinite limit of numeric accuracy.