I have been asked to update this essay in light of the discussion that occurred on the FQXi website during the competition. I cannot possibly cover all of the comments, so I shall focus on a few that I consider interesting and accessible. For more details, it is worth reading the comment thread in its entirety .
Since Sylvia Wenmackers’ winning essay also argues for a naturalistic approach to mathematics , I also thought it worthwhile to discuss how my position differs from hers. We start with that before moving on to the website comments.
Response to Sylvia Wenmackers’ Essay
The main advantage of a naturalistic view of mathematics is that it offers a simple explanation of the alleged unreasonable effectiveness of mathematics in physics. If mathematics is fundamentally about the physical world then it is no surprise that it occurs in our description of the physical world.
However, the naturalistic view is prima facie absurd because mathematical truth seems to have nothing to do with physical world. The main task of a naturalistic account of mathematics is therefore a deflationary one: explain why mathematical theories are empirical theories in disguise and why we have been so easily misled into thinking this is not the case. Wenmackers gives a battery of arguments for this position based on four “elements”. Here, I will focus on the first and third elements, which are based on evolution by natural selection and selection bias respectively, because this is where I disagree with her. I more or less agree with her discussion of the other two elements.
The Evolutionary Argument
The evolutionary argument asserts that our cognitive abilities, and in particular our ability to do mathematics, are the result of evolution by natural selection. It is therefore no surprise that they reflect physical reality, as physical reality provided the environmental pressures that selected for those abilities.
This argument works well for basic mathematics, such as arithmetic and elementary pattern recognition. The ability to distinguish a tree that has five apples growing on it from one that has two has an obvious evolutionary advantage.
However, our main task is to explain why our most advanced and abstract theories of mathematics crop up so often in modern physics, not just the basic theories like arithmetic, and there is no conceivable evolutionary pressure towards understanding the cosmos on a large scale. Knowing the ultimate fate of the universe may well be crucial for our (very) long term survival, but cosmology operates on a much longer timescale than evolution by natural selection, so, for example, there is no immediate environmental pressure towards discovering general relativity, nor the differential geometry needed to formulate it.
On the other hand, it happens that evolution has endowed us with general curiosity and intelligence. This does have a survival advantage as, for example, a species that is able to detect patterns in predator attacks and pass that knowledge on to the next generation without waiting for genetic changes to make the knowledge innate can adapt to its environment more quickly. Such curiosity and intelligence are not the inevitable outcome of evolution, as the previous dominance of dinosaurs on this planet demonstrates, but just one possible adaptation that happened to occur in our case. Like many adaptations, it has side effects that are not immediately related to our survival, one of which is that some of us like to think about the large scale cosmos.
Since our general curiosity and intelligence are only a side effect of adaptation, so are modern physics and mathematics. Therefore, I find it puzzling to argue for the efficacy of our reasoning in these areas based on evolution. Evolution often endows beliefs and behaviours that are good heuristics for the cases commonly encountered, but a poor reflection of reality (consider optical illusions for example). With general intelligence one can, with considerable effort, reason oneself out of such beliefs and behaviours. This is why I stated that mathematical intuition must be a product of general intelligence rather than a direct evolutionary adaptation in my essay.
I think that understanding how a network of intelligent beings go about organizing their knowledge is at the root of the efficacy of mathematics in physics. It should not matter whether those beings are the products of evolution by natural selection or some hypothetical artificial intelligences that we may develop in the future. For this reason, I take the existence of intelligent beings as a starting point, rather than worrying about how they got that intelligence.
Wenmackers’ selection bias argument is an attempt to deflate the idea that mathematics is unreasonably effective in physics. The idea is that, due to selection bias, we tend to remember and focus on those cases in which mathematics was successfully applied in physics, whereas the vast majority of mathematics is actually completely useless for science. Here is the argument in her own words.
Among the books in mathematical libraries, many are filled with theories for which not a single real world application has been found. We could measure the efficiency of mathematics for the natural science as follows: divide the number of pages that contain scientifically applicable results by the total number of pages produced in pure mathematics. My conjecture is that, on this definition, the efficiency is very low. In the previous section we saw that research, even in pure mathematics, is biased towards the themes of the natural sciences. If we take this into account, the effectiveness of mathematics in the natural sciences does come out as unreasonable - unreasonably low, that is.
My first response to this is to question whether the efficiency is actually all that low. After all, the vast majority of pages written by theoretical physicists might also be irrelevant to reality, and these are people who are deliberately trying to model reality. We need only consider the corpus of mechanical models of the ether from the 1800s to render this plausible, let alone the vast array of current speculative models of cosmology, particle physics, and quantum gravity. It is not obvious to me whether the proportion of applicable published mathematics is so much lower than the proportion of correct published physics, and, if it is not, then a raw page count does not say much about the applicability of mathematics in particular.
Even if the efficiency of mathematics is much lower than that of physics, it not obvious how low an inefficiency would be unreasonably low. If mathematics were produced by monkeys randomly hitting the keys of typewriters then the probability of coming up with applicable mathematics would be ridiculously small, akin to a Boltzmann brain popping into existence via a fluctuation from thermal equilibrium. In light of such a ridiculously tiny probability, an efficiency of say 0.01 %, which looks small from an everyday point of view, would indicate an extremely high degree of unreasonable effectiveness. Of course, mathematicians are not typewriting monkeys, but unless one is already convinced that the development of mathematics is correlated with the development physics by one of Wenmackers’ other arguments, then even a relatively tiny efficiency could seem extremely improbable.
My second response to the selection bias argument is that mathematics is not identical to the corpus of mathematical literature laid out in a row. Some mathematical theories are considered more important than others, e.g. the core topics taught in an undergraduate mathematics degree. Therefore, mathematical theories ought to be weighted with their perceived importance when calculating the efficiency of mathematics. If you buy my network model of knowledge then the number of inbound links to a node could be used to weight its importance, as in the Google Page rank algorithm. I would conjecture that the efficiency of mathematics is much higher when weighted by perceived importance. I admit that this argument could be accused of circularity because one of the reasons why an area of mathematics might be regarded as important is its degree of applicability. However, this just reinforces the point that mathematics not an isolated subject, but must be considered in the context of the whole network of human knowledge.
Responses to Comments on the FQXi Website
Other Processes in the Knowledge Network
Several commenters expressed doubts that my theory of theory building captures everything that is going on in mathematics. For example, Wenmackers commented:
Is this is correct summary of your main thesis (in Section 4)? : “First, humans studied many aspects the world, gathering knowledge. At some point, it made sense to start studying the structure of that knowledge. (And further iterations.) This is called mathematics.”
Although I find this idea appealing (and I share your general preference for a naturalistic approach), it is not obvious to me that this captures all (or even the majority) of mathematical theories. In mathematics, we can take anything as a source of inspiration (including the world, our the structure of our knowledge there of), but we are not restricted to studying it in that form: for instance, we may deliberately negate one of the properties in the structure that was the original inspiration, simply because we have a hunch that doing so may lead to interesting mathematics. Or do you see this differently?
There are other processes going on in the knowledge network beyond the theory-building process that I described in my essay. I did not intend to suggest otherwise. The reason why I focused on the process of replacing direct links by more abstract theories is because I think it can explain how mathematics becomes increasingly abstract, whilst maintaining its applicability. But this is clearly not the only thing that mathematicians do.
One additional process that is going on is a certain amount of free play and exploration, as also noted by Ken Wharton in his essay . Mathematical axioms may be modified or negated to see whether they lead to interesting theories. However, as I argued earlier, mathematical theories should be weighted with their perceived importance when considering their place in the corpus of human knowledge. Modified theories that are highly connected to existing theories, or to applications outside of mathematics, will ultimately be regarded as more important. It is possible that a group of pure mathematicians will end up working on a relative backwater for an academic generation or two, but this is likely to be given up if no interesting connections are forthcoming.
For my theory, it is important that these additional processes should not have a significant impact on the broad structure of the knowledge network. There should not be a process where large swaths of pure mathematicians are led to work on completely isolated areas of the network, developing a large number of internal links that raise the perceived importance of their subject, with almost no links back to the established corpus of knowledge. Personally, I think that any such process is likely to be dominated by processes that do link back strongly to existing knowledge, but this is an empirical question about how the mathematical knowledge network grows. To address it, I would need to develop concrete models, and compare them to the actual growth of mathematics.
What Physical Fact Makes a Mathematical Fact True?
Perhaps the highlight of the comment thread was a discussion with Tim Maudlin. It started with the following question:
I’m not sure I understand the sense in which mathematics is supposed to be “about the physical world” as you understand it. In one sense, the truth value of any claim about the physical world depends on how the physical world is, that is, it is physically contingent. Had the physical world been different, the truth value of the claim would be different. Now take a claim about the integers, such as Goldbach’s conjecture. Do you mean to say that the truth or falsity of Goldbach’s conjecture depends on the physical world: if the physical world is one way then it is true and if it is another way it is false? What feature of the physical world could the truth or falsity of the conjecture possibly depend on?
I stated in the essay that I think mathematical theories are formal systems, but not all formal systems are mathematics. The role of physics is to help delineate which formal systems count as mathematics. Therefore, if Goldbach’s conjecture is a theorem of Peano arithmetic then I would say it is true in Peano arithmetic. If we want to ask whether it is true of the world then we have to ask if Peano arithmetic is the most useful version of arithmetic by looking how it fits into the knowledge network as a whole. It may be that more than one theory of arithmetic is equally useful, in which case we may say that it has indefinite truth value, or we may want to say that it is true in one context and not another if the two theories of arithmetic have fairly disjoint domains of applicability.
If the Goldbach conjecture is not provable in Peano arithmetic, but is provable in some meta-theory, then we can ask the same questions at the level of the meta-theory, i.e. is it more useful than a different meta-theory. This sort of consideration has happened in mathematics in a few cases, e.g. most mathematicians choose to work under the assumption that the axiom of choice is true, mainly, I would argue, because it leads to more useful theories.
At this point in the discussion, Maudlin accused me of being a mathematical platonist. His point is that if I accept theoremhood as my criterion of truth then I am admitting some mathematical intuitions as self-evident truths, so if my goal is to remove intuition as the arbiter of mathematical truth then I have not yet succeeded. Specifically, in order to even state what it means for something to be a theorem in a formal system, I need to accept at least some of the structure informal logic, e.g. things like: if A is true and B is true then A AND B is also true. Why do we accept these ideas of informal logic? Primarily because they seem to be self-evidently true, but this is an appeal to unfettered mathematical intuition.
Maudlin points out that, because of this, it is difficult to avoid some form of mathematical platonism, if only about the basic ideas of logic. I am not yet prepared to accept this and would argue, along with Quine and Putnam [6, 8], that logic may be empirical (in my case, I would say that even very basic informal logic may be empirical).
I do not deny that I accept informal logic because it seems self-evident to me. However, as I argued in the essay, my intuitions come from my brain, and my brain is a physical system. Therefore, if I have strong intuitions, they must have been put there by the natural processes that led to the development of my brain. The likely culprit in this case is evolution by natural selection. Organisms that intuitively accept the laws of informal logic survive better than those that do not because those laws are true of our physical universe, so that creates a selection pressure to have them built in as intuitions.
Does this mean there are conceivable universes in which the laws of basic logic are different,4 e.g. in which Lewis Carroll’s modus ponens denying tortoise  is correct. I have to admit that I have difficulty imagining it, but it does not seem totally inconceivable. In such a universe, what counts as a mathematical truth would be different. In other words, mathematical truth might be empirical because the laws of logic are.
In addition to this, there is a much more prosaic way in which mathematical truth is dependent on the physical laws. Imagine a universe in which planets do not make circular motions as time progresses, but instead travel in straight lines through a continually changing landscape. Inhabitants of such a planet would probably not measure time using a cyclical system of minutes, hours, days, etc. as we do, but instead just use a system of monotonically increasing numbers. A bit more fancifully, suppose that in this hypothetical universe, every time a collection of twelve discrete objects like sheep, rocks, apples, etc. are brought together in one place they magically disappear into nothingness. Inhabitants of this world would use clock arithmetic, technically known as mod 12 arithmetic, to describe collections of discrete objects. Their view of how to measure time versus how to measure collections of discrete objects would be precisely the reverse of ours. At least one of the senses in which \(12+1=13\) in our universe is not true in theirs, and I would say that this is a sense in which mathematical truth depends on the laws of physics.
It is fair to say that Maudlin was not impressed by this example, but I take it deadly seriously. What counts as mathematics and what counts as mathematical truth are, in my view, pragmatically dependent on how our mathematical theories fit into the structure of human knowledge. If the empirical facts change then so does the structure of this network. The meaning of numbers, in particular, is dependent on how collections of discrete objects behave in our universe, and if you change that then you change what makes a given theory of number useful, and hence true in the pragmatic sense. It is this that makes the theory of number a hub in the network of human knowledge and this is what philosophers ought to be studying if they want to understand the meaning of mathematics. The usual considerations in the foundations of mathematics, such as deriving arithmetic from set theory, though still well-connected to other areas of mathematics, are comparative backwaters. If we really want to understand what mathematics is about, we ought to get our heads out of the formal logic textbooks and look out at the physical world.
Why are There Regularities at All?
Sophia Magnusdottir points out that my approach does not address why there are regularities in nature to begin with.
In a nutshell what you seem to be saying is that one can try to understand knowledge discovery with a mathematical model as well. I agree that one can do this, though we can debate whether the one you propose is correct. But that doesn’t explain why many of the observations that we have lend themselves to mathematical description. Why do we find ourselves in a universe that does have so many regularities? (And regularities within regularities?) That really is the puzzling aspect of the “efficiency of mathematics in the natural science”. I don’t see that you address it at all.
There are two relevant kinds of regularities here: the regularities described by our most abstract mathematical theories on the one hand, and the regularities of nature on the other. On the face of it, these two types of regularity have little to do with one another. The fact that the regularities described by our most abstract mathematical theories so often show up in physics is what I take to be the “unreasonable” effectiveness of mathematics in the physical sciences.
What I have tried to do is to argue that these two types of regularity are more closely connected than we normally suppose. They both ultimately describe regularities, within regularities, ..., within the natural world. I have not even tried to address the question of why there are regularities in nature in the first place. Instead, I have taken their existence as my starting point. If we live in a universe with regularities, the process of knowledge growth is such that what we call mathematics will naturally show up in physics. This answers what I take to be the problem of “unreasonable” effectiveness.
Of course, one can try to go further by asking why there are any regularities in the first place. I do not think that anyone has provided a compelling answer to this, and I suspect that it is one of those questions that just leads to an infinite regress of further “why” questions.
For example, the Mathematical Universe Hypothesis may seem, superficially, to explain the existence of regularities. If our universe literally is a mathematical structure, and mathematical structures describe regularities, then there will necessarily be regularities in nature. However, one can then ask why our universe is a mathematical structure, which is just the same question in a different guise.
If we are to take the results of science seriously, the idea that our universe is sufficiently regular to make science reliable has to be assumed. There is no proof of this, despite several centuries of debate on the problem of induction. Although this is an interesting issue, I doubt that the problem can ever be resolved in an uncontroversial way and it seems, to me at least, to be a different and far more difficult problem than the “unreasonable” effectiveness of mathematics in physics. If my ideas are correct then at least there is now only one type of regularity that needs to be explained.
Elegance or Efficiency?
Alexy Burov made the following point.
Wigner’s wonder about the relation of physics and mathematics is not just abut the fact that there are some mathematical forms describing laws of nature. He is fascinated by something more: that these forms are both elegant, while covering a wide range of parameters, and extremely precise. I do not see anything in your paper which relates to that amazing and highly important fact about the relation of physics and mathematics.
I take the key issue here to be that I have not explained why the mathematics used in physics is “elegant”. After all, if we had a bunch of different laws covering different parameter ranges then we could always put them together into a single structure by inserting a lot of “if” clauses into our laws of physics. We can also make them arbitrarily precise by adding lots of special cases in this way. Presumably though, the result of this would be judged “inelegant”.
To be honest, I have a great deal of trouble understanding what mathematicians and physicists mean by “elegance” (hence the scare quotes). For this reason, I have emphasized that the mathematics in modern physics is “abstract” and “advanced” rather than “elegant”.
A more precise definition of elegance is needed to make any progress on this issue. One concrete suggestions is that perhaps elegance refers to the fact that the fundamental laws of physics are few in number so they can be written on a t-shirt. It is tempting to draw the analogy with algorithmic information here, i.e. the length of the shortest computer program that will generate a given output . Perhaps the laws of physics are viewed as elegant because they have low algorithmic information. We get out of them far more than we put in.
So, perhaps what we call “elegance” really means the smallest possible set of laws that encapsulates the largest number of phenomena. If so, then what we need to explain is why the process of scientific discovery would tend to produce laws with low algorithmic information. The idea that scientists are trying to optimize algorithmic information directly is a logician’s parody of a complex social process. Instead, we need to determine whether the processes going on in the knowledge network would tend to reduce the algorithmic information content of the largest hubs in the network. In this I am encouraged by the fact that many scale-free networks exhibit the “small world” phenomenon in which the number of links in a path connecting two randomly chosen nodes is small.5 If this is true of the knowledge network then it means that the hubs must be powerful enough to derive the empirical phenomena in a relatively small number of steps. The average path length between two nodes might be taken as a measure of the efficiency with which our knowledge is encoded in the network, or, if you prefer, its “elegance”.
Now, of course, this may be completely unrelated to what everyone else means by the word “elegance”, as applied to mathematics and physics. If so, a more precise definition, or an analysis into more primitive concepts, is needed before we can address the problem. Once we have that, I suspect the problem might not look so intractable.