We have seen two essential components of computer systems that can scale up to the Big Math level: efficient and expressive theorem proving systems and systems for organizing mathematical knowledge in a modular fashion. We recall the five basic aspects of mathematics mentioned in the introduction to this paper:
- Inference::
-
the acquisition of new knowledge from what is already known.
- Computation::
-
the algorithmic transformation of representations of mathematical objects into more readily comprehensible forms.
- Concretization::
-
the creation of static, concrete data pertaining to mathematical objects and structures that can be readily stored, queried, and shared.
- Narration::
-
the human-oriented description of mathematical developments in natural language.
- Organization::
-
the modular structuring of mathematical knowledge.
These aspects—their existence and importance to mathematics—should be rather uncontroversial. In order to help the reader understand their tight interrelationships, Figure 1 arranges them in a convenient representation in three dimensions: we locate the organizational aspect at the center and the other four aspects at the corners of a tetrahedron, since the latter are all consumers and producers of the mathematical knowledge represented by the former. A four-dimensional representation might be more accurate but less intuitive. We note that the names of the aspects are all derived from verbs describing mathematical activity: (i) inferring mathematical knowledge, (ii) computing representations of mathematical objects, (iii) concretizing mathematical objects and structures, (iv) narrating how mathematical results are produced, and (v) organizing mathematical knowledge.
Below we look at each aspect in turn using the CFSG and related efforts as guiding case studies and survey existing solutions with respect to the tetrapod structure from Figure 1.
Inference
We have already seen an important form of machine-supported mathematical inference: deduction via machine-verified proof. There are other forms: automated theorem provers can prove simple theorems by systematically searching for calculus-level proofs (usually for variants of first-order logic), model generators construct examples and counterexamples, and satisfiability solvers check for Boolean satisfiability. All of these can be used to systematically explore the space of mathematical knowledge and can thus constitute a horizontal complement to human facilities.
Other forms of inference yield plausible conclusions instead of provable facts: abduction (conjecture formation from best explanations) and induction (conjecture formation from examples). Machine-supported abduction and induction have been studied much less than machine-supported deduction, at least for producing formal mathematics. However, there is now a conference series [1] that studies the use of machine learning techniques in theorem proving.
One of the main problems with Big Inference for mathematics is that inference systems are (naturally and legitimately) specialized to a particular logic. For instance, interactive proof assistants like Coq and HOL Light [23] have very expressive languages, whereas automated proof search is possible only for simpler logics in which the combinatorial explosion of the proof space can be controlled. This makes inference systems very difficult to interoperate, and thus all their libraries are silos of formal mathematical knowledge, leading to duplicated work and missed synergies—in analogy to the OBB, we could conceptualize this as a one-system barrier of formal systems. There is some work on logic- and library-level interoperability—we speak of logical pluralism—using metalogical frameworks (logics to represent logics and their relations) [26, 41, 45]. We contend that this is an important prerequisite for organizing mathematical knowledge in Big Math.
Computation
Computer scientists have a very broad view of what computation is. Be it \(\beta \)-reduction (in the case of the lambda calculus), transitions and operations on a tape (for Turing machines), or rewrites in some symbolic language, all of these are somehow quite removed from what a mathematician thinks when “computing.” Here, for the sake of simplicity and familiarity, we will largely be concerned with symbolic computation, e.g., manipulation of expressions containing symbols that represent mathematical objects. Of course, there are also many flavors of numeric computation such as in scientific computing, simulation/modeling, statistics, and machine learning.
In principle, mathematical computation can be performed by inference, e.g., by building a constructive proof that the sum \(2+2\) exists. But this is not how humans do it—they are wonderfully flexible in switching between the computational and the inferential aspects. Current inference-based systems in wide use have not achieved this flexibility, although systems like Coq, Agda [38, 39], and Idris [24] are making inroads. In any case, computation via inference is intractable (even basic arithmetic ends up in an unexpected complexity class)—somewhat dual to how inference via computation in decision procedures has only limited success.
Instead, the most powerful computation systems are totally separate from inference: computer algebra systems like Maple, Mathematica, SageMath, and GAP can tackle computations that are many orders of magnitude larger than what humans can do—often in mere milliseconds.
But these systems face the same interoperability problems as inference systems do, open standards for object representation like OpenMath [40] and MathML [3] notwithstanding. Just to name a trivial but symptomatic example, a particular dihedral group is called \(D_4\) in Sage and \(D_8\) in GAP due to differing conventions in the respective communities. More mathematically involved and therefore more difficult to fix is that most of the implementations of special functions in computer algebra systems differ in the underlying branch cuts [12]. Inference during computation would enable some of these problems to be fixed, but this has been sacrificed in favor of computational efficiency. Another source of complexity is that today’s most feature-rich symbolic computation systems are both closed-source and commercial, which makes integrating them into a system of trustable tools challenging. That having been said, the kinds of effort devoted to the development of those systems is significantly greater than what can currently be achieved in academia, where code contributions are undervalued compared to the publication of research papers. Furthermore, obtaining funding for sustainable development of large software is difficult.
Lastly, there is the question of acceptability of certain computations in proofs. In part, this derives from the difficulty in determining whether a program written in a mainstream programming language is actually correct, at least to the same level of rigor that other parts of mathematics are subjected to. Some of this problem can be alleviated by the use of more modern programming languages that have well-understood operational and denotational semantics, thus putting them on a sound mathematical footing. Nevertheless, it will remain that any result that requires thousands of lines of code or hours of computation (or more) that cannot be verified by humans is likely to be doubted unless explicit steps are taken to ensure its correctness. The Flyspeck project [22]—a computer verification of Thomas Hales’s proof of the Kepler conjecture [20] in the HOL Light and Isabelle proof assistants of comparable magnitude to the OOT proof effort—provides an interesting case study, since it includes the verification of many complex computations.
Concretization
If we look at the CFSG, we see that that result already contains an instance of concretization: the collection of the 26 sporadic groups, which are concrete mathematical objects that can be represented, for example, in terms of matrices of numbers. Even more importantly, many of the insights that led to the CFSG were reached only by constructing particular groups, which were tabulated (as parts of journal articles and lists that were passed around). We also see this in other Big Math projects, e.g., Langland’s program of trying to relate Galois groups in algebraic number theory to automorphic forms and the representation theory of algebraic groups over local fields and adeles. This is supported by LMFDB, which contains about 80 tables with representations of mathematical objects ranging from elliptic curves to Hecke fields—almost a terabyte of data in all. These are used, for instance, to find patterns in the objects and their properties and to support or refute conjectures. Other well-known examples are OEIS, with over \(300\,000\) integer sequences, and the Small Groups Library [47], with more than 450 million groups of small order. See [4] for a work-in-progress survey of math databases.
Unfortunately, such math databases are typically not integrated into systems for mathematical inference, narration, or knowledge organization and are only weakly integrated into systems for computation. Usually these databases supply a human-oriented web interface. If they also offer application programming interface (API) access to the underlying database, they do so only for database-level interaction, where an elliptic curve might be a quadruple of numbers encoded as decimal strings to work around size restrictions of the underlying database engine. What we would need instead for an integration in a Big Math system is an API that supplies access to the mathematical objects, such as the representations of elliptic curves as they are treated in organization, computation, inference, and narration systems; see [51] for a discussion.
As usual, there are exceptions. The GAP Small Groups Library system and LMFDB are notable examples: the former is deeply integrated into the GAP computer algebra system, and the latter includes almost a thousand narrative descriptions of the mathematical concepts underlying the tabulated objects.
Narration
Consider Figure 2, which shows an intermediate result in the OOT in the Coq inference system (foreground) and its corresponding narrative representation (background). Even though great care has been taken to make the Coq text human-readable, i.e., short and suggestive of traditional mathematical notation, there is still a significant language barrier to all but members of the OOT development team.
Indeed, mathematical tradition is completely different from its representation in inference, computation, and concretization systems. Knowledge and proofs are presented in documents like journal articles, preprints, monographs, and talks for human consumption. While rigor and correctness are important concerns, the main emphasis is on the efficient communication of ideas, insights, intuitions, and inherent connections to colleagues and students. As a consequence, more than half of the text of a typical mathematical document consists of introductions, motivations, recaps, remarks, outlooks, conclusions, and references. Even though this packaging of mathematical knowledge into documents leads to extensive overhead and duplication, it seems to be an efficient way of dealing with the OBB and thus a necessary cost for scholarly communication.
In current proof assistants like Coq, the narration aspect is undersupported even though tools like
have revolutionized mathematical writing. Source comments in the input files are possible in virtually all inference and computation systems, but they are not primary objects for the system and are thus used much less than in narrative representations. Knuth’s literate programming idea [25] has yet to take root in mathematics, although it is worthwhile noting that one of the earliest inference systems, Automath [9], had extensive features for narration. The main modern exception among inference systems is Isabelle: it supports the inclusion of marked-up text and programs, and it turns the underlying mathematical library into a deeply integrated document management system, which allows recursive nesting of narrative, inference, and computation [49].
Some computational communities such as parts of statistics, especially users of the software environment R, make use of some features of literate programming. Jupyter notebooks as used with SageMath as well as the document interfaces of Maple and Mathematica are also somewhat literate, although the simple fact that they do not interoperate smoothly with
hampers their adoption as methods of conveying large amounts of knowledge narratively.
In any case, inference and computation systems are notoriously bad for expressing the vague ideas and underspecified concepts that are characteristic of early phases in the development of mathematical theories and proofs, a task at which narration excels. Therefore, the Flyspeck project [22] used a
book [21] that reorganized the original proof to orchestrate and track the formal proof via cross-references to the identifiers of all formal definitions, lemmas, and theorems. Incidentally, the ongoing effort to establish a second-generation proof of the CFSG has a similar book, consisting of seven volumes already published and five additional volumes that will be published in the future.Footnote 7
Organization
In the discussion and survey of the four corners of the tetrapod from Figure 1, we have seen that all four aspects share and are based on the representation of knowledge, which we call the mathematical ontology,Footnote 8 and that they can interoperate effectively through this ontology. And we have seen that a modular, redundancy-minimizing organization of the ontology is crucial for getting a handle on the inherent complexity of mathematical knowledge.
Most inference and computation systems feature some kind of modularity features to organize their libraries. For inference systems, this was pioneered in the IMPS system [17] in the form of theories and theory interpretations and has been continued, for example, in the Isabelle system (type classes and locales). In systems like Coq and Lean [36] that feature dependent record types, theories and their morphisms can be encoded inside the base logic. Computation systems feature similar concepts. Finally, the MMT system [35, 43] systematically combines modular representation with a metalogical framework, in which the logics and logic-morphisms can be represented, yielding a foundation-independent (bring-your-own-logic) framework for mathematical knowledge representation that can be used to establish system interoperability.