1 Introduction

Automated theorem provers for Euclidean geometry often use numerical models (i.e. diagrams) for heuristic reasoning, e.g. for conjecturing subgoals, pruning branches, checking non-degeneracy conditions, and selecting auxiliary constructions. However, modern solvers rely on diagrams that are either supplied manually [7, 24] or generated automatically via methods that are severely limited in scope [12]. Motivated by the IMO Grand Challenge, an ongoing effort to build an AI that can win a gold medal at the International Mathematical Olympiad (IMO), we present a method for expressing and solving olympiad-level systems of geometric constraints.

Historically, algebraic methods are the most complete and performant for automated geometry diagram construction but suffer from degenerate solutions and, in the numerical case, non-convexity. These methods are restricted to relatively simple geometric configurations as poor local minima arise via large numbers of parameters. Moreover, degenerate solutions manifest as poor distributions for the vertices of geometric objects (e.g. a non-sensical triangle) as well as intersections of objects at more than one point (e.g. lines and circles, circles and circles).

Fig. 1.
figure 1

An example GMBL program and corresponding diagram generated by the GMB for IMO 2010 Problem 2.

We constructed a domain-specific language (DSL), the Geometry Model-Building Language (GMBL), to express geometry problems whose semantics induce tractable numerical optimization problems. The GMBL includes a set of commands with which users introduce geometric objects and constraints between these objects. There is a direct interpretation from these commands to the parameterization of geometric objects, the computation of geometric quantities from existing ones, and additional numerical constraints. The GMBL employs root selector declarations to disambiguate multiple solution problems, reparameterizations both to reduce the number of parameters and increase uniformity in model variance, and joint distributions for geometric objects that are susceptible to degeneracy (i.e. triangles and polygons). Our DSL treats points, lines, and circles as first-class citizens, and the language can be easily extended to support additional high-level features in terms of these primitives.

We provide an implementation of our method, the Geometry Model Builder (GMB), that compiles GMBL programs into Tensorflow computation graphs [1] and generates models via off-the-shelf, gradient-based optimization. Figure 2 demonstrates an overview of this implementation. Experimentally, we find that the GMBL sufficiently reduces the parameter space and mitigates degeneracy to make our target geometry amenable to numerical optimization. We tested our method on all IMO geometry problems since 2000 (\(n=39\)), of which 36 can be expressed as GMBL programs. Using default parameters, the GMB finds a single model for 94% of these 36 problems in an average of 27.07 seconds. Of the problems for which our program found a model and the goal of the problem could be stated in our DSL, the goal held in the final model 86% of the time.

All code is available on GitHubFootnote 1 with which users can write GMBL programs and generate diagrams. Our program can be run both as a command-line tool for integration with theorem provers or as a locally-hosted web server.

2 Background

Here we provide an overview of olympiad-level geometry problem statements, as well as several challenges presented by the associated constraint problems.

2.1 Olympiad-Level Geometry Problem Statements

IMO geometry problems are stated as a sequential introduction of potentially-constrained geometric objects, as well as additional constraints between entities. Such constraints can take one of two forms: (1) geometric constraints describe the relative position of geometric entities (e.g. two lines are parallel) while (2) dimensional constraints enforce specific numerical values (e.g. angle, radius). Lastly, problems end with a goal (or set of goals) typically in the form of geometric or dimensional constraints. The following is an example from IMO 2009:

figure a

This problem introduces ten named geometric objects and has a single goal.

Note that this class of problems does not admit a mathematical description but rather is defined empirically (i.e. as those problems selected for olympiads). The overwhelming majority of these problems are of a particular type – plane geometry problems that can be expressed as problems in nonlinear real arithmetic (NRA). However, while NRA is technically decidable, olympiad problems tend to be littered with order constraints and complex constructions (e.g. mixtilinear incenter) and be well beyond the capability of existing algebraic methods. On the other hand, they are selected to admit elegant, human-comprehensible proofs. It is this class of problems for which the GMBL was designed to express; though rare, any particular olympiad geometry problem is not guaranteed to be of this type and therefore is not necessarily expressible in the GMBL.

Fig. 2.
figure 2

An overview of our method. Our program takes as input a GMBL program and translates it to a set of real-valued parameters and differentiable losses in the form of a static computation graph. We then apply gradient-based optimization to obtain numerical models and display them as diagrams.

2.2 Challenge: Globally Coupled Constraints

A naïve approach to generate models would incrementally instantiate objects via their immediate constraints. For (IMO 2009 P2), this would work as follows:

  1. 1.

    Sample points A, B, and C.

  2. 2.

    Compute O as the circumcenter of \(\varDelta ABC\).

  3. 3.

    Sample P and Q on the segments CA and AB, respectively.

  4. 4.

    Compute K, L, and M as the midpoints of BP, CQ, and PQ, respectively.

  5. 5.

    Compute \(\varGamma \) as the circle defined by K, L, and M.

Immediately we see a problem – there is no guarantee that PQ is tangent to \(\varGamma \) in the final model. Indeed, the constraints of (IMO 2009 P2) are quite globally coupled – the choice of P partially determines the circle \(\varGamma \) to which PQ must be tangent, and every choice of \(\varDelta ABC\) does not even admit a pair P and Q satisfying this constraint. This is an example of the frequent non-constructive nature of IMO geometry problems. When there is no obvious reparameterization to avoid downstream effects, all constraints must be considered simultaneously rather than incrementally or as a set of smaller local optimization problems.

2.3 Challenge: Root Resolution

Even in the constructive case, local optimization is not necessarily sufficient given that multiple solutions can exist for algebraic constraints. More specifically, two circles or a circle and a line intersect at up to two distinct points and in a problem that specifies each distinct intersection point, the correct root to assign is generally not locally deducible. Without global information, this can lead to poor initializations becoming trapped in local minima. The GMBL accounts for this by including a set of explicit root selectors as described in Section 3.3. These root selectors provide global information for selecting the appropriate point from a set of multiple solutions to a system of equations.

3 Methods

In this section we present the GMBL and GMB in detail. In our presentation, we make use of the following notation and definitions:

  • The type of a geometric object can be one of (1) point, (2) line, or (3) circle. We denote the type of a real-valued number as number.

  • We use \(\texttt {<>}\) to denote an instance of a type.

  • A name is a string value that refers to a geometric object.

3.1 GMBL: Overview

The GMBL is a DSL for expressing olympiad-level geometry problems that losslessly induces a numerical optimization problem. It consists of four commands, each of which has a direct interpretation regarding the accumulation of (1) real-valued parameters and (2) differentiable losses in terms of these parameters:

  1. 1.

    param: assigns a name to a new geometric object parameterized either by a default or optionally supplied parameterization

  2. 2.

    define: assigns a name to an object computed in terms of existing ones

  3. 3.

    assert: imposes an additional constraint (i.e. differentiable loss value)

  4. 4.

    eval: evaluates a given constraint in the final model(s)

Table 1 provides a summary of their usage. The GMBL includes an extensible library of functions and predicates with which commands are written. Notably, this library includes a notion of root selection to explicitly resolve the selection of roots to systems of equations with multiple solutions.

3.2 GMBL: Commands

In the following, we describe in more detail the usage of each command and their roles in constructing a tractable numerical optimization problem.

param accepts as arguments a string, a type, and an optional parameterization. This introduces a geometric object that is parameterized either by the default parameterization for \(\texttt {<type>}\) or by the supplied method. Each primitive geometric type has the following default parameterization:

  • point: parameterized by its x- and y-coordinates

  • line: parameterized by two points that define the line

  • circle: parameterized by its origin and radius

Optional parameterizations embody our method’s use of reparameterization to decrease the number of parameters and increase model diversity. For example, consider a point C on the line \(\overleftrightarrow {AB}\) that is subject to additional constraints. Rather than optimizing over the x- and y-coordinates of C, we can express C in terms of a single value z that scales C’s placement on the line \(\overleftrightarrow {AB}\).

In addition to the standard usage of param outlined above, the GMBL includes an important variant of this command to introduce sets of points that form triangles and polygons. This variant accepts as arguments (1) a list of point names, and (2) a required parameterization (see Table 1). This joint parameterization of triangles and polygons further prevents degeneracy. For example, to initialize a triangle \(\varDelta ABC\), we can sample the vertices from normal distributions with means at distinct thirds of the unit circle. This method minimizes the sampling of triangles with extreme angle values, as well as allows for explicit control over the distribution of acute vs. obtuse triangles by adjusting the standard deviations. Appendix C includes a list of all available parameterizations.Footnote 2

Table 1. An overview of usage for the four commands.

define accepts as arguments a string, a type, and a value that is one of \(\texttt {<point>}\), \(\texttt {<line>}\), or \(\texttt {<circle>}\). This command serves as a basic assignment operator and is useful for caching commonly used values. The functions described in Section 3.3 are used to construct \(\texttt {<value>}\) from existing geometric objects.

assert accepts a single predicate and imposes it as an additional constraint on the system. This is achieved by translating the predicate to a set of algebraic values and registering them as losses. This command does not introduce any new geometric objects and can only refer to those already introduced by param or define. Notably, dimensional constraints and negations are always enforced via assert. Detail on supported predicates is presented in Section 3.3.

eval, like assert, accepts a single predicate and therefore does not introduce any new geometric objects. However, unlike assert, the corresponding algebraic values are evaluated and returned with the final model rather than registered as losses and enforced via optimization. This command is most useful for those interested in integrating the GMBL with theorem provers.

3.3 GMBL: Functions and Predicates

The second component of our DSL is a set of functions and predicates for constructing arguments to the commands outlined above. Functions construct new geometric objects and numerical values whereas predicates describe relationships between them. Our DSL includes high-level abstractions for common geometric concepts in olympiad geometry (e.g. excircle, isotomic conjugate).

Functions in the GMBL employ a notion of root selectors to address the “multiple solutions problem” described in Section 2.3. In plane geometry, this problem typically manifests with multiple candidate point solutions, such as the intersection between a line and a circle. Root selectors control for this by allowing users to specify the appropriate point for functions with multiple solutions. Figure 3 demonstrates their usage in the functions inter-lc (intersection of a line and circle) and inter-cc (intersection of two circles).

Importantly, arguments to predicates and functions can be specified with functions rather than named geometric objects. For a list of supported functions, predicates, and root selectors, refer to Appendices A, B, and C, respectively.

Fig. 3.
figure 3

An example usage of root selectors to resolve the intersections of lines and circles, and circles and circles.

3.4 Auxiliary Losses

The optimization problem encoded by a GMBL progran includes three additional loss values. Foremost, for every instance of a circle intersecting a line or other circle, we impose a loss value that ensures the two geometric objects indeed intersect. The final two, albeit opposing losses are intended to minimize global degeneracy. We impose one loss that minimizes the mean of all point norms to prevent exceptionally separate objects and a second to enforce a sufficient distance between points to maintain distinctness.

3.5 Implementation

We built the GMB, an open-source implementation that compiles GMBL programs to optimization problems and generates models. The GMB takes as input a GMBL program and processes each command in sequence to accumulate real-valued parameters and differentiable losses in a Tensorflow computation graph. After registering auxiliary losses , we apply off-the-shelf gradient-based local optimization to produce models of the constraint system. In summary, to generate N numerical models, our optimization procedure works as follows:

  1. 1.

    Construct computation graph by sequentially processing commands.

  2. 2.

    Register auxiliary losses.

  3. 3.

    Sample sets of initial parameter values and rank via loss value.

  4. 4.

    Choose (next) best initialization and optimize via gradient descent.

  5. 5.

    Repeat (4) until obtaining N models or the maximum # of tries is reached.

Our program accepts as arguments (1) the # of models desired (\(\mathrm {default} = 1\)), (2) the # of initializations to sample (\(\mathrm {default} = 10\)), and (3) the max # of optimization tries (\(\mathrm {default} = 3\)). Our program also accepts the standard suite of parameters for training a Tensorflow model, including an initial learning rate (\(\mathrm {default} = 0.1\)), a decay rate (\(\mathrm {default} = 0.7\)), the max # of iterations (\(\mathrm {default} = 5000\)), and an epsilon value (\(\mathrm {default} = 0.001\)) to determine stopping criteria.

Table 2. An evaluation of our method’s ability to generate a single model for each of the 36 IMO problems encoded in our DSL. For each problem, 10 sets of initial parameters were sampled over which our program optimized up to three. All data shown are the average of three trials. The first row demonstrates results using default parameters (\(\epsilon = 0.001\), \(\mathrm {learning\,rate} = 0.1\), \(\mathrm {\#\,iterations} = 5,000\)).

4 Results

In this section, we present an evaluation of our method’s proficiency in three areas of expressing and solving olympiad-level geometry problems:

  1. 1.

    Expressing olympiad-level geometry problems as GMBL programs.

  2. 2.

    Generating models for these programs.

  3. 3.

    Preserving truths (up to tolerance) that are not directly optimized for.

Table 2 contains a summary of our results.

Our evaluation considers all 39 IMO geometry problems since 2000. Of these 39 problems, 36 can be expressed in our DSL. Those that we cannot encode involve variable numbers of geometric objects. For 32 of these 36 problems, we can express the goals as eval commands in the corresponding GMBL programs. The goals of the additional four problems are not expressible in our DSL, e.g. our DSL cannot express goals of the form “Find all possible values of \(\angle ABC\).”

To evaluate (2) and (3), we conducted three trials in which we ran our program on each of the 36 encodings with varying sets of arguments. With default arguments, our program generated a single model for (on average) 94% of these problems. Our program ran for an average of 27.07 seconds for each problem but there is a stark difference between time to success and time to failure (14.72 vs 223.51 seconds) as failure entails completing all optimization attempts whereas successful generation of a model terminates the program. We achieve similar success rates with more forgiving training arguments or a higher tolerance.

For use in automated theorem proving, it is essential that models generated by our tool not only satisfy the constraint problem up to tolerance but also any other truths that follow from the set of input constraints. The most immediate example of such a truth is the goal of a problem statement. Therefore, we used the goals of IMO geometry problems as a proxy for this ability by only checking the satisfaction of the goal in the final model (i.e. with an eval statement) rather than directly optimizing for it. In our experiments, we considered such a goal satisfied if it held up to \(\epsilon * 10\) as it is reasonable to expect slightly higher floating-point error without explicit optimization. Using default parameters, the goal held up to tolerance in 86% of problems for which we found a model and could express the goal. This rate was similar across all other sets of arguments.

5 Future Work

Here we discuss various opportunities for improvement of our method.

Firstly, improvements could be made to our method of numerical optimization. While Tensorflow offers a convenient way of caching terms via a static computation graph and optimizing directly over this representation, there is not explicit support for constrained optimization. Because of this, arbitrary weights have to be assigned to each loss value. Though rare, this can result in false positives and negatives for the satisfaction of a constraint. Using an explicit constrained-optimization method (e.g. SLSQP) would enable the separation of soft constraints (e.g. maximizing the distance between points) and hard constraints (e.g. those enforced by assert), removing the need for arbitrary weights.

Secondly, cognitive overhead could be reduced as users are currently required to determine degrees of freedom; it would be far easier to write problem statements using only declarations of geometric objects and constraints between them, e.g. using only assert. This could be accomplished by treating our DSL as a low-level “instruction set” to which a higher-level language could be compiled. The main challenge of such a compiler would be appropriately identifying opportunities to reduce the degrees of freedom. To achieve this, the compiler would require a decision procedure for line and circle membership.

Lastly, we could improve our current treatment of distinctness. To prevent degenerate solutions, our method optimizes for object distinctness and rejects models with duplicates. However, there is the occasional problem for which a local optimum encodes two provably distinct points as equal up to floating point tolerance. There are many techniques that could be applied to this problem (e.g. annealing) though we do not consider them here as the issue is rare.

6 Related Work

Though many techniques for mechanized geometry diagram construction have been introduced over the decades, no method, to the best of our knowledge, can produce models for more than a negligible fraction of olympiad problems. There exist many systems, built primarily for educational purposes, for interactively generating diagrams using ruler-and-compass constructions, e.g. GCLC [13], GeoGebra [11], Geometer’s Sketchpad [20], and Cinderella [19]. There are also non-interactive methods for deriving such constructions, e.g. GeoView [2] and program synthesis [9, 12]. However, as discussed in Section 2.2, very few olympiad problems can be described in such a form. Alternatively, Penrose is an early-stage system for translating mathematical descriptions to diagrams that relies on constrained numerical optimization and therefore does not suffer from this expressivity limitation [25]. However, this system lacks support for constraints with multiple roots, e.g. intersecting circles. There are more classical methods that similarly depart from constructive geometry. MMP/Geometer [8] translates the problem to a set of algebraic equations and uses numerical optimization (e.g. BFGS) and GEOTHER [22, 23] first translates a predicate specification into polynomial equations, decomposes this system into representative triangular sets, and obtains solutions for each set numerically. Neither of these programs are available to evaluate though we did test similar approaches using modern libraries (specifically: sympy [17] and scipy [21]) and both numerical and symbolic methods would almost always timeout on relatively simple olympiad problems.

Generating models for systems of geometric constraints is also a challenge in computer-aided design (CAD) for engineering diagram drawing. Recent efforts focus on graph-based synthetic methods, a subset of techniques concerned with ruler-and-compass constructions [3, 5, 6, 10, 14, 16, 18]. Most relevant to our method are Bettig and Shah’s “solution selectors” which, similar to root selectors in the GMBL, allow users to specify the configuration of a CAD model [4]. However, these solution selectors are purpose-built and do not generalize.

7 Conclusion

It is standard in GTP to rely on diagrams for heuristic reasoning but the scale of automatic diagram construction is limited. To enable efforts to build a solver for IMO geometry problems, we developed a method for building diagrams for olympiad-level geometry problems. Our method is based on the GMBL, a DSL for expressing geometry problems that induces (usually) tractable numerical optimization problems. The GMBL includes a set of commands that have a direct interpretation for accumulating real-valued parameters and differentiable losses. Arguments to these commands are constructed with a library of functions and predicates that includes notions of root selection, joint distributions, and reparameterizations to minimize degeneracy and the number of parameters. We implemented our approach in an open-source tool that translates GMBL programs to diagrams. Using this program, we evaluated our method on all IMO geometry problems since 2000. Our implementation reliably produces models; moreover, known truths that are not directly optimized for typically hold up to tolerance. By handling configurations of this complexity, our system clears a roadblock in GTP and provides a critical tool for undertakers of the IMO Grand Challenge.