Measuring the Readability of Geometric Proofs: The Area Method Case

Quaresma, Pedro; Graziani, Pierluigi

doi:10.1007/s10817-022-09652-0

Measuring the Readability of Geometric Proofs: The Area Method Case

Open access
Published: 10 January 2023

Volume 67, article number 5, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Automated Reasoning Aims and scope Submit manuscript

Measuring the Readability of Geometric Proofs: The Area Method Case

Download PDF

1568 Accesses
1 Citation
Explore all metrics

This article has been updated

Abstract

Using an approach, inspired by our modernisation of Lemoine’s Geometrography, this paper proposes a new readability criterion for formal proofs produced by automated theorem provers for geometry. We analyse two criteria to measure the readability of a proof: the criterion given by Chou et al. and the one given by Wiedijk. After discussing the limitations of these two criteria, we introduce a novel approach, which provides a new criterion. We conclude discussing some future work.

On the difficulty of discovering mathematical proofs

Article Open access 20 July 2023

Abstraction, Axiomatization and Rigor: Pasch and Hilbert

Improving Legibility of Formal Proofs Based on the Close Reference Principle is NP-Hard

Article Open access 12 July 2015

1 Introduction

An important feature of a text is its readability. Readability is the ease with which a reader can understand a written text. The readability of a text is determined by many factors and plays an important role in many areas of interest. For example, it might depend both on the content of the text, i.e., the complexity of the vocabulary and syntax, and on the layout of the text, e.g. its typographical aspects. Readability has to be distinguished from legibility that is the ease with which a reader can recognise individual characters in a text.

In order to quantify the readability of a text, various formulas have been defined [5]. In this paper we will deal with the readability of geometric proofs produced by automatic theorem provers. A potential approach that has been followed in the past is to take formulas that were developed for applications to non-scientific texts and apply them to mathematical texts [13]. However, this sort of approach has not been extended in a proper way to measure the readability of proofs produced by automatic theorem provers and much more work needs to be done^{Footnote 1} [6, 12, 19, 22, 26].

A mathematical text is composed of many elements: descriptions in natural language, formulas and diagrams; thus, it is much more difficult to quantify its readability through formulas, then in the case of regular text. Even more complex is the problem of the readability of mathematical proofs produced by automatic provers that are often presented in a form that can only be read by experts.

In this paper, we will introduce both a language to formulate readability criteria for formal proofs produced by automated theorem provers for geometry, based on the area method [2, 10] (see Appendix 1), and a novel criterion based on our modernisation of Lemoine’s Geometrography [14, 22, 24]. We will show how this new criterion is consistent with the results of the other already existing criteria, but that it is also more general and expressive compared to the others.

The proposed language allows for an easy formulation of new readability criteria as well as for an easy implementation of those criteria in repositories such as the Thousand of Geometric problems for geometric Theorem Provers (TGTP),^{Footnote 2} thereby collecting data relevant to the further development of the area of automated theorem proving in geometry. The data will help strengthen the use of automatic tools not only in research but also in applications like in mathematics education, where the use of automated deduction is already making its way [7, 23]. Therefore, as in the Automath project, the formulation of a readability criterion will allow the definition of a threshold below which “people will start using them (the proofs produced by automated theorem provers^{Footnote 3}) for serious work” (see §2.22.2).

Overview of the paper. The paper is organised as follows: first, in §2, the known readability criteria will be discussed. In §3, Lemoine’s Geometrography, its modernisation and a formal language employed to study readability of formal proofs produced by automated theorem provers for geometry, based on the area method, will be analysed. In §4, a new readability criterion that uses Geometrography will be presented, providing also some examples of its application. In §5 conclusions are drawn and future work will be discussed.

2 Criteria of Readability (by Experts)

To the best of our knowledge there are two precise proposals to measure the readability of a proof. The first one is that proposed by Chou et al. [1, p. 452], while the second is that proposed by Freek Wiedijk and is known as the de Bruijn factor [4, 27].

2.1 Maxt-Lems Criterion

Chou et al. [1, p. 452] proposed a way to measure how difficult it is to read a formal proof, obtained by using an automated theorem prover for geometry (GATP) implementing the area method. The Maxt-Lems (ML) criterion considered the following pair (maxt, lems), where:

maxt is the number of terms of the maximal polynomial occurring in the machine’s proof. Thus, maxt measures the number of computations needed in the proof;
lems is the number of elimination lemmas used to eliminate points from geometric quantities. In other words, lems indicates the number of deduction steps in the proof.

Using those two elements and analysing all the proofs done by their GATP, they managed to determine an indicative threshold for readability. According to [1, p. 452] a formal proof, which employs the area method, is considered readable if one of the following conditions holds:

the maximal term in the proof is less than or equal to 5;
the number of deduction steps of the proof is less than or equal to 10;
the maximal term in the proof is less than or equal to 10 and the deduction steps are less than or equal to 20.

It is interesting that according to their corpus:^{Footnote 4}$66.9\%$ of the proofs have $\textrm{maxt}\le 5$, $42.6\%$ have $\textrm{lems} \le 10$ and $73.2\%$ have $\textrm{lems}\le 20$.

Let us consider, for example, the Thousand of Geometric problems for geometric Theorem Provers (TGTP) repository, specifically, problem GEO0001, the Ceva’s Theorem.

Theorem 1

(Ceva’s Theorem) Let $\Delta ABC$ be a triangle and P be any point in the plane. Let $D=AP\cap CB$, $E=BP\cap AC$, and $F=CP\cap AB$. Show that: $\frac{\overline{AF}}{\overline{FB}} \times \frac{\overline{BD}}{\overline{DC}} \times \frac{\overline{CE}}{\overline{EA}} = 1$. P should not be in the lines parallel to AC, AB and BC and passing through B, C and A respectively [10].

With respect to the ML criterion, considering the proof made by the Geometry Constructions LaTeX Converter (GCLC) [11] GATP (see Appendix 1), the values are: $maxterm = 1$, and $lems = 3$. Therefore, this would be considered a readable proof.

2.2 The de Bruijn Factor

The Automath project had the goal of developing a system that would allow to write entire mathematical theories in such a precise fashion that verification of the correctness of theorems in such theories could have been carried out by formal (mechanical) operations applied directly to the text [4]. This was a first effort in the direction of the Formalisation of Mathematics that is now pursued by researchers working in systems like Coq, Isabelle and Mizar.^{Footnote 5}

In “A Survey of Project Automath”, de Bruijn introduced the loss factor between the size of an ordinary mathematical exposition and its full formal translation inside a computer. The loss factor expresses what someone loses, in terms of shortness, when translating informal mathematics into Automath. Wiedjk developed the concept and called it the de Bruijn factor. The de Bruijn factor was developed for a situation where a proof is entered in a computer in full detail in such a way that the computer can check its correctness, e.g, when an existing informal mathematical text is taken and it is translated into a computer representation (using a system like Automath). So the de Bruijn factor measures how efficient a system is [27]. Wiedijk noted that non-meaningful questions about formatting could affect the calculus of the loss factor, for example: if indentation is performed employing the tab key, then such indentation can be eight times smaller compared to situations in which the indentation is done using the space key; also the TeX macro name for the ’$\Leftrightarrow $’ symbol uses 15 characters, while an encoding like “$<=>$” uses only 3. To further smooth formatting choices, Wiedjk proposed to compress the files before calculating the ratios of their sizes. Wiedijk calls the ratio of the uncompressed file sizes the apparent de Bruijn factor, and the ratio of the compressed file sizes the intrinsic de Bruijn factor [27].

We claim that the de Bruijn factor can be used, in a broader sense, to measure the efficiency of an automated theorem prover and a given axiomatisation. Whenever a informal proof is known for a given theorem, it can be compared with the formal proof produced by the automated theorem prover, using a specified axiomatisation. This is particularly true in geometry where a given informal geometric proof can be compared with an, also geometric, formal proof produced by a geometric automated theorem prover.

Using again the Ceva’s Theorem as an example, the readability of its formal proof, with respect to the de Bruijn factor can be calculated^{Footnote 6} (see Table 1).

Table 1 Ceva’s Theorem Proof by GCLC Area Method GATP, de Bruijn factor

Full size table

Wiedijk also introduced the de Bruijn threshold, i.e., a limit below which “the people will start using them (Automath like system) for serious work”. We will consider the value of 2 as a readability threshold. Further studies are needed in order to establish a readability threshold for automated proofs, using the de Bruijn factor. Moreover, a broader comparison between formal proofs and informal proofs is needed.

Considering the quotient of the size of the compressed formal proof (area method) and the size of the informal proof, the de Bruijn factor of Ceva’s Theorem is 1.09. It would therefore be sensible to consider the GCLC area method proof, readable.

2.3 ML and de Bruijn Factor’s Limits

Analysing the previous criteria, we can note a first limit for both the ML criterion and the de Bruijn factor: they assume that readability by expert is being considered, i.e., a geometer expert in the language of the prover that produces the proof.

A second limit emerges when the following [22] classification of formal geometric proofs produced by GATPs is taken into consideration:^{Footnote 7}

1.
no readable proof, only a proved/not proved output;
2.
non-synthetic proof (i.e., a proof without a corresponding geometric description, e.g. algebraic methods);
3.
semi-synthetic proof with a corresponding prover’s language rendering;
4.
(semi-)synthetic proof with a corresponding natural language rendering;
5.
(semi-)synthetic proof with a corresponding natural language and visual rendering;

Relating the ML criterion with this classification, we can note that such criterion only allows the definition of a threshold for semi-synthetic proofs that employ the area method (level 3). The direct applicability of the ML criterion to other synthetic methods, e.g. full-angle methods or the deductive database method [3, 28], would be possible, considering the number of deduction steps of the proofs and adapting the condition regarding the maximal term in the proofs.

The de Bruijn factor can be used directly in all levels above 1, although it is more meaningful on levels greater or equal than 2.3. Considering the (GCLC) and its integrated GATPs based on the area method, Wu’s method and Gröbner Basis method [9], it is possible to calculate the readability of the proofs developed using the different GATPs. It is indeed possible to imagine, extrapolating from the results with the area method, that all those proofs would be readable, and this would hold even though the de Bruijn factor requires informal proofs to be provided.^{Footnote 8}

The two criteria analysed are very different, the first is very specific while the second is very generic, although both criteria require readability by experts. We can therefore ask ourselves if it is possible to define a new criterion which does not require readability by experts, which is also more natural and expressive than the previous ones, and which can be generalised to various proof methods.

3 Looking for a More Natural Readability Criterion

The new criterion that we want to propose is based on our modernisation of Lemoine’s Geometrography [14, 22, 24]. We will begin by explaining what Geometrography is and what its modernisation consists of.

Geometrography, “alias the art of geometric constructions”, aims at providing a tool: (i) to designate every geometric construction by a symbol that manifests its simplicity and exactitude;^{Footnote 9} (ii) to teach the simplest way to execute an assigned construction; (iii) to discuss a known solution to a problem and eventually replacing it with a better solution; (iv) to compare different solutions for a problem, by deciding which is the most exact and the simplest solution from the point of view of Geometrography [14,15,16,17, 20, 22, 24].

3.1 Classical Geometrography

In Lemoine’s Geometrography two coefficients are defined to measure the relative difficulty to perform some geometric constructions. The approach is applied to ruler and compass geometry, i.e., geometric constructions made only with the help of a ruler and a compass. Considering the modifications proposed by Mackay [16], the following Ruler and Compass constructions and the corresponding coefficients can be analysed.

To place the edge of the ruler in coincidence with one point . . . . . . . . . . . . . . $R_1$
To place the edge of the ruler in coincidence with two points . . . . . . . . . . . . $2R_1$
To draw a straight line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . $R_2$
To put one point of the compasses on a determinate point . . . . . . . . . . . . . . . $C_1$
To put the points of the compasses on two determinate points . . . . . . . . . . $2C_1$
To describe a circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . $C_2$

Then a given construction is measured against the number of uses of those elementary steps. For a given construction expressed by the equation:

$$\begin{aligned} l_1R_1+l_2R_2+m_1C_1+m_2C_2 \end{aligned}$$

where $l_i$ and $m_j$ are coefficients denoting the number of times any particular operation is performed. The number $(l_1+l_2+m_1+m_2)$ is called the coefficient of simplicity (cs) of the construction, and it denotes the total number of operations performed. The number $(l_1+m_1)$ is called the coefficient of exactitude (ce) of the construction, and it denotes the number of preparatory operations on which the exactitude of the construction (made with the help of physical, inaccurate, tools) depends [16, 17].

3.2 Geometrography in Dynamic Geometry

Classical Geometrography applies to geometric constructions made with the help of a ruler and a compass. Its modernisation, proposed in [22, 24] uses the tools of the dynamic geometry systems (DGS). In [22] it was shown how to modernise Geometrography using GCLC, in [24] the generality of the approach is shown, using GeoGebra [8].

Considering the operations: define a point, anywhere in the plane, D and define a given object, using other objects, C, the following values for the GCLC basic constructions are obtained:

point – fix a point in the plane. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .D
line – uses two points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2C
circle – uses two points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
intersec – uses two lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
intersec – uses four points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4C
intersec2 – uses a circle and a circle or line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
midpoint – uses two points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
med – uses two points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
bis – uses three points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3C
perp – uses a point and a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
foot – uses a point and a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
parallel – uses a point and a line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
onsegment – uses two points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
online – uses two points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C
oncircle – uses two points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2C

In the modernisation (extrapolation) of the Geometrography, considering the “tools” of dynamic geometry systems, the coefficient of exactitude loses its meaning, the constructions will be executed by the DGS, so they are accurate (exact). However, the coefficient of simplicity of the constructions can still be useful, it can be used to classify constructions by levels of simplicity. A new dimension can also be added, the coefficient of freedom (cf), given by the degree of freedom a given geometric object has, e.g. “a point in a line” has one degree of freedom, a point in the plane has two degrees of freedom, etc. This new coefficient will give a value to the dynamism of the geometric construction. The degrees of freedom are measured against the point definitions. The point definition, defines a point with two degrees of freedom, the onsegment, online and oncircle constructions, define points with one degree of freedom. For the GCLC constructions contained in TGTP an average value of simplicity ($\textrm{CS}_\textrm{gcl}$) of 20.8 was obtained. Using the k-means clustering function implemented in the statistics package of Octave,^{Footnote 10} three classes of geometric constructions describing an increasing level of complexity were defined: simple constructions, $1\le \textrm{CS}_\textrm{gcl}\le 18$; average complexity constructions, $18 < \textrm{CS}_\textrm{gcl}\le 28$; complex constructions, $\textrm{CS}_\textrm{gcl}> 28$.

TGTP contain 71 simple constructions; 81 average complexity constructions; 28 complex constructions.

For example (TGTP problem’s GEO0369): “In triangle $\Delta ABC$, let F be the midpoint of the side BC, and D and E the feet of the altitudes on AB and AC, respectively. FG is perpendicular to DE at G. Show that G is the midpoint of DE”, has a geometric construction with coefficient of simplicity 19 (see Fig. 2), so an average complexity construction. The value of 6 for its coefficient of freedom is given by the fact that only the three points A, B, and C are free in the plane, while all the other points are completely bind, by construction.

3.3 Geometrography in Automatic Theorem Proving

The same approach can be (again) extrapolated to take into consideration synthetic geometric proofs, i.e., proofs based on a geometric axiomatic theory, using geometric inference rules.

Considering the proofs produced by the GATP GCLC, implementing the area method [9, 10],^{Footnote 11} the coefficient of simplicity for all the axioms and lemmas of the theory can be calculated.

Apart from the geometric constructions in which the proof is based (with coefficient of simplicity $n\textbf{Cnst}$), there are other steps to be considered.

(Elementary) Algebraic Simplification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ($\textbf{AS}$)
(Elementary) Geometric Simplification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ($\textbf{GS}$)
Application of the Area Method Lemma n. . . . . . . . . . . . . . . . . . . . . . . . . . . ($\textbf{AML}_{\textbf{n}}$)

A given proof can thus be measured against the number of those steps.^{Footnote 12} For a given proof expressed by the equation:

$$\begin{aligned} n_1\textbf{Cnst}+ n_2\times \textbf{AS}+ n_3\times \textbf{GS}+ \sum _{j=l_1}^{l_k}\textbf{AML}_{\textbf{j}} \end{aligned}$$

where $n_1$ is the coefficient of simplicity of the geometric construction, $n_2$ is the number of algebraic simplifications and $n_3$ is the number of geometric simplifications.

The coefficient of simplicity for the proof would be:

$$\begin{aligned} {\textrm{CS}_\textrm{proof}} = n_1 + n_2 + n_3 + \sum _{j=l_1}^{l_k}{\textrm{CS}_\textrm{proof}}(\textbf{AML}_{\textbf{j}}) \end{aligned}$$

The coefficient of freedom has no meaning in this setting.

Each lemma of the area method, $\textbf{AML}_{\textbf{j}}$, has a corresponding simplicity coefficient, the term, $\sum _{j=l_1}^{l_k}{\textrm{CS}_\textrm{proof}}(\textbf{AML}_{\textbf{j}})$, is the sum of all those values, for all the lemmas used in the proof. In order to achieve this for each lemma of the area method the corresponding coefficients of simplicity were calculated [21].

For example, the proof of Lemma 9 will have the following coefficient of simplicity, ${\textrm{CS}_\textrm{proof}}({\textbf{AML}_{\textbf{9}}})=74$.

Lemma 1

($\textbf{AML}_{\textbf{9}}$) Let R be a point on the line PQ. Then for any two points A and B it holds that $\mathcal {S}_{RAB}=\frac{\overline{PR}}{\overline{PQ}}\mathcal {S}_{QAB}+\frac{\overline{RQ}}{\overline{PQ}}\mathcal {S}_{PAB}$.

The following is a shorter version of its proof with the elementary algebraic and geometric simplifications condensed (the expanded version can be see in [21]).

$$\begin{aligned} \textrm{CS}_\textrm{gcl}= & {} 22 = 4\textbf{D}+ 18\textbf{C}\\ \textrm{CF}_\textrm{gcl}= & {} 8 \end{aligned}$$

$s=\mathcal {S}_{ABPQ}$, initial construction;
$1\times \textbf{GS}$, areas of triangles with the same orientation, $\mathcal {S}_{RAB}=s-\mathcal {S}_{ARQ}-\mathcal {S}_{BPR}$;
$1\times \textbf{AML}_{\textbf{14}}$, lemma 14, $\frac{\overline{PR}}{\overline{PQ}}=r$ $(\textbf{AML}_{\textbf{14}}=8)$;
$1\times \textbf{AML}_{\textbf{5}}$, lemma 5, $\frac{\mathcal {S}_{ARQ}}{\mathcal {S}_{APQ}} = \frac{\overline{RQ}}{\overline{PQ}}$ $(\textbf{AML}_{\textbf{5}}=18)$;
$1\times \textbf{GS}$, segments with the same orientation, $\frac{\overline{RQ}}{\overline{PQ}} = \frac{\overline{PQ}-\overline{PR}}{\overline{PQ}}$;
$2\times \textbf{AS}$, algebraic simplifications, $\frac{\overline{PQ}-\overline{PR}}{\overline{PQ}} = (1-r)$ and $\mathcal {S}_{ARQ}=(1-r)\mathcal {S}_{APQ}$;
$1\times \textbf{AML}_{\textbf{5}}$, lemma 5, $\frac{\mathcal {S}_{BPR}}{\mathcal {S}_{BPQ}} = \frac{\overline{PR}}{\overline{PQ}}$ $(\textbf{AML}_{\textbf{5}}=11)$;
$2\times \textbf{AS}$, algebraic simplifications, $\mathcal {S}_{BPR}=r \mathcal {S}_{BPQ}$ and $\mathcal {S}_{RAB} = s - (1-r)\mathcal {S}_{APQ}-r \mathcal {S}_{BPQ}$;
$2\times \textbf{GS}$, areas of triangles with the same orientation, $\mathcal {S}_{RAB} = s - (1-r)(s-\mathcal {S}_{PAB}) - r(s-\mathcal {S}_{QAB})$;
$7\times \textbf{AS}$, algebraic simplification, $\mathcal {S}_{RAB} = s -s +r s + \mathcal {S}_{PAB} - r \mathcal {S}_{PAB} -r s + r \mathcal {S}_{QAB}$, $\mathcal {S}_{RAB} = r \mathcal {S}_{QAB} +(1-r)\mathcal {S}_{PAB}$ and $\mathcal {S}_{RAB} = \frac{\overline{PR}}{\overline{PQ}}\mathcal {S}_{QAB}+\frac{\overline{RQ}}{\overline{PQ}}\mathcal {S}_{PAB}$.

Geometrography for the demonstration: $4\textbf{D}+ 18\textbf{C}+ 4\textbf{GS}+ 11\textbf{AS}+ 1\textbf{AML}_{\textbf{14}} + 2\textbf{AML}_{\textbf{5}}$

$$\begin{aligned} \textbf{AML}_{\textbf{9}} \left\{ \begin{array}{rcl} \textrm{CS}_\textrm{proof}&{} = &{} 74 = 22 + 4 + 11 + 8 + (18+11)\\ \textrm{CS}_\textrm{gcl}&{} = &{} 22 \end{array} \right. \end{aligned}$$

with $\textbf{AML}_{\textbf{14}}=8$ and $\textbf{AML}_{\textbf{5}}=18$ (first application) and $\textbf{AML}_{\textbf{5}}=11$ (second application).

It is considered that, from the second application of a lemma onward, its proof is accepted, so, only its adaptation to the new configuration is needed, i.e., the pattern matching of the lemma configuration to a new setting. For that reason, in any second, third, etc. application of a lemma, only the $\textrm{CS}_\textrm{gcl}$ coefficient values are considered.

Given that a mathematical proof is a sequence of steps, in addition to the coefficient of simplicity, it would be useful to have other coefficients: e.g., the total number of steps in the proof; the value of the most difficult step in the proof; the number of different steps of high difficulty in the proof; the number of different types of steps (lemmas) in the proof; a proof script; a numerical description of the proof; and a corresponding line chart or proof trace.

Therefore, to fully characterise a formal synthetic proof produced by a GATP, we can define and consider the following coefficients:

${\textrm{CS}_\textrm{proof}}$, the simplicity coefficient (as above), it gives the simplicity coefficient for the overall proof;
${\textrm{CT}_\textrm{proof}}$, the total number of steps in the proof;
${\textrm{CS}_\textrm{proofmax}}$, the highest simplicity coefficient of the lemmas/definitions applications, it gives the simplicity coefficient for the most difficult step of the proof;
${\textrm{CD}_\textrm{typeproof}}$, the number of different types of lemmas used in the proof;
${\textrm{CD}_\textrm{highproof}}$, the number of different steps of high difficulty in the proof;
The proof script, as defined above;
The corresponding line chart or proof trace in tikz format.^{Footnote 13}

It is important to note that to obtain the coefficient ${\textrm{CD}_\textrm{highproof}}$ (hp) the area method lemmas implemented in the GATP GCLC were analysed, and, using the k-means clustering function implemented in the statistics package of Octave, divided into three categories: low difficulty ($hp < 284$), medium difficulty ($284 \le hp < 1848$) and high difficulty ($hp \ge 1848$).

Using the defined coefficients above, we have the following values for the proof of $\textbf{AML}_{\textbf{9}}$:

$$\begin{aligned} \textbf{AML}_{\textbf{9}} \left\{ \begin{array}{rcl} \textrm{CS}_\textrm{proof}&{} = &{} 74 = 22 + 4 + 11 + 8 + (18+11)\\ \textrm{CS}_\textrm{gcl}&{} = &{} 22\\ \textrm{CT}_\textrm{proof}&{} = &{} 19\\ \textrm{CS}_\textrm{proofmax}&{} = &{} 18 \\ \textrm{CD}_\textrm{typeproof}&{} = &{} 2\\ \textrm{CD}_\textrm{highproof}&{} = &{} 0\\ \end{array} \right. \end{aligned}$$

The GATP GCLC implementation of the area method [9, 10] is able to produce proof scripts. Using the command prooflevel it is possible to have control over the level of detail of the proof script. Two programs^{Footnote 14} were implemented to calculate the Geometrography of the proofs. The Geometrography of the construction is calculated by a bash script, gclcGeometrography.bash, that analyses the GCLC geometric construction (not considering all the rendering commands). The Geometrography of the proof script (minus the geometric construction) is calculated by, csproof, a parser that analyse the proof script counting the algebraic steps and the geometric steps in sequence and also the lemmas and definitions of the area method with the respective coefficient of simplicity.

Using the program csproof on an arbitrary geometric proof, it can be obtained: a CSV file^{Footnote 15} with the values regarding the Geometrographic Readability Coefficient of Proofs (see Sect. 4); a file with the coefficient of simplicity of the geometric construction; a file with a line chart, a graphical representation of the proof done by the GATP GCLC.

To better understand some details, let’s consider again the Ceva’s theorem (see Theorem 1). Using the GATP GCLC, with the full level of detail, the proof script of Ceva’s theorem has all the details explained and it fills two pages, almost three pages, if the notes about the non-degeneracy conditions and about the proof itself are taken into consideration (see Appendix 1). The line chart is shown in Fig. 3. In it, the sequences of algebraic, or geometric, simplifications are condensed in only one step (for a more condensed view of the graph).

Therefore, the Geometrography of Ceva’s Theorem Proof is the following: $4\textbf{D}+18\textbf{C}+23\textbf{AS}+3\textbf{AML}_{\textbf{1}}+3\textbf{AML}_{\textbf{8}}+3\textbf{AML}_{\textbf{10}}$.

$$\begin{aligned} \mathbf {Ceva's\ Theorem}\left\{ \begin{array}{rcl} \textrm{CS}_\textrm{proof}&{} = &{} 220\\ \textrm{CS}_\textrm{gcl}&{} = &{} 22\\ \textrm{CT}_\textrm{proof}&{} = &{} 32\\ \textrm{CS}_\textrm{proofmax}&{} = &{} 84 \\ \textrm{CD}_\textrm{typeproof}&{} = &{} 3\\ \textrm{CD}_\textrm{highproof}&{} = &{} 0\\ \end{array} \right. \end{aligned}$$

4 A Geometrographic Criterion

It is interesting to note how the Geometrographic coefficients highlight many salient aspects of the proof, aspects that could be used to analyse the readability of such proofs. Furthermore, it is interesting to stress how the proof trace constitutes a sort of electroencephalogram of the machine while proving the theorem. Just as an electroencephalogram can be useful for measuring a brain’s electrical activity, the line chart helps to understand some features of the proof by looking at its trace.

Applying the Geometrography to the area method proofs contained in the repository TGTP, using the GATP GCLC with the full level of detail, and using the geometrographic coefficients we can argue in favour of the following new readability coefficient:

Geometrographic Readability Coefficient of Proofs (GRCP)

$$\begin{aligned} GRCP = ((\textrm{CS}_\textrm{proof}- \textrm{CT}_\textrm{proof}) \times (\textrm{CD}_\textrm{highproof}+ \textrm{CD}_\textrm{typeproof})) \end{aligned}$$

This coefficient relates four quantities: the simplicity coefficient of the proof, the total number of steps in the proof, the number of different steps with high-difficulty in the proof, the number of different lemmas used in the proof.

The first factor, $(\textrm{CS}_\textrm{proof}- \textrm{CT}_\textrm{proof})$, gives an approximation to the overall coefficient of simplicity of the non-trivial steps in the proof. Note that $ \textrm{CT}_\textrm{proof}$ count the number of steps rather than the coefficient of simplicity of each step. By contrast, in $\textrm{CS}_\textrm{proof}$, it is the coefficient of simplicity that counts. Each trivial step has a coefficient of simplicity equal to one, and the coefficients of simplicity for non-trivial steps, such as the construction and the lemmas, are much greater than one. In the light of this, it can be concluded that the difference between $\textrm{CS}_\textrm{proof}$ and $\textrm{CT}_\textrm{proof}$ emphasises the complexity of the proof, disregarding its length.

The second factor, $(\textrm{CD}_\textrm{highproof}+ \textrm{CD}_\textrm{typeproof})$, gives an account of the difficult steps. Steps that, potentiality, make the proof much harder to follow, steps where the normal flow of the proof would be interrupted to jump to the proof of the lemma, resuming after completing the lemma’s proof. The addition of the number of high-difficulty steps with the number of different lemmas used in the proof, gave a multiplying factor for the overall complexity of the proof. A final note about this second factor: a high-difficulty step is, for sure, a lemma application, nevertheless we felt that the high-difficulty nature of the lemma is a sufficient reason for this double counting.

Multiplying these factors, the approximation for the overall simplicity coefficient and the number difficult steps—both elements that we believe characterise the readability of a proof—we obtain a readability coefficient of a proof.

Therefore, considering 71 theorems and their area method proofs, from the TGTP repository and using, again, the k-means clustering function from Octave, the proofs can be divided into the following classes of Geometrographic readability:^{Footnote 16}$\textrm{readable}$ $(high-readability), GRCP \le 48000$; medium-readability $48000< GRCP \le 135000$; low-readability, $GRCP > 135000$.

The GRCP for GEO0001, Ceva’s proofs is: $\textrm{GRCP}_\textrm{GEO0001} = (220 - 32) \times (0 + 3)= 564 \le 48000$, so a readable (high-readability) proof.

GRCP Medium-readability Example.

TGTP problem’s GEO0021:

Theorem 2

(Circumcenter of a Triangle) The circumcenter of a triangle can be found as the intersection of the three perpendicular bisectors

has the following values for the different coefficients.

$$\begin{aligned} \texttt{GEO0021} \left\{ \begin{array}{rcl} \textrm{CS}_\textrm{proof}&{} = &{} 8554 \\ \textrm{CS}_\textrm{gcl}&{} = &{} 11\\ \textrm{CT}_\textrm{proof}&{} = &{} 591\\ \textrm{CS}_\textrm{proofmax}&{} = &{} 2807 \\ \textrm{CD}_\textrm{typeproof}&{} = &{} 13\\ \textrm{CD}_\textrm{highproof}&{} = &{} 3\\ \end{array} \right. \\ 48000< GRCP = 127408 \le 135000\end{aligned}$$

By the GRCP criterion, this is a medium-readability problem. It can be seen that it has 13 different lemmas, 3 high-difficulty step, a long proof with a significant difference between the $\textrm{CS}_\textrm{proof}$ and the number of steps of the proof (see Figs. 4 and 6).

4.1 GRCP low-readability Example.

TGTP problem’s GEO0020:

Theorem 3

(Distance of a line containing the centroid to the vertices) Given a triangle ABC and a point X, the sum of the distances of the line XG, where G is the centroid of ABC, to the two vertices of the triangle situated on the same side of the line is equal to the distance of the line from the third vertex.

has the following values for the different coefficients:

$$\begin{aligned} \texttt{GEO0020} \left\{ \begin{array}{rcl} \textrm{CS}_\textrm{proof}&{} = &{} 19989 \\ \textrm{CS}_\textrm{gcl}&{} = &{} 26\\ \textrm{CT}_\textrm{proof}&{} = &{} 4119\\ \textrm{CS}_\textrm{proofmax}&{} = &{} 2807 \\ \textrm{CD}_\textrm{typeproof}&{} = &{} 13\\ \textrm{CD}_\textrm{highproof}&{} = &{} 4\\ \end{array} \right. \end{aligned}$$

$GRCP = 269790 > 135000$, so a low-readability problem. It can be seen that it has 13 different lemmas, 4 high-difficulty step, a long proof, with a very high value of overall complexity (see Figs. 5 and 7).

4.2 Comparing the Different Criteria

The Geometrography Readability Coefficient of Proofs criterion takes into consideration all the significant aspects of a formal proof, its overall difficulty, its number of steps, the number of difficult steps and the number of different lemmas that must be applied. The other criteria consider fewer aspects. The de Bruijn criterion, given its different goal, takes only in consideration the size of the proof and it needs to have an informal proof to compare with. The ML criterion considers the number of different lemmas applied and uses the number of terms of the maximal polynomial as a way to have an approximation to the complexity of the proof.

Alongside the ML criterion, in the GRCP criterion, the number of lemmas in the proof is considered: in the GRCP criterion as a multiplicative factor, in the ML criterion as one of the conditions for readability. In the ML criterion the number of terms in the maximal polynomial are considered, but, as its authors remarked, this measures the number of computations needed in the proof, not its readability. This is weakly related to the number of steps in the proof. It approaches the number of steps needed to decompose those long polynomials occurring in the proof to a simple expression.

Regardless of this criteria comparison, we want to emphasise that the Geometrographic view proposed in this paper has a more general scope. Although the GRCP criterion is a reasonable proposal, the elementary quality of the Geometrographic approach, through the analysis of various coefficients of the proofs, the proof scripts and the proof traces, makes it possible to have a language or a tool that can be used by non-experts to formulate other criteria weaker or stronger than the one we propose. The contribution of this paper is therefore not only that of a Geometrographic criterion, but of a Geometrographic approach to the problem of measuring the readability of formal proofs in automated deduction in geometry, an approach that offer an environment in which to analyse the proofs in detail by proposing and test readability criteria. To the best of our knowledge, it is the first time that the community has access to such a general tool to formulate and to study the readability of formal proofs in automated deduction in geometry. It is also interesting to note that our criterion offers a classification of proofs that is in line, when the fundamental points are considered, with the classifications given by the other two criteria. i.e., proofs that are classified as difficult to read according to the new criterion are also classified as difficult to read for the others, and the same applies to proofs that are easy to read (Table 2).

Finally, we have to say that all the criteria proposed here have no empirical validation through the submission of tests to students, experts, etc. Nevertheless, the great advantage that our approach offers is that it allows to formulate criteria that can be implemented in repositories such as TGTP and can be evaluated experimentally in a very simple way.

Table 2 Comparison of the three criteria

Full size table

5 Conclusions

In this paper we have analysed the problem of measuring the readability of formal proofs in automated deduction in geometry. We have introduced two known criteria and highlighted some of their limitations. We have then introduced a third criterion that seems to overcome the problem of readability by expert, therefore being more natural than the previous ones, and seems to be easily generalised. One possible generalisation is given by the possibility to formulate weaker, or stronger, criteria, using the proposed language. Another possible avenue is given by the generalisation of our approach to other GATPs (e.g. the JGEx integrated GATPs, area method, full-angle method and deductive databases method [3, 28, 29], ArgoCLP, coherent logic prover [25]) and any other ATP that has a proof script based on axioms, lemmas applications and, eventually, elementary steps (algebraic, geometric, etc.). It is a matter of calculation of the coefficients of simplicity for the axioms and lemmas of the base theory in consideration.

As we pointed out, the great advantage that our approach offers is that it allows to formulate criteria that can be implemented in repositories such as TGTP and evaluated experimentally. For this reason, an important work that we are planning is an experiment to be submitted to mathematicians, computer scientists, educationalists and students providing an adequate empirical test for our Geometrographic criterion.

Change history

13 March 2023
Missing Open Access funding information has been added in the Funding Note.

Notes

We will not consider here the problem of the readability of geometric proofs from the Mathematics Educations point of view. We will address some issues related to that context in the conclusions.
http://hilbert.mat.uc.pt/TGTP/index.php.
In the original sentence, “Automath like Systems”.
They considered 478 machine solved geometry problems.
https://coq.inria.fr/, https://isabelle.in.tum.de/, http://mizar.org/.
We used the proof found in https://artofproblemsolving.com/wiki/index.php/Ceva’s_theorem as source for the informal proof.
GATPs can be of two major types: algebraic, the proof, if it exist, is done recurring to an algebraic reasoning (e.g. Gröbner basis); geometric (synthetic), the proof, if it exist, is done recurring a set of axioms and inference rules of geometry, without the use of coordinates. Semi-synthetic methods, e.g. the area method, use also the axioms of a field of characteristic different from 2.
The Wu’s method and the Gröbner basis method are both algebraic methods, from the geometric point of view their proofs are unreadable (level 2).
Exactitude, or the lack of it, in this context refer to the possible inaccuracy introduced by physical tools such as ruler and compass.
GNU Octave, version 6.1.1, package octave-statistics, function kmeans https://octave.sourceforge.io/statistics/function/kmeans.html.
The proofs developed by GATPs based on the Area Method are formal proofs. The method itself was formalised, and proved sound, using the Coq proof assistant. The GATP developed by J. Narboux, as a Coq tactic, can have the proofs verified by Coq. The GCLC area method, do not have, explicitly, that possibility, but, it would be a matter of developing a filter from the GCLC language to the Coq language (see Appendix 1).
By elementary algebraic simplification it is understood the basic algebraic operations: addition, subtraction, multiplication, division, and their properties of commutativity, associativity and distributivity. By elementary geometric simplification it is understood the direct application of the definition of the area method quantities. We call them trivial steps.
https://ftp.eq.uc.pt/software/TeX/graphics/pgf/base/doc/pgfmanual.pdf.
The open source codes are available in the GitHub project https://github.com/GeoTiles/Geometrography/tree/master/GeometrographyProofs.
Comma Separated Values format.
The actual values were rounded for better readability.
Narboux, J.: Formalization of the area method. Coq user contribution (2009). http://dpt-info.u-strasbg.fr/~narboux/area_method.html.

References

Chou, S.C., Gao, X.S., Zhang, J.Z.: Machine Proofs in Geometry. World Scientific, Singapore (1994)
Book MATH Google Scholar
Chou, S.C., Gao, X.S., Zhang, J.Z.: Automated generation of readable proofs with geometric invariants, I. Multiple and shortest proof generation. J. Automat. Reason. 17, 325–347 (1996). https://doi.org/10.1007/BF00283134
Article MathSciNet MATH Google Scholar
Chou, S.C., Gao, X.S., Zhang, J.Z.: Automated generation of readable proofs with geometric invariants, II. Theorem proving with full-angles. J. Automat. Reason. 17(3), 349–370 (1996). https://doi.org/10.1007/BF00283134
Article MathSciNet MATH Google Scholar
de Bruijn, N.: Selected Papers on Automath, Studies in Logic and the Foundations of Mathematics, vol. 133, chap. A survey of the project Automath, pp. 141–161. Elsevier, Amsterdam (1994). https://doi.org/10.1016/S0049-237X(08)70203-9
DuBay, W.H. (ed.): The Classic Readability Studies. Impact Information (2006)
Gao, H., Li, J., Cheng, J.: Measuring interestingness of theorems in automated theorem finding by forward reasoning based on strong relevant logic. In: 2019 IEEE International Conference on Energy Internet (ICEI), pp. 356–361 (2019). https://doi.org/10.1109/ICEI.2019.00069
Hanna, G., Reid, D., de Villiers, M. (eds.): Proof Technology in Mathematics Research and Teaching. Springer, Berlin (2019)
MATH Google Scholar
Hohenwarter, M.: Geogebra—a software system for dynamic geometry and algebra in the plane. Master’s thesis, University of Salzburg, Austria (2002)
Janičić, P.: GCLC - A tool for constructive Euclidean geometry and more than that. In: A. Iglesias, N. Takayama (eds.) Mathematical Software - ICMS 2006, Lecture Notes in Computer Science, vol. 4151, pp. 58–73. Springer (2006). https://doi.org/10.1007/11832225_6
Janičić, P., Narboux, J., Quaresma, P.: The area method: a recapitulation. J. Automat. Reason. 48(4), 489–532 (2012). https://doi.org/10.1007/s10817-010-9209-7
Article MathSciNet MATH Google Scholar
Janičić, P., Quaresma, P.: System description: GCLCprover + GeoThms. In: U. Furbach, N. Shankar (eds.) Automated Reasoning, Lecture Notes in Computer Science, vol. 4130, pp. 145–150. Springer (2006). https://doi.org/10.1007/11814771_13
Jiang, J., Zhang, J.: A review and prospect of readable machine proofs for geometry theorems. J. Syst. Sci. Complexity 25(4), 802–820 (2012). https://doi.org/10.1007/s11424-012-2048-3
Article MathSciNet MATH Google Scholar
Johnson, D.A.: The readability of mathematics books. Math. Teach. 50(2), 105–110 (1957). https://doi.org/10.5951/MT.50.2.0105
Article Google Scholar
Lemoine, É.: Géométrographie ou Art des constructions géométriques. No. 18 in Scientia, Série Physico-Mathématique. C. Naud, Éditeur, Paris (1902). http://catalogue.bnf.fr/ark:/12148/cb36049032t
Loria, G.: La geometrografia e le sue trasfigurazioni. Period. Mat. 3(6), 114–122 (1908)
MATH Google Scholar
Mackay, J.S.: The geometrography of Euclid’s problems. Proc. Edinb. Math. Soc. 12, 2–16 (1893). https://doi.org/10.1017/S0013091500001565
Article MATH Google Scholar
Merikoski, J.K., Tossavainen, T.: Two approaches to geometrography. J. Geom. Graph. 13(1), 15–28 (2010)
MathSciNet MATH Google Scholar
Narboux, J.: A decision procedure for geometry in Coq. Lect. Notes Comput. Sci. 3223, 225–240 (2004). https://doi.org/10.1007/b100400
Article MathSciNet MATH Google Scholar
Pak, K., Schubert, A.: The impact of proof steps sequence on proof readability—experimental setting. In: Workshop and Work in Progress Papers at CICM 2016 (2016)
Pinheiro, V.A.: Geometrografia 1. Gráfica Editora Bahiense (1974)
Quaresma, P., Graziani, P.: The geometrography’s simplicity coefficient for the axioms and lemma of the area method. Technical Report TR 2021-001, Center for Informatics and Systems of the University of Coimbra (2021)
Quaresma, P., Santos, V., Graziani, P., Baeta, N.: Taxonomy of geometric problems. J. Symbol. Comput. 97, 31–55 (2020). https://doi.org/10.1016/j.jsc.2018.12.004
Article MATH Google Scholar
Richard, P., Vélez, M., Vaerenbergh, S.V. (eds.): Mathematics Education in the Age of Artificial Intelligence, Mathematics Education in the Digital Era, vol. 17. Springer (2022). https://doi.org/10.1007/978-3-030-86909-0
Santos, V., Baeta, N., Quaresma, P.: Geometrography in dynamic geometry. Int. J. Technol. Math. Educ. 26(2), 89–96 (2019). https://doi.org/10.1564/tme_v26.2.06
Article Google Scholar
Stojanović, S., Pavlović, V., Janičić, P.: A coherent logic based geometry theorem prover capable of producing formal and readable proofs. In: P. Schreck, J. Narboux, J. Richter-Gebert (eds.) Automated Deduction in Geometry, Lecture Notes in Computer Science, vol. 6877, pp. 201–220. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-25070-5_12
Wang, K., Su, Z.: Automated geometry theorem proving for human-readable proofs. In: Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pp. 1193–1199. AAAI (2015). http://dl.acm.org/citation.cfm?id=2832249.2832414
Wiedijk, F.: The de Bruijn factor. Poster at International Conference on Theorem Proving in Higher Order Logics (TPHOL2000) (2000). Portland, Oregon, USA, 14–18 August 2000
Ye, Z., Chou, S.C., Gao, X.S.: Visually dynamic presentation of proofs in plane geometry: Part 2. Automated generation of visually dynamic presentations with the full-angle method and the deductive database method. J. Automat. Reason. 45(3), 243–266 (2010). https://doi.org/10.1007/s10817-009-9163-4
Article MathSciNet MATH Google Scholar
Ye, Z., Chou, S.C., Gao, X.S.: An introduction to Java geometry expert. In: T. Sturm, C. Zengler (eds.) Automated Deduction in Geometry, Lecture Notes in Computer Science, vol. 6301, pp. 189–195. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-21046-4_10

Download references

Acknowledgements

The authors wish to thank Salvatore Florio, Victor Pambuccian, and Mirko Tagliaferri for their comments on an earlier draft of this work.

Funding

Open access funding provided by FCT|FCCN (b-on).

Author information

Authors and Affiliations

Departamento de Matemática da FCTUC, Largo D. Dinis, 3000−143 Coimbra, Portugal
Pedro Quaresma
Department of Pure and Applied Sciences, University of Urbino, Via Sant’Andrea, 34, Urbino, PU, 61029, Italy
Pierluigi Graziani

Authors

Pedro Quaresma
View author publications
You can also search for this author in PubMed Google Scholar
Pierluigi Graziani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pedro Quaresma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The first author was partially support by FCT - Foundation for Science and Technology, I.P., within the scope of the project CISUC - UID/CEC/00326/2020 and by European Social Fund, through the Regional Operational Program Centro 2020. The second author was partially support by Italian Ministry of Education, University and Research through the PRIN 2017 project “The Manifest Image and the Scientific Image” prot. 2017ZNWW7F_004.

Appendices

Appendix

A Ceva’s Theorem, GCLC Area Method Proof

The area method for Euclidean constructive geometry was proposed by Chou, Gao and Zhang in the early 1990’s [2]. The method can efficiently prove many non-trivial geometry theorems and is one of the most interesting and most successful methods for automated theorem proving in geometry. In [10] a variant of the original axiom system was presented, based on that axiomatisation all the lemmas needed by the method were formally proved and the soundness of the method was established, using the Coq proof assistant [18].^{Footnote 17}

The GCLC implementation of the area method is able to produce formal proofs. If the highest level of details is chosen, prooflevel 7, it would be possible to (an appropriated filter has to be built) formally verify those proofs using a proof assistant, e.g. Coq. The LaTeX proof scripts that GCLC produces (by default, at prooflevel 2) are a natural language rendering, to be read by mathematicians.

The area method axiomatic system for Euclidean plane geometry (within first order logic with equality), has just one primitive type of geometrical objects: points. Variables can also range over a field $(F, +,\cdot , 0, 1)$, where F is any field of characteristic different from 2. The axioms of the theory of fields used in GCLC area method proofs, are standard.

The Ceva’s proof presented below is a LaTeX proof script produced by GCLC, at prooflevel 7, edited to include the GRCP values.

GCLC Prover Output for conjecture “cevaGEO0001”, Area method used

$$\begin{aligned}{} & {} \text {Q.E.D}\\{} & {} \text {NDG conditions are:}\\{} & {} S_{BPA}\ne S_{CPA} \text {i.e.,} \text {lines} BC \text {and PA are not parallel (construction based assumption)}\\{} & {} S_{APB}\ne S_{CPB} \text {i.e., lines} AC \text {and PB are not parallel (construction based assumption)}\\{} & {} S_{APC}\ne S_{BPC} \text {i.e., lines} AB \text {and PC are not parallel (construction based assumption)}\\{} & {} P_{FBF}\ne 0 \text {i.e., points} F \text {and} B \text {are not identical (conjecture based assumption)}\\{} & {} P_{DCD}\ne 0 \text {i.e., points} D \text {and} C \text {are not identical (conjecture based assumption)}\\{} & {} P_{EAE}\ne 0 \text {i.e., points} E \text {and} \text {A are not identical (conjecture based assumption)}\\{} & {} \text {Number of elimination proof steps: 3}\\{} & {} \text {Number of geometric proof steps: 6}\\{} & {} \text {Number of algebraic proof steps: 23}\\{} & {} \text {Total number of proof steps: 32}\\{} & {} \text {Time spent by the prover: 0.001 seconds} \end{aligned}$$

Enlarged Proof Traces

TGTP: GEO0021 See Fig. 6.

TGTP: GEO0020 See Fig. 7.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Quaresma, P., Graziani, P. Measuring the Readability of Geometric Proofs: The Area Method Case. J Autom Reasoning 67, 5 (2023). https://doi.org/10.1007/s10817-022-09652-0

Download citation

Received: 20 January 2021
Accepted: 26 September 2022
Published: 10 January 2023
DOI: https://doi.org/10.1007/s10817-022-09652-0

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Measuring the Readability of Geometric Proofs: The Area Method Case

Abstract

Similar content being viewed by others

On the difficulty of discovering mathematical proofs

Abstraction, Axiomatization and Rigor: Pasch and Hilbert

Improving Legibility of Formal Proofs Based on the Close Reference Principle is NP-Hard

1 Introduction

2 Criteria of Readability (by Experts)

2.1 Maxt-Lems Criterion

Theorem 1

2.2 The de Bruijn Factor

2.3 ML and de Bruijn Factor’s Limits

3 Looking for a More Natural Readability Criterion

3.1 Classical Geometrography

3.2 Geometrography in Dynamic Geometry

3.3 Geometrography in Automatic Theorem Proving

Lemma 1

4 A Geometrographic Criterion

Theorem 2

4.1 GRCP low-readability Example.

Theorem 3

4.2 Comparing the Different Criteria

5 Conclusions

Change history

13 March 2023

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix

A Ceva’s Theorem, GCLC Area Method Proof

Enlarged Proof Traces

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation