Tense in Mathematical English

Inglis, Matthew; Strauss, Jacob

doi:10.1007/s10516-024-09707-4

Tense in Mathematical English

Research Note
Open access
Published: 08 May 2024

Volume 34, article number 3, (2024)
Cite this article

Download PDF

You have full access to this open access article

Global Philosophy Aims and scope Submit manuscript

Tense in Mathematical English

Download PDF

Abstract

Many authors have commented on the relative frequency of the present tense—and the relative infrequency of the past tense—in mathematical writing. However, none (to our knowledge) have provided an estimate for the size of this effect or explored how universal it is. In this short note we report an analysis of corpora of mathematical and day-to-day English. We conclude that the present-to-past ratio of tenses is at least 3:1 in mathematical English, compared to approximately 5:7 in day-to-day English. Further, we show that this tendency to favour the present tense is almost universally present in written mathematics.

If One Can Read and Write Then One Can Also Do Mathematics

How Predictive Is Tense for Language Profiency? A Cautionary Tale

Conventionalism, consistency, and consistency sentences

Article 19 December 2014

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Several authors have observed that mathematics is typically written using the present tense (Ganesalingam 2013; Pimm 2005). For instance, Solomon and O’Neill (1998) studied an extract from the personal notebook of mathematician William Rowan Hamilton. They noticed that Hamilton’s personal remarks tended to be written in the past tense, whereas the mathematical content was relayed “in the timeless present” (p. 216). Ganesalingam (2013) attributed this to mathematicians’ beliefs about the philosophy of mathematics, and suggested that it is almost universal:

Working mathematicians treat mathematical objects as if they were Platonic ideals, timeless objects existing independently of the physical world. The interactions and properties of these objects are also seen as frozen, timeless quantities. Mathematical assertions are therefore almost always written as generic (gnomic) sentences in the present simple tense. (p. 21).

Some mathematics educators have noted that the tendency to discuss mathematics in the present tense could have implications for how students see mathematics. In particular, Pimm (2005) argued this style may cause students to view mathematics as being a-temporal rather than historical and contingent. Obscuring the role of human agency in mathematics in this manner may, it is suggested, make it “harder for students to construe a present or future role for themselves” within the discipline (Morgan 2016, p. 138).

While the relative infrequency of the past tense in written mathematics has often been commented upon (Burton and Morgan 2000; Ganesalingam 2013; Morgan 2016; Solomon and O’Neill 1998), to our knowledge no one has, to date, produced an estimate of the size of this effect or investigated if it is genuinely universal across mathematics. Our goal in this short note is to provide relevant evidence on these points.

To this end, we used Python (version 3.8.12) and the Natural Language Toolkit (NLTK, version 3.6.7) to perform part-of-speech (POS) tagging (Bird et al. 2009). POS tagging is a process by which the words in corpora are categorised as being particular parts of speech, depending on the definition of the word and its context. For instance, consider the following two sentences, both taken from the Lancaster-Oslo/Bergen (LOB) corpus (Johansson et al. 1978): (i) “Most stories of Miss Nightingale begin and end with her work in the Crimea”; (ii) “if you are 65 (60 for a woman) you work for an employer and are paid more than £9 in any week you will (unless you are contracted out) pay graduated contributions”. In the first sentence “work” can be tagged as a noun, whereas the surrounding context in the second sentence allows the same word to be tagged as a verb.

We used Punkt, a pre-trained model, to tokenize (separate into sentences) our corpora. We then used the default NLTK tagger (Averaged Perceptron Tagger) to tag the tokenized texts according to the Penn Treebank tag set (Taylor et al. 2003). The code used for this analysis can be found at https://github.com/jda5/tense-analysis. Importantly, the Averaged Perceptron Tagger does not achieve perfect accuracy in its tagging: Banga and Mehndiratta (2017) assessed its accuracy to be 88.7%.

The two corpora we used were:

A corpus created from academic mathematics papers uploaded to the ArXiv. The ArXiv is a preprint server that is widely used by mathematicians to self-archive their academic output. The corpus was created by downloading raw TeX sources and stripping out TeX commands and inline mathematics (so that formulae, diagrams and so on were removed). We including all 5087 papers from the first four months of 2009 that could be successfully processed. This left a corpus of approximately 27 million words (for a full description of the process by which this corpus was created, see Mejía-Ramos et al.'s (2019) description).
The Brown and LOB corpora (Kucera and Francis 1967; Johansson et al. 1978) of general English. The Brown corpus contains 500 text samples, each of 2000 words, of representative American English. The LOB corpus is a British English version of the Brown corpus, created using identical sampling principles. We combined the two corpora to create a single 2 million word corpus.

The frequencies of different verb forms (base, past tense, gerund or present participle, past participle, non-third person singular present, third person singular present, modal verb) were calculated, and then normalised, yielding a frequency per 1000 words for each form. These are shown in Table 1.

Table 1 Verb form frequencies per 1000 words in the ArXiv and Brown/LOB corpora

Full size table

Next we calculated the total number of past-tense and present-tense verbs (with these terms operationalised as shown in Table 1). To be conservative we omitted verb base forms from this calculation as, although these forms are usually in the present tense (“I like philosophy”), they can be part of phrases in other tenses (“I used to like philosophy”). We found that the ratio of present to past tense differed significantly between the corpora, \(\chi ^2(1)=146820.25\), \(p<.001\). The present:past ratio was approximately 3:1 in the ArXiv corpus and 5:7 in the Brown/LOB corpus. An even more conservative analysis would be to ignore participles and gerunds, and simply compare just the frequency of past tense verbs between the corpora. As shown in Table 1, the Brown/LOB corpus had nearly ten times as many past tense verbs (VBD tags) per 1000 words than the ArXiv corpus.

Next we explored the level of agreement/disagreement between mathematical authors concerning the use of tense. For each of the 5087 papers in our corpus we calculated the frequency with which verbs were expressed in the present and past tense. For 5004 articles (98.4%), the frequency of present tense verbs was higher. This is perhaps surprising given the large differences in style that can be seen across different mathematical fields. In just 26 articles (0.5%) was the frequency of past tense verbs higher. An inspection of these 26 articles revealed that 2 focused on the history of mathematics, 3 were written in a language other than English (but with English abstracts), and 15 had been replaced by a withdrawal notice (which explained that an error had been found in the original manuscript). The remaining 57 papers were made up of 53 empty articles (i.e. the papers had been withdrawn by the authors without being replaced by a withdrawal notice) and 4 articles where the withdrawal notice contained the same number of past- and present-tense verb forms.

In sum, in line with (Ganesalingam 2013) assertion, we found that mathematicians share an almost universal preference for writing mathematics using the present tense.

References

Banga R, Mehndiratta P (2017) Tagging efficiency analysis on part of speech taggers. In: 2017 international conference on information technology (ICIT), IEEE, pp 264–267
Bird S, Klein E, Loper E (2009) Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media Inc, Sebastopol, CA
Google Scholar
Burton L, Morgan C (2000) Mathematicians writing. J Res Math Educ 31:429–453
Article Google Scholar
Ganesalingam M (2013) The language of mathematics. Springer, Heidelberg, Germany
Book Google Scholar
Johansson S, Leech G, Goodluck H (1978) Manual of information to accompany the Lancaster-Oslo/Bergen Corpus of British English, for use with digital computers. Department of English, University of Oslo, Oslo, Norway
Kucera H, Francis W (1967) Computational analysis of present-day American English. Brown University Press, Providence, RI
Google Scholar
Mejía-Ramos JP, Alcock L, Lew K, Rago P, Sangwin C, Inglis M (2019) Using corpus linguistics to investigate mathematical explanation. In: Fischer E (ed) Methodological advances in experimental philosophy. Bloomsbury, London, UK, pp 239–264
Google Scholar
Morgan C (2016) Studying the role of human agency in school mathematics. Res Math Educ 18(2):120–141
Article Google Scholar
Pimm D (2005) Drawing on the image in mathematics and art. In: Sinclair N, Pimm D, Higginson W (eds) Mathematics and the aesthetic: new approaches to an ancient affinity. Springer, New York, pp 160–189
Google Scholar
Solomon Y, O’Neill J (1998) Mathematics and narrative. Lang Educ 12:210–221
Article Google Scholar
Taylor A, Marcus M, Santorini B (2003) The Penn treebank: an overview. In: Abeillé A (ed) Treebanks. Springer, Dordrecht, pp 5–22
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Mathematical Cognition, Loughborough University, Loughborough, UK
Matthew Inglis & Jacob Strauss

Authors

Matthew Inglis
View author publications
You can also search for this author in PubMed Google Scholar
Jacob Strauss
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthew Inglis.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Inglis, M., Strauss, J. Tense in Mathematical English. glob. Philosophy 34, 3 (2024). https://doi.org/10.1007/s10516-024-09707-4

Download citation

Received: 30 July 2021
Accepted: 17 April 2024
Published: 08 May 2024
DOI: https://doi.org/10.1007/s10516-024-09707-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Tense in Mathematical English

Abstract

Similar content being viewed by others

If One Can Read and Write Then One Can Also Do Mathematics

How Predictive Is Tense for Language Profiency? A Cautionary Tale

Conventionalism, consistency, and consistency sentences

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Tense in Mathematical English

Abstract

Similar content being viewed by others

If One Can Read and Write Then One Can Also Do Mathematics

How Predictive Is Tense for Language Profiency? A Cautionary Tale

Conventionalism, consistency, and consistency sentences

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation