1 Introduction

The concept of proof and the respective activity of proving have received growing attention by mathematics education researchers in the past few decades, and this attention is reflected in the upsurge of publications on didactical, epistemological, cognitive, and other aspects of proof, proving, and related concepts such as argumentation and reasoning-and-proving. The high level of research attention to proof and proving, which has spanned all levels of schooling (including the elementary) and relevant areas of research at the university level (including teacher education), has been justified from multiple perspectives. For example, from a philosophical/curricular perspective, the pivotal role of proof in disciplinary experts’ work—to the extent that proof has been called ‘the soul of mathematics’ (Schoenfeld, 2009)—necessitates that mathematics curricula that have integrity should offer to students authentic encounters with proof from the beginning of their mathematical education (Bruner, 1960; A. Stylianides, 2016). Relatedly, from a pedagogical and epistemological perspective, mathematically and cognitively appropriate learning opportunities for students to engage with proof can afford students with a basis for deep learning: assertions and new knowledge get accepted because they make sense mathematically rather than based on the authority of the teacher or the textbook (Bass & Bass, 2003).

Unlike prior reviews of mathematics education research in this area that were conducted by us and others over the past couple of decades (Harel & Sowder, 2007; Mariotti, 2006; Mariotti et al., 2018; Sommerhoff & Ufer, 2019; Stylianides et al., 2016, 2017, 2022), this one is not guided by an explicit proof-related theoretical framework or by pre-identified proof-related themes emerging from prior research. Rather, as we explain in the following section, we use as an organizing framework for the review Cohen et al.’s (2003) triadic conceptualization of instruction. This choice reflects our attempt to address a gap in the existing reviews of mathematics education research on proof and proving, namely, the lack of a bird’s eye view of the position of this body of research relative to the various elements of the instructional triangle (student–teacher–content), which has attracted considerable attention by educational researchers internationally and across many mathematics education topics (Goodchild & Sriraman, 2012). By utilizing this framework to organize our review, we explicitly investigate whether and to what extent the didactical/instructional relationship is addressed in the recent body of research that focused on proof and proving.

To conclude, with this review, we attempt to offer an instruction-focused synopsis of the current state-of-the-art of mathematics education research on proof and proving at both the school and university levels and to identify current and emerging trends in the field by addressing the following research question: How are recent mathematics education studies on proof and proving mapped across Cohen et al.’s (2003) triadic conceptualization of instruction, and what themes emerge from this mapping?

2 Elaboration on the scope of the review and review procedure

In three recent reviews of research on proof and proving published as handbook chapters, we documented the state-of-the-art in this area from complementary angles. In Stylianides et al. (2016), we followed a systematic approach to identify themes amongst relevant papers focusing on proof and argumentation and that were published in the proceedings of the International Group for the Psychology of Mathematics Education (PME), covering the period 2005–2015. Before us, Mariotti (2006) had reviewed themes emerging from proof-related papers published in the PME proceedings covering an earlier period, while Mariotti et al. (2018) did the same but for proof-related papers published in the proceedings of the Congress of European Research in Mathematics Education (CERME) over a twenty-year period. Sommerhoff and Ufer’s (2019) review also focused on PME proceedings, this time covering the period 2010–2014, with particular attention to argumentation and proof studies in secondary and tertiary education. Sommerhoff and Ufer used pre-determined categories, derived from prior research, to identify the extent to which different prerequisites and goals of argumentation and proving processes were investigated within PME reports in their sample.

In Stylianides et al. (2017), we conducted a narrative review of proof-related research, published mostly in refereed journals prior to 2016, from the perspectives of proving as problem solving, proving as convincing, and proving as a socially embedded activity. Before us, Harel and Sowder (2007) had conducted another narrative review covering a similarly broad body of published research related to proof but using their notion of ‘proof schemes’ as a theoretical framework.

Finally, in Stylianides et al. (2022), we conducted yet another narrative review, which was not restricted to a specific time-period, in order to define and exemplify authentic classroom mathematical activity in the area of proving. The conclusions and directions for future research of the current and our previous reviews were determined by each review’s particular focus or theoretical framework and period of interest. For complementarity, and for the reason we explained previously to address a gap in the existing reviews of mathematics education research on proof and proving, we defined the scope of the current review differently.

Specifically, we offer a synopsis of the state-of-the-art in the area of proof and proving through a review of relevant papers in English that were published (including ‘online first’) between 1/1/2018 and 1/6/2022 in the mathematics education research journals in Table 1, using as an organizing framework Cohen et al.’s (2003) notion of instructional triangle. We restricted our review to the journals that were graded as A*, A, or B by the expert panel convened by the Education Committee of the European Mathematical Society (EMS) and the Executive Committee of the European Society for Research in Mathematics Education (ERME) (Toerner & Arzarello, 2012). In making the decision to restrict our review to those journals, we were guided by a need for a systematic and practical procedure rather than elitism. We acknowledge that important research on proof and proving is published also in general education research journals, in conference proceedings, in books and book chapters, and in other languages that we did not consider in the review. Indeed, some of our own (including non-English) research has been published in outlets that we reluctantly excluded from this particular review but, as we explained previously, we addressed in other reviews.

Table 1 The journals included in the review and respective grade based on the grading reported in Toerner and Arzarello (2012, p. 53)

While we explained in detail elsewhere our own perspective on the meanings of proof and proving (e.g., Stylianides, 2007; Stylianides et al., 2017), for the purposes of this review we followed an inclusive approach and considered all papers that we obtained from searching the journals in Table 1 and that included proof, proving, or argumentation in at least one of their title/abstract/keywords. To organize the review, we used Cohen et al.’s (2003) triadic conceptualization of instruction according to which instruction “consists of interactions among teachers and students around content, in environments” (p. 122). This conceptualization draws attention not only to the main actors of the teaching and learning process—the teacher (T) and the students (S)—and the content (C) around which the actors’ work is organized (in this case, content related to proof and proving in mathematics), but also to the relationships among the actors and the content.

There is no suggestion that the merits of a paper are dependent on whether it addresses a single vertex of the instructional triangle (T, S, C) or the relationships between two or all three vertices (SC, ST, TC, STC). However, it is posited that it is important for us as a field to get a sense of the relative attention that mathematics education researchers have paid to the various elements of instruction in the area of proof and proving, and whether there are any differences between levels of education.

Furthermore, there is no suggestion that the use of the (particular version of the) instructional triangle framework is unproblematic. We share Goodchild and Sriraman’s (2012) view that, “[i]n many ways the ‘simple’ representation of didactical systems depicted in the didactic [instructional] triangle is argued to be inadequate” (p. 584), and indeed these inadequacies have motivated several extensions of and modifications to the framework to accommodate various research aims or theoretical perspectives (some of these extensions or modifications are discussed in the papers of the special issue edited by Goodchild and Sriraman, 2012). Yet, as Goodchild and Sriraman observed further, “all papers [in their special issue] confirm the central position of mathematics, learner and teacher in researching and theorizing teaching–learning processes in mathematics classrooms” (p. 584). On this basis, we judged that the ‘simple’ (in Goodchild & Sriraman’s, 2012 terms) version of the instructional triangle was the most appropriate for a review of a broad body of research, such as ours, that considers a multitude of papers that do not share a common theoretical perspective or research aim.

In conducting the analysis, we first categorized each of the papers we identified as relevant for this review according to whether the paper’s (main) focus was on S, T, C, SC, ST, TC, or STC. Similarly to Sommerhoff and Ufer’s (2019) review, the notion of ‘focus’ is important to understand our coding process. For example, consider a paper investigating students’ views of argumentation or proof in geometry in the context of classroom discussions. Building on the same example, if the teacher, although present in those discussions and potentially influential of the issues discussed by the students, did not feature prominently in the analysis or the research questions addressed in the paper, the paper would be coded as SC. Similarly, the paper would be coded as TC in case the paper focused on the teachers’ views while the students’ role was kept in the background of the reported research; this would still be the case even if the discussions could not have happened without the students’ involvement. If the teacher’s role and the students’ role were both explicitly investigated in the paper, then the paper would be coded as STC. In case that the students’ and the teacher’s presence were both missing or were not at the crux of the research questions of the paper, focusing only on aspects of argumentation or proof in geometry, then the paper would be coded as C. A paper that investigated teacher-student interactions, with the mathematical content (e.g., argumentation and proof in geometry) appearing not to play a particular role in the reported research, would be coded as ST. Accordingly, papers focusing only on the students or the teacher would be coded respectively as S or T.

The focus of a paper was also further identified in accordance with the educational system of reference. Papers that examined (preservice) teachers as learners were categorized as S, as their role in the respective educational system (i.e., university) is that of a student, whereas those that examined them as teachers of mathematics were categorized as T, as their role in the respective educational system (i.e., school) is that of a teacher. Similarly, papers that examined mathematicians from their roles as instructors in university were categorized as T. Also, we recorded the educational level of each paper: school, university, both (school and university), or discipline (in case the paper was not concerned with a particular level but rather was about the discipline in general). Teacher education papers with preservice teachers were classified as university level.

As expected, all papers made some sort of theoretical and/or methodological contribution. Some papers were more oriented towards (focused on) building ground, that is, they mostly contributed to research knowledge by further developing or implementing existing theoretical ideas, methods, or findings. Other papers were more oriented towards (focused on) breaking ground, that is, they were more explicitly written with the aim of developing or implementing new theoretical ideas or methods. We further coded the ‘breaking ground papers’ according to whether they explicitly made a theoretical and/or a methodological contribution (cf. Hanna & Knipping, 2020). We clarify that, for papers coming from the same group of authors making the same kind of methodological or theoretical contribution, only the first chronologically publication in our list was recorded as breaking ground; we viewed follow-up publications as strengthening or elaborating on that contribution and thus recorded them as building ground.

Following the journal search as we described earlier, we obtained an initial set of 126 potentially relevant papers. After screening the papers’ abstracts and, frequently, their full version as well, we excluded papers that were book reviews, science focused, or corrections/comments on other papers. This left us with a final set of 103 papers that we provide in Supplementary Materials. The coding procedure was as follows: two of us considered all categorizations separately and, subsequently, we resolved all disagreements through discussion (although inter-rater agreement was not a major concern for this review, our initial agreement was ‘good’ to ‘excellent,’ Kappa = 0.718). To meet the restrictions of how many of the 103 papers we could reference in this review, as well as which subset of these papers we would annotate in the bibliography of the review, we aimed for a representation of the themes that emerged from the review, but also a representation of authors and some of the trends that emerged recently during key conferences in the field (not only CERME and PME, but also of the International Congresses on Mathematical Education [ICME]). The identification of themes and the selection of papers we used to illustrate them, inevitably, was influenced by our subjective judgment but, at the same time, was guided by our knowledge of the field and the findings of our review.

3 Findings and discussion

Tables 2 and 3 summarize the findings of our analysis of the 103 papers. Specifically, the tables show, respectively, the distribution of these papers by level (school, university, school and university, or discipline) and contribution (breaking ground or building ground) across the seven possible combinations of the instructional triangle categories (S, T, C, SC, ST, TC, STC). There are several broad observations to draw from the tables:

  1. 1)

    Almost half of the papers in the sample (49 out of 103) fell in the SC category. This finding accords with the observation that the field thus far has produced a substantial number of frameworks and research findings related to students’ engagement with proving, especially students’ difficulties with proving tasks (Mariotti et al., 2018; Stylianides et al., 2017).

  2. 2)

    About a quarter of the papers in the sample (23 out of 103) fell in the STC category. This reflects a current emphasis in the literature (Mariotti et al., 2018) to viewing instructional practice in the area of proof and proving in a rather holistic way, considering all actors involved (students and teacher) as well as their interactions with the content. Although our review did not cover the pre-2018 period, we believe this is an emerging trend in the literature partly as a response to calls for more classroom-based studies (Mariotti, 2006; Mariotti et al., 2018) and research on classroom-based interventions (G. Stylianides & A. Stylianides, 2017; G. Stylianides et al., 2017) to better understand the teacher’s role in facilitating students’ interactions with proving (even that of teachers in the shadow education system; Moutsios-Rentzos & Plyta, 2019). Moreover, this trend echoes related efforts in the broader field of mathematics education that address emerging complex mathematics teaching and learning environments (Moutsios-Rentzos et al, 2017).

  3. 3)

    There were only few papers (ranging from 0 to 4) in the categories that did not include C in them, namely, the S, T, and ST categories. This shows that, recent research in the area of proof and proving, rarely considered students or teachers in isolation of the mathematical content within which proof and proving occurs. This is especially true in the case of the teachers who seem to be a focus of investigations predominantly when content is involved (never on their own and only once with students). It suggests further that mathematics education research on proof and proving has a strong disciplinary identity (Mariotti et al., 2018), which potentially differentiates it from other mathematics education research strands that treat (for good reasons) the content more as a research context rather than as a core part of it.

  4. 4)

    More papers focused on the university level as compared to the school level (53 vs. 37, respectively), with 9 papers pertaining to both. The remaining 4 papers related to the discipline and were all in the C category. Although the higher representation of the university papers in the sample may relate to our decision to allocate to this level teacher education papers with preservice teachers, it is also indicative of the broad educational focus of the researchers in the field as reflected, for example, in the contributions in the proof-related working groups of the recent ICME and CERME congresses.

  5. 5)

    Twenty out of the 103 papers’ contribution was deemed to be breaking ground, with 10 of those papers making an explicit theoretical contribution, 7 a methodological contribution, and 3 both. The representation of these 20 papers was uneven across the seven instructional triangle categories. More than half of the papers in the C category made such a contribution (8 out of 13); about one in five in the STC category (5 out of 23); and about one in ten in the SC category (5 out of 49). The remaining 2 papers which made such a contribution belonged to the S and ST categories.

Table 2 Distribution of papers in the sample (N = 103) by level according to their instructional triangle focus
Table 3 Distribution of papers in the sample (N = 103) by contribution (breaking ground or building ground) according to their instructional triangle focus

The above key findings are diagrammatically summarized in Fig. 1, where the instructional triangle is used as both an organizational and a communicational scheme of the perspective employed in this review and its results.

Fig. 1
figure 1

A diagrammatic summary of the findings of this review

Next, we discuss and illustrate the four instructional triangle categories with the most papers in them (SC, STC, C, and TC) to deepen understanding of the kind of research that has been conducted in relation to each of them. In our discussion we consider, as appropriate, papers that we deemed to make a breaking ground contribution. The discussion of the papers of each category is organized through themes that we inductively (and to some extent, subjectively) identified and, subsequently, conceptually linked with the broader mathematics education research including research in the field’s major conferences. Given that all four categories include C in them, we begin with the C category and move next to the SC and TC categories before we discuss the STC category last.

3.1 Content

Almost half of the 13 papers that addressed primarily issues of proof content, and were thus included in this category, focused on the university level or were about the discipline more generally. A couple of these papers discussed the role of computer-based tools, such as digital proof assistants (Hanna & Yan, 2021), for facilitating proof construction (see, also, Hanna et al., 2019). While the main concern in these papers was on the affordances of the technology, especially the ways it can mediate access to proof content, implications for mathematics instruction were interwoven with the discussion as a justification for the need for researchers to further explore these affordances.

A few other papers explored theoretical aspects of proof, such as its explanatory and convincing functions (Lockwood et al., 2020), its relationship with the notions of evidence and derivation (Aberdein, 2019), and its association (tension) with the notions of freedom and enforcement (Nickel, 2019). These papers made breaking ground contributions by deepening the field’s understanding of existing theoretical constructs relevant to proof and their interrelationships. Implications for mathematics instruction are more immediate in some of them than in others. Relatedly, Mejía-Ramos and Weber’s (2020) paper, which we deemed to have also made a (methodological) contribution, problematized the methods researchers use to gain insights into mathematical practice and called for exercising caution in how such insights ultimately inform mathematics instruction. The relationship between disciplinary and school mathematics practice, particularly the conditions under which the latter may be characterized as “authentic” in its reflection of the former, was further discussed in A. Stylianides et al. (2022) (not in the sample of 13 papers in this category).

Finally, only two of the papers in this category related exclusively to school mathematics, and both explored cultural dimensions of the place of proof in the secondary school curriculum. One of them reported a cross-country comparative textbook analysis (Bergwall, 2021) extending previous textbook-analysis work in the broader area of proof reviewed in G. Stylianides et al. (2017) and called for in Mariotti et al. (2018). The other paper, by Shinno et al. (2018), which we deemed to have also made a breaking ground (methodological) contribution, elaborated an epistemological model for understanding what constitutes proof in the curricula of different countries.

3.2 Student-Content

As can be seen in Table 2, the SC papers spanned educational levels, with 28 papers focusing on the university level, 17 on the school level, and 4 papers on both school and university. Yet some common themes emerge when looking across all 49 papers with a student-content focus.

A main theme relates to papers focusing on the role and use of examples in proving, including the use of counterexamples (see also Boero, 2022). Several of these papers derived from a special issue in The Journal of Mathematical Behavior (Knuth et al., 2019) that addressed students’ use of examples in proving and, in particular, ways in which example-based reasoning can be productive for students in the proving process. Knuth et al. (2019) discussed example-based reasoning as an important object of study and explored how examples can play a foundational role not only in the development and exploration of conjectures, but also in the development of proofs for the conjectures. Aricha-Metzer and Zaslavsky (2019) conducted individual interviews with secondary and undergraduate students (majoring in mathematics or mathematics-related subjects) on tasks that called for conjecturing and proving. They found a relatively strong tendency among students to use examples generically, which they associated with a productive use of examples in terms of developing a proof or a sound justification that can lead to a proof. Other studies involved an exploration of how inservice teachers interpret, understand, and use generic examples in their proving activities (Dogan & Williams-Pierce, 2021), a comparison between mathematicians and students’ example use (Lynch & Lockwood, 2019), and an exploration of structural aspects of generic examples to understand what makes them potentially opaque for learners in the domain of multiplication (Rø & Arnesen, 2020). Lew and Zazkis (2019) examined undergraduate mathematics students’ at-home work on prove-or-disprove tasks to better understand the interactions between example/counterexample generation activities as students try to prove the truth or falsity of mathematical claims.

Another main theme relates to papers that investigated participants’ understandings and conceptions of proof. Brown (2018) and Antonini (2019) explored issues of conviction in the context of indirect proofs. Brown employed a comparative selection of tasks to investigate undergraduate mathematics students’ views about the convincingness of indirect versus direct proofs. She found that the form of indirect proof (contraposition or contradiction) may make a difference for students and that students’ views may be more nuanced than previously considered. Antonini explored the issue of the intuitive acceptance of proof by contradiction with secondary and university students in the domain of geometry. He observed, through analysis of task-based interviews, that students can produce indirect argumentation (by starting with the assumption that a claim is false) as a compromise between proof by contradiction and the need for a more evident argument. Other studies in this area included Weber et al.’s (2022) investigation of how mathematicians use proofs to obtain conviction and certainty, Stewart and Thomas’s (2019) examination of undergraduate students’ perspectives on proof in linear algebra, Campbell et al.’s (2021) comparison of secondary students’ written and oral representations of mathematical arguments to investigate whether the two modalities portray similar understandings, Yopp et al.’s (2020) investigation of secondary students’ conceptions about the validity of a direct argument after they participated in instruction that addressed issues of constructing and critiquing arguments for a general claim, and Blanton et al.’s (2022) examination of how young learners (Grades K-1) come to construct viable mathematical arguments to justify generalizations about even and odd numbers. Finally, Davies et al. (2020) used comparative judgement to investigate undergraduate students’ proof comprehension based on students’ summaries of a given proof. This is one of the papers we identified as making also a breaking ground (methodological) contribution: it shows that comparative judgement can help produce valid and reliable assessments of the quality of students’ proof summaries, thereby offering a new way to assess proof comprehension.

Few studies in this category explored student-content issues in technological environments. One of these studies was Fujita et al.’s (2019) exploration of how learners can be supported to overcome logical circularity during their proof construction process in the context of a web-based proof learning support system. Their data derived from three strategically selected cases involving secondary students and prospective elementary school teachers. Although the computer-based environment was enough for some students to progress in their thinking, for others teacher intervention was necessary for them to begin to realize when their proof fell into logical circularity. In a different study, Cirillo and Hummer (2021) used Smartpen technology (i.e., Livescribe pens) to audio-record secondary school students’ explanations of their thinking and capture their pen strokes as they worked through geometry proof problems. Also, some of the students in Antonini’s (2019) study we mentioned earlier worked in a Dynamic Geometry Environment.

A handful of studies examined students’ proof writing or reading of proof scripts. Azrou and Khelladi (2019) conducted a case study of undergraduate students’ proof writing to understand why students’ efforts to write a proof can result in a disorganized, unclear draft. Dawkins and Zazkis (2021) used moment-by-moment think-aloud protocols to understand novice and experienced undergraduate students’ processes of reading mathematical proof. Finally, Ahmadpour et al. (2019), in a study that we identified to also make a breaking ground (theoretical) contribution, present a model for describing the growth of students’ understandings when reading a proof; the model comprises two paths relating to the semantic and syntactic levels.

A small number of studies in this category involved analyses of prospective teachers’ proof constructions and evaluations (e.g., Ko & Rose, 2018; Yee et al., 2018). Finally, we found very few studies in the student-content category focusing on students’ affective aspects, which accords with the observation that few studies in the field focus on these aspects (e.g., Moutsios-Rentzos & Kalozoumi-Paizi, 2017). One exemption was Ayalon et al.’s (2022) in-depth qualitative study that considered how secondary students’ emotions during an argumentative discourse relate to their learning of real-life functional situations.

3.3 Teacher-Content

The papers in this category addressed the interaction between teacher and proof content, and they tended to focus on the secondary school level in the domain of geometry. A smaller number of papers viewed (research) mathematicians in the role of the teacher and were thus classified under the university level. In almost all the papers, student learning or perceptions were also considered, but we deemed these to be more part of the context and less an object of study. The focus in this category is consistent with Mariotti et al.’s (2018) observation of a relatively recent attention by CERME papers on the teacher (versus the student) and their call for more studies on specific teacher competencies required for designing and managing didactic situations concerning argumentation and proof.

Some school-level papers explored teachers’ proof-related content knowledge or perspectives on proof-related content. For example, Karpuz and Atasoy (2020) explored Turkish secondary-school mathematics teachers’ content knowledge of the logical structure of proof in geometry; they reported teachers’ challenges about how to link information from geometrical figures to established knowledge, or how to avoid sweeping generalizations based on geometrical figures. In another study at the secondary school level in geometry, Aaron and Herbst (2019) examined American high-school teachers’ perspectives on the interplay between conjecturing and proving; they found that teachers favored the separation of these two activities, perceiving them as having different goals and requiring different resources. Considering these findings against those of another study that examined how a research mathematician conjectured and proved (Fernández-León et al., 2021), one appreciates the need for further enhancement of the teacher–content relationship at the school level through, for example, targeted professional development (e.g., Kazemi et al., 2021).

Another area of attention was teachers’ interactions with proof-related tasks (Ayalon & Hershkowitz, 2018; Baldinger & Lai, 2019; Rogers & Kosko, 2019). For example, Ayalon and Hershkowitz (2018) examined Israeli secondary school teachers’ choices of textbook tasks that they thought had potential to encourage proof-related activity in the classroom. Rogers and Kosko (2019) examined similar issues but from the perspective of teachers as task designers and with a scope that spanned the school and university levels: they asked their participants—elementary school teachers and university mathematics instructors—to design tasks that could support students’ ability to create and critique mathematical arguments. Teachers’ task choices or task design decisions offered useful insights into the knowledge, values, and dispositions underpinning their proof-related instruction.

3.4 Student-Teacher-Content

The papers in this category are characterized by their attempts to provide a more holistic approach to teaching and learning interactions by including both the main educational protagonists (teachers and students) and the respective content. Moreover, in STC, we identified the highest ratio of papers classified as making a breaking ground contribution, while the three papers that were considered to explicitly make both theoretical and methodological contributions were all in this category; this is in line with our aforementioned hypothesis that the STC category signifies an emerging trend in the field and is consistent with relevant calls for more research in this area (Stylianides & Stylianides, 2017; Stylianides et al., 2017). The STC papers may be grouped into four themes, according to which the rest of this section is structured.

First, almost half of the papers in the STC category discussed teaching approaches with respect to proof, including at the school or university levels. Some papers focused on the teacher’s role in supporting the students’ learning at school (e.g., Komatsu & Jones, 2022; Zhuang & Conner, 2022), while others investigated ways of supporting preservice (secondary or primary) teachers’ learning or teaching skills (e.g., Buchbinder & McCrone, 2020; Zambak & Magiera, 2020). A few papers concentrated on university teaching practices about proof (e.g., independent proof reading; Pinto & Karsenty, 2018). Stylianides and Stylianides (2022) is an example of a paper that purposefully considered both school and university by discussing a coherent approach of introducing proof to secondary students and preservice elementary teachers. Their argument for implementing a coherent approach to introducing proof across the different educational protagonists adds to the field’s theoretical perspective about the teaching of proof.

Second, some papers concentrated on proof norms and criteria, employing diverse perspectives. For example, Dimmel and Herbst (2020) utilized storyboard representations of instructional situations to investigate the teachers’ expectations of their students presenting their work at the board. Other researchers employed a broad perspective to explore the potential convergences and divergences in proof norms (e.g., by including lectures, textbooks, home assignments, and feedback; Pinto & Karsenty, 2020) or in proof criteria (e.g., by including school students, university students, and mathematicians; Sommerhoff & Ufer, 2019).

Third, few papers appeared to focus on explicitly investigating novel theoretical frameworks and/or methodological approaches to challenge current perspectives, thereby being coded as making breaking ground contributions. One of those papers was by Gabel and Dreyfus (2020) who discussed the links between rhetoric and proving in the context of the university teaching of proof, thus making a theoretical and methodological contribution to the field about the notion of proof and proving. In addition, Tabach et al. (2020) methodologically and theoretically contributed by networking two approaches to link the mathematical progress of individuals, small groups, and the whole class.

Finally, the linguistic and the discursive aspects of proof and proving appeared to also attract the interest of various researchers. For example, Kontorovich (2021) employed a commognitive framework to investigate a topologist’s feedback on her students’ proofs, thus methodologically contributing to the field. This interest is also evident in recent conferences (e.g., Boero & Turiano, 2023; Moutsios-Rentzos, 2022; a joint session of the argumentation and proof and the language thematic working groups in CERME 12).

4 Concluding remarks

In this review, we aimed to provide a complementary to existing synopses of the current state-of-the-art of mathematics education research on proof and proving by using a triadic framework of instruction (Cohen et al., 2003) to conceptually and methodologically organize our review. We also identified the educational level of the papers included in the review, showing that, at least recently, the field appears to be relatively balanced between the university level (which includes teacher education) and the school level. Before we further discuss the findings from our review, we acknowledge limitations of our approach.

Although our decision to use the ‘simple’ version of the instructional triangle framework was deemed appropriate for the scope of this review (to provide a bird’s eye, instructionally focused view of this area of research), in line with Goodchild and Sriraman’s (2012) commentary on the instructional triangle, we acknowledge that using other, more elaborated versions of the framework might have illuminated different aspects of the papers we reviewed. The fact that we did not consider papers published in conference proceedings is another limitation even though, as we mentioned previously, some past reviews focused exclusively on conference proceedings (Mariotti, 2006; Mariotti et al., 2018; Sommerhoff & Ufer, 2019; Stylianides et al., 2016). We acknowledge further that the picture we obtained from our review might have differed had we been more inclusive not only in terms of conference proceedings, but also in terms of the journals and the publication period we covered in the review. To mitigate the effects of these limitations, we considered broader literature in our discussion including papers (published in conference proceedings and other outlets) that did not meet the strict parameters of our search criteria.

Notwithstanding these limitations, we argue that the employed approach served its purpose by revealing specific trends in the recent proof and proving literature with respect to instruction. First, the role of Content (C) is central and multifaceted to the recently published work in this field, as almost all the papers (98/103) included C in their main foci including a good number of papers (13/103) that included C as their only focus. At the same time, the papers focusing only on C appeared to constitute a notable category in the sense that they included the highest ratio of papers making a breaking ground contribution to the field. Moreover, C was significant enough for some papers to transcend educational levels and to concentrate on the discipline itself (the papers focused on the discipline were all assigned to C). The significant role of C in our review was not unexpected as the notion of proof has a strong disciplinary identity (Mariotti et al., 2018); proof is at the crux of the ontology and epistemology of mathematics (e.g., Ernest, 2018) and has been called ‘the soul of mathematics’ (Schoenfeld, 2009), while it has been described as playing “a distinctive role” within the mathematical community, “separat[ing] mathematics from the empirical sciences” (Hoyles, 1997, p. 7). This multifacetedly special status of proof within the discipline of mathematics suggests that one might obtain a different picture if one conducted a similar instructionally focused review of other areas in mathematics education research for which content is viewed (for legitimate reasons) more as a context for the research rather than as a core part of it.

Another trend in the recent proof and proving literature that was revealed by our review relates to two C-including categories that appeared to play an important role in the area of research that we considered in this paper: Student-Content (SC) and Student–Teacher-Content (STC). SC was the largest category, which however did not include a high ratio of papers with a breaking ground contribution, while STC had the second highest ratio of such contributing papers. Thus, it may be argued that SC gathers the strongest interest of the researchers who appear to mainly implement and investigate existing ideas and methods (studies more oriented towards building ground), whereas STC along with C seem to be the main areas where new methodological and theoretical perspectives appear (studies more oriented towards breaking ground). An alternative interpretation of this finding is that STC or C are better suited for such theoretical or methodological breaking ground contributions as compared to other categories of the framework. Conducting a similar review of other mathematics education research areas and comparing the findings with ours might cast light on this issue.

Although our review covered a relatively short time period to allow conclusions about evolutionary trends in research on proof and proving, we presume that the STC category, in which about a quarter of the papers in our sample fell, constitutes the focus of a growing body of research in the field, with more researchers attempting to simultaneously address and include instructional aspects in their studies. This rather holistic way of studying instructional practice might reflect a response to Mariotti’s (2006) call in an early review for investigations into the teaching and learning of proof and proving not being divorced from the reality of the classroom, as well as the calls in more recent reviews by Stylianides et al. (2017) and Mariotti et al. (2018) for more research on classroom-based interventions in this area. The publication of an edited volume specifically devoted to classroom-based interventions in the area proof (Stylianides & Stylianides, 2017) might have also helped put this research area in the spotlight. We note further that some themes that may not be dominant in terms of volume of assigned papers appear to transcend the identified categories of instructional focus: notably, the role of technology (within the learning environment and/or research methodology), linguistic and discursive aspects, proof validation criteria, and affective aspects. We have found that these themes and the holistic, systemic focus (as signified by the STC category) seem to also gain more presence in major conferences in our field and, hence, we posit that they may constitute potential areas of interest for future reviews of research on proof and proving.