1 Rationale

Two leitmotivs emerge from our reading of the mathematics education literature on proof. One is concerned with how fundamental proofs are to mathematics and how paramount it is to engage students in proving throughout their mathematics education (e.g., Stylianides et al., 2017). This leitmotiv is reflected in some curricula (e.g., NCTM, 2000 for the US; Department for Education, 2013 for the UK; Leikin & Livne, 2015 for Israel), which cannot be taken for granted considering the marginalization of the activity in the curricula of other countries (e.g., Hanna, 2000; Knox & Kontorovich, 2022). The other leitmotiv pertains to how challenging proof is for newcomers (e.g., Harel & Sowder, 2007). This finding comes from many studies, including those involving high achievers in school and future mathematics majors (e.g., Kontorovich & Greenwood, 2023; Stylianides & Stylianides, 2022; Weber, 2010). The two leitmotivs feed into long-standing research on the complexity of transitioning to proof-based mathematics. In this study, we introduce and explore a didactical innovation to support students in this transition.

Philosophers, mathematicians, and mathematics educators have been considering proof from different perspectives. In this article, we adhere to the social perspective, viewing proof as a human endeavor that is cognizant of the discipline and the community that practices it (e.g., Stylianides et al., 2017). In the words of Balacheff (2008, p. 502),

The issue of truth and validity cannot be settled in the same way in everyday life, in law, in politics, in philosophy, in medicine, in physics or in mathematics. One does not mobilize the same rules and criteria for decision-making in every context in which one is involved.

The notion of rule is common in the area of proof (e.g., Lakatos, 1976) and it is central to our research. We ground the notion in the commognitive framework in Section 3. In the meanwhile, it is sufficient to associate the rules of proof (RoPs hereafter) with canons proofs are expected to abide by, such as logical inference and generally accepted conventions of proof-writing (e.g., Durand-Guerrier et al., 2012; Selden & Selden, 2015). To be explicit, we focus on the rules the written outcomes of a proving activity are expected to follow, while acknowledging that the process of proof-seeking does not necessarily follow the same rules.

Kitcher (1984) notes that conventional rules of a mathematical practice may not be visible to mathematicians:

Ideas about how one does mathematics may simply be included in early training without any formal acknowledgment of their nature or defense of their merits. It is usually at times of a great change that metamathematical views are focused clearly, in response to critical questioning. The metamathematics of a practice is most evident when the practice is under siege. (p. 163)

Mathematics education research echoes this view by arguing that some aspects of proof often remain tacit in instruction (e.g., Dreyfus, 1999; Lin et al., 2012; Selden, 2021; Tsamir et al., 2009). Accordingly, the lion’s share of proof research can be construed as students grappling with RoPs that have not been presented to them as explicitly as they could be (e.g., Knuth, 2002; Moore, 1994; Weber, 2002). This grappling has been extensively studied in the context of transition from secondary to tertiary mathematics—a time of a great change, where many familiar mathematical practices are put under siege (e.g., Di Martino et al., 2022; Gueudet, 2008).

Some scholars argue that the shift to proof-based mathematics requires a fundamental change in the rules underpinning germane mathematical activities, such as establishing and justifying the validity of mathematical statements (e.g., Stylianides & Stylianides, 2022). Kjeldsen and Blomhøj (2012) maintain that developing appropriate rules for these activities is indispensable for mathematics learning. Similarly, Sfard (2008) posits that modifying the rules that govern students’ mathematical discourses constitutes an educational goal. Hereby, it only seems reasonable to propose that a direct engagement with RoPs may be of didactical value for newcomers to proof-based mathematics. We investigate this proposal in the study at hand.

This study is not the first one to make a proposal of this sort (e.g., Hanna & Jahnke, 1996). Stylianides and Stylianides (2022) analyzed an instructional sequence, one of the milestones of which was introducing learners to an operationally functional conceptualization of proof—a set of criteria that a classroom community can use to inform their engagement with proof. The sequence was found to facilitate secondary-school students’ and prospective teachers’ sensemaking of proof. Accordingly, Stylianides and Stylianides call instructors to be clear about proof criteria “so as to practically guide [learners’] engagement with proof” (p. 17).

Based on the above, this study aims to characterize RoPs that students develop after having been explicitly introduced to them as part of the transition to proof-based mathematics. Specifically, we focus on how students formulate, explain, justify, and implement the rules when making their first proving steps.

2 Literature background

This section revisits previous findings through the lens of RoPs (Section 2.1) and discusses the potency of scriptwriting to study proof learning (Section 2.2).

2.1 Students’ challenges with RoPs

A “solid finding” in mathematics education is that “many students rely on validation by means of one or several examples to support general statements, [and] this phenomenon is persistent in the sense that many students continue to do so even after explicit instruction about the nature of mathematical proof” (ECEMS, 2011, p. 50). This finding is inseparable from broader research on students’ struggle to use examples to prove and disprove universal and existential statements (e.g., Buchbinder & Zaslavsky, 2019). Another occasionally reported issue pertains to logical circularity. For instance, Pinto and Cooper (2019) shared a case of a real analysis student who relied on a corollary of the theorem he was attempting to prove. Selden and Selden (1987) offer a comprehensive list of students’ issues with proof, including making invalid inferences and beginning with the conclusion to arrive at a true statement.

Research often refers to the abovementioned issues as systematic errors and misconceptions since they have to do with logic and proof validity (e.g., Weber, 2002). This framing is consistent with the cognitive perspective that dominates proof research (Stylianides et al., 2017). From the social viewpoint (Stylianides et al., 2017), these issues constitute deviations from the rules of logic practiced in professional mathematics communities. Hereafter, we refer to these rules as necessary since the rules of logic impose them.

Abiding by these rules requires additional rules that are less universal and require a certain community to agree on, i.e., establish a convention. For instance, Stylianides (2007) argues that in a mathematics classroom, a proof uses accepted statements and employs forms of reasoning and expression that are known to or within the conceptual reach of the students. Hence, classroom proofs vary depending on what each of them renders “accepted,” “known,” and “within students’ reach.” RoPs are needed to organize mathematical statements, forms of reasoning, and expression into fully-fledged proofs. In terms of Yackel and Cobb (1996), convention-based RoPs can be conceived as sociomathematical norms that determine “what counts as an acceptable mathematical explanation and justification” (p. 461) in a specific community. Dreyfus (1999) reflects on undergraduates grappling with RoPs of this sort. Specifically, he discusses students’ explanations and proofs that do not “go back” enough and are not “deep” enough to be considered fully fledged proofs. In this study, we invited students to discuss RoPs of their choice, and we were interested in seeing whether explicit communication about them may support students’ learning to prove.

2.2 Investigating proof learning with scriptwriting

Scriptwriting tasks present learners with a mathematical conflict and request to resolve it by composing a conversation between fictional characters (Zazkis et al., 2013). Research has been arguing that these dialogical tasks allow scriptwriters to not only showcase the knowledge they developed from resolving the conflict but also raise issues that usually remain unarticulated in a traditional problem-solution format (e.g., Zazkis & Herbst, 2018). Scripts provide a window into their writers’ thinking by attending to how fictitious characters explore ideas, clarify gaps, and raise and address perceived points of difficulty (e.g., Zazkis, 2014).

Gholamazad (2007) reports on one of the first projects to employ scriptwriting in the area of proof. Importantly to our study, Gholamazad drew on Sfard’s (2008) commognitive framework, according to which thinking constitutes an “individual version of interpersonal communication” (p. 81). Building on the same framework, Brown (2018) argued that by making students’ envisioned interactions public, scriptwriting enables students to make their proof-related thinking visible and fosters reflection. Methodologically speaking, scriptwriting has been acknowledged for providing opportunities to observe students’ ways of seeing a mathematical proof and understand what the students render important (e.g., Zazkis et al., 2013).

In the area of proof, scriptwriting has been mostly used to explore proof validation and sensemaking (e.g., Brown, 2018; Koichu & Zazkis, 2013). In all the scriptwriting studies that we came across, the task designers were the ones to prompt their participants with proof attempts, which a priori made these attempts extraneous to scriptwriters. In turn, we wanted to mobilize the scriptwriting format more openly to create a space for students to account for their own transition to proof-based mathematics.

3 Commognitive framing

We turn to the commognitive framework (Sfard, 2008) for theoretical foundations and conceptual tools. Commognition has been acknowledged for illuminating critical points in students’ mathematics journeys where the “rules of the game” change significantly, like in the transition to proof-based mathematics. Commognition construes mathematics as a discourse, associating its learning with an individual “becoming a participant in certain distinct activities” (Sfard, 2001, p. 23). One’s starting to abide by the rules of the target discourse is an example of such an activity. Next, we elaborate on RoPs (Section 3.1), introduce a commognitive perspective on proof-based mathematics (Section 3.2), and review modes of participation in a mathematical discourse (Section 3.3).

3.1 Rules of proof

Proofs concern mathematical statements that can be rendered as either valid or not “according to well-defined rules” (Sfard, 2008, p. 224). The mathematics community has been defining and revisiting these rules throughout history (e.g., Kleiner, 1991). Notwithstanding, Sfard (2008) argues that “for today’s mathematicians, the only admissible type of substantiation [of mathematical statements] consists in manipulation on narratives, and it is thus purely intradiscursive” (p. 232). We concur with this argument in the case of a literate mathematical discourse and associate RoPs with intradiscursive principles that underpin written proofs.

RoPs constitute a metadiscursive construct, prescribing what a written proof looks like. Sfard (2008) argues that such metarules “are the result of custom-sanctioned associations rather than a matter of externally imposed necessity [which] does not mean there are no reasons for their existence” (pp. 206–207). Sfard claims that even the most “objective” and commonly endorsed discursive rules that appear to be fully governed by inevitability and logical necessity are products of human choices that survived the test of time. In our case, this claim can be associated with modus ponents, modus tollens, predicate calculus, et cetera—i.e., RoPs that survived the selection of the mathematics community over time due to their communicational usefulness and effectiveness. The contingency of other RoPs might be more evident. For instance, a disciplinary tradition appears as the only reason for today’s mathematicians to mark the end of a proof with “Q.E.D” or the Halmos symbol.

3.2 Three types of proof-related discourses

In the context of proof learning and teaching, we distinguish between three types of discourses. In the first type, \({D}_{1}\), mathematical statements are endorsed without explicitly discussing proof and proving. The endorsement occurs through the implementation of conventional procedures that are viewed not only as producing new narratives about mathematical objects but also as warranting the narratives’ validity (e.g., differentiation rules generate derivatives and ensure the resulting functions are derivatives of the original functions). In the discourses of the second type, \({D}_{2}\), statement generation and proving are separate from each other. \({D}_{2}\) discourses are proof-based versions of \({D}_{1}\) since most statements that are valid in \({D}_{1}\) remain valid, but the demonstration of their validity is expected to be different. This is where RoPs are purposefully enacted to demonstrate the (in)validity of a statement. Lastly, \({D}_{3}\) refers to a metadiscourse of \({D}_{2}\), i.e., a discourse in which the main objects are rules of \({D}_{2}\). Such metadiscourses revolve around how statements in a specific \({D}_{2}\) relate to each other and why a particular narrative is valid when the other is not. On the \({D}_{3}\)-level, RoPs are endorsed (cf. Sfard, 2008) through narratives that capture the rules in words (i.e., rule-narratives are generated). Within rule-narratives, we distinguish between guiding formulations that offer a direction by describing what a proof should do or look like, and restricting formulations, prescribing one what to avoid in a proof.

The distinction between \({D}_{1}\), \({D}_{2}\), and \({D}_{3}\) is idealized, and the borders between the three are usually blurred in the educational context. In our view, the potential of this typology is in its capability to account for newcomers’ often-reported struggles with \({D}_{2}\) in non-deficit terms. Indeed, one’s violation of a particular RoP can be examined through the lens of a rule that lingered from one type of discourse to another (e.g., an application of inductive reasoning, which is valid in \({D}_{1}\), to \({D}_{2}\) where deduction is expected). In such cases, raising to the level of \({D}_{3}\) appears necessary for a teacher to communicate the rules of \({D}_{2}\) as a target discourse. In a similar vein, student engagement in a metadiscourse provides an opportunity to formulate, elaborate, and substantiate RoPs that they are expected to implement in \({D}_{2}\). The relations between RoPs learned and taught are the focus of our study.

Like any discourse, \({D}_{3}\) is characterized by a special set of keywords (e.g., claim, validity, example, inference). This terminology is avoidable in \({D}_{2}\), where RoPs are only enacted. Accordingly, students’ capability to participate in \({D}_{3}\) appears impossible without the teacher offering an appropriate vocabulary and modeling how RoPs can be captured in words.

3.3 Ritualistic and outcome-oriented participation in a discourse

With respect to students entering a new discourse, commognition distinguishes between ritualistic and outcome-oriented participation (Sfard, 2008).Footnote 1 One’s participation is said to be ritualistic when it has been motivated by social reasons (e.g., to please the teacher). In such cases, outcomes are of little interest for the learner, and the following of the demonstrated rules is the gist of the activity. In an outcome-oriented participation, discursive rules are implemented deliberately to grow new mathematical “truths.” Lavie et al. (2019) introduce the notion of deritualization to capture students' shifting from a ritualistic to an outcome-oriented participation in a mathematical discourse.

Lavie et al. (2019) highlight that pure rituals and outcome-oriented participation rarely feature in mathematics classrooms. With a focus on discursive routines (i.e., patterned actions), the researchers introduce a set of markers to distinguish between rituals and outcome-oriented performances. Looking ahead, four of Lavie et al.’s markers turned out to be relevant to our study: applicability—the expansion of the set of situations in which one performs a certain action, agentivity—the growth in one’s autonomous activity, objectification—a property of narratives that one generates as being about mathematical objects rather than people who act on these objects, and substantiability—one’s capability not only to articulate the already performed action but also to reason its implementation. Heyd-Metzuyanim et al. (2022) added canonicality to the set to capture the mathematical validity of the employed actions. In the last section, we draw on these characteristics to discuss RoPs that our students enacted and endorsed.

4 Method

Our participants come from a program entitled “Mathematics extension and acceleration” in a large New Zealand university. The program is intended for mathematically motivated and academically inclined students in their final year of high school. Typically, the students are 17 years old. As part of the program, they take a special course that gives academic credit for a bachelor’s degree in mathematics or engineering. The course is proof-based, and it covers selected topics in calculus, set theory, and graph theory. Proofs play a marginal role in New Zealand schools (Knox & Kontorovich, 2022), thereby, the first course lessons are dedicated to proof.

This study is part of a larger developmental project—a co-learning partnership (Wagner, 1997) that educational researchers and university teachers formed to support students’ transition to university mathematics (Kontorovich et al., 2023). In this study, we collaborated with Patrick—a mathematician by training and a highly acknowledged teacher with about a decade of experience in university mathematics instruction. As part of the project, Patrick revised his usual proof teaching to bring RoPs to the forefront of his first three lessons (50 minutes each). In the first lesson, he introduced proof as “an argument that mathematicians use to show that something is true.” Then, he presented three mathematical statements and led a whole-class discussion about what he dubbed “broken proofs.” In the following lessons, he illustrated how some properties of real numbers can be used to derive additional properties (e.g., he used distributivity to prove “\(-a=-1\bullet a\)”); he referred to these illustrations as “good proofs.”

Due to space limitations, we summarize RoPs discussed in the classroom in Table 1.Footnote 2 The rules were presented in seven episodes where Patrick generated narratives to present, explain, and substantiate them. The last four rules pertain to how proofs are expected to appear in writing, and we refer to them as the rules of proof layout. The table emerged from the thematic analysis of Patrick’s rule-narratives (Braun & Clarke, 2006). Given the students’ unfamiliarity with proof from their school studies, we expected the classroom rules to act as the main point of reference (“precedent-search-space” in Lavie et al., 2019) for students to lean on in the proof-centered assignments that followed.

Table 1 RoPs that Patrick emphasized in the classroom

A scriptwriting task was developed to provide students with opportunities to engage with RoPs (see Kontorovich & Bartlett, 2021 for details on task design). Overall, this was a collaborative process, fueled by an intertwining of didactical and research considerations. Figure 1 presents the eventually developed task. Patrick embedded an example of a fictitious dialogue in the task due to students’ unfamiliarity with scriptwriting. The task was part of an individual homework assignment together with proof-requiring problems that are more typical to transition-to-proof courses (see Fig. 2). The students had ten days to submit the assignment.

Fig. 1
figure 1

Scriptwriting task

Fig. 2
figure 2

Proof-requiring problems from the homework assignment

The analysis started with an overview of the collected 71 submissions to develop a general impression of whether students’ scripts addressed RoPs. This process converged into 58 scripts. Other scripts were excluded since they focused on difficulties and confusions one can experience in the proving process rather than on the rules written proofs should abide by. Most scripts involved a Friend character who shared an infelicitous proof attempt and a Student character who critiqued it. The critiquing utterances became our primary source of students’ rule-narratives.

The first question underpinning the analysis was how do the students’ RoPs (endorsed and enacted) compare to those discussed in the classroom? In the first step, we mapped every RoP in each student’s script to the rules Patrick emphasized in the classroom. Then, we focused on the students’ rule-narratives, comparing them with what Patrick did or said during the lesson. This systematic comparison led to initial categories for the similarities and differences between the students’ and Patrick’s rule-narratives. For instance, we instantly noticed that some students’ rule-narratives involve the terms used by Patrick, whereas other narratives appear to extend and elaborate on his formulations. The initial categories were iteratively split, merged, structured, and renamed until the final categorization presented in Section 5.1 was obtained.

The second question was how do the RoPs students endorsed in their scripts compare with the enactment of those rules in their proofs? This question confined the data corpus to 46 submissions where the RoPs students discussed in their scripts were also relevant to Problems 1–3. This analysis instantly drew attention to consistencies and gaps between the rules endorsed and enacted. For instance, a student could endorse some RoP in a script and then violate it in their proof. We scrutinized instances of this sort to develop categories capturing different aspects of the relationship between the RoPs students endorsed and enacted. The eventual categories are in the focus of Section 5.2.

5 Findings

This section is divided into two parts: Section 5.1 focuses on the relationship between RoPs discussed in the classroom and those featured in students’ submissions; Section 5.2 contrasts the RoPs students endorsed in their scripts with those enacted in the proofs. We use excerpts from the scripts to present the findings. We have deliberately maintained a sense of breadth in our choice of excerpts to reflect different issues raised by the students.

5.1 Students’ vs. Patrick’s RoPs

RoPs discussed in the classroom were consistently enacted in 11 (out of 71) submissions. Table 2 overviews the number of submissions where students breached RoPs in at least one of their proofs. Most of the breaches pertained to convention-based rules (e.g., finishing a proof with □), instances where the students used variables without defining them, and algebraic mistakes. Only six students attempted to endorse universal statements with examples.

Table 2 Overview of the students’ following classroom RoPs

The students aimed to prove valid statements and reject the false ones in all but seven proofs. For these instances, the students attempted to endorse universal statements with examples, used the target statements as part of the statement proofs, and provided insufficient evidence to reach the target conclusion. In five of these proof attempts, multiple rules were breached, and, overall, these instances contained more rule violations than other submissions. In this way, we suggest that choosing the wrong direction in prove-or-disprove problems predisposed the students towards breaching necessary RoPs and a larger number of them.

Out of 58 analyzed scripts, 42 endorsed a single RoP and 16 endorsed two rules. Generally speaking, all but two rules the students addressed echoed those discussed in the classroom. The right column of Table 2 shows that some rules were discussed more frequently than the others, while some rules remained beyond the students’ attention. Next, we contrast the students’ and Patrick’s rules on a discursive level. We capture the contrast in terms of similar, contextual, extended, and new rules. We introduce these categories through illustrative examples, followed by a description of each type of rule.

Table 3 presents examples of rule-narratives the students endorsed in their scripts. The table illustrates that these rule formulations overlap to a significant extent with those generated by Patrick. This is evident in such words and expressions as “start off,” “true,” “citing examples,” and “original claim.” We refer to such students’ rules as similar due to the terminological overlap with Patrick’s rule formulations. Similar rules were identified in 19 (out of 58) scripts.

Table 3 Examples of similar rule-narratives

Two features of similar rules are noteworthy: First, while lecture captures and classroom notes were accessible to the students, none of the students recited Patrick’s exact formulations in their scripts. Thereby, students’ rule-narratives can be viewed as individualized versions of the introduced RoPs as part of their emerging \({D}_{3}\). Second, the students generated their rule-narratives in the context of statements and proofs that were different from those discussed in the classroom. In this way, the students’ appeal to classroom RoPs in a different mathematical context marks an expansion of the rules’ original scope of applicability, and may be construed as a marker of deritualization.

Other scripts contained rule-narratives that were more distant from those generated by Patrick. For example, in the classroom, Patrick explained, “You want to tell people what your variables are like.” Through the voice of his Student character, S11 spelled out that “If you represent rational \(x\) and \(y\) to be \({~}^{a}\!\left/ \!{~}_{b}\right.\) and \({~}^{c}\!\left/ \!{~}_{d}\right.\), you should define what \(a\), \(b\), \(c\), \(d\) could or couldn’t be.” Accordingly, while Patrick offered a metamathematical version of the rule calling students to define variables they introduce in their proofs, S11’s rule-narrative contextualized the rule to match the specific proof attempt. In a similar vein, S27’s Student character suggested that “You need to consider the full range for your \(x\) and \(y\) values. It is important to consider all possible values for a claim.” We construe this rule-narrative as a contextualized version of the same metamathematical rule due to the specificity of \(x\) and \(y\). We referred to students’ rule-narratives as contextual when they involved terms used by Patrick’s, except a few that were specified to fit a particular proof attempt in the focus of the script (e.g., “\(a\), \(b\), \(c\), \(d\)” and “\(x\) and \(y\) values” instead of “variables”). Overall, contextualized rules emerged from 20 scripts where students’ rules presented contextualized versions of Patrick’s \({D}_{3}\)-level rules.

In the classroom, Patrick emphasized that “proof steps should be explained,” in the sense that verbal explanations should accompany formulas and symbolic statements. As if elaborating on this line of thought, the Student character of S2 argued that “when you are trying to disprove a statement by using a counterexample that requires a proof itself, it must be proved, no matter how clear it is.” This rule-narrative offers a combined guidance on disproving via a counterexample and ensuring its “counterexemplarity” is shown explicitly. This rule-narrative is not very far from suggesting that disproving is a type of proving, and thus, it is expected to abide by many of its rules. Regarding the same rule, S34’s Student character explained that “if you aren’t using much words, your proving isn’t very clear. A person with no understanding of the question but some of the concept should understand a proof. Explaining your steps and summing up what you did at the end will make it much better.” In this narrative, S34 operationalizes the vagueness of a proof not being “very clear” by specifying an intended proofreader. The guiding suggestion of summing up the proof was also not raised in the classroom. These examples illustrate what we term elaborate rules, that is, rules that either combine different classroom rules into a single narrative or reiterate them by adding detail and explanation. Extended rules were present in 23 students’ scripts.

Two new rules were identified in nine scripts. The first one revolved around the invalidity of “a circular argument,” i.e., the use of the yet-to-be-endorsed statement in the statement’s proof. This rule featured in eight scripts, maintaining that “proving a claim using what is required to be proved is clearly invalid” (S9) and “you’re not actually proving anything [through circularity]. That argument contains no evidence that is distinct from its conclusion” (S32). Patrick emphasized the importance of including true statements in the proof, but students were the ones to identify circularity as a special sub-category and elaborate on its problematics. Figure 3 presents S45’s script where a classroom rule was challenged. While Patrick highlighted that every proof step needs to be explained, the student illustrates that an explanation can be followed by a request for another explanation, resulting in an endless chain. Then, the student concludes that some “proofs” (probably “statements” were meant) must be taken “for granted.”

Fig. 3
figure 3

A student’s fictitious dialogue

5.2 Endorsed vs. enacted RoPs

Overall, in their assignments, the students engaged in \({D}_{2}\) and \({D}_{3}\) versions of a numeric discourse. In Problems 1–3, 70 students only enacted RoPs, operating in \({D}_{2}\).Footnote 3 However, in the scripts, RoPs were endorsed, explained, and substantiated, featuring \({D}_{3}\)-level communication. In 29 fictitious dialogues (out of 46 in the focus of this analysis), the students accompanied the discussion of infelicitous proof attempts with proofs that were positioned as valid. Accordingly, in these scripts, the students operated in \({D}_{2}\) and \({D}_{3}\) by formulating and implementing the same RoP.

Next, we characterize the relationship between RoPs students endorsed in 46 scripts and the enactment of these rules in their proofs, either in the solutions to Problems 1–3, or as part of their script: (i) conventional rule formulation—consistent enactment, (ii) conventional rule formulation—inaccurate enactment, (iii) radical rule formulation—conventional enactment, (iv) overgeneralized rule formulation—conventional enactment, (v) restricted rule endorsed—guiding rule enacted. The characteristics (ii)—(v) do not constitute disjoint categories since more than one of them was often applicable in many submissions. Next, we use examples and excerpts to explain these characteristics.

In their script, S7 argued that “it is better to use fact to prove more facts rather than the other way around: starting with a claim and deriving truth.” Consistently with this rule-narrative, in the proof-requiring problems, S7 systematically inferred the target statement from the logical chains. In this way, S7’s work is illustrative of a larger set of 38 submissions, where students discussed a RoP presented in a classroom and systematically enacted it in their proofs. The alignment between the rules endorsed and enacted in these submissions suggests that one’s communication about RoPs may come hand-in-hand with the capability to employ them to prove mathematical statements. In these cases, we distinguished between different versions of the same RoP based on students’ formulations, i.e., specific rule-narratives that positioned the rule at different discursive levels. For instance, “Just because you gave an example where the claim is true, doesn’t mean it’s always true” (S8) can be seen as metamathematical since it is not grounded in a specific mathematical discourse. Other rule-narratives were discourse-specific; for example, “It is incorrect to use certain values of \(x\) and \(y\) as it may only be true in special cases. […] this may not be true for all other irrational numbers” (S20). Accordingly, we were impressed with eight submissions where students demonstrated fluency with RoPs by generating metamathematical and discourse-specific versions of the same rules as well as enacting them in their proofs.

The rest of the categories capture tensions between endorsed and enacted rules. “Conventional rule formulation—inaccurate enactment” emerged from four submissions where students endorsed certain RoPs in their scripts but deviated from them in one of their proofs. For instance, in their script, S1 insisted that “you must have a claim at the top of your proof so that other readers and markers can understand what you are trying to do.” Yet, one of S1’s proofs started instantly without recapping the focal statement. Such enactments occurred occasionally rather than systematically, and this is why we refer to them as inaccurate: it does not appear that the deviations from the endorsed RoPs were intentional.

Through the voice of their Student character, S4 argued that “when you are trying to disprove a statement by using a counterexample that requires a proof itself, it must be proved, no matter how clear it is” (our italics). However, when rejecting the statement in Problem 1(a), S4 operated with \(-\sqrt{2}\) and \(0\) as irrational and rational numbers, correspondingly, without proving these properties. These properties were also not proven in the classroom. In the interview, Patrick confirmed that students were expected to draw on these properties as S4 did, and that “separate sub-proofs would be redundant.” In other scripts, the students claimed that “a statement needs to be proven for all numbers simultaneously” (S23) and “when saying something is rational, you say it equals \(m\) over \(n\)” (S29). Yet, in their proofs, S23 and S29 did not always prove all numeric cases at once and used different notations to refer to rational numbers. We refer to such rule versions as radical since, while generally addressing a legitimate RoP, their formulations insisted on an exhaustive implementation of the rule in every proof step or in a very certain way. The presented examples illustrate that the rule-narratives that students endorsed in their scripts were more demanding than they enacted in their own proofs. We identified this phenomenon in seven submissions.

Nine scripts addressed valid RoPs but without specifying the scope of their applicability. For instance, S6 argued that “proving by using examples is the wrong way to prove something” and S11 called their fictitious peer to “remember not to make assumptions next time you do a proof.” These narratives capture rules that apply to universal but non-existent statements and rules about making assumptions within a proof. By not specifying the rule scope, such formulations made the students’ rules sound overgeneralized as if applicable to all kinds of proofs and any mathematical statements. However, in their actual proofs in Problems 1–3, the students enacted these RoPs conventionally (hence, we labeled this category “overgeneralized rule formulation—conventional enactment”). For instance, S6 used counterexamples to reject existence statements and S11 used negations in their proof by contradiction. Similarly to the previous characteristic, this may be seen as another way in which students’ enactment of RoPs was more conventional than their communication about them.

The last characteristic was found in 43 (out of 46) scripts, where Student characters highlighted what is not supposed to happen in written proofs. However, drawing on restricting RoPs only was insufficient in Problems 1–3 where actual proofs were required. For instance, in their script, S44 insisted that “just because you proved it is correct for one set of values doesn’t mean it’s correct for every pair.” Yet, in Problems 1(b), (d), and (e), the student provided conventional proofs that accounted for all cases. Similarly, S11’s Student character explained that “proving that three different ways doesn’t work doesn’t prove that it’s impossible.” But in Problem 2, the scriptwriter proved the impossibility of the tessellation by contradiction. In other words, while many scripts were silent about how proofs should unfold, the scriptwriters managed to carry them out to prove statements.

Overall, the guiding rules the students endorsed mostly referred to proof layout (see Table 2). These RoPs often involved deictic words such as “something” and “it”, and they were tightly linked either to specific problems in the assignment or to a numeric discourse (e.g., “to prove them in questions 1, use letters instead of numbers” S22). Guiding metadiscursive rule-narratives were rare, as well as students’ use of mathematics-general keywords characteristic to \({D}_{3}\) (e.g., “counterexample,” “claim,” “contradiction”).

6 Summary and discussion

We join scholars who maintain that it is unrealistic to expect newcomers to discover on their own how proof functions in mathematics (e.g., Hanna & Jahnke, 1996; Stylianides & Stylianides, 2022). Accordingly, we proposed that direct engagement with RoPs may be of didactical value for proof learners. Building on the commognitive framework, we distinguished between three types of mathematical discourses: \({D}_{1}\) where the production of new narratives is treated as simultaneous substantiation of these narratives; \({D}_{2}\) where narrative generation and substantiation are distinct, and the latter is conducted via the implementation of RoPs; and \({D}_{3}\), a metadiscourse of \({D}_{2}\), where RoPs constitute an object of communication. Then, we proposed that students’ transition to \({D}_{2}\) might be supported by explicit engagement in \({D}_{3}\).

To trigger this engagement, RoPs were explicitly presented in the classroom first, and then, students were asked to script a fictional dialogue about a mistake they experienced with the new activity. The analysis was consistent with our conceptualization: while the implementation of RoPs was evident in students’ proofs (i.e., \({D}_{2}\)), the scripts provided an arena for rule formulation, explanation, and justification (i.e., \({D}_{3}\)). Indeed, most of our students used the voices of their fictitious characters to discuss the rules. These findings strengthen previous research on the potential of scriptwriting to advance proof learning and analyze this process (e.g., Brown, 2018; Gholamazad, 2007). That said, we acknowledge that many facets of students’ emerging discourses remained outside the scope of their eventual scripts. Hence, we concur with Zazkis and Cook (2018), who proposed that scriptwriting may be a useful complementary but not necessarily an exclusive way to study learners’ transition to proof-based mathematics.

RoPs discussed in the explored classroom are not new to the mathematics education literature. Specifically, research has been reporting on students’ struggle with these rules, as it emerged from students’ infelicitous proof attempts (e.g., Selden & Selden, 1987; Weber, 2002). Our study adds to this body of research by demonstrating that after a relatively short period of explicit teaching, some students can operate with RoPs conventionally and communicate about them on some metalevel. Notwithstanding, we acknowledge that our findings emerged from a special cohort that is hardly representative of a broader student population. Therefore, we call for further research into how newcomers come to grips with RoPs when being taught them explicitly.

Skeptics may argue that we should not “read too much” into the students’ discussions as they could be “just” drawing on RoPs that the teacher demonstrated in the classroom. Indeed, some rules addressed in the students’ scripts were similar to Patrick’s. Moreover, the students’ reliance on the classroom rules was possible since they had access to lecture captures and notes. But even so, we do not take the students’ discussions for granted. First, from the commognitive standpoint, transitioning to a new discourse is impossible without a ritual phase, where learners imitate discourse oldtimers (Sfard, 2008). In other words, ritual participation is an unavoidable learning phase. Second, an overlap between the students’ and Patrick’s rules draws attention to the impact of proof instruction on students’ proof learning. In their comprehensive review of the literature on university proof-based courses, Melhuish et al. (2022) conclude that “we know less about how lecturers’ actions influence students’ learning. […] We do not understand the consequences of [lecturers’ didactical] choices” (p. 11). The identified overlap suggests that the way a teacher formulates, substantiates, and implements RoPs may become a reference point for students’ sequential use of the rules. Third, the students’ deviations from the presented rules illustrate that explicit teaching of RoPs may be insufficient for students to follow them to the letter. Indeed, violations of the classroom rules were not rare in our data, especially when students attempted to prove invalid statements and reject true ones. Teaching also does not explain myriad qualitative differences between the students’ and the teacher’s versions of the rules. Section 5.1 shows that some students’ narratives expanded, elaborated, innovated, and conflicted with the teacher’s formulations. This provides evidence that articulating these rules was not an act of ritual repetition for the students. Colloquially speaking, the students’ RoPs often appeared fuller, crispier, and more precise compared to what was taught. These findings make us believe that the relationship between teaching and learning RoPs is more complex than may appear, and we encourage further research into this issue.

The identified similarities and differences between the students’ and teacher’s RoPs can be considered through the lens of deritualization, i.e., newcomers’ shifting from rituals to outcome-oriented participation in the discourse of proof-based mathematics. Even in the case of similar rules, the students generated them in the context of statements and proofs that were different from those discussed in the classroom. This can be viewed as an expansion of the rules’ scope of applicability. Some students elaborated on the reasonableness of the rules, which is a marker of substantiability. Agentivity was evident in most scripts, especially those where the students expanded, elaborated, and innovated compared to what was taught in a classroom. That said, it is important to remember that these conclusions pertain to students’ participation in a proof-based version of a single discourse (mostly a numerical one). Thereby, there is no reason to presume that one’s sensitivity to RoPs in a particular discourse marks their competency with RoPs in general. Further research is needed to reveal whether and how newcomers’ familiarity with proof in one mathematics area transects to another area.

Deritualization markers of objectification and canonicality are also of interest. Regarding the former, we found that the students discussed restricting rules more frequently than the guiding ones. On the one hand, this finding is consistent with Patrick’s rule introduction. This may also be a side effect of our task, which asked students to discuss a problematic issue, potentially nudging them to focus on what is not supposed to happen in a proof. On the other hand, the formulation of many guiding rules requires a special vocabulary that our students did not possess yet. Accordingly, their focus on restricting rules could be a matter of necessity rather than choice. An indirect support to this hypothesis comes from the fact that the students’ attempts to formulate guiding rules often involved deictic language and non-objectified formulations. Further research may explore whether the capabilities to communicate about restricting and guiding RoPs develop in parallel.

Lastly, Section 5.2 demonstrates that the relation between one’s communication about a RoP and its implementation is not straightforward. On the one hand, we found that in nearly 70% of the submissions, conventionally endorsed rules were systematically enacted by the students in their proofs. In some submissions, the students generated different versions of the same RoP, flexibly shifting between formulations on metamathematical and discourse-specific levels while successfully implementing the rule in a proof. This flexibility can be viewed as another marker of deritualization. On the other hand, we found instances of conventional rule enactment that were accompanied by radical and overgeneralized rule formulations. In some submissions, the students endorsed restricting rules but enacted guiding ones. These findings resonate with research on students’ learning to work with definitions (e.g., Hershkowitz & Vinner, 1984). Similarly to RoPs, definitions of mathematical objects constitute metarules, prescribing one’s manipulations with object instances (e.g., determining whether something is an object example or a non-example). Yet, a commonly reported finding is that students can cite a conventional definition while struggling with its implementation (e.g., Tabach & Nachlieli, 2015). Our findings attest to an opposite phenomenon: the students’ implementations of RoPs were more conventional than the communications about them. Notably, some version of this phenomenon was also evident in Patrick’s teaching: despite his advanced experience with proof and proof teaching, it was easier for him to demonstrate the implementation of RoPs rather than to capture them in words. This may suggest that challenges with explicit communication about RoPs may not be an exclusive appanage of newcomers to proof.