Argumentation and proof are undoubtedly complex and challenging parts of mathematics. In particular, during the transition from school to university, which coincides in many countries (such as Germany) with the introduction to proof-based mathematics, several activities related to proof have been shown to be challenging for many students (e.g., Kempen, 2019; Moore, 1994; Recio & Godino, 2001; Weber, 2001). Furthermore, since proof and argumentation are also major learning goals in school mathematics (e.g., Kultusministerkonferenz, 2012), it is important to investigate the knowledge and difficulties of (preservice) teachers. When teachers have an insufficient understanding of proof—which research suggests—it is not surprising that students do so as well (e.g., Reid & Knipping, 2010).

In Chapter 3, research findings on students’ and teachers’ proof skills and understanding were discussed. Thereby, several gaps could be identified. So far, research on proof and argumentation has mainly focused on activities related to the construction of novel arguments and partly to the reading of given arguments (Mejía Ramos & Inglis, 2009a; Sommerhoff et al., 2015). The comprehension of statements which are to be proven—or for which a proof has to be read—and underlying principles, however, have largely been neglected in research. Understanding the generality of mathematical statements and proofs is an essential part of the comprehension of statements and students’ proof skills, because it is the mathematical generality that is the defining element of mathematical proof and what makes mathematics unique (see Section 2.2). However, to my knowledge, neither the extent to which students lack understanding of the generality of statements nor the relation to reading and constructing different types of arguments have been researched yet. Therefore, this thesis particularly aims at investigating students’ understanding of the generality of statements and its potential connections to activities related to proof. This seems especially relevant for studies that have reported on students’ high conviction regarding empirical arguments (and their validity) (e.g., Healy & Hoyles, 2000; Martin & Harel, 1989) and students’ belief that it is impossible to prove a universal statement (for every case) at all (Chazan, 1993). Such an understanding of proof might be related to an insufficient understanding of generality. Without an understanding of the generality of statements, it might be difficult for students to develop an intellectual need for proof and appropriate conceptions of proof. Further, because previous studies have reported ambiguous research findings regarding students’ conviction, comprehension, and construction of different types of arguments (see Sections 3.2.3, 3.2.4, and 3.2.5, respectively), the present study aims to provide more clarity on this matter and to investigate the relation between students’ understanding and conviction of different types of arguments and their understanding of generality. The types of arguments that are of interest in this thesis are empirical arguments, generic proofs, and ordinary proofs, because of their prominent role in mathematics education (see Sections 2.3.2 and 2.4.2). Because research findings have provided evidence for the influence of the truth value on students’ performance in several proof-related activities, such as the estimation of truth (e.g., Barkai et al., 2002), both true and false statements were considered. Moreover, the influence of the familiarity with statements (and arguments) on the understanding and acceptance of proof has often been highlighted in the literature (e.g., Dubinsky & Yiparaki, 2000; Hanna, 1989; Stylianides, 2007; Weber & Czocher, 2019). The familiarity with the statement was therefore also considered as a potential influence on students’ understanding of generality of statements and their performance in other proof-related activities. In short, in this study, the type of statement refers to characterizing statements by their truth value and students’ (expected) familiarity with the statement.

The research framework is based on the adapted framework shown in Figure 3.2. That is, it is assumed that reading a statement, in particular estimating its truth, is influenced by the reading and/or construction of arguments. Understanding the generality of a universal statement, as part of the comprehension of statements and underlying (logical) principles, is defined here as consistent responses regarding the estimation of truth and the potential existence of counterexamples (see Sections 2.1 and 2.2, as well as Section 5.3.5 for the exact definition used in this study).

To control for individual differences in students’ responses, resources and background information were taken into account. Thereby, the focus was on cognitive resources, because firstly, research on non-cognitive resources such as beliefs and affects has not provided evidence for major direct effects independent of cognitive resources (e.g., Furinghetti & Morselli, 2009; Herppich et al., 2017; Semeraro, Giofrè, Coppola, Lucangeli, & Cassibba, 2020), and secondly, the scope of the survey should be reasonable for participants (see also Sommerhoff, 2017). Cognitive resources that are commonly considered (see Section 3.3) are content-specific knowledge, domain-specific knowledge (such as mathematical strategic knowledge), and domain-general knowledge (such as problem-solving skills and general reasoning skills). Due to its short length and positive correlation with problem-solving and reasoning skills, a cognitive reflection test (see Section 5.3.6) was used to control for individual differences in participants’ general cognitive skills. Further control variables are specified in Section 5.4.1. Figure 4.1 provides an overview of the research questions and their relation.

Figure 4.1
figure 1

Overview and relation of research questions

The focus of the first set of research questions is on students’ conviction of the truth of universal statements and potential relations to reading different types of arguments. Students’ conviction of the truth of statements was thereby divided into students’ performances in two proof-related activities: the estimation of truth and proof evaluation regarding conviction. Firstly, the potential influence of reading different types of arguments on students’ estimation of truth was investigated. As previous research has provided evidence for the influence of characteristics of the statement on the estimation of truth (e.g., Barkai et al., 2002, see Section 3.2.2), the effect of the type of statement was also analyzed. The second research question in this set aims at investigating students’ evaluation of different types of arguments. More precisely, the aim is to find out how students rate the conviction they gain regarding the truth of statements from reading different types of arguments. In contrast to other studies (e.g., Martin & Harel, 1989; Stylianou et al., 2015; Tabach, Levenson, et al., 2010; Ufer et al., 2009), the interest is not on students’ validation of arguments (i.e., which arguments students identify as valid proofs). Furthermore, aspects of arguments students claim to find not convincing were identified. It was of particular interest if students refer to the (lack of) generality and what role the comprehension of the statement and proof plays regarding conviction.

FormalPara RQ1: Conviction of the truth of universal statements and its relation to reading different types of arguments
  1. RQ1.1:

    How do the type of argument and the type of statement influence students’ estimation of the truth of universal statements?

  2. RQ1.2:

    How do the type of argument, the type of statement, and the level of comprehension influence how convincing students find different types of arguments? What aspects of mathematical arguments do students identify as not convincing?

Regarding students’ conviction of arguments (re. RQ1.2), I expected significant differences regarding the evaluation of different types of arguments, with ordinary and generic proofs receiving higher levels of conviction than empirical arguments. Even though findings on students’ conviction of different types of arguments are ambiguous (see Section 3.2.3), many university students—which are investigated in this project—seemingly (and desirably) tend to be more convinced by deductive proofs than by empirical ones (e.g., D. Miller & CadwalladerOlsker, 2020; Weber, 2010). A positive effect was also expected regarding familiar statements, because the role of being familiar with statements and modes of argumentations for the acceptance of proof has been highlighted in the literature (e.g., Hanna, 1989). However, so far, no such influence on proof evaluation has been shown (e.g., Kempen, 2021; Martin & Harel, 1989). As research findings suggest that the comprehension of the argument is important regarding students’ conviction of it (e.g., Sommerhoff & Ufer, 2019; Weber, 2010), a positive effect was expected, i.e., students’ with higher levels of comprehension also have higher levels of conviction. Further, it was expected that many students would claim to have difficulties with understanding the argument when asked about aspects that they find not convincing. I also expected students to refer to the generality of the argument, in particular regarding empirical arguments, but also generic proofs, because generality has also been identified as an important aspect in previous research (e.g., Bieda & Lepak, 2014; Ko & Knuth, 2013; Lesseig et al., 2019; Tabach, Barkai, et al., 2010).

In contrast, regarding the influence of different types of arguments on students’ estimation of truth (re. RQ1.1), no studies have been conducted so far that would lead to a respective hypothesis. If students’ evaluate the proofs regarding conviction based on their actual conviction of the truth of the statement, I would expect a similar influence of different types of arguments on students’ estimation of truth, that is, students should be more likely to evaluate (true) statements as true when reading generic and ordinary proofs than when they receive empirical arguments. Otherwise, their responses would be inconsistent. However, previous research findings suggest that students as well as mathematicians often use empirical arguments to estimate the truth value of a statement before proving it (e.g., Alcock & Inglis, 2008; Buchbinder & Zaslavsky, 2007; Lockwood et al., 2016). Therefore, it would also be possible that empirical arguments provide students with an intuition of the truth value of the statement, and thus, make them more likely to evaluate the statement as true. As mentioned above, prior research suggests that the truth value of a statement as well as being familiar with a statement may influence students’ (and teachers’) estimation of truth (e.g., Barkai et al., 2002; Buchbinder & Zaslavsky, 2007; Dubinsky & Yiparaki, 2000; Hanna, 1989; Ko, 2011). In particular, researchers have argued that it is more difficult for students to correctly estimate the truth value of false statements than the truth value of true statements (Buchbinder & Zaslavsky, 2007; Ko, 2011). Therefore, I hypothesized a negative effect for the false statement compared to true statements. Regarding familiar statements, I expected a positive effect on students’ estimation of truth as suggested by other researchers (Dubinsky & Yiparaki, 2000; Hanna, 1989; Stylianides, 2007; Weber & Czocher, 2019), and simply because students’ should have gained (extensive) experience with these statements during school.

The second set of research questions refers to the comprehension of generic and ordinary proofs. Firstly, I investigated the effect of the type of proof and the familiarity with the statement on students’ self-reported proof comprehension. Since proof comprehension relates only to the understanding of correct proofs (see, e.g., Neuhaus-Eckhardt, 2022), I excluded the false statement in the analysis. The second research question then aimed at identifying aspects of generic as well as ordinary proofs that students claim to have not understood. Of particular interest was to investigate to what extent students identify different aspects in these types of proofs that they claim to not understand.

FormalPara RQ2: Proof comprehension
  1. RQ2.1:

    How does students’ (self-reported) proof comprehension differ between students who receive generic proofs and those who receive ordinary proofs? How does the familiarity with the statement influence students’ proof comprehension?

  2. RQ2.2:

    What aspects of mathematical arguments do students identify as not understandable? How do these aspects differ regarding generic and ordinary proofs?

As was pointed out in Section 3.2.4, only few studies have investigated differences in students’ proof comprehension regarding different types of arguments. With respect to generic and ordinary proof, research findings are ambiguous. However, experimental studies suggest that no differences in students’ comprehension of generic and ordinary proof exist (e.g., Lew et al., 2020). These studies made use of proof comprehension tests, while the present study relied on students’ self-reported level of comprehension and aspects they did not understand. I nevertheless expected similar findings, namely, no significant difference between students’ comprehension of the arguments (re. RQ2.1). Similar to proof evaluation, I expected a positive influence on proof comprehension for familiar statements in contrast to unfamiliar statements. Based on previous research on proof comprehension (e.g., Conradie & Frith, 2000; Moore, 1994; Neuhaus-Eckhardt, 2022; Reiss & Heinze, 2000, see Section 3.2.4), it was expected that students mainly refer to local aspects (Mejía Ramos et al. , 2012) such as surface features (A. Selden & Selden , 2003), when asked what they did not understand. In particular, I expected students to refer to not knowing definitions, expressions, and the meaning of terms, and to a lesser extent to the logical structure of the arguments (re. RQ2.2). It was further expected that the (un)familiarity with the form of generic proofs would lead to a greater proportion of students referring to the proof framework when asked what they did not understand in comparison with ordinary proof.

The focus of the third set of research questions is on the types of arguments students use to justify the truth or falsity of universal statements. At first, students’ responses were allocated to different proof schemes (coding categories were based on Section 3.2.5). Then, the relation between students’ proof schemes and their level of conviction regarding the truth of statements was investigated to identify differences with respect to the type of proof scheme. This research question derived from the discussion about relative and absolute conviction, initiated by Weber and Mejia-Ramos (2015) (see paragraph on the Assessment of Research Findings on Proof Evaluation in Section 3.2.3).

FormalPara RQ3: Construction of arguments to justify the truth of universal statements (students’ proof schemes)
  1. RQ3.1:

    What types of arguments do students themselves use to justify the truth or falsity of a universal statement? How do students’ proof schemes differ regarding the type of statement (i.e., familiarity and truth value)?

  2. RQ3.2:

    What potential relation between the type of argument used by students and the level of conviction of the truth of the statement exists?

Research findings discussed in Section 3.2.5 suggest that students (and teachers) mainly have empirical proof schemes and, depending on the context, external conviction proof schemes (e.g., Barkai et al., 2002; Bell, 1976; Recio & Godino, 2001; Sevimli, 2018; Stylianou et al., 2006). Thus, it was expected that the majority of students would use empirical arguments, in particular regarding unfamiliar statements, and to a lesser degree, regarding familiar statements, external proof schemes such as authoritarian arguments (re. RQ3.1). I expected only few students to use deductive arguments. The relation between the type of proof scheme and students’ level of conviction of the truth of the statement is less clear (re. RQ3.2). To my knowledge, no studies have explicitly investigated this questions. However, students who are able to construct a (valid) proof might be more convinced of the truth of a statement—having absolute conviction—than students who only use empirical arguments—who might only have relative conviction. On the other hand, as Polya (1954) has pointed out, “without ... confidence [in the truth of the theorem] we would have scarcely found the courage to undertake the proof ...” (pp. 83–84), which suggests that a proof might not be necessary to gain high levels of conviction in the truth of a statement.

Lastly, the fourth set of research questions investigates the primary outcome variable of this study—students’ understanding of the generality of mathematical statements and potential relations to proof reading and construction. Due to the scarcity of related research, the purpose of the first question in this set is to find out the proportion of first-year university students—enrolled in different study programs, namely, primary, lower secondary, and upper secondary education as well as mathematics—who have limited understanding of the generality of mathematical statements. The remaining research questions in this set then aim at identifying factors that potentially influence students’ understanding of the generality of statements. Of particular interest is the influence of reading different types of arguments—empirical arguments, generic proofs, ordinary proofs, or no arguments at all (RQ4.2). Further, it was investigated how the type of statement—both its truth value and the familiarity with the statement—influences students’ understanding of generality. The third question in this set aims at investigating the influence of students’ proof comprehension and level of conviction on students’ understanding of generality of statements. Moreover, the potential relation between types of arguments students use to justify a statement (proof schemes) and their understanding of generality was examined (RQ4.4).

FormalPara RQ4: Students’ understanding of the generality of mathematical statements
  1. RQ4.1:

    What proportion of first-year university students have a correct understanding of the generality of statements?

  2. RQ4.2:

    What is the influence of reading different types of arguments on students’ understanding of the generality of mathematical statements? How does the type of statement influence students’ understanding of its generality?

  3. RQ4.3:

    How does students’ comprehension and conviction of arguments influence their understanding of generality of statements?

  4. RQ4.4:

    What potential relation exists between students’ proof schemes and their understanding of the generality of statements?

Only few studies reported on students understanding of the generality of statements (e.g., Balacheff, 1988b; Buchbinder & Zaslavsky, 2019; Chazan, 1993; Galbraith, 1981). In particular, I am not aware of (large-scale) quantitative studies that have explicitly investigated students’ understanding of the generality of mathematical statements without relating it to the understanding of generality of proof. Based on the findings reported in prior studies (e.g., Buchbinder & Zaslavsky, 2019), it was expected that only a minority of students would have limited understanding of the generality of statements, with a higher proportion regarding preservice primary school teachers and a lower proportion regarding mathematics majors (re. RQ4.1). Even though the study was conducted at the beginning of their first semester, which implies no influence of the study program itself yet, it can be assumed that the students’ choice for a respective study program (e.g., primary education vs mathematics major) is influenced by resources that also influence their understanding of proof (and generality), for instance, their previous mathematical knowledge based on their school education.

From the studies conducted so far, it is unclear if the understanding of generality differs by the (type of) statement and/or is related to the reading and construction of (different types of) arguments, or if it is solely influenced by the knowledge of the meaning of mathematical generality. To my knowledge, no studies on the influence of the type of argument on students’ understanding of generality (of statements) have been conducted so far. As has been mentioned above, research that has investigated the influence of the type of argument on other activities, for instance, proof comprehension, suggests no or only minor effects with respect to generic and ordinary proof (e.g., Lew et al., 2020). However, findings on students’ proof evaluation reported by Kempen (2018, 2021) suggest that students are less convinced by empirical arguments than by generic and ordinary proofs. What remains unclear and is difficult to derive from these findings is if the understanding of generality of statements is affected by the reading of (different types of) arguments at all. If a relation exists, a positive, but weak correlation regarding analytical arguments (generic and ordinary proofs) compared to reading no arguments or empirical arguments was hypothesized (re. RQ4.2), because valid proofs should at least in theory provide the reader with certainty that no counterexamples to true universal statements exist (Reid & Knipping , 2010). A similar hypothesis was made regarding potential correlations between students’ proof schemes and their understanding of generality, that is, it was expected that students with empirical proof schemes have limited understanding of the generality of statements (see also Conner, 2022) and students with deductive proof schemes are more likely to have a correct understanding (re. RQ4.4). Regarding the effect of the truth value of statements on students’ performance in proof-related activities, two contradictory observations have been reported. On the one hand, students seem to be more familiar with (proving) true statements than with (disproving) false ones, which suggests that they might be more successful in handling them (Buchbinder & Zaslavsky, 2007; Ko, 2011). On the other hand, it seems to be easier to disprove false universal statements by providing a counterexamples than to prove a true universal statement (Barkai et al. , 2002). In this case, the role of a counterexample might be more apparent for students, thus, resulting in a better understanding of the (absence of) generality of the statement. Previous research findings suggest that the familiarity with a statement seems to play no or only a minor role for proof evaluation (e.g., Kempen, 2021; Martin & Harel, 1989), even though its general importance for the acceptance of proof has often been highlighted (e.g., by Hanna, 1989). However, the familiarity with the statement might nevertheless lead to a higher awareness of its generality, as students would have applied the statement to many cases before. Therefore, a positive effect with respect to familiar as well as false statements was expected (re. RQ4.2). The relation between students’ understanding of generality and their level of comprehension as well as conviction is also difficult to derive from prior research. A positive relation regarding both activities might be expected, because higher levels of conviction and comprehension could potentially result in higher levels of certainty regarding the absence of counterexamples (re. RQ4.3).