We are not very pleased when we are forced to accept a mathematical truth by virtue of a complicated chain of formal conclusions and computations, which we traverse blindly, link by link, feeling our way by thought. We want first an overview of the aim and the road; we want to understand the idea of the proof, the deeper context. Hermann Weyl (Weyl 1932).

1 Introduction

In the last three to four decades the learning and teaching of proof has been a central focus of research in mathematics education throughout the world. This was due to the recognition of the critical role proof plays in mathematics as a discipline and the role it plays in advancing students’ conceptualization of and beliefs about mathematics. Against this recognition stood the overwhelming findings about the difficulties students experience in understanding, producing, and appreciating proofs (Stylianides et al., 2016 ). One of the aspects addressed by this research concerned the role of proof as an explanatory process. Traditionally, the question whether a mathematical proof is explanatory belonged to philosophy and history of mathematics, but pedagogical aspects of this question have been addressed by mathematics educators as well (e.g., Hanna & Jahnke, 1996; Balacheff, 2010; Nunokawa, 2006; Sytlianides et al. 2016; Inglis and Mejia-Ramos, 2019; Harel, 2013). In essence, the philosophical debate on mathematical explanation revolves around the question, what distinguishes explanatory proofs from non-explanatory proofs? I will argue later in this paper that in the eyes of the theoretical framework underlying this study, this question, as stated, is misdirected. Nevertheless, at the heart of this question is another question: What is explanation? Frans’ (2021) position that understanding is a condition for explanation suitably relates to our concern. But what is understanding? More relevantly, what is mathematical understanding? Frans focuses on unificatory understanding, “the type of understanding that involves seeing how phenomena are the result of a general pattern and not as a collection of isolated events,… [a] type of understanding [that is] valuable in mathematics.” (p. 1106). While unificatory understanding is undoubtedly a genuine type of understanding—especially in advanced subjects such as linear algebra, which is the referent subject of this paper—it is only one type of mathematical understanding. Other types exist, including understanding resulting from applications of mathematics in natural science and engineering, the discovery and accountability for the laws of the physical world, visualization, abduction, analogies, and empirical observations.

The goal of this paper is to offer a conceptual framework, illustrated by learning-teaching events, for constituent elements of one form of mathematical understanding, called epistemological justification (Harel, 2013). Epistemological justification manifests itself with an individual or a community through perturbation-resolution cycles revolving around the questions, why and how was a piece of mathematical knowledge conceived? As such, epistemological justification goes beyond mere comprehension and validity of concepts and claims into the reasons—conceptualized as such by an individual or a community—for the origins of mathematical knowledge. Epistemological justification, thus, is not limited to proof but pertains to any piece of mathematical knowledge. The goal of this paper is to address cognitive and instructional aspects of epistemological justification.

The paper is comprised of five sections aside from this first introductory section: Sect. 2 situates epistemological justification into the philosophical debate on mathematical explanation by pointing to their shared concerns on the one hand and fundamental differences on the other hand. Section 3 provides a theoretical framework for epistemological justification through discussions of three questions: (a) What exactly is epistemological justification? (b) What is its underlying theoretical basis? (c) What are the criteria for its occurrence? Section 4 outlines the experimental data sources for the learning-teaching events used to illustrate epistemological justification. Section 5 discusses the question: (d) What examples of instances of students’ mathematical behaviors that manifest epistemological justifications? Section 6 concludes with implications for instruction and research by addressing the question: (e) What instructional approaches might facilitate the development of ways of thinking that can enable students to seek not only certainty but also enlightenment in the form of epistemological justification?

2 Relation of epistemological justification to mathematical explanation

This paper concerns cognitive and instructional aspects of epistemological justification, not its philosophical underpinning. It is natural, however, to wonder about its relation to mathematical explanation. The goal of this section is to point out some shared concerns and differences between the two notions.

Steiner (1978), in a landmark publication, debated the question, are mathematical proofs explanatory? He used an ontological characterization to distinguish between explanatory proofs and non-explanatory proofs; namely, an explanatory proof is one that “makes reference to a characterizing property of an entity or structure mentioned in the theorem, such that from the proof it is evident that the result depends on the property.” (p. 143). Hafner and Mancosu (2005), among others, pointed to difficulties with Steiner’s approach on the account that the term “characterizing property” is too vague to determine whether a proof is explanatory and offer examples which count as explanatory proofs according to Steiner’s definition and yet they involve objects that lack characterizing properties. Resnik and Kushner (1987), too, reject Steiner’s objective characterization and offer instead a distinction that is relative to the new episteme one gains from the proof: the more why-questions a proof can help to answer, the more explanatory it is. Weber and Verhoeven (2002) agree with Resnik and Kushner’s rejection of Steiner’s criteria but articulate the importance of Steiner’s theory and further offer a way to “fix” it by exploring the knowledge a person gains from being aware of and acts upon what the person conceives as a “characterizing property”.

This debate is intrinsically linked to a general question that had occupied philosophers of the 16th-17th century; namely, is mathematics a “scientific” domain? Adhering to the Aristotelian’s definition of “scientific” as “causal”, some philosophers of the Renaissance argued that mathematics is not a perfect science because mathematical proofs—proofs by contradiction, for example—are concerned with mere certainty rather than causality (Mancosu, 1996). They further point to logically equivalent statements in mathematics to support their argument that mathematics implications are not scientific in the Aristotelian’s sense. If mathematics were causal, they posit, then the dual implications in “a if and only if b” would mean a causes b and b causes a, which implies that a causes itself—an absurdity. As I will discuss in Sect. 5, certain students’ mathematical behavior in relation to proofs can, hypothetically, be accounted for in terms of the need for causality, which is reminiscent of the intellectual perturbation that had instigated this philosophical debate.

Mathematical explanation, thus, concerns the nature of mathematical understanding involved in proving. Inglis and Mejia-Ramos (2019) used Wilkenfeld’s (2014) notion of functional explanation—“things that in an appropriate manner and at an appropriate time generate understanding” (p. 56369)—to show that philosophical accounts of mathematical explanation are consequences of Wilkenfeld’s functional account and human cognitive architecture. Understanding, however, is achieved through questions. As Resnik and Kushner (1987) indicate, the more why-questions a proof can help to answer, the more explanatory it is for the person who asks the questions.

Epistemological justification, like mathematical explanation, concerns proof understanding, but it is about a specific kind of understanding that provides an answer to a specific kind of question: why and how a piece of knowledge came to be? A proof can be viewed as explanatory by an individual and yet he or she might seek to find out what intellectually necessitated the idea underlying it—its epistemological justification, that is. For example, learners who have developed the habit of seeking epistemological justifications often are not satisfied when their successful construction of a proof was aided by a hint. They would seek to find out what necessitated the invention of the intermediate step(s) offered by the hint. Likewise, these learners, upon obtaining or presented with a counterexample to a particular assertion, would seek to understand the foundational reason for the existence of the counterexample—whereby seeking to ensure that the counterexample was not the result of a random successful search, but a necessary outcome from the structure at hand. Other learners avoid proof by contradiction—some outright reject it (i.e., constructivist mathematicians; see, for example, Bishop, 1967)—on the account that while this method of proof provides certainty it does not demonstrate cause.

Epistemological justification differs from mathematical explanation in another aspect. Namely, while the latter concerns mathematical proofs and only mathematical proofs, epistemological justification concerns mathematical knowledge broadly: why and how a piece of mathematical knowledge—an axiom, definition, theorem, proof, counterexample, or abstract structure—was conceived.

3 Theoretical framework

This section outlines a theoretical framework for epistemological justification. The framework resides within the DNR-based instruction framework (DNR stands for the framework’s three foundational principles: duality, necessity, and repeated reasoning). DNR was discussed in length elsewhere (e.g., Harel, 2008a, b, c) and so it will not be elaborated upon here. However, to make the paper somehow self-contained within its page limit, this section will provide a skeletal description of certain elements of DNR, focusing on those that underly the definition of epistemological justification and the observational criteria for its occurrence. The elements are:

  1. (a)

    A subset of DNR premises underlying knowledge and knowing, as well as their linkage and subjective nature.

  2. (b)

    Definition of epistemological justification and its antecedent concept of intellectual need.

  3. (c)

    A refined taxonomy of intellectual need.

  4. (d)

    Observational criteria for epistemological justification.

3.1 Knowledge and knowing: their linkage and subjectivity

DNR is based on eight premises, seven of which were discerned from or based on known theories. Four of these premises are highly relevant to epistemological justification. The first premise, called the knowledge of mathematics premise was theorized in Harel (2008a). It states that knowledge of mathematics consists of two related but different categories of knowledge: all the ways of understanding and ways of thinking—terms to be described shortly—that have been institutionalized throughout history. The second premise, called the knowing premise is after Piaget (1985). It states that knowing is a developmental process that proceeds through a continual tension between assimilation and accommodation, directed toward a (temporary) equilibrium. The third premise, called the knowledge-knowing linkage, too, is after Piaget, and is consistent with Brousseau’s (1997) notion of fundamental situation. It states that any piece of knowledge humans know is an outcome of their resolution of a problematic situation, conceived as such by them. Lastly, the fourth premise, called the subjectivity premise is also Piagetian. It states that any observations humans claim to have made are due to what their mental structure attributes to their environment. This premise is the basis for rejecting the objective ontological dichotomy between “proof that proves” and “proof that explains” independent of the epistemic subject’s conceptualization of the proof.

The constructs way of understanding and way of thinking are defined precisely and discussed broadly in various publications (e.g., Harel, 2008a; Thompson et al., 2014). The following descriptions are sufficient for our discussions here. A way of understanding is the specific meaning that results from having assimilated to a scheme. A way of thinking, on the other hand, is the habitual anticipation of specific meanings in reasoning. Examples of ways of understanding include one’s definition of multiplication of a matrix \($$$$ {A}_{m\times n}=[{A}_{1}\dots {A}_{n}]$$$$\) by a vector \($$$$ v={[{a}_{1}\dots {a}_{n}]}^{T}$$$$\) as the column vector \($$$$ {\sum }_{i=1}^{n}{a}_{i}{A}_{i}$$$$\); one’s meaning of row reduction as a mere algorithm for solving system of linear equations \($$$$ Ax=b$$$$\) devoid of understanding why row the algorithm preserves the solution set; or one’s meaning of invertible matrix is in terms of one of its properties rather than in terms of its core definition; or one’s meaning of linear independence as a list of vectors where one of the vectors in the list is not a linear combination of the other vectors in the list (an erroneous way of understanding resulting from an faulty negation of the definition of linear dependence). The term “meaning” here is used broadly. One’s justification for an argument or a solution of a problem, irrespective if correct or erroneous, are also examples of ways of understanding.

A student’s habitual anticipation in reasoning—his or her way of thinking, that is—might be procedural, in that it is typically restricted to how to obtain a result rather than seeking to know what makes the result the way it is. Likewise, the habitual anticipation that, for example, a concept can be understood in different ways, and it is advantageous to understand a concept in different ways, are instances of way of thinking.

Fig. 1
figure 1

Proof of the rank-nullity theorem

Epistemological justification is a way of thinking. To illustrate, before introducing its precise definition, consider the following episode. Figure 1 depicts a proof of the Rank-Nullity Theorem taken from Garcia and Horn (2017). I showed the proof to a mathematician with extensive experience teaching linear algebra. To ensure subjectivity, I only asked the mathematician to comment on the proof. The mathematician quietly studied the proof, and then exclaimed something to the effect that the proof is “unmotivated” (his word). Asked to elaborate, he responded that to understand the proof he had to translate it into linear transformations terms. Specifically, the mathematician explained that he thought of \($$$$ A$$$$\) as a matrix transformation from \($$$$ {F}^{n}$$$$\) to \($$$$ {F}^{m}$$$$\). As a linear transformation, \($$$$ A$$$$\) has a kernel and a range. Assuming \($$$$ dim\text{ker}A=r$$$$\), the question was, continued the mathematician to explain, what would \($$$$ dimrangeA$$$$\) be? Bringing to bear his meaning of linearity and linear combination, he mapped the product \($$$$ AW$$$$\) in the proof onto the \($$$$ span\left(A{w}_{1}, \dots, A{w}_{n-k}\right)$$$$\), where \($$$$ {w}_{1}, \dots, {w}_{n-k}$$$$\) are the columns of \($$$$ W$$$$\). The rest of the proof steps he then mapped onto the process which shows that \($$$$ A{w}_{1}, \dots, A{w}_{n-k}$$$$\) form a basis for \($$$$ colA$$$$\), concluding that \($$$$ rankA=n-r$$$$\). He then added that this is how he understood the proof and how he would teach it to his students.

We see here two important linear-algebraic ways of thinking, one rooted in matrix theory, where a matrix is conceived as a conceptual entity, the other in the theory of linear transformations, where a matrix is conceived as a process—a function that maps vectors to vectors. Also, consistent with the knowing premise, the mathematician in his attempt to comprehend the proof, situated it into a probe about the relation between dimker A and \($$$$ dimrangeA$$$$\). The mathematician used his own way of thinking to situate the proof into his scheme of actions. And in doing so, he constructed his own epistemological justification for the textbook’s proof. Cognitively and pedagogically, this is fundamentally different from comprehending the proof as is, without constructing a conceptual basis for its birth.

The indefinite article, “a”, in the latter clause is to highlight the subjectivity nature of epistemological justificationit is not the conceptual basis but a conceptual basis. The textbook, Matrices and Matlab (Marcus, 1993), for example, in its entire 710 pages does not include the term linear transformation. Its entire approach entails a linear algebraic way of thinking that takes a matrix as an entity constituted by the structure of its entries, rather than as a function that preserves linear combination. It is safe to stipulate, therefore, that the late Professor Marcus’s epistemological justifications for the construction of the content of his book was matrix-based. It is equally safe to stipulate that Garcia and Horn’s proofs throughout their comprehensive book are rooted in sophisticated practices of epistemological justification. The two ways of thinking are not mutually exclusive, however. Some linear algebra textbooks utilize a combination of the two approaches, as does Garcia and Horn’s book.

The focus of this section is not to address the relative pedagogical advantage of these ways of thinking; rather, the goal is to analyze the concept of epistemological justification by examining its theoretical basis and criteria for its presence. The above discussion was to illustrate the DNR notion of way of thinking and provide an initial image for the technical definition of epistemological justification.

3.2 Definitions and criteria

The definition of epistemological justification is formulated based on the above four premises. By the knowing-knowledge linkage premise, if an individual possesses a piece of knowledge K—which by the mathematics premise is either a way of understanding or a way of thinking—then, there exists a problematic situation P out of its resolution K was constructed. By the knowing premise and the subjectivity premise, P is subjective in that it is a perturbational state resulting from an individual’s encounter with a situation that is incompatible with, or presents a problem that is unsolvable by, her or his current knowledge. Such a problematic situation P, prior to the construction of K, is referred to as an individual’s intellectual need. One might experience P without ever constructing K. But if the person elicits K from a resolution of Pand is cognizant of how K resolves P, then we say that the person has constructed an epistemological justification for K. Colloquially, I describe epistemological justification as a person’s conception of the reason for the birth of a piece of knowledge.

Entailed from these definitions are criteria for the constitutive elements of epistemological justification; they are:

  1. (a)

    A subjective perturbational experience P constituting intellectual need for the learner, referred to as the intellectual need condition.

  2. (b)

    Elicitation of K from a resolution of \($$$$ P$$$$\), referred to as the elicitation condition.

  3. (c)

    Awareness by the learner of the elicitation process, referred to as the awareness condition.

There remains the question, what constitutes intellectual need? Elsewhere (Harel, 2013), I offered a taxonomy of five categories of intellectual need: need for certainty, need for causality, need for computation, need for communication, and need for structure. These needs were discussed extensively in Harel (2013) and will not be elaborated upon here. They are briefly outlined next.

3.2.1 Need for certainty

The need for certainty is a human’s desire to know whether a conjecture is true—whether it is a fact. Consonant with the subjectivity premise, when a person fulfills this need, through whatever means deemed appropriate by her or him, the person gains new knowledge—the knowledge that the conjecture is true or false.

3.2.2 Need for causality

The need for causality is the need to determine the cause of a phenomenon. It has roots in the history of the debate during the Renaissance about the scientificaness of mathematics, as was discussed in Sect. 2.

3.2.3 Need for computation

The need for computation encompasses various aspects of quantification and representation. Examples include the need to quantify a physical sensation (e.g., speed as quantification of “fastness”, weight of “heaviness”, directional derivative of steepness), the need to determine the value of an abstract object (e.g., dimension of a subspace, determinant of a matrix, orthogonality of functions), and the need to determine the values that satisfies quantitative constraints (e.g., solving a system of scalar or differential equations).

3.2.4 Need for communication

The need for communication refers to two acts: formulation and formalization. Formulation is the act of transforming strings of spoken language into algebraic expressions. Formalization is the act of externalizing the logical foundations underlying a mathematical concept or claim.

3.2.5 Need for structure

The need for structure encompasses a broad range of cases. It includes but not limited to instantiations of pattern generalization, reduction of an unfamiliar structure into a familiar one, and reasoning in terms of conceptual entities, what Dubinsky and McDonald (2001) call object conception.

In sum, any epistemological justification process is anchored in these five categories of intellectual needs, in that the latter serve as stimuli for the former.

4 Sources of illustrative events

This paper is theoretical, not empirical. As such, the episodes accompanying the discussions do not purport to serve as supporting empirical evidence; rather, they are merely illustrations for the theoretical analyses addressed in the paper. The illustrative events were taken from empirical studies, but these studies were not initially designed to investigate epistemological justification; rather, they were designed to understand the development of linear-algebraic knowledge among students, including the difficulties they encounter with foundational concepts and ideas of the field (see, for example, Harel, 2017). As often happens in research on student learning, certain segments of the data analysis invoke ideas and questions not initially intended as part of the study at hand. Epistemological justification, a notion which took years to form as a construct of the DNR framework, emerged as a side product of the analyses of these studies.

There were three data sources, all in linear algebra: the first was a teaching experiment with 12 in-service secondary school teachers; the second was conducted in the form of an exploratory teaching experiment (Steffe & Thompson, 2000) with a class of 48 undergraduate students; the third was a longitudinal case study with a single 11-year-old learner.

A lesson in the teaching experiment with teachers typically lasted 6 h, during which the participants worked collaboratively in small groups on linear algebra problems, followed by group presentations and whole group discussion. A lesson in the undergraduate teaching experiment lasted 110 min, twice a week for 10 weeks. The case study was conducted with a single learner, who currently is a six-grader (11 years old), referred to in this paper by the initials LB. The study began with LB’s first steps in forming his early counting schemes and continued progressively to arithmetic, Euclidean geometry, elementary algebra, calculus, and linear algebra, with recreational mathematics in between. The linear algebra program started as he entered fourth grade with systems of linear equations over the reals, gradually covering the usual terrain of matrix-based elementary linear algebra over the complex field. By the end of his fourth grade, LB began studying abstract vector spaces.

5 Learning-teaching epistemological justification events

This section discusses the question, what instances of students’ mathematical behaviors manifest epistemological justifications? These behaviors are illustrated in a series of events classified into two categories: discerned epistemological justification and embedded epistemological justification. Discerned epistemological justification occurs when an individual discerns, or attempts to discern, an epistemological justification for a piece of mathematical knowledge created by others, for example in the process of reading a mathematical text. An individual can successfully comprehend the text without ever attempting to discern an epistemological justification for it. In fact, current mathematical instruction is typically driven toward mere comprehension rather than construction of epistemological justification. Embedded epistemological justification occurs as an individual invents a piece of mathematical knowledge, for example as one solves a problem or constructs a proof.

5.1 Embedded epistemological justification

This section is structured around four learning-teaching events. Each event is labeled by the data source from which it was extracted (TE for Teaching Experiment; ETE for Exploratory Teaching Experiment; CS for Case Study) and the linear algebra concept it addresses.

5.1.1 Existence and uniqueness of solution (TE)

The event discussed below occurred as the teacher participants were gradually transitioning from questions about the validity of the Gauss-Jordan elimination process to theoretical questions about existence and uniqueness of solutions of linear systems. One of the conjectures produced by the teacher participants was: If the equations of a consistent × linear system S are independent, then S has a unique solution. The proof produced by the group is remarkable in its innovative quality, and it offers a glimpse into the conceptual transformation that occurred with the participants during the teaching experiment. As has been the case throughout the teaching experiment, the proofs offered by the participants were typically “messy” and mostly generic. In this case, the proof was formulated in the context of a general 3 × 3 linear system, but it was clear that conceptually the referent was a general n × n system. Figure 2 depicts the proof distilled by the teacher-researcher from the group’s presentation.

Fig. 2
figure 2

Representation of the participants’ proof of the theorem “If the equations of a consistent square system are independent, then the system has a unique solution.”

As was mentioned earlier, epistemological justification concerns the origins of mathematical knowledge broadly, not only proof. To illustrate this feature, the discussion below is divided into two parts. The first part deals with the elicitation of the theorem by the participants; the second part deals with their construction of its proof.

Elicitation of the theorem.

The investigation conducted by the participants was about the relationships among three constructs: order relation between the number of equations and number of variables in a linear system, dependency relation among the system’s equations, and the “size” of the system’s solution set (empty, singleton, or infinite). It is within this need that the participants elicited the above theorem, first in the form of a conjecture and then as an assertion to be proved. In this respect, the theorem as a piece of mathematical knowledge was elicited from the problematic situation characterized by the need for structure—specifically, the need to investigate relations among different mathematical constructs—whereby fulfilling both the intellectual need condition and the elicitation condition.

A possible indication for the awareness condition is that during the group presentation the participants described in detail how they encountered the need to systematically list the various possible relations among the above three constructs and how they gradually narrowed the list by eliminating redundancy and impossibilities into a short list of conjectures, among which was the theorem under consideration.

Construction of the proof.

The proof of the theorem was elicited from a problematic situation constituted by a combination of intellectual needs: the need for structure, the need for formulation and the need for formalization. The need for structure stemmed from the participants’ background knowledge about homogeneous systems, a topic addressed in an earlier stage of the teaching experiment. This need was expressed by one of the participants who suggested to consider the homogenous system associated with S (Step 2). The need for formalization manifested itself as the group struggled to persuade each other of certain arguments. Their debate converged into an agreement—to pursue the suggestion made by one of the group members to consider the previously established fact that if the system does not have a unique solution, then it must have a free variable (Step 1). This, in turn, raised the need to formulate this idea algebraically. Through sustained effort, the participants resolved this need by constructing a system analogous to the homogeneous system in Step 2. At this stage the participants encountered a roadblock, which required the intervention of the teacher-researcher, who suggested considering the fact stated in Step 3. This, in turn, led the participants to the breakthrough expressed in Steps 4. Following this, the participants hastily concluded Step 6, without ensuring that their conclusion requires that f is different from zero. This matter was resolved through further discussion with the teacher-researcher about the relation between the value of f and the assumption made in Step 1.

We see here cycles of perturbation-resolution pairs, illustrating the presence of the intellectual need and elicitation conditions. The data, however, provide no memory of the participants’ self-reflection and awareness of the elicitation processes.

5.1.2 Cross product (ETE)

The instructor posed the question “Given two noncollinear vectors\($$$$ a={[{a}_{1}, {a}_{2}, {a}_{3}]}^{T}and$$$$\)\($$$$ b={[{b}_{1}, {b}_{2}, {b}_{3}]}^{T}$$$$\), find a vector that is orthogonal to the plane spanned by them.” Students’ responses indicate that the problem constituted an intellectual need for them, in that they were engaged in attempts to look for ways to express algebraically conditions for a vector \($$$$ x={[{x}_{1}, {x}_{2}, {bx}_{3}]}^{T}$$$$\)be orthogonal to the plane spanned by a and b. This segment of the event, thus, fulfills the intellectual need condition.

At first, the students attempted to express the condition that x is orthogonal to the plane by stating that \($$$$ x\cdot z=0$$$$\) for each z that is a linear combination of a and b. With further discussion, they came to realize (first visually and later algebraically) that it is sufficient (and necessary) to require that the vector x is orthogonal to the noncollinear vectors a and b; that is, \($$$$ x\cdot a=0$$$$\) and \($$$$ x\cdot b=0$$$$\). Finding a vector x satisfying the latter two conditions amounted to solving a system of two linear equation with the three unknowns, \($$$$ {x}_{1},{x}_{2},{x}_{3}$$$$\). The solution obtained was\($$$$ x={[{{a}_{2}b}_{3}-{{a}_{3}b}_{2}, {{a}_{3}b}_{1}-{{a}_{1}b}_{3}, {{a}_{1}b}_{2}-{{a}_{2}b}_{1}]}^{T}$$$$\), which was then converted by the teacher-researcher into the symbolic determinant defining cross product. This segment of the event, thus, fulfills the elicitation condition, for the formula for obtaining a vector orthogonal to a plane was elicited by means of resolving a problematic situation understood as such by the students.

As to the awareness condition, I can only say that the instructor conducted a brief discussion reflecting on the students’ construction process of cross product. Absent from the data of this event is a clearer indication that the latter instructional move resulted in the fulfillment of this condition.

5.1.3 Rank (CS)

About two years after LB was first introduced to the concept of rank of a matrix, he was asked about the purpose of the concept of rank. The following paraphrased exchange captures the essence of the dialogue that ensued.

  1. 1.

    LB: The rank of a matrix A tells us the maximum number of linearly independent columns in A.

  2. 2.

    I: what questions might the concept of rank answer?

  3. 3.

    LB: If we know the rank of a matrix A, we know the number of free variables in the associated homogeneous system A. So, we know the dimension of its solution set.

  4. 4.

    I: How does rank provide this information?

  5. 5.

    LB: Because of the fundamental theorem of linear algebra [known also as the rank-nullity theorem], number of free variables plus rank equals number of columns.

  6. 6.

    I: Do you know the proof of this theorem?

LB answered affirmatively and upon request he provided a complete linear-transformation-based proof along the lines offered by the mathematician (Sect. 3).

To understand how this dialogue might be interpreted in terms of the epistemological justification criteria, it is necessary to analyze it in the context of how LB was initially introduced to the concept of rank, two years prior to this dialogue. Through repeated experience of solving systems of linear equations and representing their solutions in the vector form, \($$$$ x={v}_{0}+{t}_{1}{v}_{1}+\dots +{{t}_{k}v}_{k}$$$$\), LB came to recognize, gradually, that k represents the “size” of the solution set. In a web of activities involving the conceptual interlinks among linear combination, linear independence, basis, and dimension, “size” was necessitated and formulated in this context as “dimension”. This illustrates how a stimulus for an epistemological justification for the concept of rank was developed through a resolution of the need for computation, leading up to the formal concept of dimension. Considering this background, I stipulate that LB’s response indicates the fulfilment of the intellectual need condition and the elicitation condition.

Indication for LB’s awareness of the elicitation process might be inferred from the fact that LB recognized the mathematical reason for why rank determines the “size” of the solution set, not just as a fact but as an intellectual need.

5.1.4 Isomorphism (CS)

During a review session on elementary canonical forms, LB was asked why we care about diagonalizable operators. Below is a synopsis of the dialogue that ensued.

  1. 1.

    LB: If an operator T has an ordered basis \($$$$ \alpha $$$$\) of eigenvectors [for an n-dimensional vector space V over a field F], then we can easily determine if the operator is injective or surjective; and we can as well determine the dimensions of the operator’s null space and range.

  2. 2.

    I: How can we determine these properties and values of the operator from its matrix representation?

  3. 3.

    LB: It is because the explicit formula. The explicit formula establishes an isomorphism between \($$$$ \mathcal{L}(V,V)$$$$\) and \($$$$ {F}^{n}$$$$\). If we want to know something about a linear operator, the explicit formula allows us to look for it in the matrix representation.

  4. 4.

    I: can you give an example of how this is done?

  5. 5.

    LB: An example is when computing the eigenvalues of a linear operator through the characteristic polynomial of its matrix representation. Another example is injectivity. T is injective if and only if the matrix representation \($$$$ {\left[T\right]}_{\alpha }$$$$\)is invertible.

Upon request, LB proved the injectivity claim. His proof is depicted in the left-hand column of Fig. 3. The brevity of the proof was due to LB’s level of internalization of the reasons underlying the proof’s steps, where he no longer saw a need to state them explicitly. For better clarity, the right-hand column accompanies line by line LB’s proof together with the explanations he provided upon request.

Fig. 3
figure 3

LB’s proof of the theorem “A linear operator is injective iff its matrix representation is invertible

At this point, I reminded LB that he didn’t attend to his claim concerning eigenvalues—that isomorphism facilitates the computation of eigenvalues of an operator. He responded something to the effect that this is done by computing the roots of the characteristic polynomial of the operator’s matrix representation. The following (paraphrased) exchange ensued.

  1. 6.

    I: Could you explain how this is done?

  2. 7.

    LB: We calculate the roots of the characteristic polynomial of the operator with respect to any basis. It doesn’t matter which basis because the determinant of a matrix representation of an operator is independent of the basis with respect to which the operator is represented.

  3. 8.

    I: Can you prove this independence?

LB then turned to his notebook and produced a complete proof using the explicit formula.

Note that in Line 1 LB’s response attended not only to the need for computation—in determining the values of quantities associated with the operator (\($$$$ \text{d}\text{i}\text{m}\left(nulT\right)$$$$\) and \($$$$ \text{d}\text{i}\text{m}\left(rangeT\right)$$$$\)), but also to the need for structure—in determining the structure of the operator as injective and surjective. I infer that the respective problems associated with a linear operator constituted problematic situations for LB, whereby his response satisfies the intellectual need condition.

LB’s responses in Lines 1–5 suggest that he interpreted the question “why we care about diagonalizable operators?” in a broader context of (a) the role of isomorphism as a tool for transferring questions about a linear operator to questions about its matrix representation, and (b) the concept of matrix representation as a vehicle for the application of this role. However, these responses could have been mere statements not supported by actions. His proof, however, gives credence to the claim that this is not so—that LB conceptualized the role of isomorphism as he claimed it to be, a resolution of the need for computation. We might speculate, then, that at some point during his learning of this subject LB elicited the concept of isomorphism from the need to investigate properties associated with linear operators, whereby fulfilling the elicitation condition.

The awareness condition is typically harder to document. However, in this case not only was LB explicit as to how the explicit formula facilitates inference of properties of linear operators through their matrix representations, but he was also aware of the critical condition (2nd sentence in Line 7) for this act to work.

5.1.5 Summary

This section addressed the question, what instances of students’ mathematical behaviors manifest epistemological justifications? It offered a series of learning-teaching events illustrating the presence or absence of one or more of the epistemological justification conditions: the intellectual need condition, the elicitation condition, and the awareness condition. The events discussed here are of the embedded epistemological justification kind, which occur as learners engage in the process of inventing a piece of mathematical knowledge

5.2 Discerned epistemological justification

5.2.1 Dependence lemma (ETE)

The first lesson of an upper division linear algebra course included a review of several theorems from the prerequisite course of the same subject. The review included the dependence lemma: Suppose\($$$$ {v}_{1},\dots,{v}_{m}$$$$\)is a linearly dependent list in a vector space V over a field\($$$$ F$$$$\). Then there exists a\($$$$ j\in \{1, 2, \dots, m\}$$$$\)such that (a)\($$$$ {v}_{j}\in span\left({v}_{1},\dots,{v}_{m}\right)$$$$\); (b) if the\($$$$ j$$$$\)-th term is removed from\($$$$ {v}_{1},\dots,{v}_{m}$$$$\), the span of the remaining list equals\($$$$ span({v}_{1},\dots,{v}_{m})$$$$\). Figure 4 depicts the proof the instructor presented in class.

Fig. 4
figure 4

The instructor’s proof for dependence lemma

At the end of the lesson, Elana, one of the students in the class, approached the instructor to tell him that she did not understand the “first part of the proof”. As the instructor proceeded to explain the proof, Elana interjected, saying something to the effect that she understood the steps of the proof, but she still was uncomfortable (her phrase) with it. Elana clarified that she was referring to a comment the instructor made as he stated the lemma; namely, “no matter in what order linearly dependent vectors are listed there will always be a vector in the list that is a linear combination of its preceding vectors.” This comment puzzled Elana: “what order has to do with linear dependence?” she wondered.

Elana was not seeking to better understand the proof as a validating statement. Rather, she was puzzled about what she saw as an unexpected relation between order and linear dependence. This type of puzzlement, together with its resolution, if constructed, is an instance of epistemological justification. The resolution that the instructor offered, which seemed to quell Elana’s wonder, was something to the effect that the lemma provides an answer to a question about construction. Namely, given a set of linearly dependent vectors, how would one extract from the list a vector that is a linear combination of the rest? The lemma, the instructor explains, can be thought of as an algorithm: list the vectors in any order, \($$$$ {v}_{1},\dots,{v}_{m}$$$$\) and set the equation, \($$$$ {a}_{1}{v}_{1}+\dots +{{a}_{m}v}_{m}=0$$$$\). Now start with the last addend, \($$$$ {{a}_{m}v}_{m}$$$$\) in the equation. If \($$$$ {a}_{m}\ne 0$$$$\), then the process ends since \($$$$ {v}_{m}$$$$\) can be extracted from the equation as a linear combination of its preceding vectors. If \($$$$ {a}_{m}=0$$$$\), proceed to the next addend, \($$$$ {{a}_{m-1}v}_{m-1}$$$$\), and apply the same process. Continue this process recursively. Since, by definition, not all the weights \($$$$ {a}_{1},\dots,{a}_{m}$$$$\) are \($$$$ 0$$$$\), the process must end; and it ends at the largest index \($$$$ j$$$$\) for which \($$$$ {a}_{j}\ne 0$$$$\), whereby providing the desired \($$$$ {v}_{j}$$$$\) as a linear combination of its preceding vectors in the list. For Elana—if she had internalized the instructor’s response—the lemma and its proof would be elevated from a mere statement and its validation to an epistemological justification that accounts for the emergence of the lemma and its proof.

5.2.2 Linear dependence relations (ETE)

Occasionally, students ask questions such as “How did you know to partition a matrix in a particular way?” “How did you know to choose a particular vector to arrive at a certain result?” “What is the purpose of a particular definition?” “Where was a particular condition used in the proof?” I attribute the emergence of such questions to the effort by the instructor to model epistemological justification considerations throughout the course. Some questions, however, presented a challenge to interpret. In the rest of this section, I discuss two episodes where such questions occurred, and theorize a possible conceptual basis for them in terms of the Aristotelean causality discussed in Sect. 2.

The first episode occurred following the presentation of the proof depicted in Fig. 5 to the claim “\($$$$ n+1$$$$\)vectors in\($$$$ {R}^{n}$$$$\)are linearly dependent.”

Fig. 5
figure 5

Proof of the claim “n + 1 vectors in \($$$$ {R}^{n}$$$$\)are linearly dependent”

A group of students working collaboratively said something to the effect that while they understood the proof’s steps, they thought that the proof was incomplete. For—according to them—the proof is valid in the case that the system is homogeneous but, they wondered, “what about the case where the system is not homogeneous?”

A similar and equally puzzling question was asked by students in the context of the proof the theorem “If eigenvectors\($$$$ {v}_{1},\dots, {v}_{m}$$$$\)of an operator\($$$$ T$$$$\)correspond to different eigenvalues\($$$$ {\lambda }_{1},\dots, {\lambda }_{m}$$$$\), then the eigenvectors are linear independent” (Fig. 6). Some students asked something to the effect that the proof does not seem to be complete because it deals with one kind of polynomials—“what if different polynomials were selected?”, they asked.

Fig. 6
figure 6

Proof of the theorem “eigenvectors corresponding to different eigenvalues are linear independent

At the time of their occurrence these questions, as well as explicit requests by some students to avoid the use of proofs by contradiction, were puzzling, especially that typically they were raised by the better students in class. In what follows, I offer a hypothetical conceptual basis for this phenomenon in terms of Aristotelian causality. Historically, the mathematicians of the Renaissance subscribed to this philosophy which equated scientific understanding with causality. Some of these mathematicians questioned, others rejected, the scientificaness of mathematics based of the claim that mathematical proofs are not causal. To support this claim, they analyzed proofs of Euclid’s propositions involving the use of auxiliary lines. They argue, for example, that the proof of the triangle sum theorem is not causal because the theorem’s conclusion that the sum of the triangle’s angles is 180° is independent of the auxiliary line drawn at one of the triangle’s vertices. As was mentioned in Sect. 2, these mathematicians also rejected proofs by contradiction on the ground that these proofs provide certainty but do not demonstrate cause. Mancosu (1996) argues that this very need had a profound effect on the development of mathematics. Mathematicians such as Cavalieri and Guldin explicitly avoided proofs by contradiction to conform to the Aristotelian causality position. Descartes, whose work represents the most important event in 17th -century mathematics, appealed to constructive proof because they are causal and ostensive.

Might the students in the first episode, seeking to identify cause, interpreted the homogeneous system as a cause for the \($$$$ n+1$$$$\) vectors in \($$$$ {R}^{n}$$$$\) to “become” linearly dependent? Similarly, might the students in the second episode interpreted Lagrange polynomials as a cause for the eigenvectors to “become” linearly independent? Recall that the need for causality was theorized as one of the five intellectual needs. Based on this hypothetical account then, we might speculate that students’ questions in the above two events, even if erroneous, fulfill the intellectual need condition for epistemological justification. If we assume that epistemological obstacles encountered by the individual echo those that occurred historically (Brousseau 1997), then investigations of these phenomena might shed light on an obstacle with roots in the historical development of mathematics not yet addressed in the literature on students’ conceptualization of proof.

6 Implications for research and instruction

One might wonder what Weyl meant by his statement in the quote appearing in the opening of this paper about the mathematicians’ disposition to understand the ideas underlying proofs. Weyl was clear that mere acceptance of “a mathematical truth by virtue of a complicated chain of formal conclusions and computations” is not the sole desire. In essence, the literature on mathematical explanation is an effort to articulate the kinds of proof understanding desired by mathematicians. Stanford Encyclopedia of Philosophy (2009) outlines the historical development of the debate on mathematical explanation in philosophy of mathematics—from Aristotle’s distinction between demonstration of “facts” and demonstration of “reasoned facts” to Steiner’s (1978) and Kitcher’s (1989) models of mathematical explanation. This and other literature reviews show that such understanding is multi-faceted. This paper, being pedagogical rather than philosophical, does not belong to the genre of this literature, but its contribution is highly germane to it, in that it pours meaning into a critical aspect of proof understanding, production, and appreciation. Furthermore, the contribution goes beyond proving in that epistemological justification attends to mathematical conceptualization broadly, as the discussions of Sect. 5 demonstrate.

The analyses presented in this paper invoke a range of questions. Examples include the following:

  1. 1.

    What are the relations between epistemological and justification and mathematical explanations? The historical discussion of the last episodes (Sect. 5.2.2) points to one potential relation.

  2. 2.

    Are there connections between the development of epistemological-justification way of thinking and means by which one obtains certainty and seeks explanation? Weber, Lew, and Mejia-Ramos’ (2020) found that most mathematics majors do not obtain certainty by means of empirical evidence. Might this finding be due to the mathematical sophistication of the subjects being mathematics majors who likely developed elements of epistemological-justification way of thinking? Theoretically, there is a reason to assume such connections since epistemological justification necessarily involves reflection on one’s process of problem solving and proving, which might, in turn, lead one to recognize the different spheres of explanatory practice, those that belong to mathematics versus those that belong to science and in everyday life (see, however, Baker, 2012).

  3. 3.

    What instructional practices promote the habit of seeking epistemological justification among students? Based on my classroom observations, I hypothesize that two factors play a significant role in the promotion of this goal. The first is a persistent effort to model the application of intellectual need by raising questions whose answers lead to new knowledge rather than offering new knowledge a priori in absence of such questions. The second factor is the focus on ways of thinking broadly as central instructional objectives. To state the obvious, epistemological justification cannot be promoted, not it occurs, in isolation. Rather, it is interwoven in a network of other ways of thinking, as the following summary illustrates.

Algebraic representation

Recall that a critical stage in the construction of the proof by the participants in the Existence and Uniqueness event was the recognition by the working group of the need to translate the ideas offered by the group members into linear-algebraic language. This recognition played a crucial role in enabling the participants to assemble the different components of their ideas into a coherent whole in the form of valid proof. Experience suggests that the way of thinking of converting statements from a spoken language into algebraic expressions is not acquired easily. In our teaching experiments it receives major attention. This was done by bringing the participants to witness repeatedly that often it is not sufficient to just declare a mathematical relation; rather, one often needs to go a step further and formulate the relation in symbolic algebraic form. The level of difficulty that students encounter in acquiring this way of thinking is rather surprising. We observed students unable to engage in the solution of problems because it does not occur to them to translate statements such as “a vector v is in the span of \($$$$ {v}_{1}, \dots, {v}_{m}$$$$\)”, “\($$$$ {u}_{1}, \dots, {u}_{k}$$$$\) are orthogonal”, “the point \($$$$ X$$$$\) is between points \($$$$ A$$$$\) and \($$$$ B$$$$\)” into algebraic expressions.

Thinking in general terms

In the discussion of the same event, it was also reported that the participants discussed their claims in the context of a \($$$$ 3\times 3$$$$\) system, albeit with general entries, but not in terms of a general \($$$$ n\times n$$$$\) system. This reflects a stage in the instructional effort to help the participants transition from empirical reasoning, where the context considered is entirely specific (e.g., the coefficients of the matrix are numbers), toward the ability to reason in general terms—in terms of a general \($$$$ n\times n$$$$\) system in this case.

Structural reasoning

Another example of a way of thinking that was set as a major cognitive objective is structural reasoning. Structural reasoning is a powerful and essential way of thinking in abstract mathematical areas, such as linear transformations on a general vector space. One of the instantiations of structural reasoning is the ability to reason in terms of conceptual entities and networks of conceptual entities, what APOS theory (Dubinsky, 1991; Arnon, Cottrill, Dubinsky, Oktac, Roa, Trigueros, & Weller, 2014) refers to as object conception and schema conception, respectively. Consider LB’s use of isomorphism (Fig. 4) through the explicit formula\($$$$ Tv=\alpha {\left[T\right]}_{\alpha }{\left[v\right]}_{\alpha }$$$$\). To begin with, an essential feature of understanding the symbol \($$$$ {\left[v\right]}_{\alpha }$$$$\) at the level of process conception is the ability to imagine taking any vector v in an abstract vector space \($$$$ V$$$$\), representing it as a linear combination of the basis list \($$$$ \alpha =({\alpha }_{1}, \dots, {\alpha }_{n}$$$$\)) and forming, as a result, a column vector whose entries are the coefficient of, and are sequenced in the order they appear in, the combination. LB’s treatment of this symbol suggests that he was able to carry out this process in thought and with no limitation on the vector v considered. Furthermore, his use of the symbol \($$$$ {\left[T\right]}_{\alpha }$$$$\) as a representation of a single matrix whose columns are coordinate vectors with respect to \($$$$ \alpha $$$$\) (i.e., \($$$$ \left[{\left[T\left({\alpha }_{1}\right)\right]}_{\alpha }, \dots, {\left[T\left({\alpha }_{n}\right)\right]}_{\alpha }\right]$$$$\)) suggests that LB (a) conceived of \($$$$ {\left[v\right]}_{\alpha }$$$$\) as an input of a process and therefore as an object; and (b) encapsulated a cascade of objects \($$$$ {\left[T\left({\alpha }_{1}\right)\right]}_{\alpha }, \dots, {\left[T\left({\alpha }_{n}\right)\right]}_{\alpha }$$$$\) into a single entity \($$$$ {\left[T\right]}_{\alpha }$$$$\).

In all, the paper addresses five questions: (1) What exactly is epistemological justification? (2) What is its underlying theoretical basis? (3) What are the observational criteria for its occurrence? (4) What instances of students’ mathematical behaviors manifest epistemological justifications? (5) What instructional approaches might facilitate the development of epistemological justification? The analyses of these questions presented in this paper amount to field-based hypotheses which might serve as initial foundations for follow-up empirical studies.