1 Introduction

What constitutes good (mathematics) teaching is discussed in various contexts. Not only scientists from different disciplines, but also parents and those responsible for the education system and education policy have answers to this question. However, these answers are not always theoretically supported and empirically verified. Too often, perspectives about good mathematics teaching are subjective theories, or personal beliefs with underlying, unquestioned premises. Generally speaking, there are several generic aspects of teaching that are consensually described as essential regarding the quality of teaching, independent on the subject that is taught. Such dimensions include classroom management or learning support (Praetorius et al., 2018), cognitive activation (Kunter et al., 2013; Praetorius et al., 2018) and the importance of access to the topic for all learners (Schoenfeld et al., 2014; Schoenfeld et al., 2023). While there are similarities in the various dimensions identified by the many different frameworks, the number of generic quality dimensions often differs: For example, some researchers propose the need for three such dimensions—namely, emotional support, classroom organization and instructional support (Pianta & Hamre, 2009), or classroom management, cognitive activation and learning support (Praetorius et al., 2018)—, while others conceive of five—conceptual focus, cognitive demand, student focus, adaptivity, and longitudinal coherence (Prediger et al., 2022) or the content, cognitive demand, equitable access to content, agency, ownership, and identiy, and formative assessment (Schoenfeld et al., 2014, 2023). Such mainly generic dimensions would appear to have validity and importance for determining the quality of mathematics teaching.

There are other additional considerations to attend to, in our attempt to characterize high quality mathematics teaching. For example, does the quality of mathematics teaching also differ from quality teaching in other subjects, for example from teaching history? Some studies suggest for instance that the nature of and quality of discussions in these two subjects are quite different (Pauli & Reusser, 2018). Other studies show that the cross-subject consisteny of a teacher’s quality of teaching is only moderate (Cohen et al., 2018). Studies such as these imply that teaching quality is not a uniform and subject-independent issue. Rather, quality of mathematics teaching seems to call for the inclusion of subject-specific quality characteristics as well. But which subject-specific characteristics are the most crucial ones for helping us describe high-quality mathematics teaching?

Even if we restrict our interest to the teaching of mathematics, additional complexity remains. Does the quality of teaching also depend on what one means by “mathematics”? Good mathematics teaching is often discussed as if there is uniformity and agreement in what we mean by mathematics. But is it important to characterize good algebra teaching as possibly distinct from good geometry teaching, for example, such as in the need for specific visualizations or enactive manipulations in geometry? An attempt to capture this type of topic-dependent nuance in teaching quality can be seen in the Teaching Algebra Project (Schoenfeld, 2019; Wernet & Lepak, 2014) in which five specific “robust criteria (RCs)” are defined that describe the specifity of understanding of algebra (Wernet & Lepak, 2014, p. 5 ff.). Therefore, these RCs would have to be part of high-quality teaching of algebra in particular. Furthermore, with respect to algebra, mathematics education differentiates between “early algebra” (e.g., Kieran et al., 2016) and later “abstract algebra” (Veith et al., 2022). This implies that even within the topic of algebra, it is difficult to think about high-quality algebra teaching, as this might depend on students’ age or their level in school.

Moving beyond subject (e.g., mathematics) and topic (e.g., algebra), we might wish to consider whether different mathematical processes such as reasoning and proving or operating with numbers and variables differ in what is required to teach them effectively. Should the teaching of mathematical processes play a role in the articulation of high-quality mathematics teaching as well? Furthermore, could an argument be made that high-quality mathematics teaching may look quite different, depending on the particular content of a specific lesson? What about considerations related to the age of the students? What similarities and differences exist between high-quality mathematics teaching at the elementary level versus at the secondary level?

Questions such as these are fundamentally about the appropriate grain size for any analysis of high-quality mathematics teaching. What are the most powerful ways to characterize good teaching in mathematics? To what extent does this vision of mathematics teaching rely upon (a) broad distinctions such as differences between the teaching of mathematics and the teaching of other subjects; (b) somewhat narrower distinctions between topics within the mathematics curriculum such as algebra and geometry, (c) or between problem solving and proving, (d) or between elementary and secondary mathematics teaching; and/or (e) quite narrow distinctions between the content taught in any given lesson?

2 A selection of issues in the field

Research on the quality of mathematics teaching is a cross-cultural endeavor, as is evident (for example) in the TIMSS videostudy (Hiebert et al., 2003) and in work on Mathematics Classroom around the World (Kaur, 2017; Kaur et al., 2013; Shimizu et al., 2010). However, it would go beyond the scope of this article to present and duly acknowledge all the diverse and numerous findings from around the globe. We therefore concentrate in this section on the presentation of central results that stem from the tradition of German-language as well as U.S. empirical research. In these two regions, the question of the quality of (mathematics) teaching has been discussed particularly intensively—mainly based on broad empirical studies such as the COACTIV project (Brunner et al., 2006; Kunter & Baumert, 2013) in Germany or the Pythagoras study (Klieme et al., 2009) in Germany and Switzerland, and in the US context the international TIMSS videostudy (Givvin et al., 2005; Hiebert et al., 2003) or the TEDS-M research program (Blömeke et al., 2022; Kaiser & König, 2020)—with quite different accentuationsFootnote 1 which might be fruitful to combine. These studies had a strong influence on the discussion of quality in mathematics education, in particular with their linking to learning and instruction (Klieme et al., 2009; Kunter & Baumert, 2006), their relation to teachers’ skills and competences (Baumert & Kunter, 2013a, b), and their examination of the quality of applied tasks in mathematics education (Adleff et al., 2023; Jordan et al., 2006; Neubrand et al., 2013; Shimizu et al., 2010).

2.1 Good teaching, effective teaching and quality in teaching

The investigation of teaching quality is a central focus of empirical educational research and, at the same time, subject of theoretical and normative discussions from a wide variety of perspectives. Empirical educational research has so far identified several central characteristics of teaching quality that are positively linked to learners’ performance. These empirically proven characteristics of effective teaching are often generic and not subject-specific. Although subject-specific aspects are increasingly included in the determination of the quality of mathematics education, the relationship between generic and subject-specific teaching quality characteristics is still unclear; there remains an open and important question of whether there is “good teaching” or “good mathematics teaching” and to what extent this differentiation plays a role at all.

Supplementing the generic view with a subject- and a topic-specific one is likely to make a substantial contribution to an interdisciplinary understanding of teaching quality, especially when it comes to the quality of subject-specific teaching. In this regard, the distinction made by Berliner (2005), who mentions the “sheer impossibility of good teaching” and its measurement (Berliner, 2005, p. 205), may be helpful: According to him—following Fenstermacher and Richardson (2005, p. 189) who differentiate between “good teaching” and “successful teaching”—a distinction can be made between “good teaching” and “effective teaching,” which he summarizes as “quality in teaching”. Following this idea, quality in mathematics teaching can be characterized on the one hand by demonstrably effectiveness-related aspects (“effective teaching”) and on the other hand by normative aspects (“good teaching”). The normative aspects include, for example, the demands of the subject or of society on mathematics education, while the effectiveness-related aspects must show a measurable benefit regarding the learning and achievement development of the students. If a normative claim of good teaching does not prove to be effective empirically, it would be necessary to justify why the corresponding feature should nevertheless be retained in the sense of “good teaching”.

2.2 “Quality in teaching”—generic or subject-specfic?

With respect to generic conceptions of quality in teaching, research on teaching quality has recently identified numerous characteristics that have been shown to positively influence student learning and achievement and therefore are effective. Many of these studies have been conducted based on mathematics teaching and thus pertain to mathematical achievement development. In this context, the three basic dimensions of quality in teaching, namely classroom management, cognitive activation, and learner orientation (Klieme et al., 2006; Praetorius et al., 2018), have become well known in the German-speaking world and their positive effect on achievement development has been empirically proven, although the findings regarding cognitive activiaton are mixed.

These three basic dimensions of teaching quality are based on assumptions concerning their specific effects on the students’ outcome and achievement (Klieme & Rakoczy, 2008). Cognitive activation is assumed to lead to a greater depth of processing of the content and therefore to a higher outcome. Good classroom management in terms of disruption prevention and clearly structured lesson management, is seen as a prerequisite for working as intensively as possible and making the best possible use of learning time (e.g., increased time on task). Teaching that supports learning not only serves this outcome but also impacts motivational aspects and promotes social inclusion and a climate of cooperation. While individual instruments for assessing cognitive activation typically incorporate operationalizations influenced by subject-specific educational considerations (e.g., Neubrand et al., 2013), these three dimensions are generally conceived of in a generic manner.

The existence of these three basic and generic dimensions of quality instruction has been repeatedly discussed and criticized in recent times or seen as a “starting point toward a common conceptualization of instructional quality in mathematics” (Mu et al., 2022, p. 6). Criticism has been aimed at fundamental questions of theoretical conceptualization (Praetorius et al., 2020) as well as necessary supplementation and specification, including whether additional generic dimensions—such as cognitive learning support (Kleickmann et al., 2020) and practice support (Praetorius & Gräsel, 2021) as specific subject education features (e.g., Brunner et al., 2014; Drollinger-Vetter, 2011; Schlesinger et al., 2018)—may be needed.

The main focus of the three basic dimensions concerns the learning and thinking processes of the students, rather than on specific forms of teaching. We know from empirical research that certain forms of teaching more reliably lead to good learning results (Lee & Anderson, 2013). Empirical research shows that different teaching forms have their own advantages and disadvantages (Koedinger et al., 2012), given that it is likely not the teaching form itself that matters but rather the quality of its implementation (Alfieri et al., 2011; Henningsen & Stein, 1997; Stein et al., 1996). In particular, the quality of the implementation is effective only if it impacts the students’ learning and thinking process. Given that the learning and thinking process has both generic and subject-specific components, the relationship between generic and subject-specific aspects comes to fore. Yet it has rarely been discussed how quality characteristics—generic or subject-specific—relate to each other and whether they should be conceptualized in a hierarchical way (Brunner, 2018) as prerequisites of subject-specific teaching, for example in the case of effective classroom management and efficient time management.

Furthermore, an empirically confirmed truism, namely that the quality of teaching also depends on the teacher and his or her competencies (Lipowsky, 2006), is receiving increased attention in recent research on teaching quality. Therefore the quality of (mathematics) teaching is increasingly being linked to the professional competencies of teachers (e.g., Jentsch et al., 2021; König et al., 2021), as the COACTIV and the TEDS-M research programs have shown.

2.3 Subject-specific characteristics of mathematics instruction

As mentioned above (see Sect. 1), the question arises as to which features and quality aspects can be considered to be subject-specific. The answer to this question is also reflected in subject-specific aims about classroom activities and the areas in which the students have to develop subject- and content-specific skills. Frameworks such as those developed by NCTM (2000), the Common Core States Initative (Common Core State Standards Initiative, 2012), or PISA 2022 (OECD, 2018) define such specific aims and give a direction and guidlines for teaching a particular subject within a temporal or cultural context. The more recent frameworks (e.g., the PISA 2022 framework, but also the CCSSI) built upon prior frameworks in important ways. In the PISA 2022 framework, “mathematical reasoning” and “mathematical problem solving” gain in their importance for mathematics education, while “mathematical modelling” on real-world problems still remains a relevant topic in mathematics classrooms but is not the principal competency that it was for several years in the former PISA frameworks with its focus on “literacy” (OECD, 2006). Such framworks make assumptions about the necessary future skills that learners must be able to acquire and correspondingly defined educational goals (“Bildungsziele”) on this basis on a normative level in the sense of “good teaching”. Additionally, PISA 2022 requires the application of mathematical knowledge in areas that have not yet been part of the curriculum of most students. An example of such an new area of content concerns “conditional decision making” or “growth processes” (OECD, 2018). In all these subject- and content-specific topics, the first requirement concerns the correctness and coherence of their application.

2.3.1 Correctness and coherence

One straightforward example of a subject-specific characteristic of mathematics instruction is mathematical correctness and coherence. Teaching mathematics without regard to its correctness and coherence cannot be considered as quality in mathematics teaching. But although the centrality of this feature may appear to be self-evident, it is not always explicitely reflected in instruments to observe teaching quality. From a subject-specific point of view, correctness and coherence is indeed crucial and a prerequisite of teaching quality (Brunner, 2018; Drollinger-Vetter, 2011; Schlesinger et al., 2018), both with regard to “good teaching” (Berliner, 2005) as well as in reference to accurate knowledge as one of the three dimensions of accountable talk in mathematics classrooms (Greeno, 2006, 2015; Michaels et al., 2013). Coherence here refers to “longitudinal coherence” (Prediger et al., 2022, p. 9) or curriculum-related coherence that includes a spiral-curriculum perspective of the learning trajectories. Coherence as well as correctness is an aspect of “good teaching” (Berliner, 2005).

2.3.2 Additional subject-specific aspects

There are other characteristics of high quality mathematics instructions that can be considered as subject-specific. Because mathematics is an abstract domain, suitable visualizations (Presmeg, 2020) and the appropriate use of mathematical models (e.g., Gravemeijer, 2010; Sproesser et al., 2018) are seen as important for a deep understanding as well (NCTM, 2000; Presmeg, 2006). Such models serve as tools to “construct meaning” (Hiebert et al., 1997, p. 54). However, the use of these models is not self-explanatory and therefore may be error-prone (e.g., Rösken & Rolka, 2006). Therefore, the use of appropriate mathematical models and visualizations can be considered a subject-specific quality feature.

Instructional features such as open tasks with multiple solutions are considered to be important as well in mathematics education because of their positive effect on students’ understanding and creativity (Chu et al., 2017; Neubrand, 2006) and students’ interest and motivation (Schukajlow & Krug, 2014). In a more subject-specific sense, open tasks in mathematics can be solved in many different ways and as a result offer insights into mathematical connections and relations, potentially providing evidence for students’ deeper understanding of the mathematical content and the possibility for gaining agency (Achmetli & Schukajlow, 2019; Nieminen et al., 2022; Sullivan et al., 2000). Generally, tasks with multiple solutions can positively influence the cognitive activation of the students, which is a generic criteria of instructional quality as well (Kunter et al., 2013; Praetorius et al., 2018).

More generally, the quality of mathematical tasks also is quite related to the potential for subject-specific cognitive activiation and therefore a subject-specific aspect of quality (Henningsen & Stein, 1997; Stein et al., 1996). Task quality is important because of its connection to effective (formative) assessment, content-specific support, and the opportunity to provide mathematically accurate information and feedback. The importance of task quality is reflected in many models of teaching quality in mathematics, (e.g., the MAIN-TEACH-Model; Charalambous & Praetorius, 2020).

An additional subject-specific and normative characteristic of teaching quality relates to students’ possible application of mathematical knowledge to their everyday life (OECD, 2013, 2018). Therefore, the incorporation of real-life modeling problems (e.g., Kaiser & Schwarz, 2010; Schukajlow et al., 2018) is also discussed as a characteristic of high quality mathematics teaching.

Finally, appropriate use of mathematical language is also considered to be a subject-specific feature of high quality mathematics teaching (Neugebauer, 2022) because of the strong relationship between language skills and mathematical achievement (e.g., Dröse & Prediger, 2019; Paetsch, 2016; Plath & Leiss, 2018; Prediger & Krägeloh, 2015; Stanat, 2006; Ufer & Bochnik, 2020).

2.3.3 Topic-specific aspects of instructional quality in mathematics education

A broad discussion of topic-specific instructional quality features (Brunner, 2018, 2020) is largely missing from the literature. Consistent with this assessment, Mu et al. (2022, p. 4) talk in their literature review about “evidence for construct-underrepresentation” in various conceptualizations and measurements for teaching quality in mathematics education. Such topic- and concept-specific features could include the understanding of a particular mathematical topic such as the Pythagorean theorem (Drollinger-Vetter, 2011), or topic-specific basic conceptions (e.g., Salle & Clüver, 2021; vom Hofe, 2003) or key concepts (e.g., Gravemeijer et al., 2017) and are also known as “robust criteria of understanding” (Schoenfeld et al., 2014). Topic-specific elements of understanding are based on normative views about a particular mathematical topic and describe the object of understanding in its breadth and depth. As such, they are not only subject-specific but topic-specific as well. High quality teaching of such a topic requires addressing its key concepts and “elements of understanding” (Drollinger-Vetter, 2011) in the sense of “disentangling the core and the intertwinement of subject-matter contents” (Prediger et al., 2022, p. 3). To organize and relate these elements of understanding to each other provides a “structural clarity” of teaching (Drollinger-Vetter, 2011).

With respect to topic-specific teaching quality, there still are many unanswered questions. In particular, it seems necessary to theoretically and empirically clarify which (and how many) elements of a particular mathematical concept must be understood in order to gain in-depth understanding. For example, is it necessary to address all elements of understanding of the topic or is it sufficient to understand and work on only some of these elements to gain in-depth understanding of the whole topic?

2.4 Focus on specific mathematical processes: aspects of instructional quality in mathematics education

Instructional goals for students’ mathematical learning can be described not only with respect to a particular topic but also with regard to the processes of working and thinking mathematically. Central mathematical processes—such as mathematical reasoning, modelling, problem solving, operating (operieren, or calculating, constructing, and manipulating.) or communicating—are reflected in many models of mathematical competencies (Blum, 2006; Common Core State Standards Initiative, 2012; NCTM, 2000). Processes are described as the “how” (Prediger et al., 2022, p. 2) of mathematical learning (Westbury et al., 2000). Some of these processes (such as modelling and problem solving) are quite similar, while others appear fundamentally different (such as reasoning and calculating). Fostering growth in each type of mathematical process may require the use of particular lesson activities. For example, a mathematics lesson that aspires to promote mathematical reasoning and proving may do so by incorporating the exploration of mathematical structure, a search for connections and regularities, the formulation of an argument, the evaluation of conjectures, and the organization of arguments into a logical argument (Boero, 1999; Jeannotte & Kieran, 2017; Stylianides, 2007, 2016). These process-specific activities can support an instructional focus on reasoning and proving and are thus essential for high quality of teaching of reasoning and proving. Therefore, such process-specific activities can also be considered process-specific quality features. As another example, if the aim of a mathematics lesson concerns the process of modeling (Schukajlow et al., 2018), quality features such as changes in representation (Prediger & Krägeloh, 2015) or references to reality come into focus (Besser et al., 2020) as process-specific quality features.

2.5 Focus on different school levels

There are instruments and frameworks such as the model of the “three basic dimensions” (Klieme et al., 2006) that are not only generic but are also designed independently on different school levels and are therefore suitable for all levels. The TRU framework with its five dimensions (Schoenfeld et al., 2023) of which most are generic but focused and defined with a clear focus and relationship to mathematics education, or the MQI (Hill et al., 2008a, 2008b), are independent on the school level as well.

However, as soon as a certain school level has special features as compared to the other levels—as is the case for early mathematics education in kindergarten (with a focus on “natural learning situations” [Gasteiger, 2012], play and games [van Oers, 2010]) or high school (with a focus on the ability to study academically)—a specific quality of mathematics education in these school levels is important (e.g., Burchinal, 2018; Cerezci, 2021; Howard et al., 2018, Ufer & Praetorius, 2022), even if it is not implemented or explicitly defined in most of the instruments to measure the quality of mathematics education and teaching.

2.6 Measuring teaching quality

2.6.1 Measurment of teaching quality: generic or subject-specific?

The relationship between generic and subject-specific aspects of teaching quality is a frequently-discussed topic (Charalambous & Praetorius, 2018). But long before the emergence of this discussion, instruments for measuring teaching quality have been developed that include both generic and subject-specific characteristics but which do not clarify the relationship between these different perspectives. For many, the established view is now that teaching is “always generic and subject-specific” (Reusser & Pauli, 2021) and therefore should always be considered “two sides of the same coin” (Lipowsky et al., 2018).

In particular, various instruments for assessing the teaching quality of mathematics (Hill et al., 2008b; Learning Mathematics for Teaching Project, 2011; Schoenfeld et al., 2014; Walkowiak et al., 2018) have included normative dimensions such as the richness of the mathematical task or the justification of mathematical statements and have thus made a significant contribution to the development of subject-specific instruments.

However, differences exist in the focus of the operationalization of subject-specific instructional features. In particular, does subject specificity result from the operationalization of generic features but in a subject-specific manner (e.g., cognitive activation as the cognitive activational potential of the mathematics task)? Or are subject-specific and topic-specific characteristics (e.g., use of appropriate mathematical model) operationalized in and of themselves?

A further distinguishing characteristic of measurement instruments regards the relationship between the generic and subject-specific aspects of teaching quality that they use (Brunner, 2018). In particular, there are instruments that focus primarily on generic aspects (e.g., Klieme et al., 2006; Praetorius et al., 2018). Yet other instruments use generic aspects of teaching quality but also incorporate some subject-specific aspects (e.g., Jentsch et al., 2021). This approach has been labeled as an “additive view” (Brunner, 2018, 2020), in that the generic and the mathematics-specific aspects are not generally related to each other. Other frameworks and instruments adopt what has been termed a more “integrative view” (Brunner, 2018, 2020): Such instruments focus on generic characteristics of good teaching such as “cognitive activation” (Kunter et al., 2013) yet operationalize them in subject-specific ways (e.g., Klieme et al., 2009; Neubrand et al., 2013). A further group of frameworks and instruments have been described as reflecting an “inclusive view” (Brunner, 2018, 2020), in that they are subject-specifically designed and bring generic and subject-specific aspects together without a clear distinction between the two of them. By “inclusive” tools, we mean those that are specifically designed for mathematics education and use both generic and subject-specific aspects, but relate them completely and consistently to mathematics education. The primacy here is with mathematics and mathematics learning. In contrast, we understand “integrative” instruments to be those that also focus on mathematics teaching, but use generic and subject-specific quality features and do not operationalize these consistently in a subject-specific way. Examples of inclusive instruments are TRU (Schoenfeld et al., 2014, 2023), MQI Learning Mathematics for Teaching Project (Learning Mathematics for Teaching Project, 2011) M-Scan (Walkowiak et al., 2018), and IQA (Boston & Candela, 2018) (see Charalambous & Praetorius, 2018; Schlesinger & Jentsch, 2016).

These frameworks not only relate generic and subject-specific aspects of teaching quality to each other but also define and compare “good mathematics teaching” with “good teaching”. But in summary, as Mu et al. (2022, p. 7) point out, “a coherent analyisis of conceptual and operational indicators used to describe teaching quality in mathematics is not currently available.”

2.6.2 Challenges with measuring quality of mathematics

Determing the generic and/or subject-specific nature of teaching quality in mathematics is challenging, as much depends on the particular conceptualization and operationalization of these characteristics. For example, several studies have demonstrated that the same mathematics lesson was assessed differently depending on the particular measurement instrument that was used (e.g., Boston & Candela, 2018; Brunner, 2018; Charalambous & Litke, 2018; Praetorius et al., 2018). Thus, not only does a unitary conception of quality in teaching seem not to exist, but the same teaching episode may also be assessed differently depending on the framework used. Lindmeier and Heinze (2020) therefore questioned the significance of the subject-specific perspective in previous teaching quality research and concluded that subject-specific quality characteristics would have to be analyzed by professionally trained raters. In addition, depending on the particular feature of teaching quality in question, for example cognitive activation, many different observation occasions may be required to arrive at a reliable assessment, whereas the assessment of classroom management can be done comparatively reliably based on a single lesson (Praetorius et al., 2014).

Many prior papers have considered these questions, including several that have been published in prior issues of ZDM—Mathematics Education. First, Schlesinger and Jentsch (2016) pointed out theoretical and methodological challenges in measuring teaching quality in mathematics education using classroom observations and gave an overview about subject-specific aspects of teaching quality measured in different studies. Second, a strongly focused discussion on the applied frameworks and instruments used to measure the quality of mathematics teaching was published in ZDM—Mathematics Education (2018, Vol. 50, Issue 3) (Charalambous & Praetorius, 2018). In that issue, different frameworks were applied to analyze the same mathematics lesson. The analyses indicated a strong link between the particular instrument used and the resulting analyses and measurement of the quality of mathematics teaching, a finding that is consistent with other similar analyses published elsewhere (e.g., Brunner, 2018).

3 Quality in teaching mathematics: open questions and conclusion

3.1 Open questions

3.1.1 Consideration of phases of the learning process?

The relationship between quality features and the phases of the intended learning process is also still to be worked out. If the lesson focuses on a first introduction to a new topic, there might be aspects of quality that are more important than in a lesson that aims at practice in the use of basic operations. In the second case, a meaningful quality characteristic would be the number of practice tasks completed by the students (Aebli, 2003). If the lesson gives an introduction into fractions or focuses on problem solving, such a quality characteristic recedes into the background. Therefore, from a mathematics educational point of view, quality features should not only be related to the topic of and instructional goals of the lesson, but also to the phase of the learning process in which the lesson takes place. In both of these cases, the quality aspects have to be operationalized subject-specifically.

3.1.2 Contributions of the learners to instructional quality?

In the context of teaching quality, the role of the learners has not yet been sufficiently investigated. If we look at teaching from the perspective of what is offered and how it is used, we can assume that the interaction between the learner and the teacher and the use of what is offered also play a role in determining how sustainable the teaching is in terms of learning performance. Seidel (2020) therefore proposes to further conceptualize both the supply and the use side and to integrate psychological theories regarding the use side. This also applies to the more technical and educational quality characteristics. More generally, the question arises as to the contribution of the students individually and as a group regarding the quality of teaching. For example, in case of mathematical correctness as a subject- and content-specific feature of quality, this might mean that we can ask whether a high quality of mathematics teaching requires complete correctness or merely a level of correctness that is sufficient for the students’ current state of knowledge. In both cases the teachers’ mathematical knowledge can restrict or open the students’ input and contributions on mathematical correctness.

3.1.3 Consideration of specific grade level and age of the learners?

Currently, we do not know if there is a grade-level specific quality of mathematics teaching. Does the quality of mathematics learning and teaching in kindergarten differ from what is seen in secondary schools? We can assume it will, not only because of definitions of quality of mathematics teaching in secondary school (Ufer & Praetorius, 2022) but also because of specific quality aspects of early mathematics education such as “natural learning situations” (Gasteiger, 2012), “learning with picture books” (Björklund & Palmer, 2022; Elia et al., 2010) and the important role of play (van Oers, 2010). There might be other characteristics for other grade levels that could play an important role and should therefore be included. Therefore, there is a need to investigate early childhood mathematics more precicely and more level-specifcally as some studies try to do (e.g., Burchinal, 2018; Cerezci, 2021; Howard et al., 2020).

3.1.4 Instructional pattern dependent on the focused topic?

Does a particular mathematical topic require a particular implementation and teaching pattern that favors learning? For instance, can high quality mathematical reasoning occur when a discoursive activity is lacking what is cogently needed in a reasoning process? Or during lessons on proving, do students benefit more from an active discovery approach or from a direct instructional approach? If we determine that proving as a challenging process is more effectively learned within a strongly guided instructional pattern than only through active discovery, this might mean that a high quality proving sequence should engage students in proving in a first step through active discovery and afterwards in a guided construction of a proof.

We know that not all students benefit equally from the same instructional patterns. If this is the case, it could be that there is not an effective teaching method of a particular content and topic but probably one with variations that depend on the individual student’s prerequisites. We still do not know.

3.1.5 Differential effects of teaching quality on the learners?

Additionally, what we also do not understand yet are differential effects of quality of mathematics teaching. We know for example that the perception of quality by the students depends on their mathematical achievement (e.g., Göllner et al., 2018; Meissner et al., 2020) and also on the specific teaching method (Bieg et al., 2017). But what does this mean for future research? And how could this be taken into account in future research?

3.1.6 Cultural context and cultural values?

Several studies show differences in mathematics teaching and teaching practices when cultural background is considered (e.g., Hiebert et al., 2003). But in what ways does the quality of mathematics teaching depend on cultural beliefs and values (e.g., Dreher et al., 2018; Gasteiger et al., 2021; Hiebert et al., 2003; Kaur, 2017; Kaur et al., 2013; Leung, 2002, 2017; Zhu & Kaiser, 2019)? Additionally, it is unclear whether the different instruments used to measure teaching quality are suitable for different contexts or not. Mathematics education is always context-dependent and refers to different normative foundations, which is why cultural differences always exist (e.g., Kaur et al., 2013).

3.1.7 Relationship between teachers’ professional competences and teaching quality?

A further open question concerns the relationship between the professional competences of the teacher and the quality of teaching (Jentsch et al., 2021; König et al., 2021). Do teachers with high professional competences teach better than others? We hope so, but do we know? And which facets of mathematical pedagogical knowledge (e.g., Baumert & Kunter, 2013a; Hill et al., 2008b) are particularly important to predict a high quality of mathematics teaching?

3.1.8 Are all characteristics equally important?

It might be important to investigate whether complete correctness of mathematical content is always important for deep understanding or if there is a level of sufficient correctness that is not absolute. Do there exist aspects of quality that are more important than others and can compensate for deficiencies in another quality aspect? Is there a hierarchical structure of instructional quality? All these questions are still open.

3.2 Envisioning the relationship between two continua

Quality in mathematics teaching can be considered along a continuum or spectrum between generic and specific (subject- and/or topic-specific) features of instruction. There is implicit recognition in the field that extremes (in both directions) along this continuum offer dimishing returns in our ability to describe high quality teaching. For example, if we consider quality of teaching in only the most broad and generic sense, this is problematic because it does not take into account the quality of a professional treatment of a specific topic. But if we advance along the continuum toward maximum specificity and only consider teaching quality from a lesson by lesson, or moment by moment perspective, this is similarly problematic from a measurement perspective, as each subject and topic would then need its own instruments. It seems clear that the more specific the perspective and the instrument chosen, the smaller their scope. As a result, existing instruments all lie toward the middle, incorporating various degrees of generality and specificity. In studies of teaching quality, researchers should be expected to use instruments whose specificity and/or generality is a suitable match for their research objectives.

A second dimension that is useful in representing instruments that assess teaching quality foregrounds the continuum between a focus on generic and specific aspects of quality and shows the different ways that instruments conceive of the relationship between generic and specific aspects of quality (as discussed above). A “pure” instrument focuses only on one perspective—generic or subject- or topic-specific. An “additive” instrument supplements one particular perspective with (a few) features of the other. An “integrative” instrument operationalizes the features of one perspective along the others as well. An “inclusive” instrument combines the two perspectives depending on the research goal and creates an interdisciplinary appropriate instrument.

Attempting to place an instrument in this schematic can help to make the point of view and the methodological approach of researchers and studies reasonable and comprehensible. This might foster a differentiated and multi-perspective discourse on teaching quality. But these two dimensions are not the only ones that could be taken into account. There is a variety of open questions that illustrate the areas in which a critical diagloge and additional research will be needed.

4 Introduction in the papers in this special issue

The papers in this special issue, from author teams who have served as leaders in investigations of teaching quality in mathematics, explore several issues described above and represent a significant step forward in attempting to address the unanswered questions and challenges that we have noted. The papers generally fall into four categories.

First, several papers investigate challenges that are inherent to the assessment teaching quality. These challenges include whether teaching quality should be determined based on rating of live lessons or videotaped lessons (Jentsch et al., this issue); the relative weight of content-generic and content-specific aspects of instructional quality in assessments (Kyriakides et al., this issue; Besser et al., this issue); and whether the relationship between content-specific and content-generic aspects of teaching quality may vary depending on the content area (Rogh et al., this issue).

A second group of papers explore issues that arise when researchers attempt to incorporate students’ perceptions of classroom lessons into assessments of teaching quality. Among these issues are interpreting the heterogeneity of views that often emerge when students’ perceptions of lesson quality are taken into account, both within a school (Lindermayer et al., this issue) and also across several countries (Liu et al., this issue); and the relationship between students’ perceptions and experts’ perceptions of teaching quality (Pauli, et al., this issue).

This special issue also includes papers that examine the relationship between particular aspects of teaching and teaching quality. These aspects of teaching include teachers’ use of mathematical questions (Torbeyns et al., this issue) and adaptive learning support (Dunekacke, et al., this issue). Authors also investigate the complex relationship between culture, school context, and teaching quality, including the relationship between culturally-responsive teaching and other measures of high-quality teaching (Thomas et al., this issue), between school context and teaching quality (Quabeck et al., this issue), and between culture and high-quality use of representations (Dreher et al., this issue).

Finally, special issue paper authors also explore various aspects of mathematical content and tasks that are instrumental in determining and assessing teaching quality. These aspects include teachers’ judgements about the differentiation potential of mathematical tasks (Bardy et al., this issue), the relationship between teachers’ evaluation of tasks and teachers’ competence (Glegola, Jentsch, Ross, König, and Kaiser, this issue), the relationship between pedagogy and tasks in determining teaching quality (Litke et al., this issue), and dimensions of content and pedagogy that are associated with high quality teaching.

5 Conclusion

In sum, the quality of mathematics teaching includes aspects that affect the development of the students’ achievement positively as well as normative aspects of “good” teaching. These aspects are not only based in a generic perspective on teaching but also in a more subject-specific view. Furthermore, the subject-specific features can be further differentiated and supplemented by topic-specific aspects and characteristics that refer to certain mathematical processes or particular learning situations. Therefore, it can be helpful to situate one’s perspective with respect to the research objective on a continuum between genericity and subject- and topic-specificity.

In this paper we focus particularly on German and U.S. literature and concepts of teaching quality. This can be seen as a limitation. With this specific and limited focus, we can provide a reflection on various important aspects that should taken into account in future research. But we can not conclude that these aspects matter in culturally different mathematics teaching in an equal or similar way.

Even if we know several effective and relevant characteristics of high quality mathematics teaching, there are still a variety of important open questions. These open questions can be used to guide the elaboration of a future research agenda which should include a more interdisciplinary perspective and a deep collaboration between researchers of different disciplines.