Introduction: Need for investigating and promoting teacher expertise for fostering at-risk students’ understanding

Mathematics is cumulative content in which basic concepts from earlier years are crucial for understanding later content; for example, understanding decimal numbers in Grade 6 requires place value understanding for natural numbers from Grade 2 (van de Walle, 2007). Students who have not acquired basic concepts (such as place value understanding or the meaning of multiplication and division) have been empirically shown to be at risk of being left behind in their later school career unless they get a second chance to understand these basic concepts (Andersson, 2010; Moser Opitz, 2007).

Within the last decades, several intervention programs have shown efficacy in fostering at-risk students’ understanding of basic concepts (Brophy, 1996; Gersten et al., 2009; Slavin & Madden, 1989). However, most of these intervention programs were evaluated only in highly controlled trials under laboratory conditions with small groups of well-qualified tutors, whereas professionalizing teachers at the scale necessary for conducting the interventions has still been an ongoing challenge (Karsenty, 2010).

Although some intervention programs have also been implemented successfully in field studies with regular teachers (Slavin & Madden, 1989; Prediger et al., 2019; Cobb & Jackson, 2021), their scaled-up implementation seems to have been hindered by a research gap on the specific expertise that teachers need for fostering at-risk students’ understanding (Scherer, Beswick, DeBlois, Healy, & Moser Opitz, 2016) and by a lack of evaluated professional development (PD) programs that can initiate professional growth of this specific expertise (Cobb & Jackson, 2021; Karsenty, 2010).

This article reports on a PD research project that contributes to this research demand by (1) suggesting a conceptual model of teacher expertise for fostering at-risk students’ understanding of basic concepts, (2) presenting a PD program for initiating professional growth in this expertise, (3) empirically evaluating the initiated professional growth, and (4) identifying the most crucial obstacles in the growth of teachers’ practices.

The article starts with the background of the innovation in view by defining the terms at-risk students and understanding of basic concepts and by a brief overview on existing intervention programs fostering at-risk students’ understanding of basic concepts. The second section suggests a conceptual model for the respective content-related teacher expertise that synthesizes different research traditions of teachers’ relevant orientations and practices. The third section presents the PD program Mastering Math—(presented in more detail by Prediger et al., 2019), before the fourth section outlines the research design aimed at evaluating the growth of teachers’ expertise. Finally, the findings are presented and discussed with respect to the identified challenges.

State of research on fostering at-risk students’ understanding of basic concepts and teacher expertise in general

Prevalence of at-risk students and learning needs in understanding basic concepts

Slavin and Madden (1989) defined an at-risk student as “one who is in danger of failing to complete his or her education with an adequate level of skills. Risk factors include low achievement, retention in grade, behavior problems, poor attendance, low socio-economic status, and attendance at schools with large numbers of poor students” (p. 4). Thirty years later, large-scale assessments still show the high prevalence of students who reach only a risky competence level (OECD, 2016) and add language proficiency as another critical background factor (Paetsch et al., 2016). In Germany (the context of the presented PD research), 24.3% of students are at risk in Grade 9 (Stanat et al., 2019) and 15.4% are already at risk in Grade 3 (Stanat et al., 2017), not including 3–4% students with certified learning disabilities (Scherer et al., 2016) and 4% with other special educational needs (e.g., deafness and severe intellectual impairments).

Within the last 30 years, mathematics education (and special education) research has identified typical mathematical learning needs for at-risk students. Whereas early research mainly emphasized the relevance of basic skills underlying the current topics, increasing consensus emerged that skills must be intertwined with the understanding of basic concepts (Moser Opitz, 2007; Maccini et al., 2007; Anderson, 2010, Cobb, & Jackson, 2021; for all students, see Kilpatrick et al., 2001).

All students’ learning progressions need to cover both conceptual understanding and procedural skills, for both the current topics and for foundations from previous years (see Fig. 1 for a typical learning progression indicated by the curvilinear arrow).

Fig. 1
figure 1

Learning progressions from understanding basic concepts via basic skills and more complex conceptual understanding to complex procedures (shortened from Prediger, 2020)

Figure 2 shows an example of how the four different domains from Fig. 1 can be intertwined in compacted procedures and concepts: Understanding the procedure of multiplying decimal numbers (“new procedures” in Fig. 1) requires the decomposition of the decimal and natural numbers (place value understanding for decimal numbers as current conceptual understanding and place value understanding for natural numbers as understanding of basic concepts) into their digits and digit-wise multiplication (basic skills). Basic concepts involved here for at-risk students include the meaning of multiplication and place value concepts, with their different knowledge elements such as the positional property (the place of a digit determines its value), the additive property (two-digit numbers can be decomposed into tens and ones), and the multiplicative property for the place values (the second and third digits count the bundled units of tens and hundreds; van de Walle, 2007; Ross, 1989).

Fig. 2
figure 2

Understanding as a compacted network of connected knowledge elements: Overall categories of compacted procedures and concepts in current content and foundations (elements marked with black margins) can be unfolded into subcategories with more detailed elements (marked with grey margins)

The cumulative nature of mathematical concepts and procedures condensing earlier concepts and skills is particularly critical with respect to at-risk students because teachers still tend to believe that low-achieving students should concentrate on procedural skill development (Beswick, 2007; Boaler, 2002; Wilhelm et al., 2017). However, missing conceptual learning opportunities can hinder the learning progression in later years. This longitudinal relevance of basic concepts for at-risk students has been underlined by empirical findings that students without understanding of the basic concepts of arithmetic mentioned above (place value understanding, meaning of multiplication and division) until Grade 4 have difficulty progressing along the learning progression through Grades 5 to 8 (Moser Opitz, 2007; Andersson, 2010; Maccini, Mulcahy, & Wilson, 2007). So, ignoring that the learning progression is cumulative risks leaving students behind.

Intervention programs for at-risk students and the relevance of teachers

In their seminal review on interventions for at-risk students, Slavin and Madden (1989) characterized effective interventions as (1) well-planned, comprehensive programs with curriculum material and teacher guides, (2) intensive teaching, for instance, by one-to-one or small-group teaching and (3) programs that frequently assess student progress and adapt instruction to individual needs. Beyond these generic conditions, Gersten et al. (2009) specified instructional components that proved effective for at-risk students, among them using graphical representations and manipulatives, thoughtfully selecting and sequencing of instructional examples, and encouraging students to verbalize their own strategies or the strategies modeled by the teacher. According to the research findings reported in the last subsection, a focus on basic concepts must be added to this list of characteristics for effective interventions (Anderson, 2010; Moser Opitz et al., 2017; Cobb & Jackson, 2021).

The high relevance of tutorial programs for at-risk students as interventions supplemental to regular classrooms stems from the need for curricular adaptivity with respect to students’ learning progress (Janney & Snell, 2006; Karsenty, 2010). This often involves the need to remediate the conceptual foundations from previous years rather than to focus on the current content prescribed by the syllabus (Fig. 1). Supplemental tutorial programs in small groups have the additional strength of allowing intense communication, which is crucial for oral monitoring of students’ learning progress as well as for engaging the students in cognitively demanding, conceptually oriented discourse practices (Henningsen & Stein, 1997).

Based on these identified content needs and on findings about how to design tutorial programs for at-risk students (Gersten et al., 2009; Moser Opitz et al., 2017; Slavin & Madden, 1989), the intervention program Mastering Math was developed and investigated in iterative design research cycles (Selter et al., 2014) for basic arithmetic understanding of fifth graders. It is based on three design principles:

  • focusing on the understanding of basic concepts

  • allowing deep insights into student thinking using rich diagnostic tasks

  • promoting discourse in teacher-moderated small groups

The Mastering Math tutorial program was shown to be effective in fostering students’ understanding of basic concepts, with significantly higher learning gains than the control group taught with other intervention programs (Prediger et al., 2019). However, the learning gains varied substantially and relied heavily on the teachers’ practices in enacting the program.

Other research projects have also identified teachers as crucial for reproducing the proven efficacy of small-group tutorial programs (Cobb & Jackson, 2021; Gersten et al., 2009; Slavin & Madden, 1989). Haycock (1998) emphasized that teachers’ expertise matters, especially for the learning progress of at-risk students. Karsenty (2010) emphasized that “teaching mathematics to at-risk students is therefore a specific area of expertise, which is by no means an easy undertaking” (p. 5). Chazan (1996) also outlined that “even in the best of circumstances … the job of teaching … students who have not been successful in mathematics will remain a difficult challenge for those teachers willing to take it on” (p. 475).

For preparing (non-professional) tutors for this challenging task, Karsenty (2010) developed and documented a PD approach with a strong focus on promoting tutors’ pedagogical content knowledge about conceptual understanding of the mathematical concepts in view, including the related representations and students’ difficulties, as well as teaching strategies for overcoming them (see Fig. 3). Based on promising first indications for the program’s efficacy, Karsenty (2010) called for more professional development (PD) research to investigate how teachers or even non-professional tutors can develop their expertise in fostering at-risk students’ conceptual understanding.

Fig. 3
figure 3

PD approach for preparing tutors for teaching at-risk students (Karsenty, 2010, p. 10)

The current study pursues this call for further research. We build upon Karsenty’s (2010) work in reproducing the strong mathematical focus in our PD program accompanying the Mastering Math program. The foundation of this PD program is a conceptual model of teacher expertise that comprises more than the tutors’ pedagogical content knowledge about basic concepts. This model is introduced in the next section.

Conceptualizing teachers’ expertise for fostering at-risk students’ understanding of basic concepts

To present our conceptual model of teachers’ expertise for fostering at-risk students’ understanding of basic concepts, we first outline the general framework in which the model will be articulated and then the model itself, together with further research findings warranting the relevance of its components.

General conceptual framework for content-related teacher expertise

The search for a conceptual framework that allows design researchers to specify and describe teacher expertise of a certain area of PD content was inspired by the research synthesis of Goldsmith et al. (2014), who criticized the lack of content-related PD research that goes into detail of different areas of PD content.

To compile a general conceptual framework as a search space for content-related teacher expertise, Prediger (2019) synthesized generic frameworks by Bromme (1992) and Schoenfeld (2010) and specified how to substantiate them for different areas of PD content (e.g., language-responsive mathematics teaching in Prediger, 2019).

The framework combines a situated perspective on teachers’ classroom practices with a cognitive perspective on underlying knowledge categories and orientations (Depaepe et al., 2013), two domains formerly considered complimentary (Blömeke et al., 2015) yet often unconnected. Based on early ideas of Bromme (1992), the general framework considers teachers’ situated practices to cope with situational demands in classroom situations and their interplay with the underlying orientations, categories, and concrete pedagogical tools. More precisely, five constructs of the framework for content-specific teacher expertise are defined as follows:

  • Jobs: typical and often complex situational demands that teachers have to master in classrooms, here in particular those relevant for the PD content in view.

  • Practices: recurrent patterns of teachers’ utterances and actions for managing the jobs. Teachers’ practices can be characterized by the underlying categories, pedagogical tools, and orientations on which the teacher implicitly or explicitly draws:

  • Pedagogical tools: concrete and visible tools applied to manage the jobs (e.g., facilitation moves enacted, diagnostic tasks, manipulatives, or other didactical artifacts).

  • Categories: Conceptual (i.e., non-propositional) knowledge elements that filter and focus teachers’ perceiving and thinking. Although generic pedagogical knowledge provides relevant categories too, we focus here on the content categories that teachers explicitly or implicitly chose as their filters for perceiving and thinking (“conceptual tools” in the framework of Grossman et al., 1999).

  • Orientations: Generic or content-related beliefs and pedagogical attitudes about mathematics and its teaching and learning that implicitly or explicitly guide the teacher’s perception and prioritization of jobs (see Schoenfeld, 2010, p. 29).

The general conceptual framework can be used to determine which categories and orientations are required for or are used by teachers. In a descriptive mode, the practices of teachers are observed and analyzed with respect to the pedagogical tools, categories and orientations they use for managing certain jobs. Complementarily, in a prescriptive mode, PD design researchers prescriptively determine the pedagogical tools, categories, and orientations expert teachers should use (as Bass & Ball, 2004, suggested in their job analysis). The pathway from the current to the intended expertise can be unpacked by specifying the necessary orientations for teachers’ practices (both current and intended) to cope with the jobs and by identifying the underlying categories that should or do guide their practices.

Substantiating the model of teacher expertise for fostering at-risk students’ understanding of basic concepts

The general framework was used to substantiate the conceptual model of teacher expertise for the PD content in view in this paper: fostering at-risk students’ understanding of basic concepts. It was developed during several years of practical PD work and succeeding years of PD design research, in an iterated interplay of thorough literature review and qualitative case studies of PD processes (as documented in Prediger, 2020; Prediger & Buró, 2021). Three main jobs are the starting point of our specification:

  • Specify learning content (in basic concepts)

  • Monitor students’ learning progress (in basic concepts)

  • Enhance students’ understanding (of basic concepts)

For each job, teachers can enact multiple practices. The practices for the three jobs are highly intertwined, because what teachers focus on while monitoring or enhancing students’ understanding is guided by underlying orientations and the content categories they (perhaps implicitly) chose to specify the learning content. The relevant content categories in our context are given in Figs. 1 and 2. For example, a teacher who only focuses the content category of multiplication procedures while specifying the learning content cannot monitor details in students’ understanding of basic concepts. In contrast, a teacher who is able to unpack condensed concepts into detailed categories for different concept elements (e.g., unpacking place value understanding in the positional property, the multiplicative property and the additive property) can set these categories as learning goals and enhance students’ understanding towards these detailed and targeted goals. This is why activating detailed concept categories shapes the teachers’ practices for all three jobs and must therefore be treated in the PD (Karsenty, 2010; Morris et al., 2009).

The addressed categories that guide teachers’ practices are not only shaped by the teachers’ pedagogical content knowledge but also by the teachers’ underlying orientations. It is therefore important for planning and evaluating a PD program to consider teachers’ multiple practices for managing these three jobs and practices identified as less productive in the reported classroom research (presented in Sect. 1), as they may display the addressed categories, pedagogical content knowledge, and orientations.

From the huge and heterogeneous body of research and our own qualitative research about teachers’ multiple practices in working with at-risk students and the underlying orientations that have been identified, we extracted four main pairs of orientations underlying the observed practices:

  • Compass: diagnostic or syllabus-bound orientation

  • Content: conceptual or procedural orientation

  • Goals: long-term or short-term orientation

  • Pedagogy: communicative or individualistic orientation

In the following, we briefly report the state of research on these pairs of orientations and their articulation in practices for managing the three jobs with certain content categories. In each pair, the first orientation has been shown to be more productive than the second, but it must be emphasized that the description of dual pairs of orientations must not be interpreted as opposing orientations, as most people combine both orientations and activate them situatively. As an advance organizer, Fig. 4 summarizes typical practices enacted for each job in the different orientations. We start by presenting the orientations with the job specifying learning content, then continue by switching to enhancing students’ understanding and, after that, monitoring, as the monitoring practices depend on both.

Fig. 4
figure 4

Excerpt of the conceptual model for teacher expertise in fostering at-risk students’ understanding:
Typical practices for three jobs and four pairs of orientations (grey practices not considered in the empirical part due to time restrictions in teachers’ questionnaires)

Different practices of specifying learning content

In general, various studies have indicated the high relevance of the job of specifying the learning content. This job entails two sub-jobs, starting with unpacking the learning content from the highly compacted content into detailed pedagogical content categories for the knowledge elements in view (Karsenty, 2010; Morris et al., 2009; Pillay & Adler, 2015). Based on these unpacking practices, the second sub-job is setting learning goals. The teachers’ explicit or implicit goal-setting practices are guided by different orientations:

  • In a syllabus-bound orientation, the learning goals are drawn from the official syllabus (perhaps concretized in the textbooks) without strongly considering at-risk students’ learning needs. In contrast, more adaptive goal-setting practices (Janney & Snell, 2006, speak about curricular adaptivity; Krähenmann, Moser Opitz, Schnepel, & Stöckli, 2019, emphasize the differentiated goal setting) rely upon a diagnostic orientation, which tightly connects practices for unpacking learning content and goal-setting practices to monitoring students’ learning progress. For at-risk students, a diagnostic orientation in specifying the learning content is of particular importance as teachers cannot presume that they have already reached all official syllabus goals; rather, they need to go back to the basics.

  • In a procedural orientation, mainly procedural knowledge elements are unpacked when the required categories are addressed and prioritized as learning goals, whereas in a conceptual orientation, mainly conceptual knowledge elements are unpacked through activating categories and prioritized as learning goals. In spite of the empirical results on most at-risk students’ principal abilities to achieve conceptual learning goals when provided with suitable learning opportunities, many teachers have been shown to prioritize procedural orientations for vulnerable students. These procedural goal-setting practices that are guided by low expectations limit these students’ conceptual learning opportunities and turn them into at-risk students (Beswick, 2007; Wilhelm et al., 2017; Zohar et al., 2001).

  • In her seminal paper in personality psychology, Dweck (1996) distinguished two forms of setting achievement goals: mastery orientation (which is described as a focus on the long-term learning progress) and performance orientation (referring to the short-term demonstration of competence without improving mastery). For the context of teachers specifying learning content for at-risk students, we adapt this classical distinction between the performance and mastery orientations to a short-term orientation on the current content related to short-term categories, involving short-term repair practices aiming at quick success (an orientation that dominates many remediating tutoring groups) and a long-term orientation on long-term categories, involving long-term foundation practices focusing on basic concepts and skills from previous years (which is strongly advised by the experts; see Moser Opitz, 2007; Watson & Geest, 2005). Teachers enacting long-term foundation practices focus their goal-setting practices on categories involving basic concepts and basic skills, in contrast to short-term repair practices that exclusively focus on current content (see Fig. 1 for the distinction between categories for deep progress on current content and basic foundations).

Different practices for monitoring students’ learning progress

Teachers need to monitor students’ learning progress in particular for adaptive enhancement practices, but these monitoring practices can still differ substantially according to the underlying orientation (see the middle column of Fig. 4).

  • In a syllabus-bound orientation, monitoring is not necessary, whereas the diagnostic orientation is defined by taking monitoring seriously. The importance of monitoring practices has been underlined by many researchers in the last decades. While using different terms and prioritizations (e.g., listening to students, Empson & Jacobs, 2008; diagnostic competence, Leuders, Philipp, & Leuders, 2018; noticing, Sherin et al., 2011; dealing with errors, Brodie, 2013), all these approaches emphasize the need for teachers to actively engage in working with diagnostic information on at-risk students’ ideas, conceptions, or learning progress.

  • The pedagogy can influence the mode of monitoring. Within an individualized orientation, the monitoring is mainly realized by written formative assessments, whereas a communicative orientation also allows teachers to use oral communication for spontaneous investigation of practices of student thinking (Empson & Jacobs, 2008). Both practices can be useful and can be combined.

  • The content is crucial for successfully managing the monitoring job, and it is again directly related to the specification of learning goals. A conceptual orientation consists mainly of assessing conceptual understanding in the conceptual diagnostic practices by activating conceptual categories, while a procedural orientation consists mainly of assessing the fluency of skills by activating procedural orientations. Teachers of at-risk students tend to adopt mainly procedural diagnostic practices; for example, Son (2013) showed that teachers mostly identified conceptual obstacles as being procedural ones.

  • Finally, a short-term orientation shapes monitoring as determining how to bring students to task completion in the current content with fitting categories underlying the practices (Watson & Geest, 2005; Prediger, 2020), whereas a long-term orientation towards learning progress makes it necessary to identify, monitor, and articulate the foundation from previous years with relevant categories in order to apply enhancement practices aiming at individual learning progress.

Different practices for enhancing students’ learning

The orientations applied for specifying learning content (unpacking and goal setting) often also shape the practices for the job of enhancing students’ understanding operating with the same categories for their mathematical focus. These categories must be set up as part of teachers’ knowledge and then be addressed explicitly or implicitly by the teachers. The jobs of specifying learning content and enhancing students’ learning are tightly connected, as teachers will only promote what they have chosen as relevant learning content, in particular when a focus on basics is needed (see Fig. 4).

  • Many studies have revealed that, in particular, teachers of at-risk students hold procedural orientations, where learning opportunities are provided for fluency in basic or advanced skills in line with fitting categories rather than, as would be intended in a conceptual orientation, for understanding of basic or advanced concepts in line with fitting categories (Beswick, 2007; Boyd & Bargerhuff, 2009; Wilhelm et al., 2017).

  • The same applies for the so-called compass: Teachers with a diagnostic orientation (called “ethical compass” or “student-led compass” by Gheyssens, Couberg, Griful-Freixenet, Engels, & Struyven, 2020) have been shown to adapt their teaching to students’ learning progress (called “curricular adaptivity” by Janney & Snell, 2006). In contrast, teachers with a syllabus-bound orientation (called “syllabus-led compass” by Gheyssens et al., 2020) teach what the syllabus suggests, even if this does not meet the at-risk students’ needs (Karsenty, 2010). When they conduct adaptations, then only so-called instructional adaptations focusing on simplifying access to the task (Janney & Snell, 2006), which leads to the next pair of practices.

  • Prediger & Buró (2021) elaborated the distinction between enhancement practices and compensation practices according to the kind of adaptivity and underlying orientation and category: Enhancement practices aim at students’ learning progress in a long-term orientation (“mastery orientation,” according to Dweck, 1996). When adaptivity is necessary due to the diagnostic orientation, then it is realized in curricular adaptivity with long-term categories in view (Janney & Snell, 2006). In contrast, compensation practices also try to realize adaptivity, but with a focus on task completion instead of learning progress and categories relevant to completing the task. This second adaptation practice has been described by Corno (2008) as circumventing missing readiness; it can be highly relevant when compensating one student’s general ability (such as reading) supports the access to mathematical enhancement. In contrast, compensation practices fall short if they scaffold away all mathematical demands (Henningsen & Stein, 1997). Hence, since compensation is another counterpart to practices in mastery orientation that deviates from performance orientation in its tools and aims, the terms “long-term orientation” and “short-term orientation” are more appropriate for our context than the popular terms “mastery orientation” and “performance orientation.”

  • Many researchers have emphasized a connection between pedagogy and the possible learning goals pursued: Various studies have shown that at-risk students can develop conceptual understanding much better in communicative pedagogies when the teachers engage them in rich discourses, whereas higher achieving students can also have inner discourses (Moschkovich, 2015). Accordingly, completely individualized pedagogies where each student works independently on individual tasks have been shown to often coincide with procedural goal-setting practices (Krähenmann et al., 2019), as conceptual learning goals are harder to pursue without communication, at least for many at-risk students.

Promoting teachers’ growth of expertise: the Mastering Math PD program

Based on the presented theoretical background, the Mastering Math PD program was initially developed in 2014 and iteratively refined in several cycles to promote teachers’ growth of expertise for fostering at-risk students’ understanding. The PD program lasts for 2 years and is rolled out for middle school mathematics teachers or out-of-field teachers teaching mathematics tutoring groups (with teaching certificates only in other subjects, e.g., music).

As presented in detail in Prediger et al. (2019), the PD program was based upon solid research findings on the relevance of teacher collaboration (Borko & Potari, 2020; Cobb & Jackson, 2021), so it is organized in school networks, in other words, in teacher groups from five schools each, which form stable professional learning communities meeting with a facilitator every 6 weeks to discuss the experiences and prepare the next module.

Figure 5 connects the key components of the PD program in the model of professional growth (Clarke & Hollingsworth, 2002). The two external sources provided for reproducing Karsenty’s (2010) strong PD focus on unpacking pedagogical content knowledge about basic concepts are facilitated network meetings every 6 weeks and curriculum materials with teacher guides. Both of these provide access to all relevant content categories and their connections (Selter et al., 2014) for the following basic arithmetic concepts for at-risk fifth graders: understanding place value in the place value table and on the number line, connecting representations for addition/subtraction, understanding meanings and connecting representations for multiplication/division, and learning multiple calculation strategies for two-digit numbers in the 1st year (in view of our evaluation study) and for fractions and decimal numbers in the second year.

Fig. 5
figure 5

(adapted from Clarke & Hollingsworth, 2002)

PD environment in the Mastering Math PD program articulated in the interconnected model of professional growth

Based on findings that interventions for at-risk students should be supported by well-planned, comprehensive programs with curriculum material and teacher guides (Slavin & Madden, 1989), curriculum material for the tutorial groups was provided for all topics in 30 modules, each starting with a 10-min formative assessment and tasks for the teacher-led tutorial group instruction for enhancing students’ understanding. Figure 6 shows how the teacher guide provides PD opportunities for pedagogical content knowledge while monitoring students’ products in the formative assessment.

Fig. 6
figure 6

Extract from Mastering Math teacher guide on monitoring student understanding in a formative assessment, here for place value understanding (Selter et al., 2014)

With this important material support (Slavin & Madden, 1989; Swan, 2007), the teachers’ cooperative inquiries into students’ thinking are stimulated in biweekly school team meetings. In this way, the collective inquiry with innovative teaching practices in the newly established tutorial groups is encouraged, as suggested by Clarke and Hollingsworth’s (2002) model and Cobb and Jackson’s (2021) focus on at-risk students.

Taking into account the high relevance of teachers’ orientations and practices, the teacher groups’ inquiries into the domain of practices are systematically connected to intense reflections in the domain of outcomes during the PD sessions in school network meetings every 6 weeks (Karsenty, 2010). With collective diagnostic activities on the teachers’ classes’ formative assessments, unpacking activities for their mathematical backgrounds, and planning activities for the tutorial groups, teachers are prepared for mastering the jobs for each of the basic concepts. The orientations are repeatedly discussed in these PD sessions, especially with respect to the particular needs of at-risk students and providing long-term, conceptually rich learning opportunities in communicative pedagogy.

The multilayered PD design of the PD program is intended to develop teachers’ practices together with the teachers’ four pairs of orientations (Fig. 4) and content categories (Figs. 2 and 3) in the personal domain. This means that within the interconnected model of professional growth, we provide multiple input, enactment, and reflection opportunities, as indicated by the arrows in Fig. 5. In the next sections, we report on the evaluation study, investigating the effects on teachers’ expertise.

Methods for the empirical evaluation study

Research question and research design of the evaluation

The evaluation study for the PD program was designed to pursue the following research question:

To what extent can the Mastering Math PD program change teachers’ expertise in fostering at-risk students’ understanding, as operationalized in practices (which are self-reported) and underlying orientations as well as in addressed categories?

The evaluation study was conducted using a pre–post-design with teacher questionnaires that were administered at the beginning of the PD in September (pre-PD) and 9 months later in June (post-PD). No control group was investigated, as earlier studies had shown no change of practices and orientations in no-treatment control groups (Prediger et al., 2019).

Methods of data gathering

Investigating the growth of teacher expertise in an evaluation study using video-based classroom observation of really enacted practices would have been laborious and would not have been possible for this sample size. Instead, approximations of enacted practices were captured pre-PD and post-PD, both times in two complimentary ways: First, a standardized self-report questionnaire revealed the wider picture of various practices and orientations for the jobs of specifying areas of learning content (in particular, goal setting) and enhancing students’ learning. Second, a vignette-based activity eliciting teachers’ diagnostic judgments revealed an in-depth view of practices-in-action in unpacking learning content and monitoring students’ progress: This instrument elicited teachers’ content categories and allowed the research to infer the underlying orientations.

Vignette-based open diagnostic judgement tasks capturing content specifying and monitoring practices

Parallel open diagnostic vignettes were administered pre-PD and post-PD to capture teachers’ practices-in-action in a situated way (Blömeke et al., 2015) by eliciting categories for specifying learning content and monitoring students’ understanding. Extracts from one of the vignettes are shown in Fig. 7. To stimulate an oral monitoring situation as a typical job in tutorial groups, the prompt asks teachers to provide diagnostic judgements on students’ resources and obstacles. The vignette starts with a typical error in multi-digit multiplication and expands to a dialogue on the basic meaning of multiplication. Its content spans from the procedure for the current learning content in Grade 6 (multiplication algorithms of decimal numbers), which can be unpacked into the procedure and understanding of natural numbers from Grade 3 (see Fig. 2) with various content categories (procedural elements such as decomposition into sub-products and automatized multiplication facts as well as compacted basic concepts and their unpacking into more detailed concept elements). Figure 7 also lists the maximum of possible content categories, which were identified by experts prior to the investigation (Dröse & Prediger, submitted) and were not visible for teachers on the vignette-based questionnaire.

Fig. 7
figure 7

Diagnostic pre-PD vignette and its analysis by experts with respect to relevant content categories

Questionnaire for self-reported practices

Teachers’ self-reported practices for the three jobs (as drawn from the literature review) were captured in a standardized questionnaire structured according to the reported four pairs of orientations. For each orientation, one to three items were developed and administered on 6-point Likert scales (1 = strongly disagree, 6 = strongly agree). Examples are listed in Table 1.

Table 1 Practices captured in self-reports or in-action for three different jobs

For scales with three items, the internal consistencies are reported in Table 1. Given the small number of items, we consider the internal consistencies to be acceptable enough (α between 0.52 and 0.81, with one exception) to report means and standard deviations for these scales. In contrast, reporting the dual orientations on one scale would not have been adequate as many dual orientations overlap in teachers’ acceptance.

Sample

The initial sample of the evaluation study consisted of 124 mathematics teachers who volunteered to participate in the Mastering Math PD program and experimented with a weekly tutorial group for remediating basic concepts in Grade 5. Of those participants, 95 teachers completed the standardized pre-PD and post-PD questionnaire (75 mathematics teachers and 20 out-of-field teachers with teaching certificates for subjects other than mathematics). A subsample of 63 teachers also completed both vignette-based diagnostic judgment tasks. The teachers had between 1 and 38 years of experience in math teaching (median = 8 years). They estimated that between 0 and 80% of their students were at-risk (median = 23.50%). Some of the schools were in underprivileged urban areas.

Methods of data analysis

Analysis of diagnostic judgments

A preliminary study with pre-service teachers (Dröse & Prediger, submitted) showed that teachers rarely identify all content categories identified by a group of experts (see Fig. 7, right column), and a good judgment is not necessarily complete, but focused on the most relevant elements of the basic concepts involved. Hence, the qualitative analysis of the diagnostic judgments was conducted in four steps (exemplified in Fig. 8):

Step 1. Extract the addressed categories from teachers’ diagnostic judgments; for example, the utterance “difficulties with decimal point/positions” refers to the category of place value understanding of decimals and “picture does not match expression” to the category of connecting graphical and symbolic representations.

Step 2. Classify the addressed categories according to the field being addressed (see Fig. 2) as shown in the headings in Fig. 7: current procedure, current concept, basic procedure, basic concept, and unpacked basic concept element.

Step 3. For all basic concepts or unpacked concept elements, evaluate adequacy of judgment for the turn in the transcript of the vignette.

Step 4. Determine the total of applied categories and calculate percentages for deriving the

  • degree of unpacking practices operationalized by the percentage of unpacked basic concept elements among all applied categories,

  • degree of conceptual and complementary procedural diagnostic practices operationalized by the percentage of procedures/concepts/unpacked conceptual elements among all applied categories,

  • degree of long-term and complementary short-term diagnostic practices operationalized by the percentage of basics/current content among all applied categories, and

  • degree of targeted diagnostic practices operationalized by the percentage of adequately noticed basic concepts among all applied categories on basic concepts.

As the data were coded by two coders, we checked intercoder agreement for 25% of the material. The interrater reliability reached a Cohen’s kappa of 0.88.

Fig. 8
figure 8

Example of pre-PD and post-PD diagnostic judgement of one teacher: Increase in addressed and unpacked basic concepts

Analysis of percentages and of standardized questionnaire

Descriptive results for all variables and percentages are reported with respect to the means and standard deviations pre-PD and post-PD. Pre–post-differences were tested using paired t-tests on a 5% level of significance for each orientation. Effect size d was calculated by relating the mean differences of subsamples to the pooled standard deviation (Cohen’s d).

Findings on growth and stability of teachers’ expertise

Change of teachers’ content specifying and monitoring practices in the diagnostic judgments

Pre-PD and post-PD diagnostic judgments were elicited by vignettes as shown in Fig. 7. Figure 8 displays the pre-PD and post-PD diagnostic judgements of one exemplary teacher and the steps of coding by the applied categories as well as the deduced degrees for the unpacking practices and monitoring practices. The comparison of pre-PD and post-PD codings reveals how the teacher developed her unpacking and monitoring practices in the intended direction, as more categories on basic concepts are addressed and unpacked into elements of the basic concepts. The teacher strengthened her conceptual diagnostic practices and long-term diagnostic practices on basics slightly, and the unpacking practices most substantially.

This exemplary insight into the case of one teacher is typical for the whole sample, as the frequencies in Table 2 show: In the pre-PD diagnostic judgement, teachers unpacked on average 52.1% of the addressed categories for basic concepts. In the post-PD diagnostic judgments, the teachers unpacked 71.9% of the explicitly addressed categories for basic concepts. The inference statistics in the t-test reveal that this change is significant (with a small effect size of d = 0.38). We can interpret this result as evidence that teachers refined their specifications of the learning with more detailed basic concept elements.

Table 2 Changes in teachers’ content specifying and monitoring practices in the diagnostic judgments (significant changes marked in bold)

In addition, the productive monitoring practices-in-action increased (and the unproductive practices decreased) significantly:

  • With respect to conceptual diagnostic practices, on average 58.6% of the categories addressed by teachers in the pre-PD diagnostic judgments were related to conceptual elements. This percentage significantly increased to 69.1% of the categories addressed in post-PD diagnostic judgement (p = 0.014, d = 0.35). Complementarily, the procedural diagnostic practices decreased from 41.4% to 30.9% (complement to 100%, not reported in Table 2).

  • With respect to long-term diagnostic practices, on average 78.3% of the addressed categories in the pre-PD diagnostic judgments referred to basics and 21.7% to current content. This relation remained stable with 79.1% basics and 20.9% current content in the post-PD. As the derived long-term diagnostic practices were already strong in the pre-PD diagnostic judgments, no further development was found during the PD.

  • Finally, the diagnostic practices on basic concepts also tended to improve in their degree of correctness, from 79.5% in the pre-PD to 85% in the post-PD diagnostic judgments, although the higher degrees of unpacking might pose more challenges for the evaluation. Although this spans 12% of SD, the change was not significant.

In total, Table 2 shows that the fine-grained PD work in the Mastering Math project (with the teacher communities on unpacking concept elements and their experimentation with tutoring groups of at-risk students) resulted in the intended growth of unpacking and monitoring practices with productive underlying orientations.

Change of teachers’ self-reports on practices

Whereas the previous section revealed substantial growth in teachers’ practices for the jobs of unpacking the learning content and monitoring learning progress with relevant content categories, other practices changed much less, as the results in Table 3 on teachers’ self-reported goal-setting and enhancement practices show. As the productive and unproductive orientations can overlap, both are reported separately.

Table 3 Change of practices captured in self-reports for different jobs and pairs of orientations (1 = do not agree at all, 6 = agree absolutely)

Teachers’ self-reports of conducting practices in conceptual orientation stayed stable on a quite high level for conceptual goal-setting practices (mpre = 5.07 in the pre-PD questionnaire and mpost = 5.14 in the post-PD questionnaire) and for conceptual enhancement practices (staying the same at 4.77 pre- and post-PD). The practices in procedural orientation developed differently: Whereas the procedural goal-setting practices were reported significantly less after the PD (from 3.55 to 3.01 with a small effect size of d = -0.27), the teachers’ self-reported procedural enhancement practices did not change.

Whereas the adaptive goal-setting practices in diagnostic orientation did not change, syllabus-bound goal-setting practices for the regular classroom increased significantly (with a moderate effect size of d = 0.73).

Also, for the long-term orientation, the intended long-term foundation practice remained stable, whereas for the short-term orientation, a significant decrease in teachers’ self-reported practices was found (with moderate effect size of d = -0.64).

With the increase in self-reported communicative pedagogy practices, the communicative orientation was the only productive orientation that changed significantly. While this effect was small (d = 0.26), the significant development towards less individualized pedagogy revealed a large effect size (d = -0.82).

The most surprising and unfavorable pre–post-development was found for the short-term orientation regarding the job of enhancing students’ understanding. While teachers reported similar enhancement practices in long-term orientation, they reported even more short-term compensation practices after participating in the PD, albeit with moderate effect size (d = 0.34).

Conclusion

Summary and embedding of findings in the literature

As intervention programs for fostering at-risk students’ understanding of basic concepts always rely on teachers, PD researchers have called for more PD design research to promote teachers’ expertise in enacting such intervention programs (Karsenty, 2010; Slavin & Madden, 1989). So far, PD research has been rather unspecific in capturing what exactly teachers learn while developing this expertise. With the Mastering Math intervention program for tutoring groups (Selter et al., 2014), we were in the same situation: Although the overall program was assessed as having been successfully implemented with measurable effects on students’ achievement (Prediger et al., 2019), large differences in the teachers’ effectiveness suggested that we needed to investigate in more detail what teachers need to learn in the Mastering Math PD program and what they can learn within 9 months.

In this paper, we have presented (a) a conceptual model to specify the relevant teacher expertise, (b) the Mastering Math PD program we have developed for promoting this expertise (with components suggested by Cobb & Jackson, 2021), and (c) a detailed evaluation of the changes in expertise that we have achieved (or have not yet achieved) by the PD program.

The conceptual model of teacher expertise for fostering at-risk students’ understanding is in itself an important contribution of this paper. It synthesizes what has so far been a fragmented and incoherent international discourse on mathematics teacher expertise for at-risk students into a connected and coherent account of teachers’ practices for three jobs: specify learning content in basic concepts (i.e., unpacking learning content and setting learning goals), monitor students’ learning progress, and enhance students’ understanding. The state of research on teachers’ practices for managing these jobs for at-risk students was synthesized in a way that was guided by four pairs of underlying orientations, which can overlap and coexist. The model systematizes earlier suggestions of what teachers need to learn (e.g., by Beswick, 2007; Brodie, 2013; Gheyssens et al., 2020; Karsenty, 2010; Watson & Geest, 2005).

The suggested conceptual model for teacher expertise informed the PD design and allowed us to provide detailed insights into the professional growth that teachers achieved. Rather than presenting only a success story on an overall scale, the conceptual model allows us to scrutinize the achievements and constraints of the evaluated Mastering Math PD program for different orientations and jobs (see Fig. 9).

Fig. 9
figure 9

Summary of achievements and constraints of the evaluated Mastering Math PD program: effect sizes d from pre-PD to post-PD for each captured practice (light color marks significant effects in intended direction, italics in unintended direction)

First of all, the pedagogical orientations changed substantially, with increased relevance for communicative pedagogies (d = 0.26) and decreased relevance for individualized pedagogies (d = − 0.82). This achievement in changing teachers’ pedagogies confirms that “handing over” pedagogical tools (such as activity structures or tasks) is much easier than deeper changes in subtler orientations. Apparently, the provided curriculum materials offered sufficiently concrete support for tutoring groups so that teachers felt able to engage students in rich mathematical communications (Clarke & Hollingsworth, 2002).

Second, it is encouraging to see that the teachers’ monitoring of practices-in-action in a diagnostic judgment all changed in our intended directions. This is an important program achievement since teachers’ expertise in noticing and monitoring students’ learning has repeatedly been identified as crucial for success (Brodie, 2013; Empson & Jacobs, 2008; Leuders et al., 2018; Sherin et al., 2011). In particular, we identified a significant change toward less procedural practices (d = − 0.33), which is highly relevant for work with at-risk students is often shaped by expectations that are too low (Beswick, 2007; Brodie, 2013; Karsenty, 2010). These practices changed together with improved unpacking practices-in-action (d = 0.438), which means that teachers substantially improved their expertise to unpack the mathematical content into relevant content categories. It would be highly interesting to analyze further if this covariation might indicate that increased conciseness of available content categories may also have an impact on the orientations (Clarke & Hollingsworth, 2002).

However, these achievements in the specifying and monitoring practices and their underlying orientation were constrained by the teachers’ growing self-reports that their decisions in classrooms were mostly led by the syllabus (unproductive increase by d = 0.73). This means that although they made significant strides in developing their expertise in monitoring their students’ learning progress, this expertise might not unfold its potential for adaptive enhancement practices when the selections of what content to focus on in classrooms is constrained by a syllabus-led orientation. This constraint was already described by Gheyssens et al. (2020). In the context of the Mastering Math PD program, the unproductive increase in self-reports might be partially traced back to the effect of increased awareness of a teaching dilemma: By comparing the adaptive, extra-curricular work in the supplemental tutoring group with their work in mainstream classrooms, the teachers might have become more aware that their mainstream classrooms are led more by the syllabus than by diagnostic insights. If this interpretation is correct, then becoming sensitive to this challenge might be a first step in overcoming it.

An indication that counters this optimistic interpretation, however, is the second constraint marked in red in Fig. 9: The short-term orientation has only decreased significantly in self-reported goal-setting practices (fewer short-term repair practices; d =− 0.64) and slightly in the short-term monitoring of practices-in-action (d = − 0.03). However, it has increased in the third job in unintended directions: Whereas the long-term enhancement practices aimed at learning progress were reported to be stable, the teachers reported more compensation practices post-PD than pre-PD (d = 0.34). This means that after 9 months of participating in the PD program, teachers’ short-term orientation to use compensation practices increased, even if this circumvents the job of enhancing students’ understanding (an example item of this scale is “When I realize that my low achievers lack basic concepts for solving a task, I scaffold them until they solve it nevertheless.”).

As self-report questionnaires only provide a limited insight into teachers’ enacted practices and underlying orientations, we further investigated this critical constraint using the richer data of qualitative studies on teachers working with diverse abilities: Both studies confirmed the persistence of compensation practices inspired by short-term orientations (Prediger, 2020; Prediger & Buró, 2021). These studies also indicated that the prevalence of compensation practices was coherent with an increased awareness of syllabus-bound goal-setting practices: When the syllabus dictates advancing to the next area of content in spite of monitoring difficulties, then it has some internal rationality to circumventing the goal of students’ learning progress and restricting it to task completion in compensation practices. This can be explained as a particular version of performance orientation investigated in personality psychology since Dweck (1996) distinguished it from in mastery orientation (which is described as a focus on long-term learning progress). We claim that for teachers’ work with at-risk students, performance orientation takes the particular form of compensation practices that lead students to completing tasks without progress in understanding. The prevalence of these compensation practices was also found in a parallel classroom video study in which compensation practices were observed twice as often as enhancement practices (Prediger & Buró, 2021), so this prevalence will require future research and particular attention in the next year of the ongoing PD program.

Implications for future PD projects

Based on the identified effects of the current PD program, we can recommend that in the future similar PD programs should be established that provide ample opportunities to deepen the pedagogical content knowledge on understanding basic concepts, to experiment with formative assessment and curriculum materials using in working with at-risk students, and to reflect on the four pairs of orientations. Whereas the program was successful with respect to weakening procedural orientations in favor of conceptual orientations and to overcoming individualized orientations in favor of communicative orientations, there were also lessons learned from the identified constraints in changing teachers’ syllabus-led orientations and short-term orientations (Watson & Geest, 2005).

In this way, the conceptual model as well as the local constraints that have been detected in it by our evaluation study might influence future PD research. Future PD design research projects should search for PD approaches that are optimized to overcome the prevalence of compensation practices and strengthen real enhancement practices aimed at learning progress. This might start with emphasizing the coherence of long-term orientations across different jobs: When long-term orientation informs goal setting and monitoring, it should consequently also shape the enhancement practices. Compensation practices should be critically discussed and questioned, not only for the experimental supplemental tutorial groups, but also for overcoming them in mainstream classrooms (Gheyssens et al., 2020).

Methodological limitations and outlook on future research

Even if we can already draw consequences for PD programs, the results should be interpreted with caution while considering the methodological limitations. The study was conducted with only one treatment condition and without a control group. In order to single out which kind of support mattered most for the growth of expertise, different PD and support conditions should be compared systematically, and a waiting control group should be included to provide a baseline with which to compare. Additionally, the questionnaires should be further elaborated into scales with high internal consistency for all relevant practices and orientations. Questionnaires on self-reported practices always risk deviances from really enacted teaching; however, in our case, they helped us nevertheless to detect two important areas of constrained achievement, and the triangulation in two other qualitative studies validates these findings (Prediger, 2020; Prediger & Buró, 2021). As the vignette-based instrument for capturing practices-in-action provided deeper insights by eliciting teachers’ addressed content categories, future vignette-based instruments should also capture enhancement practices, not only specifying and monitoring practices.

Most importantly, similar studies should be conducted in other contexts, with other teachers, other topics, and different PD programs, in classrooms that include all students, not only at-risk students, in order to learn what is context specific and what is generalizable across contexts.