Introduction

Due to teachers’ significant impact on students’ learning, teacher education represents a main focus of education policy and society. Teachers’ professional competencies and their development have been primary concerns in educational—particularly mathematics pedagogical—research in many parts of the world for over two decades, as demonstrated by seminal large-scale studies in this field (Blömeke et al., 2010; Kunter et al., 2011). In recent years, situation-specific competence facets, especially teacher noticing, which is a construct mediating between dispositions and teaching performance, have increasingly come into focus (Blömeke et al., 2015). The construct of teacher noticing describes teachers’ ability to selectively perceive relevant information in complex teaching situations, to analyze competently, and thus to act appropriately (Jacobs et al., 2010; Kaiser et al., 2017). Therefore, teacher noticing is a central component of teachers’ cognition, an important characteristic of teaching expertise, and a crucial factor affecting students’ learning success (Blömeke et al., 2022; Lachner et al., 2016; Yang et al., 2021).

As a consequence, understanding the construct and developing pre-service and in-service teacher’s noticing skills are central goals of mathematics pedagogical research (Amador et al., 2021; Dindyal et al., 2021). However, to date only few studies have investigated the development of expertise in teacher noticing and have considered the influences of teaching experience on this development (König et al., 2022). In particular, there are no empirical studies addressing how the cognitive demands of teacher noticing depend on teaching experience (Bastian et al., 2022; Yang et al., 2021). Classroom situations, tasks, and problems, as well as items of the teacher noticing instrument that refer to them, impose cognitive demands on teachers (for a discussion on cognitive demands, see Turner et al., 2015).

Due to the limited number of studies examining the cognitive demands of teacher noticing––imposed by the noticed event, related situation, or accompanying items in the evaluation––this study focuses on demand-related differences in teacher noticing between three groups with different lengths of teaching experience. A detailed category system to describe the cognitive demands that mathematics teachers face during noticing was developed, and a comparison of the experience groups’ handling of different cognitive demands was conducted.

Theoretical Framework and State of the Art

Professional noticing of classroom events as a core activity of teachers entails perceiving (instructionally) relevant events in demanding situations and then processing this information (Sherin et al., 2011; van Es & Sherin, 2002). The construct of teacher noticing can be generally understood as a set of skills that enables teachers to deal with the complexity of teaching situations and to act professionally within them (Dindyal et al., 2021; Sherin et al., 2011). Since the early 2000s, exploration of this construct has become increasingly important in the educational sciences, especially in mathematics pedagogical research (see overviews by Amador et al., 2021; Dindyal et al., 2021; König et al., 2022; Schack et al., 2017). Moreover, teacher noticing has been shown to be an essential factor for teachers’ professional competence, teachers’ performant behavior in the classroom, and students’ learning success (Blömeke et al., 2022; Kaiser et al., 2017; Santagata & Yeh, 2016).

Currently, there is no consensus on the structure and characteristics of teacher noticing (Dindyal et al., 2021; Yang et al., 2021), as different theoretical perspectives have influenced the academic discourse and conceptualization of the construct. For instance, in a recent systematic literature review, König et al. (2022) identified four influential perspectives and research traditions in the current research. Within the cognitive–psychological perspective, teacher noticing is understood as cognitive processes that take place mentally in the individual teacher. Protagonists include, among others, Jacobs et al. (2010), Seidel and Stürmer (2014), and van Es and Sherin (2002, 2021). This perspective can be considered the most influential one at present (König et al., 2022), and it forms the theoretical basis of this study. In their survey of theoretical frameworks for noticing, Amador et al. (2021) described two cognitive–psychological approaches, the learning to notice framework proposed by van Es (2011) and the professional noticing of children’s mathematical thinking framework proposed by Jacobs et al. (2010), as predominant in the discussion and understanding of teacher noticing. Other perspectives that influence the discourse include the socio-cultural approach, which focuses on the social construction of professional visionFootnote 1 (Goodwin, 1994); the discipline-specific approach introduced by Mason (2002); and the expertise paradigm (e.g., Berliner, 1994).

Within the cognitive–psychological perspective, different facets of teacher noticing have been analytically distinguished: perceptual, interpretive, and, if applicable, decision-related skills (König et al., 2022). For example, Sherin et al. (2011, p. 5) described the facets “attending to particular events in an instructional setting” and “making sense of events in an instructional setting” as crucial for teacher noticing. In addition, a considerable number of recent studies included a decision-related facet in teacher noticing conceptualizations (Amador et al., 2021; König et al., 2022). For example, Jacobs et al. (2010, p. 172) identified the facets attending to children’s strategies, interpreting children’s understandings, and deciding how to respond on the basis of children’s understandings. Likewise, our own framework includes a decision-related facet, namely decision-making (Kaiser et al., 2015).

As pointed out in the literature review performed by König et al. (2022), students’ mathematical thinking and understanding are central topics in the research on teacher noticing. Some studies have investigated mathematical or mathematics pedagogical topics, such as representations (Dreher & Kuntze, 2015) or algebra (Walkoe, 2015). In addition, studies have examined teacher noticing in relation to general pedagogical issues, such as classroom management, instructional materials, and curricula (Dietiker et al., 2018; Gold & Holodynski, 2017; Stahnke & Blömeke, 2021) as well as social participation and equity (Louie et al., 2021).

Teacher Noticing as a Competence Facet

To complement the conceptualization of teachers’ professional competence with situation-specific skills, teacher noticing has been included as a central component in competence frameworks (Kaiser et al., 2017). In their competence as continuum model, Blömeke et al. (2015) conceptualized perceiving, interpreting, and decision-making as the mediating components between teachers’ dispositions and classroom performance. Thus, professional competence was understood neither solely as disposition nor performance, but as a horizontal continuum from one person’s knowledge and affective dispositions through situation-specific skills, which embed dispositions in context, to teachers’ actual performance in the classroom. As a consequence, teacher noticing was understood as strongly dependent on teachers’ dispositions, particularly knowledge facets, which aligns with prior research on teacher noticing and recent empirical studies (Blömeke et al., 2022; van Es & Sherin, 2002, 2021).

The competence as continuum model was further developed in other conceptualizations. For example, Santagata and Yeh (2016) emphasized the bidirectional relationship between dispositions and situational skills, which contrasts the rather linear understanding of the competence as continuum model. In their recent work, Metsäpelto et al. (2021) developed this model into a multidimensional adapted process model of teaching, which combines teacher noticing, as a situation-specific skill, with teaching practices and professional practices under the term “teaching competences.” In addition, this model enriches dispositions with cognitive and social skills and includes students’ learning success by looking at teachers’ effectiveness. Nevertheless, teacher noticing remains a central point in this conceptualization of competence (Metsäpelto et al., 2021).

Empirical studies have highlighted the relevance of situation-specific skills in the development of teachers’ professional competence, and have supported corresponding models (Santagata & Yeh, 2016). Previous studies have found significant correlations between teacher noticing and knowledge (e.g., Blömeke et al., 2016; Dreher & Kuntze, 2015; Meschede et al., 2017), belief facets (e.g., Meschede et al., 2017), and instruction (e.g., Blömeke et al., 2022), as well as the construct’s empirical separability from other competence facets (Bastian et al., 2022; Blömeke et al., 2016; Copur-Gencturk & Tolar, 2022), which justify its integration into these competence models.

Expertise Development in Teacher Noticing and the Impact of Teaching Experience

Thus far, only a few empirical studies have investigated teacher noticing in the context of expertise and the impact of teaching experience (König et al., 2022; Yang et al., 2021). A seminal exception is the American study by Jacobs et al. (2010), in which four different groups of primary teachers with varying levels of expertise were cross-sectionally compared. The results of the study demonstrated that skills in teacher noticing significantly increased across the four expertise groups. The evidence suggests that teaching experience as well as professional development, another expertise characteristic used in this study, influenced teacher noticing. There was a strong increase in noticing skills from the group of pre-service teachers to the group of in-service teachers, especially in the noticing facets of perception and interpretation; apparently, these facets increase with teaching experience. In contrast, expertise in the decision-related facet seemed to develop mainly in combination with professional development (Jacobs et al., 2010). The significant increase in expertise in teacher noticing between pre-service and in-service teachers, and thus the substantial influence of teaching experience on teacher noticing, was confirmed in further quantitative studies (Gold & Holodynski, 2017; König & Karmer, 2016; Meschede et al., 2017; Yang et al., 2021). However, few studies have investigated the reasons for these findings or the areas of development in detail.

Extensive research from the German context have demonstrated that master’s students (MS) have significantly lower skills in all three facets of teacher noticing compared to early-career teachers (ECT) and experienced teachers (ET) (Bastian et al., 2022), but no significant differences were found between the ECT and ET, except for a slight advantage for the ECT in the decision-making facet. However, the ET scored nominally lower on average in all three noticing facets, providing evidence of a stagnation—or even decline—in teacher noticing expertise. A similar finding was demonstrated by Kleickmann et al. (2013) for teachers’ professional knowledge. Additional regression analyses showed significant negative slopes in teacher noticing with increasing length of teaching experience (Bastian et al., 2022). These results contrast the analyses performed by Jacobs et al. (2010), which are described above, and East Asian studies, in which the development of mathematics teachers’ noticing skills progressed nearly linearly over the groups of MS, ECT, and ET (Yang et al., 2021). These differences indicate that further research is needed to explore differences in noticing skills between these groups of teachers and examine the stagnation or slight decline of noticing skills among ETs in the Western context. Both topics will be addressed in this study.

Recently, innovative techniques, such as eye-tracking procedures, have been used to investigate noticing. Studies using these techniques have reported that expert teachers focus on areas of teaching situations that contain information relevant to instruction, whereas novices let their gaze wander more frequently (Wolff et al., 2016). Moreover, the perception of experts seemed to be more knowledge-based than that of novices (Gegenfurtner et al., 2020; Seidel et al., 2021; Wolff et al., 2016). Novices observing classroom interactions were often solely focused on teachers, while experts concentrated on students and observed events related to instruction units, curricula, or school culture (Gegenfurtner et al., 2020; McDonald, 2016). Further, experts diagnosed more accurately, interpreted more often, detailed what they saw, and offered more alternative actions for observed lessons (Seidel et al., 2021; Stahnke & Blömeke, 2021).

Expertise research has pointed out that expertise must be conceptualized as domain-specific and confined to particular contexts (Berliner, 2001; Boshuizen et al., 2020), which has been observed for expertise in teacher noticing. For example, studies by Yang et al. (2021) in an East-Asian context suggested that ET demonstrate expertise in dealing with teaching-related reflections, student thinking, and assessment and diagnosis of mathematical content. In contrast, ECT excel in topics related to student participation in the classroom, cooperative teaching methods, and more recent mathematics pedagogical topics, such as the teaching of mathematical modelling.

Conceptualization and Theoretical Framework in the TEDS-M Research Program

Situation-specific components have been incorporated into the measurement of teachers’ competence. In the Teacher Education and Development Study in Mathematics (TEDS-M) research program, teacher noticing was conceptualized as a part of the competence framework and has been included in competence measurement since the TEDS-Follow-Up (TEDS-FU) study (Blömeke et al., 2014; Kaiser et al., 2015). Referring to the competence as continuum model proposed by Blömeke et al. (2015), teacher noticing is understood here as situation-specific skills and is conceptualized as consisting of three noticing facets:

(a) perceiving particular events in an instructional setting; (b) interpreting the perceived activities in the instructional setting; (c) decision-making, either as anticipating responses to students’ activities or as proposing alternative instructional strategies. (Kaiser et al., 2015).

This analytical, cognitive–psychological definition has perceptual, interpretive, and decision-related facets. With reference to the expertise paradigm (Berliner, 1994; Carter et al., 1988), the first facet describes a perceptual process in which one observes something without or with only minimal interpretation (Bastian et al., 2022), i.e., identifying discernable events. Interpretation of these incidents is understood as analysis based on the person’s knowledge and beliefs and connection of new knowledge chunks with existing knowledge or each other. The third facet of the conceptualization explicitly includes decision-making as a significant component of teacher noticing. In line with the work of Jacobs et al. (2010), decision-making is defined as the meaningful construction of possible continuations of the lesson or instructional incident, the formulation of reactions to student behavior, or the offering of well-founded alternatives for observed teacher acts (Yang et al., 2021).

Thematically, this understanding of teacher noticing does not focus on single mathematical topics. Instead, it includes a broad field of teaching events and conditions that are relevant for high-quality mathematics education. These include, among others, the cognitive activation of students, the design of teaching–learning processes, and effective classroom management. In comprehensive descriptions of the construct, each facet is understood from both a general pedagogical and mathematics pedagogical perspective (Kaiser et al., 2015; Yang et al., 2021).

Research Question

Based on the current state of the art presented above and identified research gaps—particularly the contradicting results concerning teachers’ development of expertise in teacher noticing with increasing length of teaching experience—this study aims to investigate the teacher noticing skills of different teaching experience groups. It goes beyond statistical comparisons of average scores in teacher noticing facets and takes a closer look at the in-depth differences among those experience groups in their handling of different cognitive demands. The following research question is examined:

What differences exist between the teacher noticing of groups with different levels of teaching experience in managing the cognitive demands of teacher noticing measures and related teaching situations?Footnote 2

We followed the generally agreed assumption that teaching experience facilitates higher expertise levels in teacher noticing, giving advantages to in-service against pre-service teachers. These advantages of length of teaching experience may vary in effect size for differing cognitive demands and different experience groups. We therefore assume that ECT might be more competent in dealing with recent mathematics pedagogical topics (Yang et al., 2021).

Methodological Approach

The sample, test instrument, and item and data analysis are described below. Supplementary details are provided in the electronic supplemental materials (ESM).

Study Context and Sample

The sample consists of N = 457 secondary mathematics pre-service and in-service teachers, who participated in one of four studies from the TEDS-M research program (see Table 1).

Table 1 TEDS-M studies included in the sample

The participants were divided according to their length of teaching experience: MS (n = 110) without extensive and systematic teaching experience, ECT (n = 193) with an average of 4.6 years of in-service teaching experience (SD = 0.5, Min = 3.5, Max = 6.0), and ET (n = 154) with an average of 19.6 years of in-service teaching experience (SD = 10.4, Min = 6.5, Max = 41.5). As teaching experience is a necessary—although not solely sufficient—prerequisite for expertise in the teaching profession (Caspari-Sadeghi & König, 2018; Palmer et al., 2005), we intend to explore the influence of teaching experience on the development of expertise in noticing. The three groups represent three different stages of teaching experience; thus, we hypothesize that they are sufficiently different in light of the expertise paradigm (Berliner, 1994; Gruber et al., 2019; Stigler & Miller, 2018). As the data collection took place at different times for the three expertise groups—first the ECT were tested, then ET, then MS, differences in noticing may be caused by differences in the educational background of the three groups. We will come back to this aspect in the discussion and interpretation of the results.

Further characterizations of the three experience groups can be found in Table 2. The groups show no significant differences in terms of gender, diploma from German secondary school qualifying for university admission or matriculation (so-called Abitur) and relevant teacher education grades. However, there are significant differences between the groups with regard to school type, which different selectivity—meaning the different socio-economic background of the students and their orientation in academic track and non-academic track schools—is of high importance in the German school system. This will be considered in the results section.

Table 2 Descriptive characterization of the total sample and the experience groups

TEDS-M Instrument for Measuring Teacher Noticing

The TEDS-M instrument for measuring teacher noticing was developed in the TEDS-FU study. It is an established instrument for measuring teacher noticing as a contributor to teacher expertise. We use the achievements in this test as a norm- or task-oriented measure of expertise to identify expert teachers (Krauss, 2020; Stigler & Miller, 2018). The evaluation of noticing as a facet of teachers’ professional competence is based on three scripted (i.e., staged) video-vignettes with lengths of 2.25 to 3.5 min. The entire test instrument consists of three randomly arranged test units and has a duration of approximately 90 min. In each test unit, the participants first received background information regarding the class shown in the video-vignette, the general teaching conditions, and the mathematical content covered. Afterward, they received one-time access to the corresponding video-vignette. This allowed the teaching situation to be observed as realistically as possible in the test environment. Overall, the use of scripted video-vignettes instead of filmed natural instruction or text-vignettes enabled a manageable, cognition-activating, and strongly classroom-related measurement (Kramer et al., 2020; Piwowar et al., 2018; Santagata et al., 2021). The three video-vignettes contained a range of different mathematical topics (e.g., surface and volume calculation, functions, modeling) and teaching phases like the introduction of a mathematical task and work on it or a plenary with discussion of results (for further description, see Kaiser et al., 2015). They represented lessons in the 9th to 10th grades at different school types (Kaiser et al., 2015). After watching the video-vignette, the participants answered open-response and Likert-type rating scale items that addressed their abilities in one of the three teacher noticing facets with a mathematics pedagogical or general pedagogical focus (Blömeke et al., 2014, see Table 3 for the respective number of items). Across all three video-vignettes, the measurement instrument included a total of 77 items.

Table 3 Number of items in the noticing measure by noticing facet and subject-related perspective

As displayed in Table 3, the pedagogical perspective on teacher noticing (P_PID) is well represented in the item set measuring perception, while decision-making items particularly comprise items concerning the mathematics pedagogical perspective (M_PID), which was a result of scaling and test construction. For the interpretation facet, items are more or less equally distributed in terms of subject-related perspectives.

Concerning the item formats, a rating scale item consists of a statement and a four-point Likert scale, with which the statement is to be assessed. The example item in Fig. 1 requires perception of a specific classroom event.

Fig. 1
figure 1

Rating scale item for the perception facet (P_PID)

Example items for the open-response format and the interpretation and decision-making facets are displayed in Figs. 2 and 3. The interpretation item asks participants to analyze a student’s way of solving a volume and surface calculation and identify indicators of a formal mathematical approach. Consequently, this item is related to M_PID. The decision-making item concerning P_PID asks for possible changes to the course of instruction to better deal with the class’s heterogeneity.

Fig. 2
figure 2

Open-response item concerning the interpretation facet (M_PID)

Fig. 3
figure 3

Open-response item concerning the decision-making facet (P_PID)

Responses to the rating scale items were dichotomously coded as incorrect or correct based on intensive expert interviews (Hoth et al., 2016). In these interviews, experts answered the items and had the opportunity to evaluate and comment them. Items with at least 60% of agreement on one of the four possible answers were included in the test with the agreed answer as the correct choice. Items with a minimum of 80% of agreement for one tendency (approval or disapproval for the given statement) were revised considering the experts’ comments and then again reviewed. To evaluate answers to open-response items, a coding manual was developed based on theoretical considerations and discussions with experts. The manual provided detailed descriptions and numerous anchor examples to illustrate which answers are to be considered correct. To test the reliability of this procedure, 20% of the answers to each item were double-coded, and the intercoder reliability was examined based on Cohen’s kappa (Cohen, 1960). Good overall agreement (κmean = 0.85) was achieved (see ESM for details). The validity of the measurement instrument was ensured through various procedures, e.g., extensive workshops with mathematics pedagogy and general pedagogy experts concerning authenticity of the classroom situations in the video-vignettes and adequacy of the test items as well as curricular analyses of the covered content (Kaiser et al., 2015). The measurement of teacher noticing was independent of the video-vignettes; the items for the three vignettes measure the underlying construct (Blömeke et al., 2015). An adaptation of the instrument in China achieved good fit values in confirmatory factor analyses, which confirms its cross-cultural validity (Yang et al., 2018).

Item Analyses

We developed a category system to perform a detailed comparison of the abilities of teacher groups with different lengths of teaching experience in managing different cognitive demands related to teacher noticing and its facets. Cognitive demands were defined as competence components relating to knowledge areas or abilities that are needed to apply teacher noticing and its facets in certain classroom situations as well as to answer items on the noticing measures referring to these situations (Turner et al., 2015). To develop the categories, qualitative text analysis and rational task analysis were carried out on all 77 items (Kuckartz, 2014; Resnick, 1975), which resulted in three different perspectives on cognitive demands: knowledge (5 categories), practice-oriented competence facets (5 categories), and additional characteristics (2 categories).

Relevant literature on the topics of teachers’ competence domains, cognitive processes, and particularly knowledge served as the basis for the deductive category development. The conceptualization of competence, which we generally understood as the available or learnable set of cognitive skills and abilities to successfully handle a complex demand as well as the motivational, volitional and social willingness to do this (Weinert, 2001), has been regarded as a particularly relevant component of item analyses (Turner et al., 2015). To ensure theoretical as well as practical perspectives are considered, a combination of research-oriented and practice-oriented frameworks of teacher competence was applied. To include multiple aspects of the research-oriented approach of competence as continuum model (Blömeke et al., 2015), categories specifying knowledge-based aspects of noticing were developed in addition to the existing teacher noticing facet categories and subject-related perspectives (Blömeke et al., 2010). Since the situation-specific perspectives P_PID and M_PID are strongly connected to general pedagogical knowledge and mathematics pedagogical content knowledge, categories assigning the corresponding items to specific knowledge areas were developed (e.g., mathematics pedagogical curricula and planning or students’ learning). Curricular analyses of competence frameworks for various parts of teacher education were included as practice-oriented competence frameworks, which led to categories such as dealing with heterogeneity.

Inductive categorization with an experienced coder led to additional categories concerning general pedagogical and mathematics pedagogical topics, which can be described as new in the current academic discourse. An excerpt of the category system with the number of items which were allocated to this category is shown in Table 4. Due to space restrictions, the complete category system can only be found in the ESM.

Table 4 Excerpt from the category system and number of occurrences

The strictly rule-guided procedure and creation of a detailed coding manual ensured internal study quality and internal validity (Kuckartz, 2014). With the help of two experienced coders, the coding manual was revised and inductively sharpened using the collected data. After double-coding 40% of all items, substantially significant intercoder reliability could be achieved for all items (\({\kappa }_{Mean}=.833,\) \({\kappa }_{Min}=.684, {\kappa }_{Max}=1.000\); Landis & Koch, 1977). Content validity was established through the integration of existing theories, e.g., several existing competence frameworks, and methods, e.g., rational task analysis, related to item demand analysis. External quality criteria regarding transferability were met by involving external experts and comparing the results with existing measurement instruments.

Category System and Occurrence

Given the comprehensiveness of the category system, the presented analysis is limited to a few particularly interesting and relevant categories (see the ESM for the entire category system). A short description of these categories is given in Table 5.

Table 5 Short description of categories and subcategories

The absolute frequencies of the occurrence of cognitive demand categories are displayed in Tables 3 and 4. Knowledge of mathematics pedagogy is divided more or less equally between curricula and planning (20.8%) and interaction (26.8%). Similar results are found for students’ learning (20.8%) and teaching–learning processes (27.3%) for knowledge of pedagogy. Dealing with heterogeneity, which is an important topic on the global pedagogical agenda (e.g., König et al., 2017), is necessary in 31.2% of the items and particularly necessary in 14.3%. Consequently, this essential teacher skill is represented by the instrument. Knowledge of the particular relevance of recent (mathematics) pedagogical topics is needed in 37.7% of items. Standard correlation measures—Cramer’s V and Spearman’s rank correlation—were used to examine the correlations between categories. Mostly moderate correlations were observed, suggesting that distinct but correlated categories describe different aspects of cognitive demands (see ESM Table 2).

Scaling and Quantitative Data Analysis

To examine the relationships between the handling of cognitive demands and experience group membership, scales for each subcategory were developed and validated using Rasch models. These models achieved good model fit (for one factor model, RMSEA = 0.027). As an indicator for measurement invariance, item estimates were calculated for each experience group and then correlated (cf. Bond et al., 2021; König et al., 2017). These analyses indicated not perfect but acceptable measurement invariance (rMS-ECT = 0.81, rMS-ET = 0.77, rECT-ET = 0.92) when analyzing group differences.

The extent to which experience group membership explains variance with respect to the cognitive demands of items was investigated by applying multivariate analyses of variance (MANOVAs) for each demand category and its subcategories. Following the MANOVAs, post hoc tests were performed to identify significant differences between individual groups. In the case of variance homogeneity, the Scheffé post hoc test was used for post hoc analyses. The Scheffé test is a more conservative method compared to other post hoc tests, and it can be used in the case of unequal sizes of comparison groups. In the absence of homoscedasticity, the Games-Howell post hoc test was used.

Results

We first present the results concerning the overall effect of teaching experience group membership on the handling of cognitive demands. Then, we more closely compare and contrast consecutive groups.

Relationship Between Teaching Experience Group Membership and Cognitive Demands

Examples of mean scores for two cognitive demands by experience group are presented in Table 6. These scores can be considered exemplary for the other categories.

Table 6 Mean values according to experience group in example categories

Overall, analysis of the mean values in individual demand categories replicates the pattern described by Bastian et al. (2022) for the overall construct of teacher noticing and its facets. In-service teachers perform significantly better than MS, while ECT and ET hardly differ from each other (see Table 6 and ESM). However, ECT are nominally better than ET, suggesting stagnation or even a slight decline in ability to deal with the respective cognitive demands. These differences between the two teacher groups with teaching experience were not significant in prior analyses except for the decision-making facet (Bastian et al., 2022). Thus, it is of great interest to investigate the extent to which significant differences exist when individual cognitive demands are considered. The newly developed category system facilitates these investigations.

MANOVAS were conducted to analyze whether experience group membership is a decisive factor affecting the handling of cognitive demands. As Table 7 illustrates, all MANOVAs, which are qualified by Wilks-λ, suggest that experience group membership is a significant factor affecting performance in dealing with every cognitive demand. Since the groups differed significantly in school type (see Table 2), this characteristic was included as a covariate in an additional analysis. However, the characteristic showed no significant influence and thus was not considered in further analyses. The significant variance explanation of group membership for individual demand categories (between 7.5% and 10.5%) can be classified as a medium effect (Cohen, 1988). This means that experience group membership can be attributed a medium practical significance for the handling of different cognitive demands. The largest effects are found for knowledge of pedagogy, with 10.5% variance explanation, followed by knowledge of mathematics pedagogy, with 9.4% variance explanation.

Table 7 Effect of experience group membership on the handling of cognitive demands

Deeper investigation of the variance explanations for individual subcategories reveals a more differentiated picture of small to large effects (see ESM). Interestingly, group membership has large effects for the demand categories of curriculum and planning in mathematics pedagogy (η2 = .156), knowledge of pedagogy not applicable (η2 = .170), the subject-related perspective M_PID (η2 = .170), and the particular relevance of recent (mathematics) pedagogical topics existent (η2 = .149). Thus, in combination of the large effects observed for items for which no knowledge of pedagogy is needed and items belonging to the M_PID subject-related perspective, a large effect of group membership can be found for mathematics-related items. This result explains the variance explanation of knowledge of pedagogy reported above, which is probably not explained by pedagogical items, but by mathematics pedagogical items, which are not coded as missings in this category but as “not applicable”. The large variance explanation effect for dealing with items concerning recent (mathematics) pedagogical topics is of particular interest, as these comprise, among others, the latest insights related to proficient teaching.

The pattern that was indicated by mean comparisons (see Table 6) is significantly confirmed in the post hoc analyses carried out in the MANOVAs (see Table 8 and ESM Table 4). There are mostly highly significant differences (p < 0.001) between in-service teachers and MS, but in many categories, there are no significant differences between ECT and ET. Below, we discuss deviations from this pattern and comparisons of consecutive experience groups based on the post hoc tests.

Table 8 Post hoc tests for experience group comparisons in select subcategories

Comparisons of the Consecutive Teaching Experience Groups

This section presents the results of comparisons of consecutive experience groups (i.e., MS and ECT as well as ECT and ET). The aspects of decision-making, mathematics pedagogical knowledge, and the particular relevance of recent (mathematics) pedagogical topics will be of particular interest in these analyses.

When contrasting MS directly with ECT (see Table 8), significant differences are found for all but one subcategory. These differences vary in effect size from d = 0.28 to d = 1.13. Mean differences with more than one standard deviation (and corresponding effect sizes) are largest for M_PID (d = 1.13), curriculum and planning in mathematics pedagogics (d = 1.05), and the particular relevance of recent (mathematics) pedagogical topics (d = 1.02). This may indicate particularly large developments in the ability to deal with these demands through the acquisition of teaching experience. No significant difference between the three experience groups was found for students’ assessment. Analysis of the four items in this cognitive demand category revealed the items to be comparatively easy and thus non-differentiating tasks.

Since the mean comparisons imply stagnation or regression after some length of teaching experience, comparison of ECT and ET is of particular interest. The analysis (see Table 8) revealed some deviations from the pattern of non-significant differences. ECT’ significant advantage over ET in dealing with decision-related demands of teacher noticing has already been mentioned (d = .34; Bastian et al., 2022). In addition, our new analyses document significant differences in favor of ECT related to the M_PID subject-related perspective (d = .28).

Further differences in favor of ECT are observed for cognitive demands in the categories of mathematics pedagogical curriculum and planning (d = 0.32), particular focus on inclusive mathematics teaching (d = 0.28), students’ learning (d = 0.31), and particular relevance of recent (mathematics) pedagogical topics (d = 0.38). For the latter demand, these differences are significant at the 1% level and show the largest effect size for comparison of the in-service teacher groups. It should be noted that the other three cognitive demands relate, at least in part, to recent mathematics pedagogical or general pedagogical topics. For instance, dealing with new education plans, students’ competencies, and basic ideas related to recent educational topics make up a large part of curriculum and planning. In addition, students’ learning refers to dealing with heterogeneity as well as adaptive learning methods. Inclusive mathematics teaching can be understood as such a recent pedagogical topic in itself. Therefore, differences in teacher noticing skills between ECT and ET in favor of ECT may be attributed to knowledge and skills related to dealing with recent (mathematics) pedagogical and pedagogical issues. In addition, the difference in the category of students’ learning could suggest that ECT have a strong ability to deal with students’ individual motivations and cognitions.

Discussion

The present study aimed to investigate which differences exist between the teacher noticing skills—specifically, the ability to deal with the cognitive demands of tasks related to teacher noticing—of three teacher groups with varying levels of teaching experience. These differences were measured by a video-based teacher noticing instrument and evaluated with a developed category system. The category system was conceptually developed based on research-oriented and practice-oriented competence frameworks from relevant literature on teachers’ competence domains, cognitive processes, and knowledge facets.

MANOVAs demonstrated that teaching experience has a significant influence for all cognitive demands. Post hoc tests revealed that in-service teachers with teaching experience significantly outperform MS in terms of ability to deal with all cognitive demands, particularly those associated with mathematics pedagogical items and items referring to curriculum and planning. The latter may be explained by the necessity for teaching experience to develop skills in lesson planning (König et al., 2022). Those performance differences between MS and ECT are well documented for German samples in knowledge tests in terms of general pedagogical knowledge, mathematics content knowledge, and mathematics pedagogical content knowledge (e.g., Blömeke et al., 2014; König, 2013; Kleickmann et al., 2013), supporting the current findings for situation-specific skills reported in this study.

Although ECT nominally perform slightly better than ET, these differences are mostly insignificant. However, a detailed analysis of the two groups of in-service teachers demonstrated that there are deviations from this pattern; for some cognitive demands, ECT have a significant lead over ET with small effect sizes. This concerns three areas and groups of cognitive demands. First, significant differences for decision-making might indicate that ECT have stronger skills in the decision-oriented facets of teacher noticing, which lead to higher performance in making professional decisions in the classroom. This unexpected result may imply that ET rely on routinized action patterns and do not question them in situation-specific contexts. However, this could also mark missing expertise of ET in the addressed situations, in which their decision-making was tested, and, thus, indicate a lack of knowledge (Berliner, 2001; Boshuizen et al., 2020; Yang et al., 2021). Second, the results suggest differences in mathematics pedagogical knowledge and its application in the context of teacher noticing in favor of ECT. In other words, mathematical—especially mathematics pedagogical—knowledge is less present in ET. This could imply a decline in professional knowledge among ET, since their initial teacher education, providing formal learning opportunities fostering such declarative-conceptual knowledge explicitly, dates back several years (Liu & Phelps, 2020). It is in line with studies which demonstrate the decline in mathematics content knowledge from ECT to ET using pencil-and-paper tests (Blömeke et al., 2014; Kleickmann et al., 2013). The result emphasizes the essential role of professional knowledge as a prerequisite for the development of expertise in teacher noticing (Gegenfurtner et al., 2020), since ECT, who demonstrate greater expertise in our teacher noticing instrument, particularly excel in task strongly connected to professional knowledge. Third, and of particular interest, significant differences can be observed for the cognitive demands concerning recent mathematics pedagogical and general pedagogical topics, such as inclusive or competence-oriented mathematics education. This raises questions regarding the long-term effects of teacher education, similar to the results of Liu and Phelps (2020), and especially the long-term benefits of respective professional development activities (Goldschmidt & Phelps, 2010).

These results provide further evidence that teacher expertise and teaching experience are by no means equivalent (Stigler & Miller, 2018), as ECT demonstrated greater expertise in teacher noticing than ET with more teaching experience. That could also suggest a shortage of deliberate practice opportunities for ET (Ericsson et al., 1993), especially in recent topics of teaching and learning like competence-orientation. Nonetheless, experience was shown to be a necessary condition for expertise development, as indicated by the large achievement differences between MS without systematic teaching experience and ECT.

The study has limitations, which must be considered. First, the analyzed data are based on convenience samples and were collected cross-sectionally. Thus, statements about generalizations and developments are limited and must be treated with caution. Particularly, some teachers received their teacher education at different times, which might have influenced the development of their teacher noticing and caused different levels of noticing skills. However, descriptive sub-sample characteristics speak for the comparability of the three groups (see Table 2). Second, only three groups with different teaching experience were compared, giving a rather rough representation of the stages of teacher education and professional development. Future studies should perform analyses with more differentiated groups or longitudinal designs with several measurement points. Finally, the subsamples were collected over a period from 2011 to 2020. This may have had an unknown influence on the data and, due to the ongoing implementation of teacher noticing fostering in teacher education and professional development (Amador et al., 2021; König et al., 2022; Stahnke et al., 2016), favor groups, whose data were later collected—namely ET and MS.

Despite these limitations, this study aligns with the existing body of research and offers new and detailed results on how groups with different teaching experience deal with cognitive demands when applying their teacher noticing skills, enriching existing findings on expertise development in this area. For example, significant differences between both groups of in-service teachers in terms of recent mathematics pedagogical and general pedagogical topics, and thus modern teaching methods in general, were also found in the Chinese context (Yang et al., 2021). Now, they are replicated for a Western context. This finding may be the result of low participation of ET in appropriate professional development activities (Pedder et al., 2008), low effects of such interventions (Liu & Phelps, 2020), or a lack of expertise in modern mathematics education among ET. Therefore, it is necessary to develop well-designed and, particularly, thoroughly evaluated professional development interventions, especially in areas that had a separating effect on teacher groups in this study (i.e., recent mathematics pedagogical topics and general pedagogical topics like competence-oriented teaching and dealing with heterogeneity).