1 Introduction

The question of what constitutes instructional quality is crucial in mathematics education. On the one hand, several aspects were described as relevant for high-quality mathematics instruction across different cultural contexts, such as responding to student thinking or using representations. Among those aspects, the potential of mathematical tasks and its use in instruction is typically considered since tasks are central for students when learning mathematics (e.g., Doyle 1988; Hsieh et al. 2017; Lin and Li 2009; Praetorius and Charalambous 2018; Stein and Lane 1996). On the other hand, research indicates that perspectives on what constitutes high-quality instruction may also vary between cultural contexts, like East Asian and Western ones (Clarke 2013; Hsieh et al. 2017). Exemplarily, this can be seen in the way tasks are used. For instance, one aim of East Asian instruction is to guide students to memorize solution strategies and reflect on them to acquire efficient strategies. In contrast, Western instruction aims at supporting the development of various individual solution ideas (Leung 2001).

One explanation for perspectives being different across cultural contexts builds on the notion of instructional norms, which characterize what is seen as appropriate behavior in instruction. Moreover, these norms shape expectations about people’s actions in instruction (Herbst and Chazan 2011). For example, when researchers are asked to evaluate a task’s potential and its use in a specific instructional situation, they evaluate the situation against their expectations. The researchers’ expectations, in turn, correspond to the (often implicit) instructional norms, which may be specific to the cultural context to which the person belongs.

Although there are good reasons to assume that perspectives on what constitutes high-quality use of task potentials depend on culturally shaped instructional norms, there has been little systematic research on the culture-specificity of instructional norms. This is particularly critical since such instructional norms are typically used as a frame of reference when designing assessment instruments for cross-cultural comparative research (Dreher et al. 2021). Thus the lack of knowledge threatens the validity of cross-cultural research (Clarke 2013).

Addressing this problem, we systematically investigate whether instructional norms regarding high-quality use of task potential in mathematics instruction are shared across cultures or culture-specific. We exemplarily compare the instructional norms in Taiwan and Germany because we consider this contrast to be particularly informative: Students in these countries perform very differently in comparative studies like PISA or TIMSS (e.g., Mullis et al. 2020; Reiss et al. 2019), and thus, differences regarding the instructional quality are plausible. Moreover, prior research indicates differences between East Asian and Western countries from a broader perspective, which allows inferring expectations regarding specific aspects of mathematics instruction against which new findings can be evaluated. To pursue our aim to explicate and compare instructional norms across the two countries, we contrast the evaluations of mathematics education researchers from Taiwan and Germany regarding the high-quality use of word problems using vignettes of three instructional situations.

2 Theoretical Background

To set the theoretical background, we introduce the concept of instructional norms as specific social norms and discuss reasons for their potentially culture-specific nature. According to our focus on instructional norms regarding the high-quality use of tasks, we analyze the role of tasks, especially word problems, in mathematics instruction and present expected differences regarding the high-quality use of word problems in East Asian and Western mathematics instruction.

2.1 Instructional Norms

In their endeavor to understand what matters in instructional interactions, researchers use sociological concepts like the notion of social norms. Social norms are generally considered rules of behavior on which there is—potentially implicit—agreement within a certain social context. They pre-structure action and shape expectations regarding the behavior of members of the context (Coleman 1990). Thus, members of a certain social context are largely familiar with the corresponding norms, even if they are not necessarily conscious of them.

Classrooms have been described as constituting social contexts of their own where social norms affect students’ and teachers’ actions. In particular, Herbst and Chazan (2011) explicated the general concept of classroom social norms by defining instructional norms as “a collective sense of what is conceivable and perhaps desirable to happen in classrooms” (Herbst and Chazan 2011, p. 406). Instructional norms permeate the actions in an instructional situation, their temporal structure, and the expectations of who should respond when and how (Herbst and Miyakawa 2008). Thus, instructional norms require specific situations to become evident.

Research on social norms specific to mathematics instruction typically incorporates the idea that the nature of the subject of instruction shapes the norms (e.g., Herbst and Chazan 2011; Yackel and Cobb 1996). However, despite this commonality, different norm concepts were considered in prior research with a focus on mathematics instruction. For example, Herbst’s and Chazan’s (2011) perspective is motivated by the question of how practical rationalities may be explained through instructional norms, aiming at explaining how specific situations are typically handled in mathematics instruction, given the various obligations teachers face. Obligations to the discipline of mathematics constitute one aspect, while obligations to students and other components of the schooling process make up others (Herbst et al. 2011, p. 223). Yackel and Cobb (1996) defined sociomathematical norms as what is considered mathematically normative in a mathematics classroom, emphasizing obligations to mathematics. They aimed to clarify how teachers and students construct shared mathematical concepts, like what counts as “an acceptable mathematical explanation and justification” (Yackel and Cobb 1996, p. 461). Sociomathematical norms were also productive in explaining observed differences in instructional practices (Yackel and Cobb 1996). However, in contrast to the broader concept of instructional norms, they may be understood more narrowly as capturing the practical rationalities of the inner mathematical structure of a situation. Thus, the concept of instructional norms seems better suited to describe differences that may occur in contexts that differ in terms of various obligations, as we explain, would be expected when contrasting cultural contexts.

Finally, it will be briefly discussed how instructional norms originate. As with any social norm, they are learned via socialization processes (Coleman 1990). During teacher education, training, and further professional development, these norms are subject to (implicit or explicit) social negotiation processes between (prospective) teachers and teacher educators (Herbst 2003; Tatto 1998). In Germany and Taiwan, mathematics education professors play an essential role in the socialization processes: As teacher educators, the professors transmit norms by deciding on learning opportunities and grading (future) teachers with consequences for teacher admission. Due to their position as researchers, their associated enculturation and experiences, which are typically not limited to single classes or schools, and their involvement in policy decisions on teacher education, the professors also play an essential role in shaping instructional norms within their cultural contexts (see Schwille et al. 2013, especially p. 82 for Taiwan).

2.2 Cultural Influences on Instructional Norms

As the social context of education is part of an overarching cultural context, instructional norms regarding mathematics instruction are part of a broader cultural norm system (Coleman 1990). With culture, we refer to the “shared motives, values, beliefs, identities, and interpretations or meanings of significant events that result from common experiences of members of collectives that are transmitted across generations” (House et al. 2004, p. 15). Accordingly, countries with a shared history of Confucianism are often called East Asian cultures. Countries in Western Europe that share the experience of the Renaissance and Enlightenment, as well as the Anglo-Saxon countries, are called WesternFootnote 1 cultures (Leung 2001).

Prior research surfaced differences in mathematics instruction in East Asian and Western cultures, spanning questions of the nature of mathematics to broader questions of the organization of instruction (e.g., Clarke 2013; see also Leung 2001 who synthesized the differences along six dichotomies). For Taiwan as an East Asian country, for example, mathematics instruction is characterized as exam-, content- and product-oriented (Leung 2001; Yang et al. 2022). The mastery of procedures is accordingly valued highly. In Germany, as a Western country, mathematics instruction follows a competence-based curriculum emphasizing mathematics as a tool to solve real-world problems, valuing mathematics as a process (Chang 2014; Leung 2001; Verschaffel et al. 2020). Meaningful learning is accordingly valued higher than mastery of procedures. Furthermore, the instruction in Germany and Taiwan may differ in the orientations towards the organization as whole class teaching versus individualized learning, which may stem from different perspectives on the role of teachers: teachers are seen as subject-matter specialists in East Asia but are primarily seen as pedagogues in Western countries (Leung 2001). The latter refers to the observation that, in Western countries compared to East Asian countries, being able to facilitate individual learning processes may be relatively more important for teachers than being an exemplary mathematician.

These differences were reasoned to reflect cultural traditions and result in different obligations of teachers. Accordingly, it may be assumed that (potentially implicit) norms of mathematics instruction may also be culture-specific. At the same time, there are reasons for some instructional norms being similar across different cultural contexts. Mathematics is considered an internationally homogeneous discipline, and there is an increasing international research discourse on instructional quality in mathematics education (e.g., Praetorius and Charalambous 2018). To systematically investigate the open questions regarding the culture-specificity of instructional norms, situations in which the practical rationalities can become visible must be focused. Among the many possible focuses, situations that center on the use of tasks may be considered particularly suitable as learners in mathematics instruction engage most of the time with tasksFootnote 2, and how tasks are used in instruction is internationally considered crucial for instructional quality (Mu et al. 2022; Stein and Lane 1996).

2.3 Focusing on Word Problems to Investigate Instructional Norms

Word problems are specific types of tasks characterized by verbal descriptions of problem situations, which raise questions that can be answered by applying mathematical operations (Verschaffel et al. 2020). Word problems require students to understand the given problem situation and model it mathematically. This includes transforming the verbal description into a mathematical model (making assumptions about the situation as necessary), working mathematically with the model, and relating the mathematical solution (steps) to the problem context (repeatedly), for instance, when validating results (e.g., modeling cycle, Blum and Leiss 2007; Chang et al. 2020).

Word problemsFootnote 3 can be used with different goals in instruction. In addition to their potential to promote modeling skills, word problems can provide opportunities to practice problem-solving strategies and support learners in building mathematical concepts or thinking creatively (Sun-Lin and Chiou 2019; Verschaffel et al. 2020). Despite the high potential of word problems for mathematical learning, it depends on the way it is used by teachers and students in instruction whether such potential can be used fruitfully for learning (Doyle 1988; Neubrand et al. 2013; Stein and Lane 1996). Relevant aspects of the task’s use that may support or impede instructional quality are, for instance, the alignment with the learning goal, the product a teacher expects, the actions and resources used to create the product, and the importance of the task in the accountability system (Doyle 1988). For example, how students work mathematically may align with a task’s potential, available resources may support or hinder mathematical thinking, and how a teacher implements a task may affect what students perceive as relevant about mathematics (Doyle 1988; Herbst 2003). As practices of using word problems in mathematics instruction can greatly vary, a focus on the use of word problems is suited to (exemplarily) investigate instructional norms.

2.4 Word Problems in Western and East Asian Mathematics Instruction

Using the theoretical considerations of instructional norms, it can be expected that what is considered normal practice regarding the use of tasks for mathematical learning may vary between cultural contexts (Herbst and Chazan 2012). Particularly, there are indications that the perspectives on the use of word problems may differ in East Asian and Western contexts. To substantiate this argument, we will draw again on the characterizations of East Asian and Western mathematics instruction by Leung (2001) and infer specific assumptions for the contrast between Germany and Taiwan.

The dominant process-oriented view on mathematics in Western countries (vs. the product-oriented view in East Asian countries) manifests in the observation that mathematical modeling is valued more in Western countries (e.g., Denmark, Australia, the United States of America, Italy, or Sweden; Niss 2018) than in East Asian countries (Leung 2001). Particularly in Germany, the role of modeling and real-world contexts in mathematics instruction has grown strong since the realistic turn of mathematics education in the 1970s (Sträßer 2019). For instance, the modeling cycle (Blum and Leiss 2007), which characterizes mathematical modeling as a process, is widely known and used in research and practice in Germany. Mathematical modeling has been an explicit part of the competence-based curriculum in Germany since 2003 (Chang 2014; Chang et al. 2020). In Taiwan, mathematical modeling only recently gained curricular attention in teaching reforms (Chang et al. 2020; see also for China, Cao and Leung 2018). Accordingly, one would assume that instructional norms regarding the high-quality use of word problems in Germany reflect a stronger emphasis on their use to engage students in mathematical modeling processes in comparison to Taiwan.

The findings of comparative studies also resonate with these expected differences. For instance, Chang et al. (2020) compared the modeling competences of German and Taiwanese students. They found that despite the Taiwanese students being more mathematically knowledgeable, German students were more capable of setting up models than Taiwanese students. Similar observations were made for pre-service teachers in Taiwan whose performance in abstraction and mathematical work was better than in process-oriented tasks (Hsieh et al. 2012).

In contrast, acquiring mathematical knowledge as a product is more important in East Asian mathematics instruction. Instructional content in general, and thus also word problems in particular, would be expected to support the consolidation or application of learned knowledge (Pratt et al. 1999). Memorizing content (often misinterpreted as rote learningFootnote 4 without understanding) is valued highly and seen as essential before applying it to problems since a broad knowledge base is understood to be a key to performance (Leung 2001; Pratt et al. 1999). Hence, learning is first oriented towards acquiring mastery or an ideal way of solving, often in a way it was shown by the teacher (Pratt et al. 1999). Repetitive practice with variations is one path to achieving this end (Wong 2006).

Studies indicate that this focus on optimal solution strategies in East Asian mathematics instruction is also relevant when engaging in problem-solving processes like mathematical modeling (Cai 2006). Remarkably, Xu et al. (2022) found that Chinese mathematics teachers prefer to teach modeling in a teacher-centered way (representing the subject-matter expert). It was also reported that in East Asian instruction, the usefulness of solutions is valued over the novelty of solutions (Morris and Leung 2010).

However, another essential aspect of East Asian instruction is the idea that content learned by students should be reflected on because there is no real learning without thinking and reflection (Leung 2001). Under this perspective, teachers are encouraged to reflect on solution strategies or to ask students to reflect on the strategies by themselves so that students can understand why a strategy is considered ideal. Accordingly, one would assume that instructional norms regarding the high-quality use of word problems in Taiwan show a stronger emphasis on reflecting and building on existing, optimal, and efficient solution methods when applying mathematical concepts compared to Germany (Morris and Leung 2010).

In contrast, Western traditions value abilities to flexibly solve mathematical tasks by developing individual strategies according to the task’s characteristics (Morris and Leung 2010). Comparing different solution strategies has indeed shown to be helpful for flexibly solving tasks in studies in Germany and the USA (Silver et al. 2005; Schukajlow et al. 2015). Hence, exploratory working of students that supports them in finding and comparing individual solutions is recognized to support meaningful learning in Western contexts (Dreher et al. 2021; Leung 2001). In particular, it is considered essential to scaffold students’ interactions with tasks and their multiple solutions (for Germany: Borneleit et al. 2001). Hence, teachers should enable individual experiences, attend to solutions of the learners, compare strategies and use students’ work to establish meaning. Consequently, one would assume that instructional norms regarding the high-quality use of word problems in Germany more strongly reflect the importance of meaningful learning through establishing connections.

To sum up, Western and East Asian instruction, as discussed in this chapter, can be expected to differ in three aspects referring to practicing modeling processes, to reflecting and understanding the efficiency of solution strategies, and to comparing different individual strategies.

3 Research Interest

Although our argumentation shows that there are good reasons that culturally shaped instructional norms may impact what counts as high quality in mathematics instruction, there has been little systematic research on the culture-specificity of instructional norms. This may be a problem for cross-cultural comparative research, where frameworks or instruments are used whose underlying norms often remain implicit. Our study addresses this problem by explicating instructional norms regarding a high-quality use of task potential based on German and Taiwanese researchers’ evaluations of the use of tasks in three exemplary instructional situations. Furthermore, we examine whether the norms are culturally shared or culture-specific.

Against the background of cultural differences and the resulting assumptions about differences in instructional norms, which are in turn expected to influence evaluations of teaching within the cultural contexts, we ask: RQ1) Do mathematics education researchers’ evaluations indicate that there are, as anticipated, culture-specific instructional norms regarding the high-quality use of tasks for mathematical learning in Germany and Taiwan? RQ2) Do the evaluations reflect further differences supporting the assumption of culture-specific influences on conceptions of high-quality use of tasks for mathematical learning?

4 Context and Methods

This study is part of the binational project “Teacher noticing in Taiwan and Germany” (TaiGer Noticing). The project aims to investigate the role of potentially culture-specific norms regarding aspects of instructional quality (Dreher et al. 2021; Lindmeier et al. 2024). To achieve this goal, the project focuses on three different aspects of instructional quality, which are relevant in Taiwanese and German mathematics instruction and may show cultural differences: The use of representations (Dreher et al. 2024), responding to students’ thinking (Dreher et al. 2021), and the use of task potential. This contribution focuses on the latter aspectFootnote 5.

4.1 Instruments

As instructional norms are often implicit but become evident in instructional situations, we follow a situated approach using representations of practice to elicit instructional norms (Herbst and Chazan 2011). The approach is based on the ethnomethodological notion of a breaching experiment (Mehan and Wood 1975) and indirectly allows inferring instructional norms via a person’s reaction when confronted with the breach of an anticipated norm. Specifically, we use descriptions of instructional situations where teachers use tasks in a way not in line with the behavior expected according to an anticipated instructional norm. If someone evaluates a situation negatively and expresses the corresponding points of criticism, it can be concluded that the person had expected the (breached) instructional norm to hold.

Following this approach, we designed text vignettes that all include breaches of anticipated instructional norms regarding high-quality use of task potential in mathematics instruction from the perspectives of the authors from Germany or Taiwan (Dreher et al. 2021).

When designing the vignettes, a specific focus was on ensuring ecological validity (i.e., the vignettes represent instructional situations that are conceivable in both countries’ secondary mathematics instruction). Hence, developing the vignettes followed a sophisticated concurrent development process conducted symmetrically in Germany and Taiwan. This process was documented in detail by Dreher et al. (2021). It included the consolidation of a joint item development process, the independent development of vignettes in each country, as well as methods of reviewing and translating the vignettes for the intended binational use. As part of this process, the research team also decided to focus on the topic of (linear and quadratic) functions and equations since it is central in the secondary curriculum (grades 7–9) in Germany and Taiwan.

Each vignette consists of 1) a picture of the task implemented in the instructional situation, which is assumed to have a high potential for mathematical learning from the perspective of the authoring research team from Taiwan or Germany, and 2) a fictitious transcript of a classroom interaction (about 200 words). Since the vignettes were designed following breaching experiments, they are supposed to illustrate a non-optimal use of the task’s potential (breach of an instructional norm).

In total, we developed six vignettes focusing on high-quality use of tasks—three in each country. During the development process, we did not specify which kind of task the vignettes should focus. Only three vignettes finally focused on word problems (see Sect. 2.3), so this report is based on these three vignettes (see Lindmeier et al. 2024, for another vignette with a focus on the use of tasks not built on a word problem). Two of the three vignettes were developed in Germany (Task1, Task2), whereas one (Task4Footnote 6) was developed in Taiwan. Each vignette includes a breach of an anticipated norm from the perspective of the authoring research team from one country but not necessarily from the perspective of the research team from the other country.

Vignette Task2 focuses on the task “cliff-jumping” (topic: quadratic functions, Fig. 1). The task prompts students to engage with a real-world scenario presented as graphically supported text. It requires students to understand the real-world situation, make an educated guess about the solution based on the real-world context and its graphical visualization and then determine the immersion point by a given mathematical model (mathematization and working mathematically). The quadratic equation has two solutions, but only one can be the point of immersion. Hence, the mathematical results are needed to be interpreted and validated. Thus, the learning potential of this task refers to the known difficulty of students connecting real-world situations and mathematical models. Considering that the highlighted steps correspond to the modeling cycle (Blum and Leiss 2007), the vignette’s authors assume that the task has a high potential to support students to engage in mathematical modeling processes and expect teachers using this task to scaffold the students in connecting their understanding of the situation to the mathematical model and in using their understanding of the situation to validate the results.

Fig. 1
figure 1

Vignette Task2

The instructional situation in vignette Task2 starts after a group working phase. The teacher collects the students’ guesses and then shifts the focus toward using the given function. In an interactive teaching style, two different ways of finding the roots of the equation are presented verbally (solution formula, factoring). A student mentions that one of the roots is irrelevant as a solution to the word problem. The teacher confirms and redirects the interpretation of the remaining result to a student. The instructional situation is closed with a remark on how to find the roots of quadratic equations. The teacher does not emphasize the relationship between the real-world situation and its mathematical model by using educated guesses or the visualization of the situation. From the perspective of the vignette’s authors, the teacher neglects the task’s potential to support students to engage in modeling processes.

Vignette Task4 focuses on the task “student camp” (topic: systems of linear equations, Fig. 2). The task requires students to understand a real-world scenario presented as text and to set up a system of equations to determine the solution. The potential of this task from the perspective of the Taiwan authoring team is that the task can support the acquisition of efficient or ideal solution strategies, as it allows for different variable assignments: a) The assignment of x and y as the number of student groups or b) the assignment of x and y as the number of students in line with the question formulated in the word problem. The first approach results in a simpler calculation, while the second approach may be quicker to find for the students because of the congruency with the word problem formulation. Hence, the Taiwanese team members would expect a teacher to discuss the various characteristics of the different variable assignments. A teacher should support students reflection of the pros and cons of each solution path to let them understand which one is the most efficient to solve the task.

Fig. 2
figure 2

Vignette Task4

This expected behavior is breached in the instructional situation represented in the vignette. The teacher presents how to set up the system of equations with the variables as the number of groups (a). The students express confusion, and one asks why the variables were not set to represent the number of students (b). The teacher presents the corresponding approach, asks for preferences, and highlights the first resulting in a simpler calculation. From the perspective of the Taiwanese authoring team, the teacher does not use the potential of the task to discuss the pros and cons of different variable assignments but merely states that the students should go with the first approach because of the simpler calculation.

Vignette Task1 focuses on the task “medicine dosage” (topic: proportional relationship, Fig. 3). The task prompts students to engage with the real-world situation presented by a text and an illustrative image. First, the task requires students to determine a specific dosage for Paul. Further, the task prompts students to think about different solution strategies and find at least two. The potential of this task from the perspective of the vignette’s authors is that it may support students in learning to solve word problems flexibly by actively prompting them to find different ways to solve the problem. Accordingly, when using this task in instruction, the German authors of the vignette would expect the teacher to discuss different solution strategies of the students to support flexible problem-solving and to emphasize connections between the solutions.

Fig. 3
figure 3

Vignette Task1

In the instructional situation represented in vignette Task1, the teacher first collects the solutions of two students verbally, praises them without specific feedback, and marks them verbally as strategies without further elaboration. A third student’s way of solving based on equations is elaborated on and noted on the blackboard by the teacher. From the perspective of the vignette’s authors, the teacher misses the opportunity to connect the different ways of solving the task. Thus, the teacher fails to focus on meaningful learning because s/he just checks the result for correctness and notes one solution, which is considered a non-optimal use of the task’s potential.

4.2 Sample and Data Collection

To examine whether there are, as anticipated, culture-specific instructional norms regarding the high-quality use of tasks, we invited mathematics education professors in Germany and Taiwan to evaluate the teacher’s use of the task in the exemplary instructional situations. The selection criteria were that they are active in mathematics education research, have a background related to secondary mathematics teaching, and are involved in educating future mathematics teachers. We aimed for a sample size of 15 participants in each country. Because of an assumed completion rate of 50%, a random sample of 30 professors in Germany meeting the criteria was invited. In Taiwan, only 32 persons met the criteria. Hence, all of them were invited. In total, 19 Taiwanese professors (6 female, 13 male) from 10 universities and 17 German professors (7 female, 10 male) from 13 universities worked on the vignettes. Completion rates were similar in both countries (TW 59%, GER 56%). All participants answered each of the vignettes.

On average, the German researchers spent 18.4 years (SD = 9.8) on research in mathematics education and they were involved in teacher education for 15.9 years (SD = 7.6). Eight German researchers also did research in mathematics. The Taiwanese researchers spent, on average, 16.9 years (SD = 7.5) on research in mathematics education. They were involved in teacher education for 11.47 years (SD = 7.3). Five Taiwanese researchers also did research in mathematics.

To elicit the implicit instructional norms held by the researchers, the vignettes were presented in the form of an online survey in the participant’s native language (German or Chinese) with the following prompt: “Please evaluate the teacher’s use of the task in this situation and give reasons for your answer.” The researchers answered in their native language.

4.3 Translation and Analysis

Translation

Subsequently, the national research teams translated the answers into English as the common language within the research team. The other team members reviewed the translations, and an external trilingual person checked each translation. The original answers are attached in the digital supplementary material.

Development of the Coding Manual

Afterward, a coding manual was cooperatively created by the German and Taiwanese members of the research team. The coding follows a two-step process for each evaluation: Step 1) Did the researcher criticize the teachers’ use of the task represented in the vignette? (Code 0 = no and 1 = yes; Table 1, codes used across vignettes) And, if so, step 2) Did the researcher see the breach of a norm anticipated by the vignette’s authors, or did the researcher give other reasons for critique? (Codes per reason per vignette).

Table 1 Deductively developed part of the coding manual regarding coding step 1

In step 2, the codes are vignette-specific with a deductively determined code for critique regarding the breaches of the potentially culture-specific anticipated norms (RQ1). The corresponding part of the coding manual is shown in Table 2.

Table 2 Deductively developed part of the coding manual regarding coding step 2

To analyze whether the researchers’ evaluations reflect further cultural differences (RQ2), the coding manual for step 2 was enriched for each vignette via inductive category formation (Mayring 2014). Codes were added for each discernible reason for critique that repeatedly occurred in the answers. Finally, a code for other reasons that occurred only singularly was added. These inductively generated codes are shown in Table 3. The final coding manual contained all codes shown in Tables 1, 2 and 3. All codes are explained in the findings section in detail.

Table 3 Inductively developed part of the coding manual regarding coding step 2

Coding Process

After the coding manual was finalized, independent parallel coding of all answers was conducted in Taiwan and Germany. In detail, every team member (from Germany as well as Taiwan) coded each answer independently. During the coding process, the coders knew the nationality of the participant as it was helpful to code answers where, for instance, critical aspects were not clear from the English translation. If an answer contained more than one reason for a negative evaluation, multiple codes could be assigned to reflect the reasoning comprehensively. Note that the inductively generated codes could also be assigned in addition to those referring to the breaches of the anticipated norm.

Differences in the coding were first discussed within the national teams, and an agreement coding at the national level was synthesized. This resulted in two independent codings of the complete data set. The interrater reliabilities (Cohen’s Kappas) between the German and the Taiwanese coding regarding whether the answers include critique (step 1) and on the level of the vignette-specific codes (step 2) are displayed in Table 4. They were observed to be almost perfect (> 0.80, Landis and Koch 1977), except for agreement regarding step 2 for Task4 (moderate agreement). The differences between the two national codings were then discussed by the whole research team, and a final coding was generated.

Table 4 Cohen’s Kappa to assess interrater reliabilities between the independent codings from Germany and Taiwan

Subsequently, we counted the number of answers per step 2 code. Since there are no documented cutoff values for which quotas should be used to identify an instructional norm, we resorted to the assumption that to be able to speak of a norm, the members within one context should be largely familiar with it. So we decided to apply the following criterion: If at least half of the researchers from one country issued the same reason for critique, we concluded that this reason could be considered to reflect an instructional norm in their context (majority criterion, Dreher et al. 2021).

5 Findings

5.1 Overview and Organization of the Report

We will describe the findings in detail per vignette in the following way: First, we report how many researchers criticized the teachers’ use of the task represented in the vignette in each country (coding step 1). Second, to answer RQ1, we report whether the majority of researchers per country mention critique regarding the breach of the anticipated norm and illustrate with sample answers how the researchers addressed the anticipated normsFootnote 7. Third, we explain further points of critique as captured by the inductively generated codes by means of sample answers and show their frequency per country. This provides evidence of whether the evaluations reflect further culture-specific differences (RQ2).

5.2 Task2 “cliff jumping”

Regarding the use of tasks in this instructional situation, 13 German (76.5%) and 17 Taiwanese (89.5%) researchers issued criticism.

As nine German (52.9%) and only three Taiwanese researchers (15.8%) mentioned critique regarding the breach of the anticipated norm, we infer that this anticipated norm regarding the high-quality use of word problems to support students to engage in modeling processes is a culture-specific norm in Germany (RQ1).

The sample answer of GER1 (Fig. 4) illustrates this: S/He stated that the teacher largely misses addressing mathematical modeling, which is identified as a potential of the task. To justify the critique, GER1 suggested improving the use of the task’s potential through teacher questioning focused on mathematical modeling processes, including a higher engagement regarding the context and the educated guesses.

Fig. 4
figure 4

Example of a German researcher’s answer referring to the anticipated culture-specific norm in vignette Task2

GER2’s answer (Fig. 5) illustrates another critical aspect. S/He was concerned that the context is not used for ruling out the second algebraic solution. As there is no indication that the researcher saw the teacher’s support to connect the understanding of the situation to the mathematical model as deficient, we inductively coded a separate category that was assigned to four cases in total (critique regarding the use of the context).

Fig. 5
figure 5

Example of a German researcher’s answer mentioning further critique regarding vignette Task2

Similarly, we inductively extracted a code regarding the treatment of the educated guesses used for answers that criticized how the teacher used the prompt presented in the task without indicating that the teacher’s support of mathematical modeling processes was considered deficient. Remarkably, this code occurred five times in the Taiwanese subsample but only once in the German subsample.

The first sentence of the Taiwanese researcher TW1’s answer (Fig. 6) represents a negative evaluation of the teacher’s use of the task regarding the algebraic explanations: The researcher criticized that the mathematical concepts “equation” and “function” are not used consistently by the teacher and that s/he did not consider asking the students to explain their solution strategies. Researchers from both countries mentioned that the teacher should have paid more attention to the algebraic solution paths, for example, by further elaborating on them or supporting their formulation through written notes. Nine Taiwanese researchers pointed to this aspect, so it is the main point of criticism issued by the Taiwanese participants. In the German answers, such reasons were mainly presented along with reasons indicating that the researchers saw the breach of the anticipated norm.

Fig. 6
figure 6

Examples of two Taiwanese researchers’ answers mentioning further critique regarding vignette Task2

The Taiwanese researchers generally focused more on aspects of the mathematical content than the German researchers when evaluating the teacher’s use of the task. This can also be seen in the answer TW2 (Fig. 6). It indicates that the researcher expected the teacher to propose a conclusion regarding using or not using the formula to solve quadratic equations. This critique regarding a lack of conclusion was observed five times in the Taiwanese sample but was not mentioned by any German researcher.

Finally, four German and three Taiwanese researchers evaluated the teachers’ use of tasks negatively because of aspects related to heterogeneity, questioning whether weaker students could follow the instruction.

Table 5 summarizes the coding of the researchers’ answers for Task2.

Table 5 Distribution of codes regarding further critiques for vignette Task2

5.3 Task4 “student camp”

The use of the task, as presented in this vignette, was criticized by 18 Taiwanese (94.7%) and 15 German researchers (88.2%). In total, eleven Taiwanese (57.9%) and four German researchers (23.5%) mentioned that the teacher should have discussed the pros and cons of the strategies of assigning variables. Referring to the majority criterion, we can infer that the anticipated norm regarding the use of the potential of the task to discuss the pros and cons of different variable assignments is a culture-specific norm in Taiwan (RQ1).

Exemplarily, TW3 (Fig. 7) pointed out that the students could have seen the advantage (pro) of the first approach (being easier to calculate) better if they had compared both sets of equations. S/He particularly emphasized discussing how to set the variables in the different solution paths and criticized the teacher insisting on a uniform method.

Fig. 7
figure 7

Example of a Taiwanese researcher’s answer referring to the anticipated culture-specific norm in vignette Task4

Further criticism is, for instance, illustrated by GER3’s answer (Fig. 8). GER3 also saw a benefit for the students in solving the sets of equations. However, in contrast to the Taiwanese researcher TW3, s/he requested that the teacher should have finally considered the “contextual equivalence” of the solutions instead of discussing the pros and cons. So, the answer focuses on understanding the meaning of the results of the calculations regarding the situation in the word problem rather than on effective solution strategies. Seven German researchers mentioned such critique regarding the connections between the different approaches assigning the variables. Focusing on the equivalence of the two approaches appears to be a German perspective in this situation since none of the Taiwanese researchers mentioned it.

Fig. 8
figure 8

Example of a German researcher’s answer mentioning further critique regarding vignette Task4

However, researchers from both countries criticized that the students had no chance to set up the equations on their own (critique regarding the teacher setting up the equation, 12 cases in total) and that the teacher was too focused on his or her own solution strategy (critique regarding the teacher’s preference, 12 cases in total). Both critiques were often mentioned in addition to other reasons but also occurred as the only reason.

Compared to the other vignettes, the largest number of other reasons for criticism was found regarding vignette Task4 (four cases). One German researcher (GER4, Fig. 9) related the task to the modeling cycle and criticized an insufficient implementation. S/He also sees an issue in not naming the variable as the task suggests. GER4 mentions that this leads to an insufficient relation between the mathematical model and the real-world situation.

Fig. 9
figure 9

Example of a German researcher’s answer mentioning another reason for critique regarding vignette Task4

Table 6 summarizes the coding of the researchers’ answers for Task4.

Table 6 Distribution of codes regarding further critiques for vignette Task4

5.4 Task1 “medicine dosage”

The use of the task, as presented in this vignette, was evaluated negatively by 12 German (70.6%) and 13 Taiwanese researchers (68.4%). The majority of the Taiwanese (11, 57.9%) and the majority of the German researchers (10, 58.8%) noticed the breach of the anticipated norm. Hence, we infer that the anticipated norm regarding the use of the task’s potential to support flexible problem-solving through emphasizing the connections between the solutions illustrated in this vignette can be considered an interculturally shared norm (RQ1). Figure 10 presents two sample answers—one of a German and one of a Taiwanese researcher—that both addressed insufficient support to connect the different solution strategies to learn to solve word problems flexibly (breach of the anticipated norm). The German researcher criticized that the teacher “does not go to the bottom” and elaborated on this as the missed opportunity to relate the strategies in which the researcher saw different degrees of sophistication. The Taiwanese researcher also discussed a missing integration of the solution strategies, which are seen as different methods to deal with proportional relations.

Fig. 10
figure 10

Examples of a German and a Taiwanese researcher’s answer referring to the interculturally valid norm in vignette Task1

Further criticisms indicated that some researchers considered the explanation of the solution paths insufficient (critique regarding the explanations). Six German and three Taiwanese researchers mentioned this. In four of these cases in Germany, the researchers also mentioned the missing connection between the solutions (breach of the anticipated norm). The sample answer of the German researcher (GER5, Fig. 10) exemplifies this case: After criticizing the teacher for “not going to the bottom”, this evaluation is warranted first by a lack of explanation of S2’s solution. The missing connection of the solution paths is mentioned after that, indicating that the researcher expected the teacher to use the explanation of the strategies to connect them. Finally, one researcher’s answer could not be assigned to any category (other reasons). The coding of the researchers’ answers to Task1 is summarized in Table 7.

Table 7 Distribution of codes regarding further critiques for vignette Task1

6 Discussion and Conclusion

This contribution reports a study that systematically investigated whether anticipated instructional norms regarding the use of tasks in mathematics instruction are culture-specific. Based on prior work on cultural differences in education, these norms were expected to show certain differences across cultural contexts, despite a common understanding of the importance of high-quality use of tasks for mathematical learning.

To explicate and compare the potentially implicit instructional norms, we used a situated approach and contrasted the evaluations of mathematics education professors from Germany and Taiwan regarding specific instructional situations represented by vignettes. These vignettes were designed to include breaches of anticipated norms regarding the use of word problems.

Qualitatively analyzing researchers’ written evaluations of the teacher’s use of the task in the situations, differences and commonalities in the perspectives of the researchers from Germany and Taiwan could be extracted by a sophisticated binational process. Regarding RQ1, we found that the perspectives regarding two situations differed between the cultural contexts in line with the assumptions of the vignette’s authors. The answers indicate that using a task’s potential to support student engagement in modeling processes is a culture-specific norm in Germany (Task2), and discussing the pros and cons of different ways of assigning variables describes a culture-specific norm in Taiwan (Task4). We could not find any cultural specificity regarding the use of tasks that promote flexible solving by connecting different solution paths. Instead, supporting flexible solving is a shared norm in both countries (Task1).

To answer RQ2, we situate the findings regarding the instructional norms and further critiques as mentioned in the evaluations from Germany and Taiwan in previous research regarding cultural differences. Specifically, we revisit the assumptions regarding differences in the high-quality use of word problems. The German instructional norm elicited with Task2 (supporting students’ engagement in modeling processes focusing on connecting the real-world situation and the mathematical model) aligns with the important status of modeling in Germany (e.g., Sträßer 2019) and the importance of a process orientation in Western countries (Leung 2001). The Taiwanese researchers, in contrast, focused more on algebraic aspects like using or not using a formula when evaluating vignette Task2, in line with the content orientation valued in East Asian mathematics instruction (Leung 2001).

The Taiwanese instructional norm regarding discussing the pros and cons of solution methods elicited with Task4 mirrors especially the expectation of reflection of optimal solution methods (Cai 2006). Remarkably, a similar perspective became visible in some evaluations of Task2, where only researchers from Taiwan criticized the lack of conclusion.

Regarding Task1, the researchers’ perspectives across cultural contexts showed agreement regarding the teacher missing the opportunity to connect the different ways of solving a problem to develop flexibility. It is important to note that interculturally shared norms may be observed for various reasons. Two plausible explanations should be suggested.

First, developing flexible problem-solving may be considered an instructional aim that can be associated with word problems under a lens of mathematical modeling, but also with a lens of application of mathematical concepts. Connecting different mathematical concepts explicitly is emphasized in Taiwan (Hsieh et al. 2017). It can be seen as one way to reflect and learn to apply mathematical contents flexibly (Leung 2001). Thus, the breach of the anticipated norm included in vignette Task1 that refers to the missing connections between solutions using different mathematical concepts can be seen as beneficial from a process-oriented perspective of mathematical modeling and a content-oriented perspective on dealing with proportional relations, as the sample answers also illustrate (Fig. 10).

Second, the results may also reflect Western influences on Taiwanese instructional norms. As Morris and Leung (2010) stated, flexible solving is especially emphasized in Western educational traditions but has been widely discussed in the international research community. With growing globalization and as one side-effect of international comparative studies, certain trends in education were observed (Clarke 2013). For example, Taiwanese curricula have been reformed several times, influenced by global trends, while also recently, the own educational context has been explicitly re-considered (Yang et al. 2022). Hence, conceptions of instructional quality regarding mathematics in Taiwan today may reflect not only traditional perspectives but may also be shaped by Western ideas of instruction (Hsieh et al. 2020).

Among the study’s limitations, we want to draw attention to the limited number of vignettes focusing on word problems so that we could only use three instructional situations to compare the researchers’ perspectives. However, given that the overarching study took three different aspects of instructional quality under investigation, it was not possible to incorporate more vignettes. The sample size is limited, yet it can be considered an exceptionally large researcher, i.e., expert sample. Despite the sampling strategy aiming at representativeness, it is, of course, not clear if the participating mathematics education researchers entirely represent the perspectives of the mathematics education researchers in Germany and Taiwan. The nature of the study also allows no conclusion whether our findings can be generalized to other Western or East Asian countries, even though the findings largely align with major cultural characteristics as expected.

One may further argue that the applied majority criterion to determine norms within a country is too lenient and that the observed percentages may indicate low agreement among the researchers in one country. This is true to a certain extent. Still, it should be considered that we did not ask researchers directly whether they agreed with certain anticipated norms of instructional quality but instead analyzed their written evaluations of instructional situations where a not entirely obvious breach of the anticipated norm was included. Hence, we considered the majority criterion to be appropriate.

Finally, in this study, we only considered the perspective of mathematics education professors to infer whether the instructional norms regarding mathematics instruction in the two contrasted countries apply as anticipated. This decision was based on their important role in forming instructional norms because of their duties in teacher education and research, but it would be important to investigate whether the findings can be replicated with other relevant groups from the same cultural contexts (e.g., teachers).

Bearing in mind these limitations and the need for further research, our findings indicate that despite mathematics being a relatively homogeneous discipline and growing international consensus regarding important aspects of instructional quality in mathematics education, instructional norms in different contexts with different educational traditions may still differ substantially and in line with cultural differences when looking at specific instructional situations. There is now a certain awareness of the risk that (implicit) instructional norms, for example, as underlying assessment instruments, could impair findings in comparative research (see, e.g., Lindmeier et al. 2024). Nevertheless, systematic investigations on the role of instructional norms are so far rare. Talking about mathematics education in a common research language may easily level differences at first sight. Thus, in order to draw attention to potentially different instructional norms across cultures, we focused explicitly on making them visible by means of instructional situations in our overarching binational research project with equal participation of researchers from a Western and an East Asian country. Beyond the evidence for influences of culture on norms regarding high-quality use of tasks reported in this article, we made similar observations regarding other aspects of instructional quality, which are widely used in comparative research (student thinking, use of representations: Dreher et al. 2021, 2024; for an overview: Lindmeier et al. 2024). In times of increasingly global mathematics education research endeavors, one of the biggest challenges will be finding ways to address cultural specificities and use them to advance our mutual understanding of conceptions of high-quality teaching.