1 Introduction

Numerous studies have shown that disciplinary behaviour by teachers is often discriminatory. In general, ethnic minority pupils suffer more from disciplinary sanctions than ethnic majority pupils (Aud et al. 2011; Musu-Gillette et al. 2016; Nichols, 2004; Nicholson-Crotty et al. 2009; Peguero and Shekarkhar 2011; Petras et al. 2011; Rocque and Paternoster, 2011; Skiba et al. 1997, 2002), even when pupil behaviour is experimentally controlled for (Glock 2016; Okonofua et al. 2016). One of the possible explanations for the discriminatory treatment of ethnic minority pupils could be collectively shared stereotypes, understood as "cognitive structures containing the recipient's knowledge, beliefs and expectations of a human social group" (Macrae et al. 1996, p. 42; Hamilton and Troiler 1986). In the US for instance, school and kindergarten teachers perceive African-American children as less capable, less socially competent, and more disruptive (Chang and Demyan 2007; Kumar and Hamer 2012; Minor 2014; Neal et al. 2003; Pigott and Cowen 2000). African-American pupils are also more likely to be rated as "troublemakers" than white pupils even if their behaviour is no different (Okonofua and Eberhardt 2015), and teachers expect more disruptive behaviour of this group in future even if current behaviour is the same as that of white pupils (Kunesh and Noltemeyer 2019). In Germany, preservice teachers judge pupils with Turkish roots less favourably (Froehlich et al. 2016), and they remember disturbing behaviour by these pupils much better than similar behaviour by native pupils (Glock and Krolak-Schwerdt 2014). Hence, negative stereotypes about ethnic minority pupils seem to contribute to discriminatory treatment and judgments by teachers.

In this contribution, we focus on two theoretical approaches in order to understand the empirical results cited. First, they can be understood theoretically following social identity theory (Tajfel 1969, 1970). This theory assumes a strong identification between individuals belonging to the same relevant social group (e.g. ethnic group, socioeconomic class, gender), and assumes that—in intergroup situations—individuals favour members of their in-group over members of their out-group. Stereotypes help them to differentiate between the two groups so that they can favour their own group (Snyder and Miene 1994). Since the vast majority of teachers in most countries have an ethnic majority background, the discriminatory treatment of ethnic minority pupils could mainly be the result of in-group favouritism by ethnic majority teachers, who punish pupils of their in-group more mildly and pupils of their out-group more severely compared to ethnic minority teachers. However, the empirical evidence in the educational context is weak, although some results indicate that ethnic minority pupils are less severely punished by ethnic minority teachers than by ethnic majority teachers (Glock and Schuchart 2020; Lindsay and Hart, 2017). Nevertheless, despite this rather weak empirical support, politicians and educators often call for an increase in the proportion of ethnic minority teachers in schools (Villegas and Irvine 2010).

Other results indicate that ethnic minority pupils are not judged more favourably by ethnic minority teachers and that ethnic minority teachers do not differ from ethnic majority teachers in their punishment of ethnic minority pupils (Bradshaw et al. 2010; Cullinan and Kaufmann 2005; Pigott and Cowen 2000; Takei and Shouse 2008). This could be explained in terms of a different theoretical approach, namely system justification theory (Jost et al. 2004), which assumes that it is not only the desire to protect positive social identity that motivates behaviour towards a competing out-group. Following this theory, members of disadvantaged groups (e.g. ethnic minorities) have internalized the legitimization of social hierarchies to such an extent that they prefer the out-group to their in-group and even strive to leave their in-group in order to "move up" to the out-group. This is supported by research showing that members of ethnic minorities often favour the out-group instead of showing in-group favouritism because of their lower status (Jetten et al. 2000; Livingston 2002) and because they have the same stereotypes (Dasgupta 2004). Following this view, there should be no difference between the disciplinary practice of ethnic minority and ethnic majority teachers, and the discriminatory treatment of ethnic minority pupils reported above should be adopted by both groups of teachers. Since the number of studies on the effects of an ethnic match in schools (teachers share the ethnic background of their pupils) on the disciplinary behaviour of teachers is small, we do not yet have a sound database to empirically either justify or reject the above-mentioned demand for a higher proportion of ethnic minority teachers in schools.

In this study, we test various hypotheses which we derive from the two theoretical approaches. Our aim is to investigate whether an ethnic match influences the disciplinary behaviour of teachers. We also analyse the implicit and explicit stereotypes of teachers as one of the explanations for their disciplinary practice. This study makes several important contributions to existing research: (1) We analyse the judgments and sanctions of ethnic minority as well as ethnic majority teachers for both ethnic minority and ethnic majority pupils, and thus all the possible categories for an ethnic match or mismatch. This also allows us to consider an often neglected aspect of in-group favouritism—whether ethnic majority pupils are disadvantaged by ethnic minority teachers. (2) We conduct an experimental study, controlling for the type of misbehaviour by pupils. Most of the studies cited above used administrative or survey data, with the behaviour mostly reported by teachers. (3) The majority of experimental studies in this field focus on one or two types of behaviour and their external validity can therefore be questioned. In this study, we cover a wide range of different types of misconduct in schools, using a scale with 18 different types of misbehaviour.

2 Research background

2.1 Theoretical explanations for in-group- and out-group favouritism

Our general assumption is that membership of a particular social group influences the behaviour towards and the stereotypes about this group (Dasgupta 2004). However, different theories have different perspectives on the function that membership of a social group has for its members: The theory of social identity (Tajfel 1969, 1970; Tajfel and Turner 1986) assumes that individuals try to maintain or enhance positive self-esteem by positively distinguishing their in-group (the social group they belong to) from a competing out-group in intergroup situations in terms of judgment and behaviour (Dasgupta 2004). They tend to perceive and treat “members of the out-group as undifferentiated items in a unified social category, rather than in terms of their individual characteristics” (Tajfel and Turner 1986, p. 283). A stereotypical view of the out-group justifies discriminatory behaviour towards its members. One of the theoretical principles of social identity theory is that it must be possible to positively distinguish the in-group from the out-group. This may be difficult for members of low-status, negatively stereotyped groups (Steele et al. 2002). Following Tajfel and Turner (1986), members of these groups will strive to perceive their in-group more positively, for instance by using other dimensions for comparison or by changing the value of the adjectives assigned (cf. “black is beautiful”). A different strategy is to dissociate themselves from or even to leave their in-group in order to move into a group that is perceived more positively (Tajfel and Turner 1986).

However, it is often the case that individuals cannot easily leave their in-group—for instance, they cannot change their ethnicity. Other theories such as system justification theory assume that there are conflicting motivations that regulate judgment and behaviour towards the in-group and the out-group. Following this approach, it is not only identification with one’s social group but also the legitimization of social hierarchies that plays a role in the perception of the in-group and the out-group (Jost et al. 2004). The members of privileged groups (e.g. the ethnic majority) legitimize their position with positive in-group and negative out-group descriptions such as stereotypes (Jost and Banaji 1994). Their in-group favouritism may be rooted in their desire to legitimize their privileged social position as well as in their desire to achieve and to protect positive self-esteem. For the members of disadvantaged groups (e.g. ethnic minorities), these two motivations work in opposite directions. If they have internalized the legitimization of social hierarchies, their desire to protect positive self-esteem by favouring their in-group may conflict with their acceptance of a social order that assigns them a disadvantaged position (Jost and Burgess 2000). As a consequence, they may have a more positive view of the socially successful out-group than of their own in-group, and this may result in behaviour that favours the out-group and not the in-group.

2.2 Implicit and explicit stereotypes

However, different motivations can contradict each other in such a way that minority members may not want to explicitly admit their negative view of their in-group, and this may lead them to control their explicit stereotypes (Devine 1989; Jost et al. 2004). Access to a level of stereotypes that is largely unconscious and non-controlled can be gained by the measurement of implicit stereotypes (Bargh 1999; Devine 1989). These are defined as "the introspectively unidentified (or inaccurately identified) traces of past experience that mediate attributions of qualities to members of a social category" (Greenwald and Banaji 1995). They are recursively related to explicit stereotypes (i.e. they are the result of explicit stereotypes and influence them, Glock and Böhmer 2018). However, explicit stereotypes can also deviate from implicit stereotypes since the former are subject to the influence of social norms and social desirability (Greenwald and Banaji 1995). Research on stereotypes has shown that while members of ethnic minorities control their stereotypes about their in-group at the explicit level, their implicit stereotypes often correspond to those of the majority (Chasteen et al. 2002; Devine, 1989). For example, in some studies, African Americans favoured their in-group at the explicit level while they had even more positive implicit stereotypes about white Americans than white Americans themselves (Ashburn-Nardo et al. 2003; Nosek et al. 2002). The implicit preference for their in-group was accompanied by corresponding behaviour, e.g. a preferred choice of out-group members to cooperate with in a demanding task (Ashburn-Nardo et al. 2003). However, whether the out-group is implicitly favoured more than the in-group depends on various factors, including the status differences between the in-group and the out-group and the extent to which people collectively agree with shared stereotypes (cf. Dasgupta, 2004). Moreover, implicit stereotypes influence the motivation to control stereotypical behaviour and the ability to reflect on whether and to what extent implicit stereotypes are related to stereotypical behaviour. Highly motivated individuals may even tend to overcompensate for implicit stereotypes (Wegener and Petty, 1997).

2.3 Empirical findings in the school context

Both social identity theory and system justification theory predict that ethnic majority teachers have negative stereotypes about ethnic minority pupils, judge their behaviour as more disruptive and punish them more severely than pupils of their in-group. There is some evidence to support this assumption. Compared to ethnic minority kindergarten teachers, ethnic majority kindergarten teachers are more likely to perceive untidiness, school immaturity and poor ability to follow instructions in children of their out-group (Rimm-Kaufman et al. 2000; see also Downey and Pribesh, 2004; Dee, 2005). The findings of Bates and Glick (2013) suggest that white kindergarten teachers perceive the behaviour of African-American children (their out-group) as more disruptive than African-American teachers do, although no positive effects were observed for an ethnic match between Latin-American and Asian kindergarten teachers and children. In the study of Takei and Shouse (2008), an unfavourable judgment of the behaviour of African-American pupils by teachers of their out-group applies only to history and English teachers but not to mathematics teachers. McGrady and Reynolds (2013) find that white teachers perceive African-American pupils as less attentive and more disruptive but not less willing to work hard or to be socially competent as white pupils (see for limited support of this assumption also Cullinan and Kauffman, 2005). Some studies also show that ethnic majority teachers punish misbehaviour by ethnic minority pupils more harshly than the same behaviour by ethnic majority pupils (Glock and Schuchart, 2020; Lindsay and Hart, 2017).

Regarding stereotypes and the behaviour of ethnic minority teachers, social identity theory would expect them to favour their own in-group and disadvantage their out-group. Some findings point in this direction: McGrady and Reynolds (2013) show for the judgment of social behaviour in English classes that ethnic minority teachers favour ethnic minority pupils over ethnic majority pupils, but they did not find this result for three other behaviour judgments in English classes, and they found it for no behaviour judgments at all in mathematics classes (there was no in-group favouritism by African-American teachers at all levels in the study by Cullinan and Kauffman, 2005). Downey and Priebesh (2004) found that both kindergarten and eighth grade African-American children and adolescents benefit from African-American teachers, and in eighth grade, African-American pupils are rated even better by teachers of their in-group for their learning effort than white pupils by white teachers. The results of Dee (2005) indicate that the behaviour and the attention of white pupils are less favourably rated by ethnic minority teachers (see also Glock and Böhmer, 2018). There is less evidence for the disciplinary behaviour of ethnic minority teachers towards their in-group. Although the results of Lindsay and Hart (2017) and Glock and Schuchart (2020) indicate that ethnic minority pupils are punished less severely by ethnic minority teachers, a number of studies show no clear support for in-group favouritism among ethnic minority teachers (Alexander et al. 1987; Bradshaw et al. 2010; Cullinan and Kaufmann 2005; Pigott and Cowen, 2000; Rocque and Paternoster, 2011; Takei and Shouse, 2008).

This suggests that—for ethnic minority teachers—maintaining a positive social identity may not be the motivation for stereotypes and behaviour or may not be the only motivation. It is in this direction that studies must be interpreted that do not show systematic preference of ethnic minority students by teachers of their in-group. Pigott and Cowen (2000) found that the behaviour of African-American children was more negatively rated by both white and African-American teachers, and an ethnic match had no significant influence on the ratings. Alexander et al. (1987) show that African-American and white teachers, in particular those from privileged families, perceived the behaviour of African-American pupils as more negative than that of white pupils, irrespective of their own ethnic affiliation (see also Dee, 2005). The study conducted by Takei and Shouse (2008) even shows that African-American mathematics and science teachers judge the behaviour of African-American pupils less favourably than white teachers do. However, this result could not be replicated for history and social science teachers. The results of Bradshaw et al. (2010) indicate that an ethnic match between pupils and teachers had no effect on the risk of office disciplinary referral of African-American pupils (see also Abacioglu et al. 2019).

2.4 Summary and research questions

2.4.1 Disciplinary behaviour of teachers and ethnic match

According to social identity theory and system justification theory, differences in the disciplinary behaviour of teachers should depend on whether a pupil belongs to their in-group or to their out-group. Since the proportion of ethnic minority teachers in German schools is still very low (about 6%, Massumi, 2014), the data in our study are from preservice teachers, where the proportion with an ethnic minority background is about 16%. The following research questions and hypotheses are therefore related to preservice teachers. Our first question is:

  1. 1.

    Does the ethnic match or mismatch between preservice teachers and pupils influence the disciplinary behaviour of preservice teachers?

The few studies on ethnic match and disciplinary behaviour (Abacioglu et al. 2019; Bradshaw et al. 2010; Glock and Schuchart 2020; Lindsay and Hart 2017; Rocque and Paternoster 2011) show heterogenous results. In the following, we formulate theoretically based hypotheses. Following social identity theory, preservice teachers of the ethnic majority should be expected to punish ethnic majority pupils (members of their in-group) less severely than ethnic minority pupils (members of their out-group), while the opposite should be true of ethnic minority preservice teachers, who should punish ethnic minority pupils (members of their in-group) less severely than ethnic majority pupils (members of their out-group). Compared to ethnic minority preservice teachers, ethnic majority preservice teachers should punish ethnic majority pupils less severely and ethnic minority pupils more severely.

However, following system justification theory, different hypotheses must be formulated for ethnic minority preservice teachers. If they are considered as members of a low-privileged social group, they may have internalized the legitimacy of the social hierarchy. As a result, they may punish ethnic minority pupils (members of their in-group) more severely than ethnic majority pupils (members of their out-group). This theory should not result in different hypotheses for ethnic majority preservice teachers since they are in a socially privileged position compared to ethnic minority teachers. A comparison of disciplinary behaviour by different groups of preservice teachers should therefore not reveal any differences, and ethnic minority as well as ethnic majority pupils should be punished in the same way by both ethnic majority preservice teachers and ethnic minority preservice teachers.

2.4.2 Stereotypes and ethnic match

One reason for the group-specific disciplinary measures adopted by teachers can be stereotypes, and we distinguish between explicit stereotypes (conscious, with control possible) and implicit stereotypes (unconscious, with control not possible). In general, the presence of implicit and explicit stereotypes can be determined by whether they apply to a participant's in-group or out-group.

Our question is:

  1. 2.

    Do the stereotypes of preservice teachers about pupils depend on whether pupils belong to their in-group or their out-group?

Not all the studies that focus on the school sector find negative explicit stereotypes among ethnic majority preservice teachers about ethnic minority pupils (e.g. Glock and Böhmer 2018). One reason may be that teachers and preservice teachers are sensitized by public discussion of discrimination against ethnic minority pupils. For this reason, implicit stereotypes could differ more than explicit stereotypes, depending on whether there is an ethnic match or mismatch between teachers and pupils.

Following social identity theory as well as system justification theory, we assume that ethnic majority preservice teachers have negative implicit stereotypes about ethnic minority pupils (their out-group). However, social identity theory predicts that ethnic minority preservice teachers have more favourable stereotypes about ethnic minority pupils (members of their in-group) than about ethnic majority pupils (members of their out-group), and they should thus differ from ethnic majority preservice teachers in this respect. However, the results of some studies outside the educational context based on system justification theory have found that explicit stereotypes are in favour of the in-group, but implicit stereotypes are in favour of the out-group (Ashburn-Nardo et al. 2003; Nosek et al. 2002). A further consideration is that ethnic minority preservice teachers may see themselves as people who are going to succeed in gaining access to the privileged positions often held by ethnic majority members as a result of their academic qualifications and their future socioeconomic status. Thus, they may have more implicit negative stereotypes about ethnic minority pupils (their in-group) than about ethnic majority pupils (their out-group) and should not differ in this respect from ethnic majority preservice teachers.

2.4.3 The explanatory contribution of stereotypes to disciplinary behaviour

Although some studies indicate that stereotypes are associated with behaviour (e.g. Dasgupta 2004), evidence is still limited for teacher behaviour. Therefore, in the present study we ask:

  1. 3.

    Is the disciplinary behaviour of preservice teachers influenced by their ethnic match or mismatch with pupils, and can this be explained by the stereotypes that they have?

Irrespective of the theoretical approach, the extent to which stereotypes can be controlled by individuals is important for hypotheses about the relationship between stereotypes and behaviour. For instance, if behaviour is stereotypical but explicit stereotypes are well controlled, less controllable implicit stereotypes may be more appropriate to explain the behaviour. In contrast, if stereotypes are poorly controlled, implicit as well as explicit stereotypes should contribute to an explanation of discriminating disciplinary behaviour.

3 Method

3.1 Participants

In order to answer our research questions, we conducted an experimental study among 226 preservice teachers at a university in North Rhine-Westphalia/Germany. Due to missing data, the net sample consisted of N = 196 participants. Those who themselves or whose parents were born in Germany and whose family language was German were classified as members of the ethnic majority. Participants who themselves or whose parents were not born in Germany or whose family language was not German were classified as members of an ethnic minority. Of the 196 participants, n = 31 (15.8%) belonged to one of a number of different ethnic minorities. Due to the small group size of each minority, we treated all ethnic minority members as one group. In North Rhine-Westphalia, raising awareness of the situation and language difficulties of ethnic minority pupils in the school system has become an important part of the teacher training curriculum (LABG 2009). 58% of respondents were at an advanced stage in their study programme, but we have no information whether they had attended a course on this part of the curriculum and if so, what exactly had been taught.

3.2 Materials

3.2.1 Selection of types of disruptive pupil behaviour

In a first step, types of pupil misbehaviour were selected from existing instruments such as the "Pupils undesirable behaviour questionnaire" (Kokkinos et al. 2004, 2005). We eliminated less appropriate items (e.g. daydreaming, untidy homework) and added items from the questionnaire for the judgment of undisciplined behaviour formulated by Romi and Freud 1999 (cheating, late arrival, forgery of parental signature). Much of the behaviour selected for our study can also be found in the questionnaire "Teacher Observation of Child Adaptation-Revised" (TOCA-R, Werthamer-Larsson et al. 1991), on which some of the US studies cited above are based (e.g. Bradshaw et al. 2010; Petras et al. 2011).

3.2.1.1 Pilot study of scenes of misbehaviour

In total, we selected 18 types of behaviour (see ''Appendix'', Table 5). In a pilot study among N = 25 preservice teachers, we asked the participants to judge this behaviour regarding the degree of disturbance on a 5-point Likert scale ranging from 1 (not disturbing) to 5 (very disturbing) and the need for intervention on a 4-point Likert scale from 1 (unnecessary) to 4 (urgently needed). A wide range of types of behaviour was judged as less to moderately disturbing (1–12), for example stealing a mobile phone, forging an excuse for missing lessons, cheating or coming to class late. Verbal and physical violence against classmates, teachers and objects was judged as very disturbing (13–18). In the case of these situations, intervention was perceived as necessary to urgent. However, a need for intervention was also seen for all situations perceived as less disturbing with the exception of repeated forgery of a written parental excuse for absence. Thus, the preliminary study showed that the situations described covered a wide range of behaviour classified as disturbing, and here intervention was considered necessary in almost all cases.

3.2.1.2 Presentation in the questionnaire of the situations involving pupil misbehaviour

The 18 types of misbehaviour were described as part of school-related situations (e.g. pupil pinches his classmate during lesson so that he yells out). These were performed by 14–17 year old pupils in schools in the context of a "theatre workshop", which we organized and photographed. Protagonists were exclusively male pupils because teachers find their behaviour more disturbing than that of female pupils (see Glock 2016; Bradshaw et al. 2010; Skiba et al. 2011).

3.2.1.3 Protagonists in the scenes

The protagonists had a clearly German or Middle Eastern appearance. They were also given German or Turkish names taken from websites with the most popular first names. We focused on Turkish names because Turkish people represent the largest ethnic minority in North Rhine-Westphalia.

3.2.2 Disciplinary measures adopted by teachers

Participants were asked to choose for each situation of misbehaviour one of the 6 disciplinary response options. These options were adapted to German school practice and arranged in an ascending order of strictness. The options were (here for a pupil named Marvin): (1) I don't even react, (2) I talk to Marvin in private and ask him to not repeat such behaviour in future, (3) I rebuke Marvin in front of the whole class in plain language and threaten consequences, e.g. a disciplinary entry in the class register, (4) Marvin gets a disciplinary entry in the class register, (5) I inform Marvin’s parents, (6) I exclude Marvin from the rest of this lesson. We treated the response options as ordinally scaled.

Based on the results of the pilot study, we summarize the scenes 1–13 as scenes of mild to moderate misbehaviour (M = 3.31, SD = 1.65, Cronbach's alpha = 0.91) and scenes 14–18 as scenes of severe misbehaviour (M = 4.25, SD = 1.53, Cronbach's alpha = 0.89).

3.2.3 Stereotypes

3.2.3.1 Explicit stereotypes

The participants were asked to judge a Turkish and a native protagonist in the scenes using a semantic differential. The scale included 6 bipolar adjectives, always in the same order, with the positive pole (e.g. peaceful) corresponding to 1 and the negative pole (e.g. aggressive) to 5. Cronbach's alpha for the scale including all 6 adjectives is α = 0.92, M = 2.68, SD = 0.42.

3.2.3.2 Response latency as an indicator of implicit stereotypes

We measured the response latency of each judgment in milliseconds with the program "Inquisit", assuming that the more time a participant needs for his/her judgment, the less accessible this judgment is and the more likely it is that their implicit stereotypes will deviate from their explicit stereotypes (Fazio 1990). Cronbach's alpha for the total scale of the reaction times is α 0.90, M = 2754.71, SD = 1133.57.

3.3 Procedure and design

The 18 scenes were presented to each participant in an online questionnaire with a description (for example, scene 10 see Table 1: Marvin pinches his classmate so that he yells out). Participants were then asked to choose the most appropriate disciplinary measure. Following the presentation of all the scenes, participants were presented with photos of the protagonists and asked to judge each protagonist using the semantic differential. Finally, they answered questions about their own ethnic background (see 3.1).

Table 1 Disciplinary behaviour of participants and ethnic match (regression coefficentsa, robust standard errors; level of analysis: disciplinary measures)

Each participant was shown each scene, performed by either a German or a Turkish protagonist. We created two sets of scenes, differing in the ethnic background of the protagonist. Thus, the scene that was performed by a Turkish protagonist in Set 1 was performed by a German protagonist in Set 2 and vice versa. This ensured that in the overall dataset, each behaviour was performed once by a Turkish protagonist and once by a German protagonist, and also that each participant was shown each scene once.

3.3.1 Ethnic match

There was an ethnic match if an ethnic minority participant was shown a scene with a Turkish protagonist or if an ethnic majority participant was shown a scene with a German protagonist. The opposite, an ethnic mismatch, was the case if an ethnic minority participant was shown a scene with a German protagonist or if an ethnic majority participant was shown a scene with a Turkish protagonist.

3.4 Methodological approach

The data had a multi-level structure as a total of 196 participants had to choose a disciplinary measure for 18 different scenes, 9 of which were performed by an ethnic minority pupil and 9 by a native pupil. Thus, N = 3528 disciplinary measures (1764 for scenes with an ethnic minority pupil) were nested in 196 participants. We collapsed the 18 scenes into the two categories “moderate misbehaviour” and “severe misbehaviour”.

In order to answer our research questions, we ran regression models. We took account of the multi-level structure methodologically by correcting standard errors with the option "robust cluster" in the statistics program Stata 15.

4 Results

4.1 Question 1: Disciplinary practice and ethnic match

In question 1 we addressed the disciplinary practice of preservice teachers as a function of the ethnic match. Table 1 shows the results for preservice teachers with an ethnic minority or majority background (columns 1–2) and for every combination of ethnic match (columns 3–6). Following theoretical approaches, we assumed that participants with an ethnic majority background punish pupils of their in-group more mildly than pupils of their out-group. However, the results show that this group does not punish pupils from different ethnic groups differently, and this holds true both for moderate and for severe misbehaviour.

Various hypotheses were derived regarding the disciplinary practice of ethnic minority preservice teachers towards pupils of their in-group. As Table 1 (column 2) shows for moderate misbehaviour, ethnic minority pupils were punished more mildly by this group than ethnic majority pupils. No differences were found for severe misbehaviour. In order to investigate whether ethnic minority pupils indeed benefited from ethnic minority teachers, we compared teachers from different groups.

This comparison (columns 3–6) indicates that ethnic minority pupils did not benefit from ethnic minority preservice teachers and were treated the same by preservice teachers of both their in-group and their out-group. However, for moderate misbehaviour, ethnic minority teachers treated pupils of their out-group more harshly (column 5) than ethnic majority preservice teachers treated pupils of their out-group (= reference group). Thus, ethnic majority pupils were disadvantaged by teachers of their out-group, but they seemed not to benefit from teachers of their in-group (column 4). However, this result could not be found for severe misbehaviour.

4.2 Question 2: Stereotypes and  ethnic match

4.2.1 Explicit stereotypes

The general results have already shown that ethnic majority pupils were perceived less favourably than ethnic minority pupils. This also applied to different groups of participants. Both ethnic minority and ethnic majority preservice teachers (Table 2, columns 1–4) had more positive explicit stereotypes about ethnic minority pupils than about ethnic majority pupils.

Table 2 Stereotypes and ethnic background of pupils and participants (results of t-tests)

A direct comparison of participants showed that ethnic minority pupils (column 5) were judged the same by ethnic majority and ethnic minority preservice teachers. However, compared to ethnic majority teachers, ethnic minority teachers had slightly more negative stereotypes about ethnic majority pupils (column 6).

4.2.2 Response latencies

Ethnic majority preservice teachers (Table 2, columns 1 and 2) needed on average more time to judge pupils of their in-group than pupils of their out-group. This suggests that these participants controlled their (less positive) stereotypes about their in-group more than their stereotypes about their out-group. A direct comparison of the two groups of preservice teachers (columns 5 and 6) indicates that ethnic majority preservice teachers needed more time for the judgment of both ethnic minority as well as ethnic majority pupils than ethnic minority preservice teachers. Moreover, ethnic minority preservice teachers needed the same time to judge ethnic majority and ethnic minority pupils (columns 3 and 4). Thus, their explicit stereotypes seem to be less controlled than those of ethnic majority preservice teachers.

4.2.3 Relationships between explicit stereotypes and response latencies

Table 3 shows the correlative relationships between the explicit stereotypes and the response latencies for the different pupil groups, again broken down according to the ethnic background of the participants. There were no correlative relationships between the explicit stereotypes for pupils with different ethnic backgrounds and the corresponding reaction times for ethnic majority preservice teachers. The interpretation given above that these participants controlled their judgment of pupils of their in-group better must thus be treated with caution. For ethnic minority preservice teachers, the significant correlations indicate that response latency decreased with an increase in negative stereotyping of pupils of their in-group. The same relationship could not be found for the judgment of pupils of their out-group. This indicates that a less favourable judgment of ethnic minority pupils is closer to the implicit stereotypes of ethnic minority preservice teachers.

Table 3 Correlations between explicit and implicit stereotypes and response latency of participants according to the ethnic background of pupils; level of analysis: individuals)

4.3 Question 3: Ethnic match, stereotypes and sanctioning behaviour for mild to moderate misbehaviour

The results have shown so far that ethnic minority participants, but not ethnic majority participants, treated pupils of their in-group and their out-group differently in the case of mild to moderate misbehaviour. For this reason, the following analyses focus on the reactions of ethnic minority participants to pupils whose misbehaviour was moderate. Only the significant results will be reported.

In Table 4, we try to predict the harsher treatment of ethnic majority pupils compared to ethnic minority pupils by ethnic minority preservice teachers by introducing into the model stereotypes and the interaction between stereotypes and the ethnic background of pupils. Looking at model 1, it can be seen that the previously observed effect of pupils' ethnic background on the disciplinary behaviour of ethnic minority preservice teachers (Table 1) is explained by the interaction of the explicit stereotypes about ethnic minority pupils and the ethnicity of the misbehaving pupil: With increasing explicit stereotypes about their in-group, participants belonging to an ethnic minority tend to punish pupils of their out-group more severely (0.46 +). However, the effect is very small.

Table 4 Regression of disciplinary measures of ethnic minority participants on stereotypes for moderate misbehaviour (regression coefficentsa, robust standard errors; level of analysis: disciplinary behaviour)

Next, the response latencies and their interaction with the ethnic background of pupils are considered (Table 4, model 2): Ethnic majority pupils are more harshly punished (0.70**). There is no direct effect of the response latencies. The effect of the interaction coefficient (significant at the 10%-level) indicates that with increasing response latencies for stereotypes about ethnic minority pupils, ethnic majority pupils were punished less severely (negative sign). The correlation analyses have already shown that a more positive judgment of the in-group (= ethnic minority pupils) correlates with longer response latencies of ethnic minority preservice teachers and vice versa. This means that the more ethnic minority preservice teachers have negatively connotated implicit stereotypes about ethnic minority pupils (= shorter response latencies), the more severely they punish ethnic majority pupils. However, R2 suggests a low explanatory power of stereotypes and response latencies in both models.

5 Discussion

In our research, we applied different theoretical approaches to the investigation of in-group and out-group bias and underlying stereotypes about the ethnic background of pupils. According to social identity theory (Tajfel, 1969, 1970; Tajfel and Turner 1986) as well as system justification theory (Jost et al. 2004), ethnic majority preservice teachers should punish pupils of their in-group less severely and, at least implicitly, judge them more favourably than pupils of their out-group. Our analyses did not confirm these hypotheses. Participants belonging to the ethnic majority judged ethnic minority pupils even more positively than pupils of their in-group. The response latencies showed no clear indication that implicit stereotypes deviated from explicit stereotypes in a systematic way. The more favourable judgment of ethnic minority pupils was not reflected in more favourable treatment of misbehaviour by ethnic minority pupils, since here ethnic majority participants treated pupils of their in-group and their out-group in the same way. Our results therefore support those studies which have also not found indicators of an in-group/out-group bias among ethnic majority teachers (Alexander et al. 1987; Bradshaw et al. 2010; Cullinan and Kaufmann, 2005; Pigott and Cowen, 2000; Rocque and Paternoster, 2011; Takei and Shouse, 2008).

However, our results contradict the findings of an experimental study by Glock (2016) that identified more severe punishment of ethnic minority students by ethnic majority preservice teachers (ethnic majority pupils were not part of that study). One reason could be that in Glock’s study only one scenario of misbehaviour was selected, whereas our dependent variables consist of a number of different types of behaviour. One of the strengths of our study is that we measured the reaction of preservice teachers to 18 different scenes of misbehaviour by pupils. The differences observed by Glock (2016) could be due to the selection of a scene that simply by accident yields a difference between the punishment of ethnic minority and ethnic majority pupils. Overall, ethnic majority teachers in our study seemed motivated to see ethnic minority pupils in a positive light and to treat all pupils in the same way. One reason could be that in North Rhine-Westphalia (the federal state in which our study was conducted), raising awareness of the situation and language difficulties of ethnic minority pupils in the school system has become an important part of the teacher training curriculum (LABG, 2009). Fifty-eight percent of the participants in our study were at an advanced stage in their study programme, and this part of the curriculum may have already had an effect on their judgments and their behaviour. It is an open question whether the recent attention given to minority pupils in the teacher training curriculum of most federal states (Becker-Mrozeck et al. 2017) has led in general to less-stereotyped judgments about and behaviour towards ethnic minority students by preservice teachers. Even though such a development would be assessed as positive in principle, it must be questioned whether such judgements and such behaviour are maintained beyond the period of teacher training (Kumar and Hamer, 2012).

We have significantly contributed to existing research by systematically investigating all the combinations of ethnic match between teachers and students. We were therefore able to investigate the behaviour of ethnic minority preservice teachers towards ethnic minority and ethnic majority pupils in more detail than many previous studies in this field. Following social identity theory, more favourable judgments and less severe punishment of ethnic minority pupils by ethnic minority teachers were expected. Our results show that ethnic minority participants judged ethnic minority pupils more positively and treated them less harshly than ethnic majority pupils. However, when the comparison with ethnic majority participants was taken into account, it became clear that ethnic minority pupils were judged and treated in the same way by teachers of their in-group and their out-group. As is the case in other studies (Bradshaw et al. 2010; Dee 2005; Pigott and Cowen 2000; Takei and Shouse 2008), we could not find support for the assumption drawn from social identity theory that ethnic minority pupils benefit from ethnic minority preservice teachers. In fact, the results seem to partly confirm the hypothesis drawn from system justification theory since ethnic minority preservice teachers seemed to explicitly favour their in-group, but tended to implicitly have a more negative view of their in-group. However, this did not lead to preference being given to pupils of their out-group as other studies suggest (Ashburn-Nardo et al. 2003): The more negative explicit and implicit stereotypes ethnic minority preservice teachers had about moderately misbehaving pupils of their in-group, the more severely they punished pupils of their out-group. If we reverse the reference group of the pupils’ ethnic background, we can also say that the more negative explicit and implicit stereotypes ethnic minority preservice teachers had about moderately misbehaving pupils of their in-group, the less severely they punished them.

One interpretation of this finding could be that ethnic minority preservice teachers felt the desire to compensate for a negative view of their in-group. Wegener and Petty (1997) have pointed out that individuals are able to compensate for their implicit negative stereotypes by better treatment of the negatively stereotyped group. The design of our study does not allow us to conclude whether ethnic minority preservice teachers compensate for a more negative view of their in-group with better treatment of their in-group, harsher treatment of their out-group, or both. Since we know from the comparison with ethnic majority teachers that differences between these two groups are to the disadvantage of ethnic majority pupils, this could suggest that ethnic minority preservice teachers in our study tended to act against the implicitly preferred group. However, regardless of its interpretation, this finding should be treated with caution since the interaction coefficient was significant only at p ≤ 0.10. Moreover, a significant coefficient was found only for moderate but not for severe misbehaviour. One reason for this finding could be that the situations we described as “moderate” were those that often occur in class and at school and which can be interpreted differently by teachers. In contrast to these, the situations described as "severe" refer to extreme and at the same time less frequent behaviour, resulting in a higher degree of agreement on the required reaction (Glock, 2016).

6 Limitations

This study has some limitations. While the pupils shown in the scenes had Turkish names, two thirds of the participants with an ethnic minority background did not belong to this group. This raises the question whether these participants perceived Turkish ethnic minority pupils as their relevant in-group. For instance, Glock and Schuchart 2020 and Kleen et al. 2019 show that ethnic minority preservice teachers only favour ethnic minority pupils if these pupils belong to their particular ethnic group. Thus, in particular stereotypes and disciplinary behaviour towards ethnic minority pupils may have been different if we had been able to differentiate between ethnic groups among our participants (e.g. Bates and Glick 2013). However, there are plausible reasons for the fact that ethnic minority participants tend to perceive ethnic majority pupils as their out-group. In our study, ethnic minority participants themselves and/or their parents had immigrated to Germany. Some participants had Turkish roots, but the majority or their parents were from countries which traditionally supplied Germany with immigrant workers (Italy, Greece, Portugal) or from countries of the Global South. These ethnic groups are not considered to be fully assimilated or acculturated (see e.g. Diehl et al. 2016). One indication of this is that 60% of the ethnic minority participants but only 12% of the ethnic majority participants stated that "many" or "all" of their friends had an immigrant background. It can therefore plausibly be assumed that ethnic majority pupils were more likely to be perceived as an out-group by ethnic minority participants.

Implicit stereotypes could not be adequately measured by response latencies since participants were not required to evaluate pairs of adjectives but to respond to a semantic differential. Recording implicit stereotypes by reacting to pairs of adjectives (Fazio 1990) might have provided better measurement of the implicit stereotypes among preservice teachers and might also have more effectively contributed to accounting for the discriminatory behaviour of ethnic minority preservice teachers (Glock and Böhmer 2018). Furthermore, employing an Implicit Association Test (Greenwald et al. 1998) in order to measure implicit stereotypes (Greenwald and Banaji 1995) as well as assessing stereotypes explicitly using a Likert scale (Banaji and Greenwald 1995) would have allowed us to test the assumptions of dual process models such as the Reflective Impulsive Model (Strack and Deutsch 2004), which assume that all behaviour and judgments are the result of both automatic and controlled components.

A further limitation stems from the combinations of pupil misbehaviour and ethnic background. As all the participants judged ethnic minority and ethnic majority pupils, they had an opportunity to vary their judgments in the light of the context given (Bless and Schwarz 1998), as teachers always do in the classroom context (Trautwein et al. 2006). Hence, in a context in which the ethnicity of the pupils is held constant, the findings could be different, as participants might not feel the need to control their stereotypes and to vary their judgments accordingly.

Although the advantage of experimental studies is that participants are randomly assigned to pupils and in this way frequent selection problems (see e.g. Neugebauer, 2011) are avoided, experimental situations have little to do with reality. This can be illustrated with reference to three aspects: a) Stereotypical behaviour becomes more likely when people find themselves in stressful situations in which they can draw on readily accessible knowledge and experience (Crocker et al. 1998; Meister and Melnick, 2003). The participants in our study had sufficient time to reflect on the situation and to decide on judgments and disciplinary measures. b) Okonofua and Eberhardt (2015) and Vavrus and Cole (2002) have pointed out that harsh and discriminatory sanctioning behaviour by teachers is time-dependent. This means that in particular a repetition of disruptive behaviour is more likely to result in more severe punishment of ethnic minority pupils. This time dependency was not modelled in our study. c) Disruptive behaviour by pupils and disciplinary behaviour by teachers always takes place in the social context of a class or school. The higher the proportion of ethnic minority pupils in a class, the more severe the disciplinary measures adopted and the more likely it is that there is discrimination against ethnic minority students (see e.g. Rocque and Paternoster, 2011, but contrary findings in Bradshaw et al. 2010). Some studies also point to the existence of disciplinary cultures in schools which are all the more severe the higher the proportion of ethnic minority pupils (Rocque and Paternoster, 2011; Welch and Payne, 2010). The classroom and school contexts are difficult to model in experimental studies.

7 Conclusions

Even if these limitations question the generalizability of our study and the transferability of the results to practice, it can cautiously be concluded that the desire to maintain a positive social identity may not be the only individual motivation that regulates the behaviour and judgment by (preservice) teachers of pupils of their in-group and their out-group. We have found no empirical support for the view that the judgments and behaviour of ethnic majority teachers are biased. What is needed is a closer examination of the individual and of the situational conditions under which discriminatory behaviour in terms of in-group or out-group favouritism is more likely.

As far as teacher training is concerned, it can be stated that overemphasis on differences related to ethnic minority status is accompanied by the assignment of an inferior status to ethnic minorities as a group. This not only emphasizes the focus on deficits but may also make it easier for ethnic minority preservice teachers to legitimize unjust punitive measures against ethnic majority pupils. What is needed instead is the strengthening of a professional perspective of diversity that focuses on individual needs and strengths. From this point of view, fair treatment of ethnic minority pupils is a matter of teacher professionalism and not a matter of ethnic affiliation.