1 Introduction

For adolescents of immigrant descent, being stereotyped and experiencing discrimination can undermine academic and socioemotional adjustment (Berry et al., 2006; Kunyu et al., 2020; Titzmann et al., 2011). One approach to understanding this harmful process is through considering social identity. Social identity theory emphasizes that individuals can derive a positive sense of self through the identification and sense of belonging to a particular social group (Tajfel & Turner, 1986). When social identities are devalued and threatened due to stereotyping or discrimination, however, this can compromise adolescents’ academic experiences (Verkuyten et al., 2019). Studies consistently show that discrimination based on social identities (i.e., being harassed due to perceived social group membership based on ethnicity-race, gender, disability, etc.) is linked to greater distress and poorer academic performance (Russell et al., 2012).

To counter social identity threats, interventions such as brief self-affirmation writing tasks that focus on affirming values were developed (Cohen & Sherman, 2014; Cohen et al., 2006; Steele et al., 2002). Studies testing this type of intervention have proliferated, culminating in a recent meta-analysis of 58 studies in education (Wu et al., 2021). Notably, most studies were from the US except for nine (15%). The results showed that for identity-threatened students, i.e. students potentially experiencing social identity threat because of their group identity and associated negative stereotype, there was an overall positive and significant average affirmation effect (Hedges’ g = 0.15, a medium effect according to benchmarks for educational interventions, Kraft, 2020). For identity non-threatened students, the overall average affirmation effect was small and not significantly different from zero (Hedges’ g = 0.01). Importantly, this meta-analysis also showed a moderate to high heterogeneity of effects for identity-threatened students. Subsequently, the next generation of testing brief self-affirmation writing tasks should continue to pinpoint factors that lie behind these variations (Easterbrook et al., 2021; Hanselman et al., 2017; Wu et al., 2021). The aim of our longitudinal field experiment, then, is to test whether a randomized trial of a brief self-affirmation writing intervention protects against poorer school-related adjustment for identity-threatened students (adolescents of immigrant descent who may be more likely to experience being the target of negative stereotypes and discrimination) vs. identity non-threatened students (adolescents of non-immigrant descent). In addition, tapping both into student characteristics and context conditions, we consider heritage group, experiences of discrimination, and classroom cultural diversity climate as potential moderators that may explain the heterogeneity of effects found in previous research.

1.1 Being a target of stereotypes and discrimination

The experience of being the target of negative stereotypes and living with stereotype threat—the fear of confirming a negative stereotype about one’s group (Steele & Aronson, 1995)—can be stressful. One of the most prominent stereotypes of immigrant descent adolescents in Germany include being less competent academically (Froehlich et al., 2022), even when their competence is not lower than their non-immigrant peers (Zander et al., 2014). Pre-service teachers who read scenarios of students that confirmed stereotypes (e.g., a student of Turkish-heritage performing low academically) vs. disconfirming stereotypes (e.g., a student of Turkish-heritage being the best student in the class) were more likely to agree with statements that were negatively biased against students of immigrant descent (Glock & Krolak-Schwerdt, 2013). Pervasive negative stereotypes of immigrant students in Germany and throughout Europe, particularly of Turkish or Muslim background, increase the risk for these adolescents to experience stereotype threat and also more explicit discrimination (Baysu & Phalet, 2019; Frankenberg et al., 2013; Vedder et al., 2007). When stereotype threat is activated, Turkish-heritage adolescents in Germany performed worse on math tasks (Martiny et al., 2014). However, subsequent meta-analyses show that these effects are not strong, especially under real-life conditions (Appel et al., 2015; Froehlich et al., 2022; Shewach et al., 2019).

Related to (implicit) stereotype threat, explicitly experiencing stereotyping and discrimination in the school context not just from students but also teachers, can also contribute to disengagement from school and negatively affect academic achievement (Baysu & Phalet, 2019; Cohen et al., 2009; Easterbrook et al., 2021). A meta-analysis of 67 studies found that teacher-based racial-ethnic discrimination relates to lower well-being and academic performance among students (Civitillo et al., 2023). Taken together, being the target of stereotypes and discrimination in school by classmates and teachers undermines positive academic and socioemotional adjustment if there is nothing to offset the negative effects.

1.2 Self-affirmation theory and intervention

Self-affirmation theory suggests that if an adolescent is threatened (i.e., devalued) in an important area of the self (such as being the target of pervasive, negative stereotyping concerning academic performance), then affirming a different area of the self (e.g., reminding adolescents of the important values that they hold), can diffuse the threat by reinforcing self-integrity, promoting a sense of belonging, and adopting a more expansive view of the self (Cook et al., 2012; Sherman & Cohen, 2020; Steele et al., 2002). Therefore, self-affirmation is an important psychological intervention, especially for identity-threatened students.

Brief self-affirmation writing interventions are short writing exercises where students reflect on values that are important to them. These self-affirmation interventions have effectively diffused threat to the self and, importantly, prevented academic declines for ethnic minority students in the United States (Cohen et al., 2006, 2009; Sherman et al., 2013; Yeager & Walton, 2011). While subsequent studies did not always show effects (e.g., de Jong et al., 2016; Hanselman et al., 2017), a study in England showed that another “underperforming” stereotyped group—students of low SES–benefited from a self-affirmation writing intervention such that it reduced the academic performance gap with high SES students by 62% (Hadden et al., 2020). As noted above, while effects are not always evident in every study, a recent meta-analysis of this type of intervention in education shows an overall positive, significant effect for identity-threatened students (Wu et al., 2021). Therefore, in our study, we hypothesize that the intervention will benefit school-related adjustment for adolescents of immigrant descent (who are assumed to experience higher stereotype threat, especially as they are also more likely to have lower SES, SVR-Forschungsbereich, 2016), but not for those of non-immigrant descent (who are assumed to experience lower stereotype threat).

To date, studies and the recent meta-analysis show heterogeneity of effect sizes, suggesting that potential moderators should be taken into account for a more precise understanding for whom and under what conditions the intervention may be effective (Easterbrook et al., 2021; Hanselmann et al., 2017; Wu et al., 2021). For instance, implementation by teachers is important as it conveys that the school environment reinforces students’ affirmed views. Nonetheless, European research also shows the opposite: Whereas an intervention administered by researchers was effective amongst Turkish- and Arab‐heritage German students, at least for short-term follow-up (8 weeks after the intervention was administered) (Lokhande & Müller, 2019; Müller & Lokhande, 2017), a study targeting Turkish- and Moroccan-Dutch students where the intervention was administered by teachers showed no effect on school achievement or problem behavior (de Jong et al., 2016). Nonetheless, in the meta-analysis, effect sizes were greater when the affirmation task was administered by teachers as part of a regular classroom assignment rather than researchers (Wu et al., 2021). Such variations are important to consider and empirically test.

1.3 Potential moderators producing heterogeneity of effects

Theorized heterogeneity (i.e., explanations for variations in the effectiveness of the intervention based on theory) is necessary to move the field of self-affirmation interventions forward (Easterbrook et al., 2021; Hanselmann et al., 2017). Therefore, we focus on three theoretically important moderators that focus on student characteristics and context conditions: heritage group, experiences of discrimination, and classroom cultural diversity climate.

One important potential moderator is heritage group. Attention to variation within the broad grouping of adolescents of immigrant descent is necessary to better pinpoint for whom protective factors may be more likely to work. Although large-scale studies of academic achievement report that being an immigrant predicts poorer achievement (e.g., PISA, TIMSS), these findings obscure important variations within and between different heritage groups (Brenick & Titzmann, 2015). For example, Aussiedler (German heritage immigrants from Russia) and their descendants attain, on average, a higher level of education compared to adolescents from other heritage groups (Fuchs & Sixt, 2007). Turkish-heritage and Arabic-speaking country heritage 7th graders in Germany (but not Eastern European-, or other heritage groups) showed higher math competence for those who participated in a brief writing affirmation intervention compared to a control group (Lokhande & Müller, 2019; Müller & Lokhande, 2017). As different heritage groups are distinguished by different migration histories and experiences, lumping them together in one group erases important variations. Therefore, in our study we tested whether the effectiveness of the intervention may vary by specific heritage group.

Another important moderator is experiences of discrimination. Focusing on categories of heritage groups is a proxy for group-related experiences (Easterbrook et al., 2021). A more precise way to understand group variations is to directly assess those experiences. For instance, adolescents with Turkish-heritage and heritage from Arabic-speaking countries report higher levels of discrimination compared to other ethnic minority groups (Antidiskriminierungsstelle des Bundes, 2013). Rather than relying solely on heritage group categorization that may gloss over the specific mechanism to explain variation in intervention effectiveness, we include adolescents’ self-reported experiences of discrimination to test whether the affirmation intervention would be more consequential for those experiencing more discrimination (and thus more identity threat) compared to those experiencing less.

In addition to student variables, considering the school context is also important (Hanselmann et al., 2014). The classroom cultural diversity climate reflects how teachers and schools approach and manage diversity (Schachner et al., 2021). It is part of the learning environment and may also moderate the effectiveness of the affirmation intervention. Two of the most widely studied classroom climate dimensions are cultural pluralism and equality and inclusion. The cultural pluralism dimension emphasizes that cultural diversity in the classroom should be acknowledged, valued, seen as a resource, and opportunities should be provided to learn about this diversity. The equality and inclusion dimension emphasizes that all students from all backgrounds should be treated equally, no one should be excluded, and opportunities for contact, cooperation, and creating common goals among diverse groups, are intentionally supported. There are two ways moderation could work. On the one hand, studies show that a positive diversity climate–both naturally observed and experimentally induced–produces similar effects as self-affirmation interventions by boosting students’ sense of belonging and achievement, notably amongst identity threatened groups (Celeste et al., 2019, 2021; Schachner et al., 2019). Thus, intervention effects may be more pronounced in the absence of a positive cultural diversity climate. In our study, we expect that students from contexts with higher contextual threat in their school (e.g., in more negative cultural diversity climates, defined by less emphasis on cultural pluralism or equality and inclusion) may benefit more from the intervention (Hanselmann et al., 2014). On the other hand, drawing on the research above (e.g., Celeste, 2021) and accumulating evidence that supportive contexts afford affirmation effects to take hold and grow (Bailey et al., 2017; Walton & Yeager, 2020), it is plausible that the affirmation intervention may be more pronounced in a more positive cultural diversity climate. Further, we expect that the intervention can be even more effective when it includes implicit messages valuing pluralism and diverse cultural heritages. Therefore, besides the traditional self-affirmation intervention, we included an additional culture-sensitive intervention condition that explicitly included messages valuing cultural pluralism.

1.4 The current study

The current study tests whether students of immigrant descent in 7th grade who participate in a brief self-affirmation writing intervention administered by their teacher would be protected against poorer school-related adjustment in both the short-term (at the end of 7th grade five to six months after the intervention) and longer-term (at the end of 8th grade one year after the intervention). In the federal state of Berlin, where the study was conducted, 7th grade is the year following the transition to secondary school. We chose this grade because interventions targeted at students in transitional periods (and thus periods of higher uncertainty and threat) may be more effective than in periods of more stable development (Sherman et al., 2013).

Self-affirmation studies in education have focused predominantly on academic performance in terms of grades (Cohen et al., 2006), math motivation (Gaspard et al., 2015), or math competence (Lokhande & Müller, 2019; Müller & Lokhande, 2017). Because self-affirmation interventions can affect a range of educational outcomes (Easterbrook et al., 2021), we include grades and math competence as well as other important but less studied school-related indicators of adjustment such as disruptive school behavior, reactions to academic challenges, and behavioral and emotional school engagement.

1.5 Hypotheses

H1 (intervention effects): Adolescents of immigrant descent who receive the self-affirmation writing intervention will show better school-related adjustment compared to adolescents of immigrant descent in the control group who do not receive the intervention, with effects more pronounced in the culturally sensitive intervention condition compared to the traditional self-affirmation condition. Further, these effects will be evident in the short-term (five to six months after the intervention) as well as long-term (one year after the intervention). No intervention effects will be observed for adolescents of non-immigrant descent.

H2 (moderators of intervention effects): Because we expect that the intervention will be more effective for those experiencing more identity threat, we hypothesize that the intervention will show greater effects depending on the heritage group and experiences of discrimination. More specifically, we hypothesize that the self-affirmation intervention would be most effective for adolescents with Turkish heritage and those with South-West Asian and North African (SWANA) heritageFootnote 1 (H2a) and those experiencing more discrimination (H2b). For classroom cultural diversity climate, our test of moderation is exploratory as previous research suggests diverging plausible outcomes (H2c).

Our study hypotheses and analyses were pre-registered before we started analyses: https://osf.io/7d4bs/?view_only=187b1e8b251d49119e7019140ebdc486. In this study, we focus on our pre-registered Research Questions 2 (Does the intervention protect against identity threat over time?) and 3 (What factors moderate the effects of the intervention?) and focus on a subset of outcomes (i.e., academic and school-related and not socioemotional adjustment). We did not focus on all outcomes and all research questions from the pre-registration as it was too much to include in one paper.

2 Method

2.1 Participants

All 211 secondary schools in Berlin were contacted by mail and followed up by phone, prioritizing those schools with a high share of students with a non-German heritage language (statistics on which are published on the website of the Berlin Senate Administration for Education, Youth, and Family). Fifteen principals (7%) agreed to participate in the study. The principals were told that it would be desirable for as many 7th grade classrooms as possible from their schools to participate in the study. Overall, 58 classes participated, including 38 at Integrierte Sekundarschulen (comprehensive schools or integrated secondary schools), 14 at Gemeinschaftsschulen (non-academic track secondary schools) and six at Gymnasien (academic track secondary schools).Footnote 2 Next, a parental consent letter was drafted, translated into Albanian, Arabic, English, French, Polish, Russian, and Turkish, and mailed to the schools for distribution amongst parents. Members of our research team introduced the study at school parent nights and were available for questions. In exchange for participation, classrooms received 25€ for their class fund at each time point. Schools received information about the study results through a newsletter.

The sample included 639 7th grade adolescents with data collected in October/November/December 2017 (T1), May/June/July 2018 (T2), and May/June 2019 (T3). Most (n = 413, 65%) adolescents were of immigrant descent, while the rest were of non-immigrant descent (Mage = 12.35 years, SDage = 0.69, 50% female). Eight percent of immigrant descent adolescents were first-generation (born outside of Germany and immigrated after they began school) and the rest were second (79%) and third (10%) generations. Adolescents represented five heritage groups: non-immigrant descent (n = 211), Turkish heritage (n = 107), South-West Asian and North African (SWANA) heritage (n = 121), Eastern European heritage (n = 115), and other heritage immigrant descent (n = 70). These groups were based on the birthplace of their parents and grandparents in combination with their self-reported heritage identity.

2.2 Procedure

We received approval from the ethics committee of the University of Potsdam and the Berlin Senate Administration for Education, Youth, and Family. Students with parental consent voluntarily completed the three surveys and two interventions. Each survey was administered by researchers with teacher support during two class periods and the survey materials were available in German, the language of instruction in the schools.

Based on information from the first questionnaire, students were grouped into heritage groups (SWANA, Turkish, Eastern European, other heritage immigrant descent, no immigrant descent). For each heritage group, the students were further classified into immigrant generations (1st vs. 2nd/3rd generation), and then into gender groups (male vs. female). Within each stratified group (heritage group by immigrant generation by gender), the students were then randomly assigned across three intervention conditions: traditional self-affirmation (n = 212), culture-sensitive self-affirmation (n = 197), and no-affirmation control group (n = 205). Based on the effect size from a previous experimental self-affirmation study in Germany with a similar sample (Cohen’s d = 0.28, Müller & Lokhande, 2017), with Type I error of 0.05 (power of 0.95), for three groups a sample size of N = 370 was required. Thus, our sample size had adequate power to detect the estimated main intervention effect. In addition, we conducted post-hoc power analysis for our tests of interactions. Using semPower (Moshagen & Erdfelder, 2016), we calculated the actual power with the available sample size to detect misspecifications of a model corresponding to RMSEA = 0.05 with an alpha error of 0.05.

The interventions were administered by teachers during a single class period and each student received a closed envelope containing either the self-affirmation or control group writing task. The teachers received prior training and were asked to administer the tasks with the same procedure for all conditions (Cohen et al., 2006; Sherman et al., 2013). In this procedure, teachers presented the task as a normal classroom activity. Students, teachers, and researchers did not know the treatment group to which students were assigned. After administering the intervention, teachers were asked to fill out a feedback sheet responding to questions regarding the implementation such as how well students cooperated, whether students asked about why they were doing the writing task, and whether students thought the task was affiliated with the university.

2.3 Intervention and control conditions

Established self-affirmation procedures suggest administering the intervention across several time points with slightly different intervention tasks at each time point. The self-affirmation tasks in our study were structured (intervention 1) and open-ended (Intervention 2) and had previously been implemented by Sherman and colleagues (2013). To encourage greater engagement with the task, we presented the self-affirmation tasks in the form of a comic based on the German study finding that students wrote more when presented in this form (Müller & Lokande, 2017). See Supplemental Material Figures S1 and S2 for examples of the self-affirmation materials. In terms of timeliness, the 2017 school year began on September 4. Of the 58 classes, Intervention 1 was completed in November/December 2017 for 55 classes and 3 classes in March 2018. For Intervention 2, 48 classes were completed in February/March and 10 classes in May/June 2018. We aimed to administer the interventions shortly before a math test. We had also originally planned for a third intervention tailored to students’ responses at Intervention 2, but were not able to schedule it because of the long delays to complete Intervention 2 due to illnesses in schools, non-responsiveness of some teachers, and school activities.

2.3.1 Traditional self-affirmation task

The tasks followed the original materials (Cohen et al., 2006) as well as slight modifications (Müller & Lokhande, 2017). For the self-affirmation task for Intervention 1 adolescents read a comic where a friendly alien came to earth and asked: “What is important to you in life? What do you like to do? Please read the following list. Mark the two most important things.” Twelve values were presented (e.g., being with family/friends, being free and independent, and being good at art). In the next step, as a writing aid, students were asked to note down seven words related to why the two values were important to them. This was followed by instructions asking them to write an essay about these values, using the words that they noted down. Finally, to reinforce the affirmation, students again were asked to list the two values and asked “How do you feel when you think of these two things? When I think about these two things…” and responded to 5 items such as, “I know that I belong and am not alone.”

Intervention 2 was an open-ended task. Adolescents read another comic about the alien who returned to earth and wanted to learn more, and asked them to write about what is important in their lives and why.

2.3.2 Culturally-sensitive self-affirmation task

We slightly modified the instruction of the brief self-affirmation task to explicitly affirm being of various cultural backgrounds. The modified instruction said, “In different parts of the world people find different things important. Please read through the following list. Which of these is important to you in your life? What do you like to do? Please circle the two most important things” “We also added, “You can choose to write in German or your heritage language, whatever feels more comfortable for you.”. As we wanted to explore if this culturally sensitive instruction would improve overall intervention effectiveness, half of the participants in the intervention condition received the culturally sensitive instruction. Even though most of our participants represent the second or third generation of immigrant descent and may not have sufficient writing skills in their heritage language to complete the tasks in this language (indeed, only seven students used their heritage language for Intervention 1 and three for Intervention 2), the assumption was that allowing them to use their heritage language would communicate that their background and heritage identity are valued and can be a resource when completing academic tasks in school.

2.3.3 No affirmation control group

Following the original control group task (Cohen et al., 2006) and modified as with the affirmation tasks (Müller & Lokhande, 2017), students in the no-affirmation control group were presented with a comic of an alien that came to earth and said “Some people find things important that are not so important to you. What is not important to you in life? What do you not like to do? Please read the following list. Mark the two least important things.” The same set of 12 values were presented as in the affirmation task and the rest of the writing instructions were the same.

Intervention 2 was an open-ended task. Adolescents read another comic about the alien who returned to earth and were asked to write down for the alien what time they woke up that morning, what they had for breakfast if they did, and how they got to school.

2.3.4 Fidelity check

As a check on the fidelity of the intervention, we noted whether the intervention was completed before a math test. We also had teachers note the number of students who noticed that there were different written tasks (i.e., different conditions) after doing the intervention. They also rated how much the following statements were true for them “I was motivated to do the writing tasks with the class”, “I believe the writing assignments were helpful to the students”, and “The study and writing assignments were additional tasks that made too much work for me as a teacher”. The response scale for these three items were 1 = not at all, 2 = not quite true, 3 = Neither true or not true, 4 = rather true, 5 = that is true.

2.4 Measures

We calculated mean scores for all scales. Descriptive statistics and bivariate correlations are reported in Tables 1 and 2.

Table 1 Descriptive Information of the Measures for Immigrant and Non-Immigrant Descent Students in the Control and Intervention Groups
Table 2 Bivariate correlations of the outcomes across time

2.4.1 School-related adjustment

2.4.1.1 Math, German, and English Grades

Teachers were asked to report student grades in these subjects at all three time points. Because the schools used different grading assessments (i.e., “regular” grades with 1 as outstanding and 6 as fail, points, percentages, colors, and certificates using ratings of expert, basic, etc.), we converted all forms into percentages to make them comparable (see Supplemental Material, S1 for a chart on how grades were converted into percentages).

2.4.1.2 Math competence

A German standardized mathematics test, the DEMAT 6 + designed for sixth and seventh graders, was used to assess mathematical competence at T1, T2, and T3 (Götz et al., 2013). The math items of subscales A and C of the DEMAT were divided across waves. Each wave contained 15 items, seven of which remained the same while eight differed across waves. The average item difficulty was kept constant across waves, following the procedure of Müller and Lokhande (2017). The range of possible scores was 0 to 15.

2.4.1.3 Mastery reaction to academic challenges

A subscale of Skinner et al. (2009) motivational perspective on engagement and disaffection measure was used to assess how persistent adolescents were in the face of academic challenges (i.e., showing a mastery reaction). An example item is, “If I can’t get a problem right the first time, I just keep trying”. The four items were rated on a 4-point Likert scale (1 = that’s not right to 4 = that’s completely right). Cronbach’s alphas were 0.73, 0.78, and 0.79 for T1, T2, and T3, respectively.

2.4.1.4 Disruptive school behavior

Jenkins’ (1995) scale on school delinquency and school commitment was used to assess disruptive school behavior. Five items that were age-appropriate and used in previous research in Germany were selected (Schachner et al., 2018). An example item is, “How often did you refuse to do a task given by the teacher in the last four weeks?”. The items were rated on a 5-point Likert scale (1 = almost never to 5 = very often). Cronbach’s alphas were 0.61, 0.73, and 0.70 for T1, T2, and T3, respectively.

2.4.1.5 Behavioral and emotional school engagement

Two subscales from Skinner and colleagues (2009) were used where adolescents reported on their effort, attention, and persistence during their participation in classroom activities. The measures have been used with German adolescents (Bakadorova & Raufelder, 2014). A sample item for behavioral school engagement is, “When I’m in class, I participate in class discussions” and a sample item for emotional school engagement is, “When we work on something in class, I feel interested”. Each subscale had five items that were rated on a 5-point Likert scale (1 = No, that’s not right to 5 = Yes, that’s right). Cronbach’s alphas for behavioral engagement were 0.82, 0.86, and 0.84, and for emotional engagement, 0.81, 0.83, and 0.84 for T1, T2, and T3, respectively.

2.4.2 Moderators

2.4.2.1 Heritage group

As noted above, adolescents were categorized into five heritage groups: Turkish, SWANA, Eastern European, other heritage immigrant descent, no immigrant descent.

2.4.2.2 Discrimination

We used the perceived discrimination subscale from a measure for migration-specific hassles that was initially developed for adolescents of immigrant descent in Germany (Titzmann et al., 2011). Six items assessed the events of unfair or negative treatment experienced by adolescents that were attributed to their ethnic background in the last year. Although this scale was developed for adolescents of immigrant descent, all adolescents responded to the measure. A sample item for discrimination is, “My schoolmates laughed at me because of my heritage culture”. The items were rated on a 5-point Likert-scale (1 = never, 2 = 1 to 2 times, 3 = 3 to 5 times, 4 = 6 to 10 times, and 5 = more than 10 times). Cronbach’s alphas for discrimination were 0.86, 0.88, and 0.87 for T1, T2, and T3, respectively.

2.4.2.3 Classroom cultural diversity climate

We used two subscales of the Classroom Cultural Diversity Climate Scale (Schachner et al., 2021): the 5-item unequal treatment subscale to measure the opposite of equality and inclusion, and the 7-item heritage culture and intercultural learning subscale to measure cultural pluralism. A sample item for unequal treatment is “Students from certain heritage cultures are teased more often than students from other heritage cultures”. A sample item for the heritage and intercultural learning subscale is “During class we learn about the heritage cultures of fellow students”. Items were rated on a 5-point Likert scale (1 = strongly disagree to 5 = strongly agree). Cronbach’s alphas for unequal treatment were 0.81, 0.74, and 0.75 for T1, T2, and T3, respectively. For heritage culture and intercultural learning, Cronbach’s alphas were 0.87, 0.88, and 0.89 for T1, T2, and T3, respectively. Higher scores indicated higher unequal treatment and greater heritage and intercultural learning.

3 Results

3.1 Randomization check

To ensure randomization was effective, we tested for pre-intervention (T1) differences on demographics (age, SES, importance of religion, German writing, German reading, cognitive test) and main study variables between students in the control and intervention conditions. The adolescents in the control, traditional intervention, and culture-sensitive intervention groups did not differ in their scores at pre-test (see Supplemental Material, Table S2 for a more detailed description of how demographics were measured and results of this randomization check).

3.2 Opportunity gap check

We tested baseline (T1) differences on study variables between immigrant and non-immigrant descent students across all treatment conditions as an indication of the degree of “achievement disparity” (Wu et al., 2021) and varying contextual conditions. Adolescents of immigrant descent differed from those of non-immigrant descent in math competence, F (1,621) = 49.724, p < .001, ηp2 = 0.074, and disruptive behavior, F (1,611) = 6.448, p = .011, ηp2 = 0.010, but not on mastery reactions to academic challenges or behavioral and emotional school engagement across all treatment conditions. Non-immigrant descent students scored higher on math competence (M = 6.09 SD = 3.08) than immigrant descent students (M = 4.33, SD = 2.89). They also reported lower disruptive school behavior (M = 1.94, SD = 0.57) than immigrant descent students (M = 2.08, SD = 0.67).

We also tested whether students of immigrant descent differed from those of non-immigrant descent regarding contextual moderators. At baseline, students of immigrant descent (M = 1.28, SD = 0.61) reported higher discrimination than students of non-immigrant descent (M = 1.11, SD = 0.37), F (1,614) = 13.270, p < .001, ηp2 = 0.021. Students of immigrant descent (M = 2.27, SD = 0.91) and non-immigrant descent (M = 2.26, SD = 0.83) did not differ in heritage and intercultural learning classroom climate, F(1, 610) = 0.005, p = .942, ηp2 = 0.000. Students of immigrant descent (M = 1.69, SD = 0.80) and non-immigrant descent (M = 1.64, SD = 0.76) also did not differ in unequal treatment classroom climate, F(1, 609) = 0.468 p = .494, ηp2 = 0.001.

3.3 Manipulation check

As a manipulation check we based our procedures on Hanselmann et al. (2017). Two trained independent coders who were blind to the experimental condition coded all essays (first independently coding then discussing and reaching consensus on codes not in agreement) to examine whether students in the intervention group wrote more affirming responses (e.g., by expressing either the importance of the value for themselves, by expressing that they are “good in” the valued domain, or by expressing positive emotions) than students in the control group. Essays of the students in the intervention group were coded as more self-affirming than essays of the students in the control group for both the first, χ2(2) = 437.77, p < .001, and second intervention, χ2(2) = 20.180, p < .001. Most of the essays in Intervention 1 (95%) and Intervention 2 (87%) in the intervention group were coded as expressing self-affirmation.

3.4 Fidelity check

A majority of students (67%) did Intervention 1 before a math test while (33%) did Intervention 1 one day after.Footnote 3 For Intervention 2, all students did the intervention before a math test. Most of the teachers (n = 45 out of 47 teachers) who gave the intervention responded to fidelity check questions. After both interventions, 43% of teachers reported that no students noticed different written tasks and the mean number of students the teachers reported noticing was 2.54. Thus, on average, two students in a class noticed differences.Footnote 4 Of those who filled out the teacher survey, 28% of the teachers reported either “not quite true” or “not at all” to being motivated to do the writing task while 39% reported either “rather true or “true”. Thirty-four percent reported either “not quite true” or “not at all” to seeing the writing assignments as beneficial for the students, 11% reported “rather true” and none reported “true”. Finally, 43% of the teachers reported either “rather true” or “true” that the intervention was an additional task that caused extra workload, while 32% reported “not quite true” or “not at all”.

3.5 Attrition and missing data

A total of 639 adolescents participated in the pre-test (T1) and at least one of the two intervention tasks. Among those, 86% (n = 549) participated in the post-test (T2). Among the adolescents present in both pre-test (T1) and post-test (T2), 73% (n = 399) participated in the follow-up (T3). The attrition rates in the intervention and control groups between the pre-test (T1) and the post-test (T2) were similar (control: 12%, traditional intervention: 14%, culture-sensitive intervention: 15%, χ2(2) = 0.82, p = .66, ηp2 = 0.001). The attrition rates in the intervention and control groups between the post-test and follow-up were also similar (control: 26%, traditional intervention: 29%, culture-sensitive intervention: 23%, χ2(2) = 1.50, p = .47, ηp2 = 0.001).

For further attrition analyses, we followed the same approach from another longitudinal study (Miklikowska et al., 2019). To test whether attrition from T1 to T2 (n = 90) or from T1 to T3 (n = 204) was related to the demographic and the study variables, we conducted logistic regression testing whether attrition (0 = dropped, 1 = retained) was predicted by demographic (e.g., gender) or the study variables at T1. The results showed that attrition from T1 to T2 was related to the cognitive skills test, behavioral school engagement and discrimination. Adolescents who scored higher at the cognitive skills test were more likely to participate at both time points (T1 and T2) than adolescents who scored lower at the cognitive skills test, F (1,637) = 8.856, p = .003, ηp2 = 0.030. Adolescents who reported lower discrimination were more likely to participate at both time points (T1 and T2) than those who reported higher discrimination, F (1,627) = 5.697, p = .017, ηp2 = 0.013. Adolescents who reported higher behavioral school engagement were more likely to participate at both time points (T1 and T2) than those who reported lower behavioral school engagement, F (1,627) = 16.035, p < .001, ηp2 = 0.038. Nagelkerke R2 of the regression model was 0.08.Footnote 5

Attrition from T1 to T3 was not significantly related to the demographic (e.g., gender) or study variables except the cognitive skills test. Again, adolescents who scored higher on the cognitive skills test were more likely to participate in all time points (T1, T2, and T3) than adolescents who scored lower on the cognitive skills test, F (1, 637) = 15.354, p < .001. Nagelkerke R2 was 0.03. We added the cognitive skills variable to the models as an auxiliary variable (Graham, 2003).

Missing data for the study variables were less than 7% within each time point. Participants with and without complete data were compared on the study variables using Little’s (1988) missing completely at random test within each time point. Results suggested that missing values were missing completely at random for T1, χ2 (78, n = 639) = 67.114, p = .80, not missing completely at random for T2, χ2 (14, n = 536) = 27.147, p = .018, and again completely at random for T3, χ2 (21, n = 399) = 17.950, p = .652. Due to the small amount of missing data overall, we used full information maximum likelihood (FIML) to handle missing data.

3.6 Measurement invariance

We tested measurement invariance for all study measures across time and immigration status groups (immigrant descent vs non-immigrant descent; see Supplemental Material, Tables S3-S9) as measurement invariance is necessary to meaningfully interpret latent difference scores (Geiser, 2012). We first tested configural invariance (specifying the same measurement model in all groups, without constraining factor loadings or intercepts to be equal across groups), metric invariance (factor loadings were constrained to be equal across groups), and scalar invariance (intercepts were constrained to be equal across groups). Metric invariance implies that associations can be compared across groups and scalar invariance implies that means can be compared across groups.

All measures demonstrated scalar invariance across time and immigrant status group except for one measure. One item (i.e., How often have you been too late for class in the morning or after lunch break in the last four weeks?) of the disruptive school behavior measure was removed to establish scalar measurement invariance (with four items) to proceed to testing a latent change model.

3.7 Steps to test hypotheses

We built multigroup latent change models using Mplus, using intervention vs. control condition as a grouping variable. In each model, we allowed the same items to correlate with each other across time. We then generated latent change variables/scores for the change between T1 and T2 and the change between T2 and T3. In a latent change model, a latent state variable at T2 is a function of the latent state variable at T1 plus a variable capturing the change from T1 to T2 (Geiser, 2012). In other words, because T1 scores between intervention and controls groups are the same, the calculated latent state variable at T2 reflects only the change in latent scores between T1 and T2. Likewise, the calculated latent state variable at T3 reflects only the change in latent scores between T2 and T3.

We then used the “model test” function in multigroup analyses to compare whether the change score for T1 to T2 and T2 to T3 significantly differed between intervention and control groups. We did this separately for students of immigrant descent and non-immigrant descent (H1). We did not add any covariates to the models as groups had been randomized by participants’ heritage group, generation, and gender within classes. However, based on preliminary analysis regarding attrition, we added cognitive skills as an auxiliary variable to the models.

Regarding moderators, to test whether the intervention was effective for specific heritage groups among immigrant descent adolescents, we used a categorical variable for heritage groups and tested the direct effects of the intervention by selecting sub-samples (H2a). However, the sample size for each heritage group was small (e.g., n = 114 for SWANA, n = 107 for Turkish heritage).

To test the moderators of discrimination and classroom cultural diversity climate among immigrant descent adolescents, we used a single group model where change scores for T1 to T2 and T2 to T3 were regressed on the treatment condition (0 = control, 1 = intervention), the moderator, and the interaction term between the treatment condition and the respective moderator (H2b and H2c).

Against our hypotheses, we note that there were no differences between the traditional and culture-sensitive self-affirmation tasks in any of the main analyses. Therefore, we combined the two intervention conditions into one group and report main results based on this combined intervention group vs. control group, below.

3.8 Effects of the intervention on school-related adjustment

With regards to Math, German, and English grades, we could not use latent change models because each grade was assessed with just one item. Therefore, we tested for an intervention effect with two steps. First, we used a multigroup (intervention vs. control) regression model where grade (each tested separately) at T2 was regressed on T1 and grade (each tested separately) at T3 was regressed on T2. Using the model test function, we examined whether the regression parameters were equal across the two groups. Based on the best fitting model, we calculated the predicted values (the estimated model parameters) for T1, T2, and T3 to test for an intervention effect. Second, we combined the intervention and the control groups. Controlling for grade at T1, grade at T2 was regressed on a categorical variable representing the intervention conditions (0 = control, 1 = intervention). Controlling for grades at T1 and T2, grade at T3 was regressed on the intervention conditions. Based on these steps, there was no intervention effect for students of immigrant descent, non-immigrant descent (H1), or specific heritage groups (H2a). Discrimination and two dimensions of classroom cultural diversity climate did not moderate intervention effects on grades for students of immigrant descent (H2b & H2c).

We could not use latent change models for math competence because the scale was one ordinal item ranging from 0 to 15. Therefore, we followed the same analytic approach with grades. We found no intervention effect on math competence for students of immigrant descent, non-immigrant descent (H1), or specific heritage groups (H2a). Discrimination and two dimensions of classroom cultural diversity climate did not moderate intervention effects for math competence for students of immigrant descent (H2b & H2c).

With regards to mastery reactions to academic challenges, change scores did not differ between the control and intervention groups for T1-T2 and T2-T3 for students of immigrant descent, non-immigrant descent (H1), or specific heritage groups (H2a). However, students of immigrant descent who experienced high levels of discrimination may have benefited more from the intervention, at least in the short term (H2b). More precisely, they showed a smaller decline in their mastery reactions to academic challenges compared to students who experienced high levels of discrimination in the control group (interaction term approached significance, ß = 0.243, p = .064) at T2. But this effect was not significant in the long term, meaning that students of immigrant descent who experienced high levels of discrimination ended up with similar scores on mastery reactions to academic challenges at T3 regardless if they received the intervention or not (see Fig. 1 and Table 3). The post hoc power analysis suggested that with the available sample size (n = 394), we had sufficient power (99.88%) to reject a poor-fitting model (df = 85). The two dimensions of classroom diversity climate did not moderate the intervention effects on mastery reactions to academic challenges in the short- or long-term for students of immigrant descent (H2c).

Fig. 1
figure 1

Discrimination as a Moderator of the Intervention Effect on the Reactions to Academic Challenges for Students of Immigrant Descent (n = 413, low = 1 SD below the mean, high = 1 above the mean)

Table 3 Model fit indices for the latent change models that show significant effects

With regards to disruptive school behavior, change scores differed between the intervention and control groups for T1-T2 and T2-T3 for both students of immigrant descent (Wald χ2 (1, n = 413) = 4.489, p < .05 for T1-T2; Wald χ2 (1, n = 413) = 7.257, p < .01 for T2-T3) and non-immigrant descent (Wald χ2 (1, n = 226) = 4.535, p < .05 for T1-T2; Wald χ2 (1, n = 226) = 14.214, p < .001 for T2-T3) (H1). More specifically, from T1 to T2, the intervention group initially showed greater disruptive behavior than the control group, but from T2 to T3 this reversed. By T3, the intervention group showed less disruptive behavior compared to the control group for students of non-immigrant descent (see Figs. 2 and 3 and Table 3). In other words, the intervention lessened an upward trajectory in disruptive behavior for both groups. This finding did not support our hypothesis as the intervention benefitted both students of immigrant and non-immigrant descent concerning disruptive behavior, at least in the long term. The change scores in disruptive behavior for T1-T2 and T2-T3 did not differ by intervention condition for specific heritage groups (H2a). Discrimination and the two dimensions of classroom climate did not moderate the intervention effects on disruptive behavior for students of immigrant descent (H2b and H2c).

Fig. 2
figure 2

Direct Effects of the Intervention on Disruptive School Behavior for Students of Immigrant Descent (n = 413, low = 1 SD below the mean, high = 1 SD above the mean)

Fig. 3
figure 3

Direct Effects of the Intervention on Disruptive School Behavior for Students of Non-Immigrant Descent (n = 226, low = 1 SD below the mean, high =  1 SD above the mean)

With regards to behavioral school engagement, change scores did not differ between the intervention and control groups for T1-T2 or T2-T3 for students of immigrant descent, non-immigrant descent (H1) or specific heritage groups (H2a). Discrimination and unequal treatment climate did not moderate the intervention effects among students of immigrant descent (H2b and H2c). But there was a significant interaction for the intervention effect and heritage/intercultural learning classroom climate between T1 and T2 (ß = 0.281, p = .013) and T2 and T3 (ß = 0.288, p = .025) for students of immigrant descent (H2c, see Fig. 4 and Table 3). In the short‐term (T2), students of immigrant descent who perceived high heritage and intercultural learning in the classroom benefitted more from the intervention (had a smaller decline in behavioral engagement) compared to those who perceived high heritage and intercultural learning in the classroom but did not receive the intervention. At T3, the intervention effect was not sustained such that the intervention and control groups both increased in behavioral school engagement similarly. The post hoc power analysis suggested that with the available sample size (n = 413), we had sufficient power (99%) to reject a poor-fitting model (df = 127). Overall, students in the higher heritage and intercultural learning classroom showed greater behavioral school engagement compared to those in lower heritage and intercultural learning classrooms across all time points.

Fig. 4
figure 4

Heritage and Intercultural Learning Climate as a Moderator of the Intervention Effect on Behavioral School Engagement for Students of Immigrant Descent (n = 413, low = 1 SD below the mean, high = 1 SD above the mean)

With regards to emotional school engagement, change scores did not differ between the control and intervention group for T1-T2 or T2-T3 for adolescents of immigrant descent, non-immigrant descent (H1), or specific heritage groups (H2a). Discrimination and two dimensions of classroom climate did not moderate intervention effects for adolescents of immigrant descent (H2b and H2c).Footnote 6

4 Discussion

Adolescents of immigrant descent may experience stereotyping and discrimination, creating threats to their identity that may undermine their school-related adjustment. In our study we tested whether engaging in a brief self-affirmation writing task would protect adolescents against a decline in school-related adjustment from the beginning of 7th to the end of 8th grade. We also considered whether heritage group, experiences of discrimination, and classroom cultural diversity climate would contribute to heterogeneity of effects. Our overall conclusion is that the brief self-affirmation writing intervention shows a small effect on some aspects of school-related adjustment for adolescents of immigrant descent and may sometimes be beneficial also for adolescents of non-immigrant descent. The context of the intervention is also important, with classroom cultural diversity climate acting as a psychological affordance to enhance affirmation effects, at least in the short term.

We did not find that the self-affirmation writing intervention benefited students of immigrant descent regarding grades or math competence, two of the most widely studied academic outcomes in education (Wu et al., 2021). Our findings contrast to Müller and Lokhande’s (2017; Lokhande & Müller, 2019) study that had a very similar sample drawn from the same city and found their intervention to be effective for boosting math competence using the same measure. Several differences with their study may explain our divergent findings. In their study the intervention was administered by the researchers and the effects were assessed immediately after the intervention and again eight weeks later. In our study the intervention was administered by teachers and effects were assessed five to six months and one year later. We trained teachers to implement the intervention based on the theoretical importance of having affirmation clearly linked to school (Easterbrook et al., 2021). While we tried to be consistent in the training, teachers also varied in motivations. Some were enthusiastic about the intervention and others less so. Perhaps this variation in teacher training compliance and effectiveness contributed to fewer intervention effects compared to the previous study. Indeed, implementation fidelity has been identified as a likely moderator of self-affirmation intervention effectiveness also in previous research, and this particularly applies for interventions implemented by teachers (Bradley et al., 2016). For teacher-implemented studies with smaller samples, procedures can be followed and monitored more carefully and close collaborations can be established with teachers. This was difficult to attempt to do in a larger-scale study such as ours, and indeed some of the larger self-affirmation studies with the goal of scaling up the intervention were less successful than the initial, smaller studies.

In addition, especially in the East German federal states including Berlin, many schools are under-resourced and there is a drastic shortage of teachers. This becomes even more problematic when facing additional challenges, such as providing schooling for large numbers of refugee students who arrived in the years of our study. It means that there is less time available for teacher training and careful implementation of an intervention in addition to their regular teaching content. Scaling up paired with teacher implementation may therefore cumulate in additional challenges for intervention effectiveness.

Despite the absence of a main intervention effect on grades and math competence, we did find effects on a few outcomes. Our findings for mastery reactions to academic challenges are somewhat in line with self-affirmation findings showing that the intervention prevented a decline in academic motivation and performance (Cohen et al., 2009; Sherman et al., 2013), but under specific conditions. Our findings show (tentatively as these results approached significance at p = .064) that adolescents of immigrant descent who participated in the intervention were more likely to show less decline in their mastery reactions to academic challenges, at least in the short term, from T1 to T2, but only those who experienced higher (not lower) levels of discrimination. These results suggest that those who experience the greatest direct threat to their sense of self could be targeted for affirmation interventions to reinforce and make clear that they are valued. Finding that these effects last only in the short‐term demonstrates the importance of studying longer-term effects to draw conclusions about overall effectiveness. Yet finding a short-term effect also suggests there may be windows of opportunity for intervention effectiveness that might make sense to target. The non-immigrant descent adolescents did not show any of these effects for mastery reactions to academic challenges, in line with theoretical reasoning that the intervention would not affect adolescents assumed to experience less identity threat, especially in an educational context.

Regarding disruptive school behavior, we found that the self-affirmation writing intervention prevented an upward trajectory in disruptive school behavior from the 7th to end of 8th grade for all adolescents, regardless of immigrant status. While disruptive school behavior can increase throughout secondary school (Zimmermann et al., 2013), the intervention appears to have prevented some of this increase. Engaging in disruptive behavior is consequential for relationships with teachers, school grades, and perceptions of the self (Zimmermann et al., 2013), and thus an important outcome to pay attention to.

Our results do not align with findings with sixth and seventh grade Dutch students, where a brief self-affirmation writing task did not reduce problem behaviors across an academic school year (de Jong et al., 2016). Yet our results do align with findings with U.S. students, where multiple brief self-affirmation writing tasks across three years reduced discipline incidents (taken from school records) in 7th and 8th grade, with no difference by ethnic group (Binning et al., 2019). Similarly, contrary to hypotheses, the intervention in their study did not solely benefit identity-threatened adolescents, but all adolescents.

There is some evidence that self-affirmation tasks can reduce effects of identity threat in general and not only threats tied to a specific group identity (Cohen & Sherman, 2014). In Germany, the often-used categories of immigrant descent vs. not is contested as a relevant or meaningful identity as it is a label that the majority of people in Germany officially designated as “migration background”, do not self-identify with (Nesterko & Glaesmer, 2019). The risk for susceptibility to stereotype threat has been argued to be greater for those who identify more strongly with the group that is being targeted, but there are very few studies in Europe on this topic and the few show mixed evidence for immigrant groups (Appel et al., 2015). This may be a reason why we did not find consistent differences in the effectiveness of the intervention for these two groups.

Classroom cultural diversity climate moderated the effect of the self-affirmation writing task on behavioral school engagement. In the short term, students of immigrant descent who perceived high heritage and intercultural learning in the classroom (less identity threat) showed a smaller decline in behavioral school engagement (i.e., benefitted more from the intervention) compared to students who perceived low heritage and intercultural learning in the classroom. In the long-term, however, there were increases in behavioral school engagement for all students from the end of 7th into 8th grade, regardless of treatment condition. Thus, a positive classroom diversity climate acted as a protective factor for those participating in the intervention in the short-term, and later, as an enhancing factor for all adolescents in the long-term. These results illustrate the importance of defining both vulnerabilities and opportunities when considering interventions within real-life social contexts (Walter & Yeager, 2020). Our results suggest that a positive diversity climate in the classroom could be considered a psychological affordance, a feature of the context that can act as a resource to allow an affirmation intervention to take hold (Walter & Yeager, 2020) as well as foster positive school adjustment beyond the intervention. Our findings are also in line with findings from a previous study showing that trust in teachers and school, and a more positive school climate may contribute to positive affirmation intervention effects (Binning et al., 2019).

Against our hypothesis, we did not find that the intervention was effective in changing emotional school engagement. While negative stereotypes for immigrant descent students and families in Germany, especially for those of Turkish- or Arab-heritage regarding school engagement, for instance in terms of being bildungsfern (“academically distant”) (Emmerich & Hormel, 2013), perhaps the stereotype is not as pronounced towards the entire group of students of immigrant descent. Alternatively, not all members belonging to the same negatively stereotyped group are equally affected because they may not be aware of the stereotype, do not endorse the stereotype, or do not identify with the group (Spencer et al., 2016). We did not find other studies that included this dependent variable so we await future studies to test this further.

Regarding heritage group as moderator, we did not find it moderated the effectiveness of the intervention. Heritage groups are broad categories that do not directly assess potentially important experiences, such as the degree of discrimination experienced. Discrimination indeed moderated some of the intervention effects, suggesting it is more important to consider such experiences rather than solely rely on categorizations.

Based on theory and empirical evidence, the self-affirmation writing intervention should be more effective when the “achievement gap” between identity-threatened adolescents and identity non-threatened adolescents is large (Wu et al., 2021). In our study for the whole sample at pre-test, we found that adolescents of immigrant descent differed from those of non-immigrant descent in math performance and disruptive behavior, but not on mastery reactions to academic challenges or behavioral and emotional school engagement. Further, these differences were small. The fact that the two groups did not show substantial school-related adjustment gaps in most of the study variables, could also explain why we found few effects. Despite persistent educational disparities between students of immigrant descent and non-immigrant descent in Germany, the school system in the federal state of Berlin is more inclusive than elsewhere in Germany, with a longer period of primary schooling and more within-school (as opposed to between-school) tracking in integrated secondary schools (Teltemann & Schunck, 2016).

Overall, our results speak to the importance of considering a range of conditions to clarify self-affirmation effects in educational settings as the quality and timing of implementation, demographics and experiences of students and teachers, and availability of school resources can all contribute to heterogeneity of effects (Easterbrook et al., 2021; Hanselmann et al., 2017). Notably, in the Wu et al. (2021) meta-analysis, of the nine studies that were conducted outside the US, only four included adolescents, with two studies finding effects (England and Germany) and two that did not (Netherlands). Although the overall meta-analytic effect is positive, significant, and medium size for identity-threatened students, which is encouraging, the authors clearly note that because of the great heterogeneity in effect sizes, it will be the case that single studies may show a null effect (Wu et al., 2021). As noted earlier, future studies should continue to detail moderators on various levels to uncover when and for whom this intervention may be effective.

4.1 Limitations

There were some limitations to our study that could have affected our study findings. One such limitation is that our timing and dosage were not ideal. We had planned to implement the first intervention at the very beginning of the year, directly after students transitioned to secondary school, based on the importance of starting before negative recursive processes begin, especially during or after a transition (Sherman et al., 2013). However, a delay in receiving ethics approval prevented us from doing so and thus we started further into the school year (eight weeks) than desired. Indeed, a 4-week delay can significantly decrease the effectiveness of the affirmation intervention (Cook et al., 2012). We also originally planned for three inventions but were only able to complete two. Nonetheless, dosage may be less important than timing (Wu et al., 2021) and other aspects of our study were well-timed in that we focused specifically on 7th graders after a period of transitioning into secondary school and attempted to schedule interventions before math tests.

4.2 Strengths and avenues for future research

One strength is that we tested for theorized heterogeneity and focused on student (heritage group, experiences of discrimination) and contextual (classroom quality climate) moderators. Other moderators such as implementation quality (Bradley et al., 2016), teacher-related (e.g., growth mindset), and other school-related aspects (Easterbrook et al., 2021; Hanselmann et al., 2017; Wu et al., 2021) are also important. In future studies we plan to also test mediators to explain why self-affirmation writing interventions work in terms of the underlying psychological processes (Easterbrook et al., 2021; Wu et al., 2021). Based on self-determination theory (Ryan & Deci, 2020), we included measures of sense of belonging, autonomy, and competence. There is evidence that a key mechanism for why self-affirmation tasks work is enhancing a sense of belonging (Celeste et al., 2021; Cook et al., 2012; Shnabel et al., 2013). A greater sense of belonging seems to provide a context for doing well in school academically (Celeste et al., 2019). Relatedly, self-affirmation tasks can promote a sense of trust with teachers and school and contribute to better school behavior among adolescents (Binning et al., 2019). Less known is whether self-affirmation tasks enhance the two other basic needs of self-determination–a sense of autonomy and competence–so we plan to test this in the future.

Our findings are useful in that the intervention seems to offer some protection for school-related adjustment, for both students of immigrant and non-immigrant descent, depending on the particular aspect of adjustment and context. Although we did not find strong support for all hypotheses with our wide range of outcomes, the strengths of the study are the theoretical foundation, adequate sample size, real-life conditions and longer-term school-related adjustment measured. Our findings are still important to demonstrate under what conditions and for what outcomes the intervention does and does not seem to work.

We also would like to highlight our agreement with recent studies that recognize the limitations of only focusing on affirming students with an individual-level intervention (Easterbrook et al., 2021; Wu et al., 2021). Broader societal and structural level supports, as well as within the school, will be necessary to provide the resources and nurturing environment that will allow students to be affirmed, valued, and supported for who they are (Walton & Yeager, 2020).