Bullying is one of the most common forms of youth violence and is now acknowledged as a serious public health concern, affecting children and adolescents in all parts of the world. According to the Centers for Disease Control and Prevention (2014), bullying among youths is considered as a secondary hypothesis, we expected that the extent of the changes within the OBPP is associated with implementation fidelity (that is, to which extent is the program implemented as designed by the program developers). Therefore, we hypothesized that:

any unwanted aggressive behavior(s) by another youth or group of youths who are not siblings or current dating partners that involves an observed or perceived power imbalance and is repeated multiple times or is highly likely to be repeated. Bullying may inflict harm or distress on the targeted youth including physical, psychological, social, or educational harm. (p.7).

Systematic research on school bullying started over 40 years ago, and a growing body of research worldwide has documented the broad negative impacts of bullying, especially for the victims (Kaess, 2018). These negative outcomes highlight the need for effective intervention and prevention programs to reduce school bullying among children and adolescents around the world.

Efficacy of Anti-bullying Programs

As bullying is a serious issue in schools, considerable research has been conducted in the past decade on the effectiveness of anti-bullying programs. A recent meta-analysis including 100 independent evaluations found that overall, programs were effective in reducing school bullying perpetration (relative reduction of 19–20%) and victimization (relative reduction of 15–16%) (Gaffney et al., 2019). Four anti-bullying programs with multiple evaluations were compared, showing that evaluations of the Scandinavian Olweus Bullying Prevention Program (OBPP; Olweus, 1993) produced the largest effect sizes for bullying perpetration outcomes, while the Italian NoTrap! program (Menesini et al., 2012) was the most effective in reducing bullying victimization. In our own OBPP evaluation study, a prospective quasi-experimental design with an annual student survey (baseline, postline + 12 months, follow-up + 24 months) was used to evaluate the effect of the program. Based on data from approximately 5500 pupils (grades 5–9) who took part in the surveys between 2015 and 2018, a comparatively high effect of the program could be demonstrated, with a relative reduction in bullying perpetration as well as victimization of 25% after 2 years (Ossa et al., 2021). However, the low recruitment rate of 1.9%, an absence of program effect for boys, as well as a stronger effect for grades 5–7 should be considered in the interpretation of findings.

Importance of Teachers

Implementing a bullying prevention program does not only influence the pupils, but is also connected to teacher outcomes. The responsibility to act against bullying lies with the school staff, and by working with an anti-bullying program, adults at school are made aware of the issue of bullying and develop competencies to intervene appropriately. Teachers are often present when an episode of bullying occurs, and they are often the first adults that students contact (Wachs et al., 2019). Teachers have both the authority to address inappropriate behavior and the moral obligation to keep students safe (Cortes & Kochenderfer-Ladd, 2014). Even though teachers are key figures of a program’s effectiveness, most intervention studies have not focused explicitly on the effects of anti-bullying programs at teacher level, but on the bullying behavior and well-being of students instead (Van Verseveld et al., 2019). Van Verseveld et al. (2019) conducted a meta‐analysis on the effects of school‐based anti-bullying programs on determinants of teacher intervention (e.g., teachers’ attitudes toward bullying, subjective norms, self‐efficacy, and knowledge regarding intervention strategies), as well as on teachers’ responses to bullying (e.g., teacher intervention). Thirteen peer-reviewed papers could be included, of which only six studies contained teachers as informants for the measurement of teacher outcomes. The meta-analysis comprises a total of eight anti-bullying interventions, all of which provided a teacher training package aimed at improving teacher awareness and responsiveness to bullying situations. With regard to determinants of teacher intervention, a significant moderate positive effect of anti-bullying programs on teachers’ attitudes, subjective norms, self-efficacy, and knowledge could be shown (g = 0.531). Furthermore, regarding teachers’ responses to bullying, a significant small to moderate effect was found on teachers’ actual intervention practices in bullying situations (g = 0.390). However, examining the included OBPP studies in greater detail revealed wide variation in effect sizes (based on students’ ratings). While Pepler et al. (2004) reported no significant effect of the OBPP on teachers’ responses to bullying (g = 0.028), and Black and Washington (2008) found only a small effect (g = 0.075), the evaluation study of Limber et al. (2018) revealed a very large effect of the OBPP on teachers’ responses to bullying (g = 1.250). This variability in effect sizes raises the question of what explains these differences.

Contextual Variables: Job Strain

The occupation of teachers is considered one of the most stress loaded, besides nurses and doctors (Jennings et al., 2017). When teachers are asked about their greatest stressors, problematic behavior of students, additional support required for students in need, and the feeling of being overwhelmed by their own tasks are perceived as among the greatest burden (Richards, 2012). A recent systematic review identified detrimental determinants of teacher exhaustion, including work climate, teacher self-efficacy in managing student behavior, and classroom disruption (Mijakoski et al., 2022). High rates of teacher absenteeism and turnover, as well as an imbalance in job roles, responsibilities and institutional resources, can create additional teacher stress (Bottiani et al., 2019; Hong, 2009). Regarding bullying, several studies have shown that teachers feel unprepared to intervene in bullying situations, that they would like to receive additional training, and have difficulty monitoring bullying in addition to their regular duties (Bauman & Hurley, 2005; Bradshaw et al., 2012; Van Verseveld et al., 2021). Jennings and Greenberg (2009) theorized that a burnout cascade may result when teachers lack the social-emotional competency to manage behavioral challenges in the classroom and fail to create a healthy class climate with supportive student–teacher relationships and positive classroom management. The relationship that teachers share with their students is critical for teachers’ emotional well-being and motivation (Klassen et al., 2012). Studies suggest that teachers’ ratings of their own social and emotional skills positively relate to how they manage stress and to their levels of burnout (Brackett et al., 2010). Therefore, successful school-based prevention has the potential to positively improve teacher outcomes, such as self-efficacy, general stress level, and risk for burnout, as a function of its positive impact on classroom management and student behavior (Bradshaw et al., 2009). So far, only a few studies have examined this relationship. For example, Domitrovich et al. (2016) were able to show the positive effect of a classroom behavior management program on teacher burnout, but only for the component of personal accomplishment, and not for emotional exhaustion. This secondary benefit of school-wide prevention programs would ultimately serve the teachers’ health and may provide a justification for their use. Moreover, higher teachers’ job satisfaction directly influences lower levels of bullying (De Luca et al., 2019), whereby teachers’ satisfaction can be perceived by the students in their everyday life in school as it constantly influences the quality of interactions and relationships in the classroom.

Contextual Variables: School Climate

School climate presents an important context for teachers’ professional activities and has been defined in a variety of ways that cover student- and teacher-oriented conceptualizations. Effective anti-bullying work calls for changes in the school culture and organization, as well as in the behavioral norms, having a lasting impact on the school as a social system (Olweus & Limber, 2010). Therefore, its effects go beyond the reduction of bullying, and previous research has demonstrated an improvement in school climate within the work with the OBPP (Olweus, 2012). In their meta‐analysis described above, Van Verseveld et al. (2019) inferred that strengthening the teacher ultimately leads to a change in the school climate. Beyond this, aspects of the school climate such as teacher-teacher collaboration and communication, can influence teachers’ attitudes and behaviors toward bullying, and appear to be associated with teachers’ active responses to bullying (Kollerová et al., 2021). This indicates a complex model of interrelationships between the reduction of bullying through the implementation of bullying prevention programs, active teacher intervention, and school climate, and research is lacking on the possible associations between these influencing factors.

Importance of Implementation Fidelity

In research on anti-bullying programs, the wide variation in effect sizes for both teachers’ intention to intervene in bullying situations as well as for reduction of bullying (Gaffney et al., 2019; Van Verseveld et al., 2019) raises the question as to which factors are responsible for program effects. Possible explanations might be that anti-bullying programs differ in terms of focus, number of program components, and training dosage. Aside from this, aspects of implementation (such as fidelity, dosage, or quality) have been found to be moderating factors for program outcomes, even within the same program. According to Carroll et al. (2007), implementation fidelity is the degree to which programs are implemented as intended by the program developers. Results from nearly 500 implementation studies in the field of prevention and promotion targeting children and adolescents offered strong empirical support for the conclusion that the degree of implementation fidelity affects the outcomes obtained. The magnitude of mean effect sizes were at least two to three times higher when programs were carefully implemented and free from serious implementation problems (Durlak & DuPre, 2008). In anti-bullying program evaluations, however, limited attention has been paid to implementation fidelity, so far. A study on the effects of the KiVa anti-bullying program on teachers investigated the associations between KiVa activities and teacher perceptions (Ahtola et al., 2012). The effects of team membership and the number of implemented student lessons were tested. While only about 2–3% of the variation in teacher perceptions could be explained at the school level, 8% of the individual variation was explained by engagement in KiVa activities at the end of the intervention year. Another person-centered KiVa trial examined the link between the implementation of the program and its effectiveness by using monthly teacher reports (Haataja et al., 2014). Results revealed that lesson adherence as well as lesson preparation time (but not duration of lessons) were associated with reductions in victimization at the classroom level. In a second step, it was also examined how, when, and why teacher adherence to KiVa lessons varied. Different factors were associated with the degree of implementation fidelity: high starting levels were enhanced by positive beliefs about program effectiveness, while maintaining high implementation levels was enhanced by principal support. Finally, consistent and high implementation was enhanced by lesson preparation (Haataja et al., 2015). The cited studies underscore the assertion that implementation matters, and Axford et al. (2020) concluded that schools might require more intensive and responsive implementation support to achieve significant program effects.

Facilitators and Barriers for Implementing Bullying Prevention Programs

The implementation of school-wide anti-bullying programs has been facing major barriers in different school systems worldwide. A main problem is the complex structure of schools—teachers often instruct a large number of students they see only a few times a week, and this can create difficulties in building positive student–teacher relationships (Coyle, 2008). Challenges can also stem from a focus on academic achievement, lack of time, variations in staff commitment, lack of support from headmasters, uncooperative parents, short time of program implementation, or simultaneous implementation of conflicting prevention efforts (Cunningham et al., 2016; Limber et al., 2004; Nansel et al., 2003). In interviews and focus groups with OBPP participants, the following additional themes impeding OBPP implementation emerged: unanticipated changes and events, difficulties identifying bullying incidents, social media influences that exacerbated bullying behaviors, and limited fiscal and staff resources (Sullivan et al., 2021). Similarly, Herkama et al. (2022) conducted focus group interviews with teachers to explore facilitators and barriers to the sustainment of the KiVa anti-bullying program. Program-related, organizational, as well as contextual issues were discussed in the process. According to the participants, the following program-related characteristics were important for program sustainability: systematic program structure, clear guidelines on how to address acute cases of bullying, user-friendly materials, program adaptability, information about bullying as a phenomenon and practical tools for prevention, support from program developers as well as realistic expectations and recognition of program boundaries. In the organizational area, strategic coordination and planning, teacher motivation and commitment, time and personnel resources, headmaster’s support, teacher trainings, supportive school climate, as well as a fit of the program to the current school structures were listed as program facilitators. Finally, a national core curriculum, a school-wide bullying prevention plan, and positive media attention were mentioned in the contextual area.

Implementation fidelity and sustainability are important influencing factors on the attained outcomes and benefits of bullying prevention (Durlak & DuPre, 2008; Haataja et al., 2014). Therefore, it is essential to gain knowledge about facilitators and barriers for successful program implementation, before delivering a program. In the preparation phase, program planners should consider the specific challenges that may arise to identify necessary resources to support a program delivery with high implementation fidelity and sustainability. The identified influencing factors are intertwined in complex ways: some of them might be more influential than others, and in some cases, the presence of several facilitators is needed in order to sustain the program. Besides, high implementation fidelity might contribute to a positive cycle, where the realization of reduced prevalence of bullying may encourage further implementation (Herkama et al., 2022). However, it also has to be considered that our secondary outcomes (job strain and school climate) might be influenced by several school characteristics, even more than by the implementation fidelity of the OBPP. For example, school climate appears to be more positive in smaller schools, with more personalized relationships between teachers and pupils and a bigger feeling of safety (Cotton, 2001; Newman et al., 2006). Teachers in smaller schools tended to have more positive perceptions of their abilities to influence school norms, and to control their classrooms (Garrett et al., 2004), which might contribute to less job strain. Several school characteristics (school size, level of bullying victimization, number of inhabitants of the school location, and school board) were therefore integrated in our models, to investigate their influence on our primary and secondary outcome variables.

Research Questions

To assess the changes in teachers’ responses to bullying and in the school climate, as well as to check for a possible link between job strain and OBPP work, we integrated an online teacher survey into our evaluation study of the OBPP in Germany. After working with the OBPP for 2 years, we expected more active responses by teachers in the case of bullying (Black & Washington, 2008; Limber et al., 2018), lower level of job strain for teachers (Bradshaw et al., 2009; Domitrovich et al., 2016), as well as general improvements in the school climate (Olweus, 2012; Van Verseveld et al., 2019) across all schools. Furthermore, we expected that these improvements are associated with the degree of implementation fidelity (Durlak & DuPre, 2008; Haataja et al., 2014) and therefore should be higher for certified schools, which were able to fulfill the central requirements of the program. Non-completer schools should achieve no improvements at all, and vice versa (see “Assessment” for the definition of the different levels of implementation).

Specifically, we hypothesized that:

  1. 1.

    The intention to intervene when witnessing bullying would be higher at postline compared to baseline over all schools (primary outcome).

  2. 2.

    The general job strain for teachers would be lower at postline compared to baseline across all schools (secondary outcome).

  3. 3.

    School climate would be better at postline compared to baseline over all schools (secondary outcome).

  4. 4.

    Changes would be highest for certified schools, followed by completer schools. Non-completer schools were expected to achieve no improvements at all.

Methods

Study Population and Design

The OBPP is an evidence-based anti-bullying program which was developed in Norway in the 1980s and has since been continuously adapted and expanded. The program includes elements at four levels: school, classroom, individual, and parents/ community. All program components are guided by four key principles: adults should (1) show warmth and positivity toward students; (2) set strict limits and restrictions on unacceptable student behavior; (3) apply consistent and non-aggressive consequences; and (4) act as positive and authoritative role models (Olweus & Limber, 2010). The effectiveness of the OBPP is well documented (Gaffney et al., 2019) and therefore, the Clinic of Child and Adolescent Psychiatry Heidelberg translated the program materials and trainings (Olweus, 2012), and commenced the first scientific evaluation of the program in Germany in close cooperation with Olweus International. The project was funded by the foundation of Baden-Wuerttemberg (Baden-Württemberg Stiftung). Secondary schools in our state were informed about the possibility to participate in the program and could voluntarily sign up for participation. Overall, 21 schools were enrolled in the studyFootnote 1: eleven in 2015 (wave 1) and another ten in 2016 (wave 2). These schools can be divided into A-Level schools (Gymnasium: comparable to secondary/high school for grades 5 through 12 or 13, more academic, required for enrolment at university) as opposed to B-level schools (Realschule / Werkrealschule / Gemeinschaftsschule: comprises part of general or practical secondary/high school education, generally for grades 5 through 9 or 10 and allows for the option to commence vocational training, but is insufficient for enrolment at university).

The present article is based on teacher self-reports at baseline and postline. In order to gain insights into how the OBPP works from the teachers’ point of view, regular anonymous teacher surveys were conducted. Data was collected before the implementation process started (baseline, in 2015 and 2016 respectively), five times during the implementation process (quarterly) and after the implementation process (postline, in 2017 and 2018 respectively). In the quarterly surveys, additional information was obtained on the progress of the implementation process of the individual program components, as well as on the satisfaction with these components tailored to the role of the respondent (teacher, class teacher, and/or Olweus group leader). The present article is based on the baseline and postline surveys only. For baseline, 901 teachers from 21 schools were invited to take part in the survey. As two of the non-completer schools refused to partake in the postline survey, 820 teachers from 19 schools were invited for postline. These teacher surveys are part of a wider study design aimed at determining the effectiveness of the OBPP (reduction of bullying victims and perpetrators within 2 years). To clarify this overarching question, three annual student surveys were part of the program and formed the basis for our main evaluation study. In the first wave of schools, students participated in the surveys between 2015 and 2017, while in the second wave of schools, student surveys took place between 2016 and 2018.

Study Procedures

The study was conducted in compliance with the Helsinki Declaration and was appraised and approved by the ethics committee of the faculty of medicine at the University of Heidelberg (S-341/2014) and the respective school authorities. Furthermore, the study was registered at a WHO trial registry (Deutsches Register Klinischer Studien; DRKS00008202). Informed consent was appropriately obtained, and all teachers were extensively informed about the purpose, content, and conditions of the study by members of our research team at a teacher’s conference, as well as via information leaflets. They were also given the opportunity to contact our research team for questions. Teachers were assessed using self-report online questionnaires from July 2015 until July 2018 via LimeSurvey, which had a duration of about 10 min. An e-mail was sent to invite each teacher to participate, as well as up to two reminder e-mails (if necessary). All e-mails contained an individual code and the link to the online platform. Data was saved anonymously; login codes were saved separately from the e-mail addresses and it was not possible to connect the given answers with the login codes.

Assessment

The baseline and postline surveys consisted of 31 self-created items. The present article focuses on the following seven items, which were presented to all participants:

Intention to intervene: “How often do you try to intervene when you witness bullying among students?” (1 = I almost never do anything, 2 = I very seldom do anything, 3 = I sometimes do anything, 4 = I often do something, 5 = I almost always intervene, 6 = I didn’t notice that students were bullied at school).

Job strain: “How much do you currently enjoy your teaching profession?” (VAS 0 = no joy at all - 100 = a lot of joy); “How stressful do you currently find your teaching profession to be?” (VAS 0 = not stressful at all – 100 = very stressful); “How strenuous do you currently find your teaching profession to be?” (VAS 0 = not afflicting at all – 100 = very afflicted). These three items form the scale job strain (sum score, the first item was inverted, Cronbachs α = .73).

School climate: “How well do the students in your school get along?”; “How well do the staff in your school get along?”; “How well do students and school staff in your school get along? (all VAS 0 = very bad - 100 = very good). These three items form the scale school climate (sum score, Cronbachs α = .71).

In our sample of schools, the implementation fidelity varied extensively. Therefore, we created three groups of schools, based on their level of implementation:

  1. (a)

    Non-completer schools who quit the OBPP within the first 18 months

  2. (b)

    Completer-schools who worked with the program for at least 18 months and conducted at least two annual student surveys

  3. (c)

    Certified Olweus schools who additionally fulfilled the following quality criteria: (i) at least five meetings of study- and supervision groups for teachers per year; (ii) annual presentation of survey results to teachers and parents; (iii) regular class meetings in grades 5 to 9 (at least monthly); (iv) OBPP as a topic on a regular teacher conference at least twice per year; (v) information of parents at least twice per year (parents’ evening, information letter etc.). When applying for certification, schools had to fill out a documentation sheet (program activity report) to demonstrate the implementation of the required program modules. Olweus coaches (i.e., specially trained persons responsible for implementation and use of the OBPP) were responsible for the collection of the required information. The documentation was checked by our research team, and it was finally decided about certification within an on-site audit in the respective school.

Statistical Analyses

Data were collected anonymously at each timepoint and therefore, linking the baseline and corresponding postline data of an individual teacher was not possible. We used ordered logistic regression to estimate the categorical variable (intention to intervene, see Table 2 for categories) and linear regressions with robust standard errors to estimate the continuous scales (job strain, school climate). The proportional odds assumption was checked using the Brant test. Timepoint, implementation level, and their interaction acted as predictors. Post hoc comparisons were undertaken using the Wald test to investigate the change between baseline and postline over all implementation levels and for individual implementation levels. No missing values were imputed. We did an available-case analysis. To check whether the results are robust, we integrated the school characteristics of school size, baseline level of bullying (% of pupils that get bullied), number of inhabitants of the school location, and school board (private vs. public) separately in our models, to investigate their influence on our primary and secondary outcome variables. The school characteristics were entered as covariates and interaction with timepoints in our regression models. Data were analyzed using Stata 17.0 (StataCorp, 2021).

Results

At baseline, 615 of the 901 invited teachers took part in the assessment (participation rate of 68.26%). At postline, only 388 out of 820 invited teachers participated in the survey (participation rate of 47.32%). Table 1 gives an overview of the baseline and postline teacher samples concerning gender as well as for different school characteristics (school type, level of implementation, school board, school size, number of inhabitants of the school location, and level of bullying victimization).

Table 1 Description of baseline T0 (N = 615) and postline T1 (N = 388) teacher samples

Overall, the two samples are quite comparable, with a higher proportion of female teachers, teachers from B-Level schools, as well as public school teachers. The biggest difference between the two points of measurement is at the level of implementation. While at baseline, 20.16% of the teachers belonged to non-completer schools, only 11.34% of the teachers at postline worked at non-completer schools. This can be easily explained by the fact that the motivation to take part in the postline teacher survey was of course the lowest in the drop-out schools. Two of seven non-completer schools even completely refused to participate.

Table 2 shows the distribution of the variable intention to intervene (primary outcome) separated by implementation level and measurement point.

Table 2 Descriptive statistics (frequency and percentage) of intention to intervene at baseline T0 (N = 615) and postline T1 (N = 388) separated by implementation level

The Brant tests of parallel regression assumption resulted in non-significant test statistics (p > .05), providing evidence that the parallel regression assumption holds. Because of zero-populated categories (see Table 2), we combined the lowest three categories for the test.

For intention to intervene, our main regression model achieved a significant fit (Χ2(5) = 30.96; p < .001). Ordered logistic regression revealed a significant change between baseline and postline over all schools (p < .001; OR = 1.78; 95% CI = 1.39–2.29). After 2 years of work with the OBPP, teachers became significantly more active when witnessing a bullying situation. Figure 1 provides a graphical illustration of this relationship, showing a movement toward the fifth category “I almost always intervene.”

Fig. 1
figure 1

Density plot for intention to intervene at baseline T0 (N = 615) and postline T1 (N = 388)

Examining the contrast between postline and baseline for the different implementation levels in a second step, we found a positive increase in activity only for completers (p < .001; OR = 2.41; 95% CI = 1.65–3.51) and certified schools (p = .004; OR = 1.76; 95% CI = 1.19–2.60), but not for non-completers (p = .569; OR = 0.83; 95% CI = 0.44–1.58). The difference between completers and certified schools was not significant (p = .255).

In a third step, we separately integrated the interaction of measurement point (baseline vs. postline) with each of our four school characteristics (school board, school size, number of inhabitants of the school location, and baseline level of bullying) as covariates in our regression models to check whether these school characteristics would influence our outcome variables even more than the level of implementation of the OBPP. Our model proved to be stable with regard to the covariates since the same relationships remained.

For our continuous scales school climate and job strain, we used robust standard error estimates, as (even when transforming the variables) the assumption of normally distributed residuals was violated. For job strain, our main regression model could not achieve a significant model fit (F(5,997) = 1.58; p = .163), and therefore we did not interpret the individual coefficients.

For school climate, our main regression model achieved significant fit (F(5,997) = 3.33; p = .005). Linear regression revealed no significant change between baseline and postline over all schools (p = .448; B = 0.62; 95% CI = −0.98–2.21). However, examining the development of school climate separated by level of implementation, a significant change was yielded for certified schools (p = .003; B = 3.46; 95% CI = 1.14–5.78). For completer schools, no significant change was observed (p = .713; B = 0.42; 95% CI = −1.82–2.67), while for non-completers, the school climate even changed for the worse (p = .035; B = −5.51; 95% CI = −10.64 to −0.38). In the final successive integration of the school characteristics as covariates in our regression models, only school size changed the yielded relationships. The main effect of school size, as well as its interaction with the measurement point, was not significant (both p > .07). However, when integrating school size as a covariate, the previously significant deterioration of the school climate for non-completer schools became non-significant (p = .054; OR = −5.16; 95% CI = −10.41–0.09). That is because non-completer schools tend to be smaller, and smaller schools showed weak evidence for better school climate at baseline (p = .077; B = −.89, 95% CI = −1.87–0.10) and postline (p = .051; B = −1.34, 95% CI = −2.68–0.01). Aside from this finding, our model proved to be stable with regard to the covariates baseline level of bullying, number of inhabitants of the school location, and school board.

The results of the logistic and linear regressions described are summarized in Table 3.

Table 3 Postline T1 vs. baseline T0 contrasts for intention to intervene and school climate over all schools and separated by implementation level

Discussion

To gain insight into the perspective of teachers, we integrated an online teacher survey into our evaluation study of the OBPP in Germany. As teachers are key figures in the reduction of bullying at school, we aimed to investigate teachers’ responses to bullying after 2 years of work with the OBPP. Since past research mostly used students’ self-reports to answer this question, the current study helps to fill this gap in the literature. The second aim of this study was to check whether the implementation of the OBPP appears to be related to further secondary benefits, by contributing to a reduction of the self-reported level of job strain as well as to an improvement of the school climate. We expected positive changes on all three outcome variables at postline compared to baseline over all schools. Additionally, research has demonstrated that implementation fidelity affects the outcomes obtained (Durlak & DuPre, 2008). We therefore expected that the reported changes are associated with the level of implementation, being highest for certified schools, middle for completer schools and non-existing for non-completer schools.

First, our data revealed a significant increase of intention to intervene in bullying over all schools, which represents a primary goal of the OBPP and confirms our first hypothesis. After 2 years of work with the program, teachers became significantly more active when witnessing a bullying situation. This also corresponds to the meta-analysis of Van Verseveld et al. (2019), showing a significant small to moderate effect of anti-bullying programs on teachers´ responses to bullying over 13 studies. When examining the contrast between baseline and postline for the different implementation levels in a second step, we found this significant positive increase in activity for completers and certified schools only, but not for non-completers. This relationship underscores the importance of implementation fidelity for achieving positive effects and also contributes to the assumption that the increase in active responses to bullying was likely a result of OBPP efforts (Haataja et al., 2014). Contrary to our second hypothesis, no significant decrease in job strain could be found at postline, as our model was not able to significantly predict the data for this outcome. This means that although teachers acquire skills in dealing with bullying within the OBPP, their general stress level did not change. Moreover, a significant improvement in school climate as a secondary goal of the OBPP could not be shown over all schools, contradicting our third hypothesis. However, looking at the development of school climate separated by the level of implementation, a significant change revealed for certified schools, which confirms our fourth hypothesis again. It would be interesting to see if a school climate improvement in completer schools could be achieved at a later date, as any change might take a longer time there due to the lower level of implementation. Aside from the already described influence of school size on school climate, our models proved to be stable with regard to the covariates of school size, baseline level of bullying, number of inhabitants of the school location, and school board.

In summary, our teacher-related primary and secondary outcomes were associated with the level of implementation within the OBPP implementation process. Certified Olweus schools, i.e., the group with the highest level of implementation, achieved an increase of teachers’ intention to intervene as well as an improvement of school climate. Completer schools, which met the minimum requirements of the program, only showed an increase of teachers’ intention to intervene, but no improvement of school climate, while non-completer schools showed no improvements at all over the course of 2 years. This relationship highlights the importance of recording implementation data through surveys or interviews with teachers when implementing a school-wide anti-bullying program. As Durlak and DuPre (2008) stated:

It is important that the potential value of new interventions is adequately tested, and this is impossible without attending carefully to the process of implementation. […] There is extensive and persuasive evidence that confirms the powerful impact of implementation on outcomes. A major implication emanating from these findings is that the assessment of implementation is an absolute necessity in program evaluations. (p. 328 & 340)

Collecting implementation data is important for several reasons: (a) the overall degree of program delivery informs program developers of whether the program is feasible, (b) monitoring of implementation can reveal problems in program use that can thereby be solved quickly, and (c) a significant association between the level of implementation and outcome provides further support for the effects obtained being a result of the program rather than by other factors (Haataja et al., 2014). Unfortunately, the implementation process is rarely taken into account in the field of bullying prevention currently, although implementation fidelity and program commitment have been found to be important moderating factors for program outcomes (Van Verseveld et al., 2019).

Strengths, Limitations, and Future Directions

The current study has several strengths that are worth mentioning. First, teacher outcomes were measured as part of an evaluation of an anti-bullying program, and these outcomes were not rated by the students, but the information was directly derived from teachers in the form of regular teacher surveys instead. Not only should improvements in the behavior of teachers be an important measure for the success of a program, but secondary positive changes for teachers might be an important source of motivation to implement a laborious whole-school anti-bullying approach. Second, the integration of teacher data also provides insight into the implementation fidelity, which has emerged in research as a key moderator for the effects of a program. Surprisingly, the teachers’ perspective has barely been considered thus far, and our findings contribute to the literature on this important component.

Nevertheless, there are some reasons why findings from this study should be interpreted with some caution. The current quasi-experimental study design only allows to draw the conclusion that teachers’ intention to intervene in bullying situations as well as their rating of the school climate improved over time in the schools with high implementation fidelity. Even though the longitudinal design of the study implies a directionality of the observed effects, only a randomized controlled trial would permit causal attributions. For this reason, we cannot be sure that program implementation is really responsible for the change. Besides, the participation rate of the study was at least medium and dropped from 68.26% at baseline to 47.32% at postline, and therefore, our sample might not be representative for the teachers taking part in the OBPP, and our results might be distorted by a self-selection bias. Next, due to the anonymization of the data, the assignment of baseline and postline data of individual teachers was not possible, which prevents any analysis of individual trajectories. Another limitation centers on a rather subordinate methodological problem, regarding the sixth category of the variable intention to intervene (“I didn’t notice that students were bullied at school”). This category could be interpreted positively as the optimum of the scale, to keep the scale in order. However, it could also be interpreted negatively in the sense of not noticing existing bullying. In the latter case, we would have had to recode this category from 6 to 0, which would mathematically violate a model assumption and impair the model fit. As the calculation of a sensitivity analysis between excluding and including this category showed no difference in results, we decided to retain the category in its positive expression. The observed main shift within this scale was from category 3 and 4 (“I sometimes do anything” and “I often do something”) to category 5 (“I almost always intervene”), which is clear in interpretation anyway. In addition, our results are based on teachers’ self-reports only and might therefore be limited by common-method bias. Future research should complement teachers’ self-reported data with students’ reports and observations. In the present analyses, the baseline rates of bullying victimization reported by pupils were the only student data used, which we integrated as one of the control variables in our regression models. A further limitation of the study relates to the measures used, as the sum scales of job strain and school climate were each composed by only three self-created items. Future studies should include stronger and more validated measures. In addition, the single item on teachers’ responses to bullying does not provide insight into the kind of intervention, i.e., which available strategies were used and with what success. As De Luca et al. (2019) stated, very few studies examine how teachers respond in bullying situations, and even less analyze the impact of those interventions, although it appears that teachers’ responses (or non-responses) to bullying vary considerably. This is especially relevant as the choice and success of a strategy might be influenced by student characteristics (e.g., gender, popularity, or social skills), teacher characteristics (e.g., gender, beliefs, empathy, self-efficacy, teaching experience, or job satisfaction), as well as school characteristics (e.g., response of other teachers, or support from the principal) in a complex model (De Luca et al., 2019; Farley, 2018). Finally, no data on the determinants of teachers’ responses to bullying was collected within our study to further investigate how specific program elements or training activities are associated with teacher outcomes. Van Verseveld et al. (2019) found the largest effects on determinants of bullying intervention that were directed at improving teachers’ self-efficacy and knowledge. Previous research has also shown that teachers’ empathy toward the victim, beliefs about bullying, their perceptions of the seriousness of bullying incidents, and school support are related to teachers’ responses to bullying (Dedousis-Wallace et al., 2014; Novick & Isaacs, 2010; Yoon et al., 2014). Knowledge regarding the moderators of this relationship should be expanded to further improve the effectiveness of preventive programs, since changes on the behavioral level of teachers should be a central goal and criterion of success of bullying prevention.

Conclusion

Our study based on teachers’ self-reports indicates that the OBPP is associated with a positive change in the intention of teachers to intervene in bullying, and not only with a change in determinants of intervention such as attitude or knowledge. A change in actual behavior represents an important success indicator of a preventive program, as, according to the ideas of Dan Olweus, responsibility for acting against bullying lies with the school staff. Furthermore, teachers in the certified Olweus schools reported an improvement in school climate after 2 years, representing a further benefit of the OBPP. The teacher plays an important role in the management of classroom bullying, and therefore teacher outcomes should be part of future program evaluations, as research should focus more on the effects of anti-bullying programs on teachers. As Ahtola et al. (2012) said:

Throughout the years, students are replaced, but teachers, more or less, remain. When we are looking for ways to change the students’ environment permanently in order to increase well-being, we are likely to rely heavily on teachers’ commitment and activity. Their knowledge, attitudes, and skills have an important role when the school’s position in the promotion of well-being and the prevention of problems is negotiated. (p. 858)

Besides, it is not only important that teachers intervene more frequently in bullying situations, but also that teachers use strategies that have proven to be effective. To that end, more research is needed in order to support teachers and provide them with useful strategies for noticing, terminating, and preventing bullying.

The positive changes reported in our study are associated with implementation fidelity, as our secondary hypothesis stated. It is therefore essential to record data about the extent of program activities when evaluating a program. Otherwise, it is unclear whether possible missing effects are due to conceptual deficiencies of the program, or simply due to inadequate implementation. This makes teachers an important source of information, and thus far, their contribution has received too little research attention. Furthermore, dosage-response-effects might also be responsible for the wide range of effects of anti-bullying programs in the past, as fidelity of implementation is a critical factor influencing a program’s success.