Introduction

The wellbeing of teachers is an important element of effective education systems. Yet, worldwide educators are leaving the profession due to stress and burnout, leading to warnings of teacher shortages (Greenberg, 2016; OECD, 2019). Teachers with poor wellbeing that remain in the profession have negative consequences for students, as teacher burnout is associated with poorer teacher-student relationships, student behavior, and academic achievement (Hoglund et al., 2015; Reyes et al., 2012). Consequently, understanding how to improve teacher wellbeing is critical to ensuring positive outcomes for students, including student wellbeing (Shirley et al., 2020), as well as addressing concerns about the sustainability of the profession (McCallum & Price, 2016; OECD, 2019).

In this article, we present a systematic review that critically synthesizes literature on interventions to improve educator wellbeing. We followed the key idea summarized by the Cochrane organization, that “systematic reviews seek to collate evidence that fits pre-specified eligibility criteria in order to answer a specific research question” (Chandler et al., 2022, “Key Points” section). Following the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines (Liberati et al., 2009), empirical studies of interventions that met pre-determined inclusion criteria were reviewed for key data, assessed for quality, and synthesized to answer a set of research questions. In answering our research questions, we summarize: the intervention content, theories underpinning these interventions, how wellbeing and factors affecting wellbeing are measured, and the effectiveness of the interventions. We conclude by illustrating key considerations when designing and implementing an educator wellbeing program, offering suggestions on the most effective category of interventions, and highlighting areas in need of more research.

Our introduction outlines wellbeing through a positive psychology lens, before discussing teacher wellbeing and interventions to improve wellbeing.

Wellbeing

Although wellbeing has been referenced in relation to health for many years (e.g., World Health Organization, 1948), there has been a rapid increase in interest in wellbeing since 2000 when the field of positive psychology was established. Positive psychology—which has been described as the science of wellbeing—seeks to “understand and build the factors that allow individuals, communities, and societies to flourish” (Seligman & Csikszentmihalyi, 2000, p. 5). Although there is no one agreed upon definition of wellbeing within the field, a commonly used definition is “feeling good and functioning well” (Huppert & Johnson, 2010, p. 264). Wellbeing is often conceptualized as multidimensional; for example, Ryff’s six dimensions of wellbeing (Ryff, 1989), Seligman’s (2011) PERMA model (Positive Emotion, Engagement, Relationships, Meaning, and Accomplishment), and Huppert & So’s 10 features of flourishing (Huppert & So, 2013). A broad array of self-report measures of wellbeing exist, with one review identifying 99 measures (Linton et al., 2016). The self-report measures identified by Linton et al. (2016) clustered around six key themes, with most measures capturing a few of the six types of wellbeing: mental well-being, social wellbeing, physical well-being, spiritual well-being, activities and functioning, and personal circumstances. A different approach to categorizing types of wellbeing is Goodman et al.’s (2021) hierarchical model of wellbeing with general wellbeing at the top, and four levels underneath: “lenses (perspectives from which well-being is conceptualized), contents (homogeneous topic areas that make up each lens), characteristics (clearly defined components of well-being that offer practical value in dissecting human experiences), and contexts (characteristics that arise in particular situations or contexts and/or a narrow aspect of a particular characteristic)” (p. 833). Examples at each level are subjective wellbeing as a lens, affect as a content area, positive affect as a characteristic, and happiness at work as a characteristic in a particular context. As well as different types of wellbeing, wellbeing can be thought of as a continuum ranging from low wellbeing (or languishing) to high wellbeing (or flourishing) (Keyes, 2002). It is important to note that low wellbeing is not synonymous with mental illness, as a small percentage of people flourish despite mental illness such as depression (Keyes, 2002; Peter et al., 2011). The numerous definitions, measures, and models illustrates the many ways to conceptualize wellbeing.

Research in the field of positive psychology has identified a number of factors that allow individuals to flourish and developed many positive psychology interventions (PPIs) that individuals can use to boost their wellbeing (Carr et al., 2021). Some examples of PPIs are: gratitude journalling, acts of kindness, and identifying strengths. However, there has been critique about positive psychology’s focus on the individual, given the limited evidence about how much individuals can influence their wellbeing through intentional activity compared to the influence of their life circumstances (Brown & Rohrer, 2019). Scholars have called for the field to develop beyond the focus on the individual, acknowledge the influence of context on wellbeing, and embrace greater complexity (Kern et al., 2020; Lomas et al., 2020).

Beyond positive psychology there is already research that highlights the influence of context on wellbeing. For example, individuals’ happiness is influenced by the happiness of the people they interact with (Fowler & Christakis, 2008), and the levels of trust in a society are related to wellbeing (Helliwell & Wang, 2010). Socio-ecological views of wellbeing consider the influence of relationships and social contexts on an individual’s wellbeing, along with individual factors that influence wellbeing. Bronfenbrenner’s (1979, 1999) ecological model of human development uses a socio-ecological approach and has been used as a lens to explore wellbeing, or aspects of wellbeing, in schools, such as student belonging (Allen et al., 2018), pastoral care (Barber, 2016), and teacher wellbeing (Ainsworth & Oldfield, 2019; Hofstadler et al., 2021; Price & McCallum, 2015). Teacher wellbeing has been shown to be influenced at all four levels of Bronfenbrenner’s (1979) model: “the microsystem (individual and collective capacities); mesosystem (interrelationships between contexts); exosystem (organisational); and macrosystem (societal and legislative influences), compounded by the influence of time at the chronosystem level” (Price & McCallum, 2015, p. 195). Studies that compare different factors that influence teachers’ wellbeing have shown contextual factors explain more variance in teachers’ wellbeing than individual factors (Ainsworth & Oldfield, 2019; Hobson & Maxwell, 2017). Contextual factors include support from school leadership, which is associated with greater teacher wellbeing and lower burnout (Ainsworth & Oldfield, 2019; Cann et al., 2021; Janovská et al., 2016). Collaborative and trusting school cultures are also strongly associated with greater teacher wellbeing (Cann et al., 2022). Teachers cite collaborative working relationships with colleagues as contributing to developing confidence, reducing anxiety and stress and contributing to their wellbeing (Collie et al., 2012; Paterson & Grantham, 2016), and that collaborative relationships can help to offset some of the negative aspects of teaching (Wigford & Higgins, 2019). Teachers' trust in colleagues is positively associated with increased enthusiasm, teaching satisfaction, and contentment, and reduced anxiety, burnout, and depression (Huang et al., 2019; Yin et al., 2016, 2018). Other contextual factors are negatively associated with teacher wellbeing. Student misbehaviour is associated with lower levels of teacher wellbeing (Aldrup et al., 2018; Kaynak, 2020). Workload and a lack of work-life balance is associated with lower wellbeing and higher burnout (Ainsworth & Oldfield, 2019; Hobson & Maxwell, 2017; Kaynak, 2020). It is therefore important to take into account contextual factors when designing and assessing interventions to improve wellbeing.

Teacher wellbeing

Much of the research into teacher ‘wellbeing’ has actually focused on poor mental health, stress, and burnout (Bricheno et al., 2009). For example, teachers experience higher stress levels than the national population in the United Kingdom (Kidger et al., 2010; Travers & Cooper, 1993), China (Yang et al., 2009) and New Zealand (Milfont et al., 2008). Stress is linked to negative impacts on school performance and attrition (Greenberg, 2016; Ingersoll, 2001) and negatively affects teachers who remain in the profession through reductions in teacher-efficacy, job satisfaction (Collie et al., 2012), optimism and motivation (Desrumaux et al., 2015). Given the prevalence of stress and its consequences for schools and teachers, it is an important area of research, but is more focused on ‘illbeing’ rather than ‘wellbeing’.

There is no agreed upon definition of teacher wellbeing to guide the research to more positive aspects. McCallum et al. (2017) note that wellbeing is seldom defined specifically in relation to teachers, and the few definitions that exist vary widely. For example McCallum and Price (2016) define teacher wellbeing as “diverse and fluid… something we all aim for…. yet is unique to each of us” (p. 17). In contrast, the OECD uses a definition of teacher wellbeing that clearly outlines four dimensions used to underpin measurement and analysis: cognitive wellbeing, subjective wellbeing, physical and mental wellbeing, and social wellbeing (Viac & Fraser, 2020).

Further variability is introduced as concepts that some researchers consider to be a dimension of teacher wellbeing, are defined as factors that impact teacher wellbeing by others. For example, Viac and Fraser (2020) include self-efficacy within the cognitive dimension of teacher wellbeing, whilst McCallum et al (2017) list self-efficacy as a factor that impacts teacher wellbeing. Regardless of the variations in definitions, there is general agreement that for many teachers their wellbeing is in need of improvement.

Improving wellbeing in schools

In recent years there has been a greater focus on wellbeing in schools, yet efforts are mainly focused on improving students’ wellbeing—school staff are usually overlooked. The concept of positive education emerged shortly after positive psychology was founded as schools became increasingly aware of the need to improve the wellbeing of their students (Seligman et al., 2009), prompted by data on the poor state of mental health for young people around the world (Slemp et al., 2017). Positive education uses theories and models from positive psychology, such as Seligman’s (2011) PERMA theory of wellbeing, and applies them in the form of interventions at educational institutions ranging from primary schools to universities (Russo-Netzer & Ben-Shahar, 2011; Waters & Loton, 2019). However, positive education interventions rarely focus on improving teacher wellbeing alongside that of students. A review of 75 studies of positive education interventions revealed only 2% used whole-school approaches that sought to improve staff wellbeing (Waters & Loton, 2019). Considering that addressing teacher wellbeing has been identified as an important first step in implementing whole school programs to promote student wellbeing (Quinlan, 2017; Slemp et al., 2017), this is an area in need of more systematic research.

Lack of evidence for school wellbeing programs

Concerns have been raised about the quality of evidence that supports positive education interventions. Dix et al. (2020) reviewed the evidence for 200 positive education programs available in Australia and concluded that over half were based on low quality evidence. There may be a similar concern for interventions focused solely on teacher wellbeing. While there are some reviews focused on particular interventions to improve teacher wellbeing (for example mindfulness: Hwang et al., 2017; Lomas et al., 2017; Zarate et al., 2019), reviews that examine the broad range of teacher wellbeing interventions offered are lacking. McCallum et al. (2017) review a range of different initiatives that enable teacher wellbeing; however, they refer mostly to observational studies, books, and literature reviews and include few references to intervention studies. Whilst this provides a useful starting point to consider possible teacher wellbeing interventions, a more systematic review of the range of different intervention studies is needed.

Evaluating the implementation of wellbeing programs

When evaluating the evidence to support wellbeing programs, it is also important to consider the implementation of the program, especially given the implementation gap—where evidence based practices do not lead to changes when implemented in schools (Hagermoser Sanetti & Collier-Meek, 2019). Halliday et al. (2019) draw on the implementation science literature to identify the following factors that influence the implementation of student wellbeing programs: provider (e.g. teacher), organization (school), intervention (the wellbeing program), recipient (student) and context. For example, organizational factors include the school’s readiness for the program, the social climate, and alignment of the program with school goals (Halliday et al., 2019). These factors are useful to consider when evaluating the implementation of educator wellbeing programs.

Research questions

This review draws on research evidence to provide insights into how interventions might improve educator wellbeing. We purposely use the term educator wellbeing, rather than teacher wellbeing, to encompass studies that seek to improve the wellbeing of all school staff. We investigated the following questions:

  • Q1: What interventions are used to improve educator wellbeing—what is the content of the intervention and how are they implemented?

  • Q2: What theories underpin educator wellbeing interventions?

  • Q3: What study designs and measures are used to explore educator wellbeing?

  • Q4: What are the impacts of educator wellbeing interventions?

Methods

We identified studies of interventions to improve educator wellbeing that took a broad view of wellbeing, meaning we included studies that purport to examine wellbeing, flourishing, or thriving, rather than one specific subdomain of wellbeing (e.g., self-efficacy). This allowed us to explore the range of conceptualizations of educator wellbeing, interventions used, and approaches to measurement of their impact. The review was conducted using the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines (Liberati et al., 2009). A protocol was developed a priori for the search method, inclusion and exclusion criteria, screening methods, data management, and coding of articles. The protocol was not registered with a review database.

Search methods

The databases Education Research Complete and PsychInfo were used to search for potentially eligible articles from 2000 to November 2020—chosen as the field of positive psychology was established in 2000 (Seligman & Csikszentmihalyi, 2000). The search terms related to broad concepts of wellbeing (e.g. wellbeing, wellness, thriving etc.), educators, and interventions. Potential search terms were checked against known articles about interventions to improve educator wellbeing (Jennings et al., 2019; Taylor, 2018) in order to refine the list of search terms. The term ‘intervention’ was not used in many titles or abstracts of intervention studies, so search terms to identify empirical studies were used instead. The final search terms were:

  1. (1)

    Title: Educator related terms: Teach* OR Educator* OR instructor* OR (employee AND school) OR (staff AND school)

  2. (2)

    Title: Wellbeing related terms: wellbeing OR well-being OR "well being" OR wellness OR flourish* OR "psychological need* satisfaction" OR thriv*

  3. (3)

    Title: Exclusion terms: "pre-service" OR preservice

  4. (4)

    Abstract: Empirical study related terms: qualitative OR quantitative OR mixed OR n* = OR survey OR questionnaire OR interview OR sample*

  5. (5)

    Articles from peer reviewed journals, in English.

Article screening

Titles and abstracts were screened by the first author to determine suitability for inclusion in this review. Inclusion criteria were empirical studies that included an intervention and collected data before and after the intervention to measure its impact on educators. Studies using either quantitative or qualitative data were included as were studies with or without a control group (Fig. 1).

Fig. 1
figure 1

Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) flow diagram of the selection process

Of the 429 unique articles identified, 381 were excluded for reasons such as: focusing on student wellbeing rather than educator wellbeing, focusing on educators that were not in the early childhood to secondary sectors (K-12 educators), or not including an intervention. Where insufficient information was given in the abstract to determine if the article met the inclusion criteria, full text screening was conducted. Forty-eight articles were identified for full text screening, after which we determined 23 articles met the eligibility criteria for inclusion (Fig. 2).

Fig. 2
figure 2

Number of teacher intervention articles in the period 2000 to 2020

Coding and analysis

Qualitative coding of the final sample of 23 articles was conducted using NVivo version 12 (QSR International Pty Ltd., 2018). The research questions were used to decide on a priori codes. Codes included: aspects of the intervention—content, provider, school, intervention participants, and context—wellbeing theories used, quantitative measures used, or descriptions of qualitative approaches. For example, coding of the intervention included: descriptions of activities, dosage (i.e. frequency and timing, including time of day or time of school year), control condition, and the location of the sessions. As coding progressed, a few additional codes were created in order to refine existing codes.

We used a code book to record codes and their description, and inclusion and exclusion notes with examples. Items coded to the same node were regularly compared to each other and to the codebook to avoid ‘definitional drift’, and to refine the codes to reflect the themes (DeCuir-Gunby et al., 2011; Gibbs, 2007). After the first pass of coding, we analyzed intervention descriptions to look for themes related to intervention content, and similar studies were grouped and coded (see Table 1). Quantitative data were entered into Excel to produce tables listing all scales used, noting any significant changes post intervention, and effect sizes where available. This enabled trends to be identified across the different types of interventions.

Table 1 Summary of articles

Results

These results are from 23 articles examining the impact of 22 interventions on educator wellbeing (one intervention was reported in two articles—one at one year follow up). Of the 23 articles, 17 were published between 2016 and 2020.

Key information about the 22 interventions is presented in Table 1. This is followed by a series of sections that answer the research questions.

RQ1: what interventions are used to improve educator wellbeing—what is the content of the intervention and how are they implemented?

The articles were categorized by the intervention content:

  • Multi-foci interventions, which included several content areas within the intervention, such as mindfulness, coping, emotion regulation, exercise, and time management (9 articles)

  • Gratitude interventions (2 articles)

  • Mindfulness-based interventions (6 articles)

  • Teacher professional development that mainly focused on teacher practice (5 articles on 4 interventions)

  • Physical environment (1 article)

The categories above can be grouped into two broad content areas: (1) Direct wellbeing content (where educators directly engaged with wellbeing content during the intervention)—this included multi-foci programs (several content areas), gratitude, mindfulness— and (2) Indirect content (where improved wellbeing was proposed to be an indirect result of content such as educators learning about behavior management)—this included teacher professional development, and physical environment.

Direct wellbeing content

Multi-foci interventions.

Multi-foci interventions (the largest category, n = 9) included several different content areas within an intervention program. Content was drawn from a range of approaches such as cognitive behavioural therapy, acceptance and commitment therapy, positive psychology, and mindfulness. Content commonly included across these studies were building social support and positive relationships and managing thoughts and emotions (see Table 2 for details). Less frequently used content included identifying and using strengths, practicing gratitude, and planning for regular exercise and good sleep. Generally, these multi-foci interventions focused on teaching skills (e.g., emotion regulation) and practices (e.g., gratitude journalling) to educators. However, a few interventions provided external resources and environments to support wellbeing, including elements such as aromatherapy, comfortable chairs, stress balls, food and drink (Sharrocks, 2014), and visits from occupational physiotherapists (Saaranen et al., 2007).

Table 2 Summary of content areas in the multi-foci interventions
Gratitude interventions.

The two gratitude interventions used the ‘three good things’ intervention (Seligman et al., 2005), which is well known in the field of positive psychology for improving happiness and reducing depression. Both studies involved a weekly 15 min reflection over eight weeks (Chan, 2010, 2013).

Mindfulness interventions.

Mindfulness has been described as developing a greater sense of awareness of the present moment, and bringing acceptance, rather than judgement, to any thoughts and feelings (Frank et al., 2015; Harris et al., 2016). The mindfulness interventions (n = 6) achieved this through activities such as meditation, body scan (a method of focusing attention on the body), mindful walking or eating, mindful colouring, and yoga. Of the six mindfulness interventions, four also specifically stated that the intervention included learning to develop self-compassion (Beshai et al., 2016; Harris et al., 2016; Hwang et al., 2019; Jennings et al., 2019), as mindfulness is a core component of self-compassion (Neff, 2015).

Indirect wellbeing content

Teacher professional development.

The teacher professional development interventions comprised three interventions focused on teachers improving their practice, and one focused on implementing a positive education program to increase student wellbeing (Bradley et al., 2018). The first three interventions theorised that teacher wellbeing would increase due to providing support and increasing teacher efficacy. In contrast, Bradley et al. (2018) proposed that teachers’ wellbeing would increase due to training about a wellbeing curriculum that they implemented within their classrooms to teach students about wellbeing.

Physical environment intervention.

One study used an intervention that changed only the physical environment, by adding filters to the electrical supply (Havas & Olstad, 2008). Havas and Olstad (2008) suggested that electrical supply filters could improve teachers wellbeing by reducing physical and cognitive symptoms reported by people who identify as having ‘electrical hypersensitivity’.

Other intervention features

The length of the interventions varied considerably, from five days to one year. The teacher professional development interventions were generally the longest in duration, ranging from seven months to one year. Apart from a three year long collaborative project to develop a multi-foci wellbeing intervention (Saaranen et al., 2007), all other interventions were 16 weeks or less, with the majority being eight weeks or less, and the shortest being five days for a mindful colouring intervention (Czerwinski et al., 2020). Sessions also varied in duration, for example: 15 min for Chan’s (2010) gratitude intervention, 2.5 h sessions in Cook et al.’s (2017) multi-foci wellbeing program, and a five day training course as part of teacher professional development (Wolf et al., 2019).

The interventions were mainly delivered in person, but also included three delivered online (Chan, 2010, 2013; Czerwinski et al., 2020). The online interventions used either written instructions (Chan, 2010, 2013), or an instructional video (Czerwinski et al., 2020), whilst the in person interventions were generally group training sessions. The facilitators for in person sessions depended on the type of intervention used. The mindfulness interventions used facilitators who were trained in mindfulness practice, yoga, or meditation. The teacher professional development interventions used facilitators with educational backgrounds, with specific training in the intervention being delivered. Five of the nine multi-foci interventions were delivered by the authors of the studies. Two studies used a range of professionals as needed for different aspects of the intervention, for example psychologists, an architect, physiotherapist, and speech therapist (Saaranen et al., 2007; Sottimano et al., 2018).

RQ 2: what theories underpin teacher wellbeing interventions?

In many articles, the wellbeing theories used were implicit in the authors’ definitions of wellbeing. It was expected that the studies may apply a variety of definitions of wellbeing, but it was somewhat surprising that some articles lacked any definition at all.

Positive psychology theories

Nine of the 22 interventions drew on positive psychology to define wellbeing: two gratitude interventions, six multi-foci interventions, and one professional development intervention (on implementing positive education). These studies generally conceptualized teacher wellbeing as both reducing negative indicators (or threats to wellbeing) such as stress, whilst also increasing positive indicators of wellbeing. For example, Cook et al. (2017) assert that “efforts targeting teacher well-being should aim for more than simply reducing stress and burnout—they should also strive to cultivate positive patterns of thinking and feeling” (p. 15).

Work-specific theories

One study developed a theoretical framework which emphasized the need for holistic development of occupational wellbeing (Saaranen et al., 2007). Other articles did not explicitly state any theories underpinning their conceptualization of teacher wellbeing, but inferred work based definitions of wellbeing through their measurement approach, for example “teacher professional well-being (measured through teachers’ levels of motivation, burnout, and job satisfaction)” (Wolf et al., 2015, p. 14).

Other theories

Two studies of multi-foci interventions drew on more than one theoretical framework in order to design the intervention. Cook et al. (2017) and Taylor (2018) draw on a combination of cognitive behavioral therapy, acceptance and commitment therapy, positive psychology, and mindfulness. Ecological theories of wellbeing were only considered in Cigala et al.’s (2019) study of an intervention in which teacher’s collaborated on professional development, and Fernandes et al.s (2019) study of a multi-foci intervention.

Implicit definitions of wellbeing in the measures used

The articles about mindfulness interventions tended to lack an explicit definition of wellbeing, and often included implicit definitions inferred by the measurements used. For example, Czerwinski et al. (2020) refer to “three subscales of the wellbeing scale (depression, anxiety and stress)” (p. 6). Only two of the six studies included a measure of positive functioning—teacher efficacy (Harris et al., 2016; Jennings et al., 2019). All six studies measured participants’ mindfulness, and three studies measured self-compassion (often a feature of mindfulness training). In general, the mindfulness studies referred to the reduction of stress and increasing of mindfulness as characteristic of increased wellbeing.

Havas and Olstad’s (2008) intervention of fitting filters to the electrical supply in schools was founded on the theory that adverse symptoms of electrical hypersensitivity could be alleviated by reducing exposure to electromagnetic fields, a theory that has been severely critiqued (Dieudonné, 2020). In terms of wellbeing, they did not present any underlying theory of wellbeing, but used a range of physical symptoms such as shortness of breath and skin rashes, and cognitive symptoms such as memory loss and frustration, to assess wellbeing.

RQ 3: what study designs and measures are used to explore educator wellbeing?

Five different study designs were identified within these studies: randomized control trials (n = 5, RCT), cluster randomized control trials (where the condition was assigned by school, n = 4), controlled trials using non-random assignment (n = 5), studies with no control group using a pre-test/post-test design (n = 4), and qualitative case studies with no control group (n = 4). The designs that had control groups used waitlist controls (n = 9), controls with no intervention (n = 2), or active controls (n = 3).

The numbers of participants and schools in each study varied widely, from one to 240 schools, and from five to 444 individual participants. The qualitative studies tended to have the least number of participants. Even excluding the qualitative case studies (n = 5 to n = 16), the number of participants ranged from five to 444 with a mean of 109 (SD = 113).

Quantitative measures

The quantitative measures used were almost exclusively self-reported psychometric scales. Exceptions to this were measures of blood pressure and stress hormone levels. Table 3 summarizes the measures used across the 18 quantitative studies. The measures have been sorted into broad categories that do not always have clear distinctions. For example, resilience has been categorized as a positive wellbeing measure given that motivation and self-efficacy are important factors in determining resilience (Beltman et al., 2011).However, resilience also encompasses ideas about how long it takes a person to recover after a setback (Huppert & So, 2013), so it could be argued it is more related to coping with negative conditions than being a measure of a positive state.

Table 3 Quantitative measures used

The majority of measures were focused on wellbeing, with other measures focused on intervention content and implementation, school and classroom climate, and student outcomes. Approximately one third of the wellbeing measures used were context specific to work or schools, including the work-related measures category (such as job satisfaction and teacher efficacy), the burnout measures (all referred to work in most or all items), and one instance of the perceived stress scale that was modified for work in schools in Cook et al.’s (2017) study. The other wellbeing measures used scales which were context free, for example Watson et al.’s (1988) positive and negative affect scale which asks how frequently respondents have experienced particular emotions within a certain timeframe without specifying any context. In general, studies tended to assess educator wellbeing using work-focused measures (such as job satisfaction or teacher efficacy) and negatively focused measures (such as stress, depression, and burnout). Only three studies did not include negatively focused measures: two gratitude interventions (Chan, 2010, 2013), and one multi-foci intervention ( Fernandes et al., 2019). The studies in the teacher professional development and the multi-foci intervention categories tended to have a balance between negative wellbeing measures, and positive or work-related measures. However, the mindfulness interventions tended to focus more on the negative aspects of wellbeing—stress, depression, anxiety, and burnout—and none used measures of positive aspects such as flourishing.

Surprisingly, measures of physical wellbeing were rarely included in studies. Researchers measured physical symptoms, such as sleep quality, blood pressure, and cortisol levels (a stress hormone), in three studies of mindfulness interventions (Frank et al., 2015; Harris et al., 2016; Jennings et al., 2019), and one study of filters fitted to the electrical supply (Havas & Olstad, 2008). All but one of these studies relied on self-report measures, such as sleep quality or medication use, whilst Harris et al. (2016) also used physical measures of blood pressure and tested the cortisol levels of educators via collecting saliva samples.

Measures of relationships were also surprisingly under-represented. This is especially remarkable given that an explicit focus on building positive relationships was part of seven of the nine multi-foci interventions, but only two of these studies measured the quality of teachers relationships via social support and vertical trust scales (Fernandes et al., 2019; Sottimano et al., 2018). Apart from these two multi-foci intervention studies, four studies measured relationships—teacher-student relationships (Hwang et al., 2019; Wolf et al., 2015) and teacher relationships in general (Bradley et al., 2018; Harris et al., 2016).

Finally, few studies included measures of the intervention itself. The only instances of this were where participants: rated the intervention in terms of acceptability (Beshai et al., 2016; Cook et al., 2017), their intentions to implement the training they received (Cook et al., 2017), and how they perceived they benefitted from the intervention (Rahm & Heise, 2019).

Qualitative methods

Qualitative methods were infrequently used—four studies used solely qualitative approaches (Cigala et al., 2019; Sharrocks, 2014; Turner & Theilking, 2019; Wessels & Wood, 2019), and two studies used mixed methods approaches (Hayes et al., 2020; Hwang et al., 2019). The methods for collecting qualitative data included: interviews with teachers (Hwang et al., 2019; Turner & Theilking, 2019), focus groups (Hayes et al., 2020; Sharrocks, 2014), field notes and recordings of teacher meetings (Cigala et al., 2019; Wessels & Wood, 2019), video recording of teachers presenting their learning about wellbeing (Wessels & Wood, 2019), teachers written observational reports of classroom activities (Cigala et al., 2019), and teacher written reflections (Turner & Theilking, 2019). Four studies analysed the qualitative data using inductive methods where the data were used to identify themes, one identified themes a priori through a review of the literature (Cigala et al., 2019). One study only provided a brief summary of qualitative findings—as they were published in full in another article—and did not identify the coding method used (Hayes et al., 2020).

RQ 4: what are the impacts of educator wellbeing interventions?

In order to answer this research question, we report separately on the quantitative and the qualitative results reported in articles. For the quantitative results we focus on statistically significant changes in measures, and their effect sizes, and for the qualitative results we summarize the key themes identified.

Summary of quantitative results

Table 4 gives an overview of the main effects of the interventions that were studied using quantitative measures.

Table 4 Summary of statistically significant changes by intervention category

For each type of intervention, we summarize statistically significant changes detected between pre-test and post-test measures, and the effect sizes of any changes. The effect sizes have been categorized into small, medium, large, or no effect size given. Cohen’s d was the most frequently used effect size statistic, and we used Hattie’s (2009) interpretations of the size: 0.20 ≤ |d|< 0.40 are small, 0.40 ≤ |d|< 0.60 are medium, and |d| ≥ 0.60 are large (ranges for other effect size statistics are given in Table 4, note 5). As Kirk (1996) notes: “statistical significance is concerned with whether a research result is due to chance or sampling variability; practical significance is concerned with whether the result is useful in the real world” (p. 746). Only three of the 18 quantitative studies which reported statistically significant changes did not report effect sizes (Havas & Olstad, 2008; Saaranen et al., 2007; Sottimano et al., 2018), therefore the practical significance of results can be evaluated for most studies. However, only two studies reported confidence intervals for the effect sizes (Harris et al., 2016; Hayes et al., 2020), meaning that for most studies the confidence level in the effect sizes cannot be evaluated.

Multi-foci interventions.

The multi-foci interventions (n = 6) were the category with the largest percentage of measures showing improvement (71%). These were distributed between work related measures of wellbeing (such as teacher efficacy), positive measures of wellbeing (such as resilience), and negative measures of wellbeing (such as stress). Five of the six multi-foci interventions showed significant improvements to the majority of measures (all with medium to large effect sizes where these were provided), but Saaranen et al.’s (2007) study generally showed no significant improvements, apart from a few subscales of occupational wellbeing. The five studies showing consistent improvements all used interventions were participants attended weekly sessions over a period of four to nine weeks. However, Saaranen et al.’s (2007) two-year long intervention targeted all staff in participating schools, with a small group of staff in each school promoting occupational wellbeing through changes to conditions, such as meeting practices, and providing optional opportunities such as exercise groups and IT training. This suggests multi-foci interventions may be most effective when educators regularly attend wellbeing training sessions, rather than relying on a small group of staff to influence the wellbeing of others in their school.

Gratitude interventions.

The two gratitude intervention studies show statistically significant changes in five out of eight measures (63%), with small to large effect sizes (Chan, 2010, 2013). The largest effect sizes were for satisfaction with life (medium and large effect size increases in the two studies) and positive affect (large effect size increase in one study). In one study, satisfaction with life and gratitude only had statistically significant increases for people with low gratitude at baseline (Chan, 2010), suggesting this intervention is more effective for some participants than others.

Mindfulness interventions.

The mindfulness interventions (n = 6) tended to focus on negative wellbeing measures such as stress, anxiety, burnout, and depression. Although 47% of measures for the mindfulness interventions showed improvements, only 6 of the 18 (33%) negative measures of wellbeing improved. Four interventions measured depression, anxiety, and/or burnout, but only one showed any statistically significant reductions—reducing all three measures by a large effect size (Czerwinski et al., 2020). The results for reducing stress were slightly more consistent, with three out of five interventions reducing stress with medium to large effect sizes (Beshai et al., 2016; Czerwinski et al., 2020; Hwang et al., 2019).

Other measures, such as affect, self-compassion, and mindfulness tended to show improvement—for example, all three studies that measured affect showed improvements. All six studies measured mindfulness, with four showing large effect size increases, but the other two showing no changes. For the four studies that increased mindfulness, three showed medium to large effect size reductions in stress. Thus, whilst the mindfulness interventions did not always improve mindfulness, it appears that where an intervention did improve mindfulness, it also tended to reduce stress.

Teacher professional development.

The three teacher professional development interventions only showed improvement on a few measures (33%), such as teacher efficacy, positive emotions, and turnover. Surprisingly, one intervention did not show significant changes for any measures (Hayes et al., 2020). All three studies measured burnout, but only one intervention successfully decreased burnout (Wolf et al., 2019). There were no changes to job satisfaction (measured in two studies) and stress (measured in one study). Overall, the teacher professional development interventions were not effective at improving teacher wellbeing.

Physical environment intervention.

In Havas and Olstad’s (2008) study of an intervention fitting filters to the electrical supply in schools they present a list of physical and cognitive symptoms, and calculated statistics for each individual teacher, for each symptom, to determine if there was a statistically significant difference between the days with dummy filters and the days with the actual filters. They classify a symptom as being improved due to the intervention if the number of teachers reporting improvements are greater than the number of teachers reporting worsening symptoms, and report that “of the 38 symptoms 79% were better” (p. 158)—therefore obscuring the result that for 87% of symptoms the filters led to both significantly improved symptoms for some teachers and significantly worse symptoms for other teachers. Some of their conclusions are inconsistent with the results that they present. For example, they state that “teacher health and sense of well being improved with enhanced power quality” (Havas & Olstad, 2008, p. 158), yet their figure of results (p. 159) shows that teachers sense of wellbeing improved for 0% of teachers at p < 0.05, and sense of wellbeing was worse for approximately 5% of teachers at p < 0.05. Given the inconsistencies in the statements made with the results reported, this study is treated as quite unreliable.

Effects for interventions using control groups.

It is claimed that high-quality studies that use random assignment to intervention and control groups generally show lower effects (Rahm & Heise, 2019). We explore this for quantitative studies in this review that used control groups, but exclude Havas and Olstad’s (2008) study due to poor reliability. Five studies in this review used a randomized control design. Of the 28 scales across the five studies, 15 (54%) showed a statistically significant change (effect sizes: 7 = large, 5 = medium, 3 = small). For the studies using control groups, but not using RCT, 36 out of 61 measures (59%) across 8 studies showed statistically significant changes (effect sizes: 20 = large, 11 = medium, 0 = small, 5 = no effect size reported). Therefore, the RCTs in this review showed a similar proportion of scales with significant improvements, and a slightly lower proportion of large effect size changes compared to non-RCT studies.

Sustained impacts.

Four of the studies collected data at follow up points between six weeks to one year after the intervention ended in order to explore the sustained impacts of the intervention (Hwang et al., 2019; Jennings et al., 2019; Rahm & Heise, 2019; Wolf & Peele, 2019). Many measures that had shown statistically significant improvement immediately post intervention were no longer significant at follow up: time urgency after six months (Jennings et al., 2019), burnout and teacher–child interaction quality after one year (Wolf et al., 2019), and positive and negative affect after six months (Rahm & Heise, 2019). However, other changes were sustained, or only became significant at follow up. In Hwang et al.’s (2019) study, all of the statistically significant changes observed at the end of the intervention were sustained at the follow up six weeks later, and some measures, such as teachers use of asking students questions, became significant at the follow up point. Rahm and Heise (2019) observed that increases in flourishing and stress reduction were sustained at all follow up points, but general self-efficacy and internal locus of control only showed significant increases at the five month follow up point. The four studies that used follow up measures are distributed across different categories of interventions, so we are unable to draw conclusions about particular categories of interventions. However, these findings illustrate the potential for follow up data to provide further insights into how interventions work.

Dropouts.

Some studies have high dropout rates—for example, eight out of 43 teachers (19%) dropped out of Czerwinski et al.’s (2020) mindful colouring intervention. Over half of the 18 quantitative studies did not report on dropouts, but those that did had an average dropout rate of 10%. Only a few studies reported reasons for dropping out such as being too busy with work (Chan, 2013), or data about the dropouts, such as having significantly lower life satisfaction at baseline (Rahm & Heise, 2019).

Dosage.

The interventions varied considerably in terms of their duration, from five days to three years, and the length of sessions, from 15 min to a five-day training programme. The extraordinary variety of duration and session length means that it is not possible to determine any patterns across the studies in relation to the effects of dosage on the results obtained.

Questioning the reliability of the quantitative data

The majority of studies in this review (18 out of 22) evaluated the effectiveness of their interventions using quantitative data—indeed, many view it as a more robust approach to evaluation than using qualitative approaches (Shuval et al., 2011). However, most of the quantitative results reported by these studies have poor reliability due to small sample sizes, which are associated with low statistical power. This low power leads to a larger variability in results and can lead to inflated effect size estimates (Button et al., 2013). Of the 18 quantitative studies, three reported low statistical power as a limitation, but did not calculate the power (Frank et al., 2015; Harris et al., 2016; Hayes et al., 2020), and three calculated power (Jennings et al., 2019; Rahm & Heise, 2019; Wolf et al., 2019). Only Wolf et al. (2019) reported effect sizes that could be detected with > 80% statistical power. As an example, to detect a medium effect size with a statistical power of 80% for a t-test comparing means for matched pairs, a sample size of 52 is required—but 13 out of the 18 studies have a sample size less than 50. Although most studies did not report statistical power, the problem of low statistical power is likely to be widespread across all studies.

Summary of qualitative results

The key findings identified in the qualitative studies are summarized in Table 5.

Table 5 Summary of qualitative findings

A key trend in the findings in five of the six studies using qualitative data was the improvement to teachers’ relationships with colleagues and students. Teachers “valued time to develop better relationships with colleagues rather than just ‘working relationships’” (Sharrocks, 2014, p. 19). The interventions led to improvements in teachers sense of belonging (Cigala et al., 2019), and in one intervention the improvement of relational wellbeing was described as “the most striking difference experienced by the participants” (Wessels & Wood, 2019, p. 4). Relationships with students also improved in one of the interventions, which was ascribed to teachers spending more one-on-one time with students, greater use of student voice, and recognition of student needs (Turner & Theilking, 2019).

Three of the studies described positive impacts that related to teachers’ practice. Cigala et al. (2019) noted improvements to teachers self-efficacy, and Turner and Theilking (2019) described that teachers were more positive and calm in the classroom and more engaged with teaching. Sharrocks (2014) described how teachers reported that after focusing on their own wellbeing they felt able to handle incidents in the classroom more effectively.

Two studies described how school cultures could be a barrier to interventions. Teachers described how attitudes within a school could be a barrier to trying to improve wellbeing at the whole-school level, as “colleagues with poor mental well-being were ‘pathologised’, with ‘learning’ to cope and maintaining positive well-being perceived as the sole responsibility of the staff member” (Sharrocks, 2014, p. 19). In Hayes et al.’s (2020) study teachers described having difficulty implementing classroom management strategies as colleagues did not agree with them.

Discussion and conclusion

Our review reveals studies with a wide variety of: conceptualizations of wellbeing, content included in interventions, and measures used to assess the impact of the interventions. This variety highlights some important considerations when evaluating the evidence to support educator wellbeing programs. We use the findings of this review to illustrate key considerations for educators, schools, policy makers, and researchers when designing and implementing an educator wellbeing program.

Interventions may reduce negative outcomes, increase positive outcomes, or both

Studies in this review conceptualized wellbeing in various ways, such as reducing negative outcomes or increasing positive outcomes—or both. These conceptualizations of wellbeing drove the content of the interventions and the outcomes obtained. For example, mindfulness interventions tended to measure effects on stress, depression, anxiety, and burnout, but none measured flourishing or general wellbeing. Therefore, we can identify that mindfulness interventions reduce educator ill-being, as they generally reduced stress, but no conclusions can be drawn about their effects on educator well-being. In contrast, gratitude interventions showed improvement to positive outcomes such as satisfaction with life and positive affect, but did not measure their impact on stress. This difference is important, as Shirley et al. (2020) note, “increasing well-being and removing ill-being are two different things” (p. 2). Multi-foci interventions were the one category that tended to both reduce negative outcomes such as stress, and improve positive outcomes such as satisfaction with life. When an intervention claims to improve educator wellbeing it is important to understand whether it reduces the negative, or increase the positive, or both. This understanding will help schools to determine if an intervention will meet the needs of the educators in their context, for example, if stress is a main concern then schools should select an intervention that specifically addresses reducing stress.

Interventions work better for some people than others

Two findings showed that some people experience more successful outcomes than others. Some studies had high numbers of dropouts, while in other studies, positive outcomes only occurred for individuals with particular baseline measures. Interventions are successful when they have a good ‘person-activity fit’ (Lyubomirsky & Layous, 2013), a concept that describes how successful wellbeing interventions are influenced by the fit between features of the intervention or activity (e.g. dosage or social support) and the person (e.g. motivation, beliefs, personality). Variation in person-activity fit is evident when people who drop out of interventions have different characteristics to those that complete the intervention. Only a few studies in this review analysed dropouts, with one finding that dropouts had significantly lower life satisfaction at baseline (Rahm & Heise, 2019). Other research also shows that dropouts have different baseline measures, such as lower internal locus of control, less accurate self-evaluation, and low expectations of an intervention (Davis & Addis, 1999; Geraghty et al., 2010). Poor person-activity fit may explain why some people choose not to participate in the intervention at all. For example, in research not in this review, Woodward et al. (2007) found voluntary participants in an intervention to treat post-traumatic stress were more likely to have less severe symptoms than non-participants. As all studies in this review recruited volunteers, there may be a similar phenomenon, with participants suffering from less severe symptoms of low wellbeing than non-participants.

As well as poor person-activity fit causing participants to drop out (or not to participate at all) it may lead to less successful outcomes for some participants who remain. For example, one gratitude intervention in this review only led to increases in satisfaction with life for individuals with low gratitude scores at baseline (Chan, 2010). When assessing which interventions are likely to improve educator wellbeing, rather than look for the ‘best’ intervention, it is worth considering which intervention will work best for a particular person. For schools to determine which program would be the best fit for which staff, they could survey staff to assess their needs and determine fit with a program. For example, a school system might match people with high stress levels with mindfulness interventions that are targeted at reducing stress, or match people motivated to learn about wellbeing theory to programs that include theory modules. As providing such individualised access to different wellbeing programs is resource intensive, education systems could support this by funding programs that can be accessed by many schools, rather than schools having to fund a program individually. Alternatively, where schools are seeking to implement one program within their school, multi-foci interventions may be the most promising, as they generally showed good improvements to educator wellbeing. Multi-foci programs included a variety of content; therefore, considering the idea of person-activity fit, it is likely that most participants encountered an activity that improved their wellbeing.

School culture and relationships are important in determining wellbeing

School culture impacts educator wellbeing and is an important factor in determining the success of interventions. In general, the studies in this review focused on individuals and did not consider the influences of school culture. Few studies described the school culture in which the intervention was conducted, especially as many studies recruited teachers from a number of different schools. However, studies that did report on school culture provided valuable insights. For example, where teachers in a school participated in sessions focused on improving group climate, trust and social support increased, and stress and burnout decreased (Sottimano et al., 2018). In contrast, two of the qualitative studies highlight that, when a school’s culture is not taken into account, it can negatively impact an interventions success. Teachers were deterred from participating in a wellbeing intervention as teacher stress had been pathologized in a school (Sharrocks, 2014), and teachers had difficulty in deploying the classroom management strategies they learnt as colleagues did not agree with them (Hayes et al., 2020). Beyond these studies, most did not attend to school culture—a large gap in the research given its impact on intervention success (Halliday et al., 2019) and educator wellbeing (Cann et al., 2022, Paterson & Grantham, 2016; Van Maele & Van Houtte, 2015). In order to take school culture into account, wellbeing interventions developed in experimental settings will often need a degree of adaptation (Halliday et al., 2019). Adapting interventions through processes such as co-design can lead to interventions that are more engaging, satisfying, and useful for those taking part (Boyd et al., 2012; Thabrew et al., 2018). If schools can ensure that educators have an input into the intervention design there may be a greater increase in wellbeing.

Positive relationships with colleagues and students are associated with greater educator wellbeing (for example: Seligman, 2011). Seven multi-foci interventions included an explicit focus on an aspect of building positive relationships—such as establishing social support or doing good deeds for others—and showed improved relationships with colleagues and/or students at the end of the intervention. As the multi-foci interventions were the most effective at improving wellbeing, and were the only interventions to include activities focused on improving relationships, this may be one factor explaining their success. Other studies reinforce the importance of relationships, for example the quality of individuals’ social interactions have been linked to their wellbeing (Park, 2004; Sun et al., 2019). However, as only seven of the 22 interventions included a focus on improving relationships, this important influence on wellbeing is often overlooked. Schools and policy makers should look for interventions that aim to improve relationships in order to maximise the chances of improving educator wellbeing.

Qualitative data and follow up data provide insights into how interventions work

Studies that collected qualitative data, or follow up data, provided some insights into how the interventions worked. Only six of the 22 studies in this review collected qualitative data, yet these data provided insights that quantitative data did not. For example, in Hayes et al.’s (2020) mixed method study, they did not see any significant improvement in quantitative measures, but interviews revealed some teachers created more positive cycles of behavior in their classes that helped them to feel more confident. Other insights are provided by follow up data that shows some interventions may take a while to take effect. For example, it was not until five months after Rahm and Heise’s (2019) wellbeing education sessions that teachers’ general self-efficacy and internal locus of control showed an improvement. This may be because teachers need time to regularly practice strategies learnt in an intervention before an impact is seen. Understanding how and when an intervention works is important for identifying situations in which it is likely to successfully improve educator wellbeing. In order to develop this understanding, researchers should make greater use of qualitative methodologies, which are useful for exploring the causal processes of an intervention (Maxwell, 2004). This would also align with the call for greater use of qualitative data within positive psychology to extend our understanding of wellbeing (Lomas et al., 2020).

Limitations

Limitations to our review include the likelihood that our search did not capture all research on interventions to improve educator wellbeing. Our literature search focused specifically on the broad term wellbeing, rather than specific components of wellbeing (such as positive affect) or interventions commonly purported to improve wellbeing (such as mindfulness). Therefore, the search method excluded some articles that focused on a narrower aspect of educator wellbeing (e.g., Mindfulness to reduce stress; Flook et al., 2013). Moreover, we did not search the ‘grey’ literature (i.e., unpublished), such as studies that have not been written up, or were written up but not published (e.g., dissertations). In particular, negative or non-significant results are far less likely to be published (Franco et al., 2014).

Another limitation is the validity of the conclusions that can be drawn as we were unable to compare the same quantitative measure across different categories of interventions, as they rarely used the same measures. For example, across the 18 interventions using quantitative data the Perceived Stress Scale (Cohen et al., 1983) was used in seven studies, and was therefore one of the most used scales. However, it was not used consistently enough to compare all different categories of interventions, being used in four of the six mindfulness intervention studies, two of the six multi-foci intervention studies, and one of the three teacher professional development studies. Validity of comparisons between different types of interventions were further limited as each group of interventions included only a small number of quantitative studies, from one to six. In order to fully evaluate the potential for a particular type of intervention to enhance teacher wellbeing it may be necessary to draw on meta-analyses from a range of contexts outside of education (for example: Davis et al., 2016). These factors mean that caution should be applied when attempting to draw conclusions about the effectiveness of particular interventions.

Further research

Many studies did not clearly identify a wellbeing theory or definition. Further research should provide clarity on the theoretical approaches being used, in particular whether wellbeing is conceptualized as reducing negative phenomena (such as stress), increasing positive phenomena (such as satisfaction with life), or a combination of both. In terms of the methods used, further research could focus on: higher quality reporting of attrition (including baseline data and reasons for attrition), data about the characteristics of people for whom an intervention works, follow up data collection a while after the intervention has been completed, and more use of qualitative data. In terms of intervention design, areas in need of further research are the role of educators’ relationships with colleagues, and the role of school culture in terms of improving educator wellbeing.

Conclusion

When considering the balance of evidence, we identify multi-foci interventions (those that included several content areas within the intervention program) as the most promising approach to improving educator wellbeing. Multi-foci interventions tended to both reduce negative outcomes such as stress, and improve positive outcomes such as satisfaction with life, making them broadly applicable to a range of educator needs. As these interventions also provided a variety of content, they have an increased likelihood of producing a good person-activity fit that leads to a successful outcome for each participant. Part of their success may be the explicit focus on improving educators’ relationships with others, a factor already shown to be associated with wellbeing. However, these multi-foci interventions, along with the majority of interventions in this review, tended to ignore the influence of school culture on educator wellbeing, leading to a significant gap in the research. Far more research is needed into practices that improve educator wellbeing given its vital importance for education systems. Improving educator wellbeing is important to support the sustainability of the profession, improve student outcomes, and value educators.