Key points

  • School-based physical activity interventions may have small beneficial effects on anxiety, resilience, well-being and positive mental health, while the effect of reduced sedentary behaviour on mental health is unclear.

  • Future studies should more clearly report on implementation, describe the activities of the control group and whether the activity is added to or replacing ordinary physical education lessons in order to facilitate the interpretation of results.

  • It is unclear what type of interventions provides the best effect on mental health and by which mechanisms they work.

Introduction

Mental health problems have increased among children and adolescents over a number of years in high-income countries, especially in Northern Europe [1], but the reasons for this remain elusive. According to figures from the 2017 Global Burden of Disease study, anxiety and depressive disorders are among the top four leading causes of the disease burden among young people in Western Europe and top six in Sweden [2]. In general, a larger proportion of girls and young women report mental health problems as compared to boys and young men but they all follow the same patterns of increase over time [3]. The National Board of Health and Welfare in Sweden reported that the number of children and adolescents who have received healthcare for depression or anxiety has increased during the period 2006–2016 [4]. An analysis of factors associated with this apparently increasing trend of mental health problems did not specifically point out changes in family or socioeconomic factors, but instead highlighted the issue of increasing stress in school and worries related to further education and career opportunities in the longer perspective as possible factors behind this development [5]. This raises the question of whether schools can intervene to prevent or delay the onset of mental ill-health and/or promote the development of positive mental health defined as a state of well-being where individuals can cope with the normal stresses of life and successfully participate in everyday life [6]. Schools are an effective setting to reach children at no extra cost to the participants and their families. Several school-based psychological universal prevention programmes have been carried out with modest but significantly positive effects on depression in younger children [7] and on depression and anxiety in older children [8].

One type of intervention which has received attention in recent years is physical activity. Physical activity is defined as any bodily movement that gives rise to increased energy expenditure above resting level [9]. Few children and youth reach recommended levels of physical activity worldwide and specifically in high-income countries [9,10,11], including Sweden [3]. Physical activity can differ according to type of activity e.g. yoga or football, frequency (times per day or week), duration (minutes or hours) and intensity measured by age-related maximum heart rate. Previous reviews have demonstrated beneficial psychological benefits of physical activity such as reductions in levels of depression among children and adolescents [12,13,14] in addition to their general health promoting effects. Moreover, strong and consistent relationships have been found between sedentary time using screens for leisure and depressive symptomatology and psychological distress, respectively [15]. Prevention programmes can be universal reaching all children or targeted at groups with elevated risk or with clinical symptoms [16]. Targeted interventions usually result in larger effect sizes [17]. Systematic reviews of universal or targeted interventions not restricted to the school setting have concluded that physical activity has beneficial effects on psychosocial outcomes such as externalising [17] and internalising mental health problems [17], self-concept [17, 18], self-esteem [19], academic achievement [17] and overall mental health [14]. Liu et al. [18] reviewed the effects of physical activity interventions mainly involving children with obesity, disability or very inactive children in the school setting. These authors concluded that physical activity had a positive effect on self-concept and self-worth and that the effect was stronger in school-based settings compared to other settings.

To the best of our knowledge, no systematic review has yet been conducted focusing on school-related interventions increasing physical activity or decreasing sedentary behaviour with the aim of improving mental health or reducing mental ill-health in general populations of school children. Therefore, there is a need to systematise current knowledge regarding the effectiveness of school-based physical activity interventions on mental health, to specify the optimal type of interventions and to clarify mechanisms of action. Such knowledge can be used by policy-makers and schools as a basis for actions to promote positive mental health and prevent mental ill-health in school-aged children. The aims of the systematic review were as follows:

  1. 1)

    To study the impact of school-related physical activity interventions or interventions to reduce sedentary behaviour on symptoms of mental health in terms of internalising mental health problems and positive mental health in children aged 4–19 years

  2. 2)

    To investigate possible moderators of these effects such as age, sex, socioeconomic status, family structure, geographical location, focus of the intervention, type of control group, level of implementation and study quality

Methods

Study registration and protocol

This review adheres to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement for reporting systematic reviews and meta-analyses [20]. It was registered with the International Prospective Register of Systematic Reviews (PROSPERO; registration no. CRD42018086757) available from: https://www.crd.york.ac.uk/prospero/.

Search strategy

A literature search was conducted on March 16, 2018, with an updated search on October 24, 2019, using the following databases: MEDLINE; Epub Ahead of Print, In-Process & Other Non-Indexed Citations; Ovid MEDLINE(R); Daily and Ovid MEDLINE (R) (Ovid); PsycINFO (Ovid); Web of Science Core Collection; ERIC (ProQuest); and Sociological Abstracts (ProQuest). Search terms were used to describe the population (e.g. school student), intervention (e.g. physical activity), outcomes (e.g. mental health) and study design (e.g. RCT). Studies were limited to English and Swedish language (see Online resource 1 for the full search strategy). Reference lists from studies meeting inclusion criteria as well as recent reviews in the field were hand-searched.

Inclusion and exclusion criteria

The criteria for inclusion in the review were peer-reviewed original empirical studies published between January 2009 and October 2019. Studies were included if the population consisted of general population samples of children in preschool, primary school and secondary school, aged 4 to 19 years.

All types of school-related or school-initiated interventions were included. This could be single- or multicomponent interventions, conducted in- or outside school, with a component aiming to increase physical activity or decrease sedentary behaviour. Examples were active breaks during the school day, policies, regulations or environmental changes that can promote physical activity or reduce sedentary behaviour. We included only randomised controlled trials (RCT), cluster-RCTs (cRCT), quasi-experimental or longitudinal observational study designs with a control or comparison group. The comparison group had to come from the same base population or should be matched on key factors and could be a non-exposed group, a physical education-(PE)-as-usual group, a waitlist control group or other-intervention-without-physical-activity group.

Studies were included if they reported any one of the following primary or secondary outcomes at both baseline and post-intervention: Positive mental health defined by well-being, health-related quality of life, happiness, self-esteem, self-confidence, self-compassion, self-efficacy, resilience, positive effect and coping, internalising mental health problems defined by emotional problems, worries, anxiety, negative effect and depressive symptoms. The outcome should be measured by a valid and reliable rating scale suitable for children and adolescents. When more than one relevant outcome was described in the same study, the overall concepts were given priority over subdomains of the concept. Only studies using established and validated measures of the indicated outcomes suitable for children and adolescents were included.

Studies were excluded if they targeted purely clinical populations, if the intervention was not school-related, or the aim was not to increase physical activity or reduce sedentary behaviour. For pragmatic reasons, studies were also excluded if they solely addressed the following aspects of positive mental health and internalising mental health problems (outcomes): self-realisation, working ability, the ability to contribute to society, self-destructive behaviour, problematic eating behaviour and psychosomatic disorders such as recurring pain, sleep problems or stress. Interventions not requiring additional energy expenditure such as mindfulness were also excluded.

Data extraction

Two authors (S.A. and S.J.) independently screened the titles and abstracts of the identified articles. Articles judged as potentially eligible by at least one author were imported into EndNote Reference Manager, version X6 (Thomson Reuters, Philadelphia, PA) and retrieved for full-text review. Both authors independently read the full text of these articles using the established inclusion and exclusion criteria. Disagreements were resolved through discussion with a third author (L.S.E.). From the included studies, two authors (S.A. and S.J.) independently extracted relevant information into a spreadsheet in Excel with the help of a standardised checklist. Extracted items included main author, year of publication, study design, population characteristics and sample size, characteristics of the intervention, type of control group, relevant mental health outcomes (mean scores and standard deviation (SD) or difference in mean scores and standard error (SE) at baseline and at end of intervention), instruments used, time of measurement and main findings. The extracted data were compared and in case of disagreement, a third author (L.S.E.) checked the data. If relevant data were not included in the article, the corresponding author was contacted and asked to supply the data. If no answer was received after 1 month, a reminder was sent. If no answer was received after additional 2 weeks or if authors were unable to provide the requested data, the paper was excluded and the reason documented. Finally, data were transferred into the Comprehensive Meta-Analysis Software (CMA version 3.0, Biostat. Inc., Englewood NJ, USA) for the meta-analysis. A p value of < 0.05 was used to indicate statistical significance.

To capture the implementation of the intervention, the following data were extracted: implementation fidelity, dose, quality, responsiveness, reach and adaption. However, if the information provided in the included articles was not sufficient, the literature was searched for additional publications containing this information. Despite these efforts, the only aspect with enough data to allow for comparisons across studies was reach, i.e. the proportion of children reached by the intervention. Implementation reach was categorised on a scale from 1 to 4 with 1, 80–100% (high); 2, 60–79% (moderate); 3, < 60% (low); and 4, unknown.

Data preparation

Before the meta-analysis could be conducted, a number of decisions were made regarding which scales and instruments to combine in each outcome and the appropriate method to achieve this. If a study reported results of comparisons for multiple intervention groups with one control group, the combined mean and SD of the intervention groups was calculated before calculating the effect sizes [21]. Likewise, if results were reported separately for boys and girls, we calculated a combined mean and SD. If two relevant scales were used simultaneously in a study population to capture different aspects of the same outcome, a merged mean and SD was calculated for the two outcomes, given the scales had the same metrics [21]. Otherwise, one of the scales was chosen in order to avoid multiple dependent effect sizes within studies, which would assign more weight to studies with more outcomes. The selection of relevant outcomes from each study was done by a consensus procedure between three of the authors (S.A., M.H. and L.S.E.) based on theoretical grounds.

Meta-analysis

Owing to the anticipated heterogeneity across studies, we conducted a random effects meta-analysis. From each included study, unadjusted mean scores and SD at baseline and follow-up were entered for the intervention and control groups. For studies that did not report unadjusted mean scores, adjusted mean scores or differences in means and SEs were entered. If a study reported results from multiple follow-ups (e.g. post-intervention, 6-months, 12-months), the first follow-up point (post-treatment) was chosen to compare with the baseline score. None of the studies reported within-group correlation (i.e. pre to post-intervention), but we assumed a within-group correlation of 0.7 [21]. Where studies reported the standard error (SE) or confidence interval (CI) instead of the SD, the SD was calculated [21]. The effect size of each included study was calculated by computing mean difference (posttest-pretest) between the intervention and the control group and divided by the pooled standard deviations.

The pooled standardised mean difference (SMD) was then calculated as the difference in mean scores between the intervention and control groups summed across studies. As the SMD is subject to bias due to small sample size [21], we report the corrected SMD (Hedges’ g) together with 95% confidence intervals (CIs) and p values. A positive value of Hedges’ g indicates a positive effect of the intervention, while a negative value indicates the opposite. Values of Hedges’ g 0.2, 0.5 and 0.8 represent a small, medium and large effect size, respectively. The I2 statistic is the proportion of the observed variance that is due to the true between-study variance; i.e. heterogeneity. Values in the order of 25%, 50% and 75% might be considered as low, moderate and high, respectively [22]. Significance can be inferred by the p value for heterogeneity, the Q statistic. A significant value for Q confirms the hypothesis that the true effect size differs across studies.

Quality assessment of studies

Two authors (S.A. and L.S.E.) independently assessed study quality using the Effective Public Health Practice Project (EPHPP) Quality Assessment Tool for Quantitative Studies [23]. The EPHPP has a rating scale of 1–3 (1 = strong, 2 = moderate and 3 = weak). Quality was assessed on selection bias, study design, confounders, blinding, data collection methods and withdrawal and dropouts. Selection bias was scored based on population representativeness and percentage agreeing to take part. The EPHPP tool does not mention cluster RCT studies but we decided also to award the score ‘strong’ for this study design. Confounders were scored based on reported differences with regard to relevant confounders between groups at baseline and on the percentage of reported confounders controlled for. Blinding was scored based on whether the participants were blinded to the research question, and the assessors were blinded to the group allocation. Data collection was scored based on the evidence reported for validity and reliability of the measurement tools used. Finally, withdrawal and dropout were scored based on the percentage of participants completing the study. A global rating was then determined based on the ratings of the above constructs. A ‘strong’ global rating was awarded if no weak ratings were present, a ‘moderate’ global rating if there was only one weak rating and a ‘weak’ global rating if there were two or more weak ratings. Intervention integrity (assessed by whether the intervention consistency was measured; what percentage received the intervention; was there potential for contamination) and appropriate analysis in relation to the research question(s) (unit of analysis; unit of allocation; statistical analysis; intention to treat) were also assessed. However, the scoring of these constructs did not contribute to the overall rating score.

Risk of publication bias

To detect the risk of publication bias across studies, we used funnel plots to examine the asymmetric distribution of studies around their mean effect size in the outcome variables, and Egger’s tests for the association between sample sizes and effect sizes that were included in the meta-analysis for each outcome (i.e. tests for asymmetric funnel plots). To quantify the effect of potential publication bias on meta-analytic summary effects, Duval and Tweedie’s trim and fill method was applied if there was significant risk of publication bias. This procedure estimates the summary effect after adjusting for potential publication bias.

Moderator analysis

Moderator analysis was done narratively. For the narrative analysis, studies were grouped into three categories for each outcome, those with a statistically significant negative (not desired) effect, those with a null effect and those with a statistically significant positive (desired) effect. These groupings were then compared to different levels of the potential moderator, e.g. focus of the intervention.

Results

Study selection

The search resulted in 14,821 hits and after removal of duplicates 10,265 unique titles remained. Duplicates were removed via the EndNote Reference Manager software.

The flowchart is shown in Fig. 1.

Fig. 1
figure 1

Prisma flow chart

Thirty-one articles were included in the analysis, representing 30 different intervention studies [24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54], all of which were published in English. There were three studies by Melnyk et al. and one by Ardic et al. [26, 44,45,46] representing the same intervention, namely Creating Opportunities for Personal Empowerment (COPE). Two of the studies by Melnyk et al. [44, 46] were from the same intervention, and therefore the long-term follow-up study [46] was not included in the meta-analysis. Studies read in full text and excluded were documented with reasons for exclusion and are shown in Online resource 2.

Study characteristics

Characteristics of included studies are shown in Table 1. The sample size varied from 19 [45] to 2797 participants. Mean age varied between 8 and 17 years, and the proportion of females varied from 31 to 100%. Socioeconomic status was mixed or low in most studies and unknown in ten studies [25, 28, 33, 35, 39, 40, 43, 50, 53, 54]. Included studies came from twelve countries. Eight studies were conducted in the USA [34, 36, 41, 44,45,46, 48, 53], and six were from Australia [30, 33, 40, 42, 47, 49]. The rest were conducted in Great Britain (n = 5) [24, 27, 32, 37, 38], Ireland (n = 2) [29, 52], Germany (n = 1) [39], China (n = 1) [35], South Korea (n = 1) [54], Canada (n = 1) [28], Denmark (n = 1) [31], Spain (n = 2) [43, 51], Norway (n = 1) [50] and Turkey (n = 2) [25, 26].

Table 1 Characteristics of included studies

There was a large variation in the content of the interventions from ordinary school physical exercise, sport and recreation, yoga and playground modifications, to more extensive programmes such as COPE. However, no study had the reduction of sedentary behaviour as a primary aim. We categorised the focus of the interventions into four different types as ‘body’ (N = 8) [27, 28, 33, 35, 40, 50, 51, 53] ‘body-education’ (N = 11) [24, 29,30,31,32, 38, 39, 42, 43, 49, 52] ‘body-mind’ (N = 6) [34, 36, 37, 41, 48, 54] or ‘body-education-mind’ (N = 5) [25, 26, 44,45,46,47]. By ‘body’ we mean interventions aimed at improving body strength physical activity. By ‘education’ we refer to interventions containing learning elements, while ‘mind’ means efforts aimed at strengthening mental processes. See Table 2 for a categorisation of other potential effect moderators. The duration of the interventions varied from 4 weeks to 4 years. The level of implementation reach was low in six studies [24, 27, 28, 30, 38, 48], medium in two studies [32, 41], high in eighteen studies [25, 31, 33, 34, 36, 37, 39, 40, 42,43,44,45,46,47, 49,50,51, 53] and unknown in five studies [26, 29, 35, 52, 54]. A description of qualitative implementation factors (fidelity, dose delivered or received, responsiveness, level of adaptation) is shown in Online resource 3.

Table 2 Potential effect moderators

The control groups received PE as usual (N = 21) [24, 25, 27, 28, 30, 31, 33, 34, 36,37,38,39,40,41, 47,48,49,50,51,52,53], attention control programmes without physical activity (N = 4) [26, 44,45,46], other physical activity (n = 1) [43] or were a waitlist control (N = 4) [29, 32, 35, 42] while for one study [54], the activity of the control group was not reported. The study designs were RCT (N = 9) [33, 34, 36, 37, 41, 47, 51, 53, 54], cRCT (N = 15) [24, 25, 29,30,31,32, 35, 38, 42,43,44,45,46, 48,49,50], quasi-experimental (N = 5) [26, 28, 39, 40, 52] and observational study (N = 1) [27].

In total, nine outcomes were identified, based on at least 3 studies each. These were symptoms of depression, anxiety, emotional problems, negative effect, well-being, health-related quality of life, self-esteem and self-worth, positive effect and resilience. In addition, two composite outcomes were defined: internalising mental health problems and positive mental health. Instruments measuring each outcome are presented in Online resource 4 and a definition of these concepts is given in the “Methods” section (inclusion and exclusion criteria).

Risk of bias within studies

Study quality was weak, moderate or strong (Table 2, details in Online resource 5). Four studies had strong quality [33, 35, 47, 50], 16 had moderate quality [24, 26, 30,31,32, 34, 36,37,38,39, 41, 42, 44, 46, 49, 51] and 11 had low quality [25, 27,28,29, 40, 43, 45, 48, 52,53,54]. The main weaknesses were lack of blinding of participants and assessors, and selection bias.

Meta-analytic results

Results of the eleven meta-analyses are shown in Table 3. The number of studies included in each meta-analysis ranged from 4 for resilience to 26 for positive mental health. Figure 2 shows the forest plot of the composite outcome internalising mental health problems, and Fig. 3 for positive mental health.

Table 3 Meta-analysis
Fig. 2
figure 2

The effects of physical activity interventions in school on internalising mental health problems. Horizontal lines represent standardised mean difference (Hedges’ g) and 95% CIs. The diamond represents the overall estimated effect. The size of the box represents the weight of each study

Fig. 3
figure 3

The effects of physical activity interventions in school on positive mental health. Horizontal lines represent standardised mean difference (Hedges’ g) and 95% CIs. The diamond represents the overall estimated effect. The size of the box represents the weight of each study

Of the eleven outcomes measured, the effect of physical activity was significant (beneficial) for four outcomes, anxiety (Hedges’ g = 0.347, 95% CI = 0.072; 0.623, p = 0.013), resilience (Hedges’ g = 0.748, 95% CI = 0.326; 1.170, p = 0.001), well-being (Hedges’ g = 0.877, 95% CI = 0.356; 1.398, p = 0.001) and the composite outcome positive mental health (Hedges’ g = 0.405, 95% CI = 0.208; 0.603, p < 0.001). For the remaining outcomes, the meta-analysis showed no evidence of significant pooled effects of the interventions compared to controls (Table 3). Significant Q statistic and I2 between 59% and 98% indicated moderate to very high heterogeneity across results for all outcomes. An exception was the results for positive effect, where heterogeneity was low (I2 = 2%).

Moderator analysis

Several potential moderators were analysed narratively for their effect on the outcomes for which more than 10 studies were included: internalising mental health problems, positive mental health, self-esteem, well-being and HRQOL. Outcome for each study was tabled (not shown) as significant negative effect, no effect or significant positive effect. Interventions were divided into the four types ‘body’, ‘body-education’ ‘body-mind’ and ‘body-education-mind’ (Table 2). The control groups could be divided into three categories: PE as usual, waitlist control, other physical activity or other activity but not physical. Other factors included in this analysis were sex distribution of the target group, age group (≤ 12 years or > 12 years), socioeconomic status (low, mixed, high), level of implementation reach (low, medium, high) and study quality (low, medium, high). Two factors showed a pattern for the outcome internalising mental health problems. One was age, where interventions in younger children showed a significantly negative or no effect and those in older children showed a significant positive or no effect. Negative effects on younger children were found in three studies [36, 39, 49]. One involved ashtanga-informed yoga three times per week for 12 weeks which led to significantly lower global self-worth and more internalising mental health problems compared to the control group [36]. Another intervention containing weekly 90-min health-promotion PE lessons consisting of strength and endurance training led to a significantly higher level of emotional problems [39] compared to the control group. The third study [49] involved specialist-taught physical education classes which led to significant higher level for depression compared to the control group. A common pattern for the three studies [36, 39, 49] with negative effects on internalising mental problems was that they all addressed younger children, and all had high implementation reach, moderate quality and a control group that received PE as usual with the same frequency and duration as the intervention group. For implementation reach, the studies with a high reach showed a significant negative or no effect on internalising mental health problems, and those with a low level of implementation showed no or a positive effect. No moderator pattern was identified for the outcomes self-esteem, well-being or positive mental health.

Effects of publication bias across studies

Evidence for risk of publication bias was found in the meta-analysis for depressive symptoms (Egger’s p value = 0.024), anxiety (Egger’s p value = 0.045), well-being (Egger’s p value = 0.040), health-related quality of life (Egger’s p value = 0.029) and positive mental health (Egger’s p value = 0.022) but not for the other outcomes (Table 4). Nevertheless, publication bias did not appear to effect the conclusion about the effects of physical activity in school on these five outcomes. For anxiety, well-being, health-related quality of life and positive mental health, the corrected standardised differences in means (Hedges’ g) were unchanged after adjustment by the random effect trim and fill method. For depression, adjustment by the random effect trim and fill method changed the corrected standardised difference in means (Hedges’ g) from − 0.006 to − 0.0131, and the association remained non-significant (Hedges’ g adjusted 95% CI = − 0.330; 0.068). It should be noted that the power of statistical tests, especially Egger’s test, was low due to the small number of included studies, as shown by the wide confidence intervals.

Table 4 Analysis of publication bias

Discussion

Main results

To our knowledge, this is the first systematic review of school-based physical activity and sedentary behaviour interventions for children and adolescents in the general population, with self-reported mental health as the outcome. In total, 31 articles, describing 30 interventions were included. None of the included interventions were intended primarily to reduce sedentary behaviour. Out of eleven studied outcomes, we found beneficial effects of the interventions on positive mental health (Hedges’ g = 0.405), anxiety (Hedges’ g = 0.347), well-being (Hedges’ g = 0.877) and resilience (Hedges’ g = 0.748).

Relevance of results

The results of the current review are encouraging, since school-based interventions can be delivered to all children without costs being incurred by families. Such interventions are also shown to have numerous other cardio-metabolic health benefits, especially in high-risk youngsters with obesity or high blood pressure [55, 56]. A recent systematic review, not limited to the school context or to intervention studies, also concluded that physical activity has a beneficial role in mental health in pre-schoolers, school children and adolescents [14], but with smaller effects sizes than in the present review. However, this review had some weaknesses as only two of the intervention studies included in our review had been identified. Furthermore, the authors included multiple outcomes from the same study in the meta-analysis, which assigned too much weight to those studies and decreased heterogeneity considerably [21]. Therefore, their estimates should be interpreted with caution.

If the increase in mental health problems among children and youth is partly caused by increased school stress, as suggested in a newly published report from Sweden [5], this increases the pressure on schools to implement evidence-based initiatives to halt or reverse this negative trend. The results from this review are therefore very encouraging, because they indicate that schools can counteract this development by implementing initiatives to increase physical activity during the school day. However, the present results should be interpreted with caution as the number of studies for some outcomes was relatively small, and the meta-analyses showed high heterogeneity. The fact that some studies, using relative intensive interventions, reported negative effects in younger children, points out the importance of monitoring mental health when introducing school-related physical activity interventions. This variation between results could be explained by at least three factors. First, the interventions themselves varied considerably in terms of content, duration, frequency and intensity. Some studies also combined physical and other activities, making it difficult to disentangle effects of different programme components. Second, reporting of the implementation of the interventions was very heterogeneous or absent. Taken together, we noted a large variation in fidelity and the only implementation factor we could compare between studies was reach. This means that differential effects of the interventions could also depend on how well they were implemented. Implementation of physical activity and other behavioural interventions in schools is a well-known challenge [57] and deserves greater attention and standardisation in future studies. Third, the control groups were mostly not inactive but frequently performed other activities; a design problem also shown to reduce the magnitude of effect sizes in studies of exercise for depression in adult populations [58, 59]. Therefore, as the narrative moderator analysis showed, it is not possible to recommend one specific intervention over the other. On balance, however, the results support previous findings that physical activity interventions implemented in diverse contexts have benefits for school-aged children and adolescents [14, 17].

For internalising mental health problems, age appeared to moderate programme effects, with older children over the age of 12 years experiencing favourable or no effects and younger children experiencing negative or no effects. Considering that the average age of onset for anxiety disorders is 11 years [60] and 11–13 years for depressive disorders [61], prevention effectiveness may vary depending on not only the type of intervention, but also the age or developmental stage of the child [7]. Except for age, we found no systematic pattern in effectiveness regarding the type of intervention, sex of the participants, socioeconomic status, implementation reach, study quality or the type of control group on internalising mental health problems. More studies are needed to investigate the influence of these variables on programme outcomes. For positive mental health, no systematic pattern was found regarding potential effect moderators.

Previous reviews examining the effect of different physical activity interventions on anxiety and depression in children and youth have shown varying results [7, 13]. In a systematic review from 2006, Larun et al. [13] examined the effect of exercise in prevention and treatment of anxiety and depression among children and young people and reported a statistically significant difference for depression but not for anxiety. Only one of the included studies in the review by Bonhauser et al. [62] from 2005 would have qualified for the present review but was excluded due to year of publication (i.e. before 2009). In this cRCT targeting 15-year old school children with an intervention involving extra physical exercise compared to PE as usual, a significant beneficial effect was reported on anxiety, but not on depression. The authors concluded that a school-based programme to improve physical activity in adolescents of low socioeconomic status achieved significant benefits in terms of physical fitness and mental health. This study supports our findings that physical activity in the school setting can reduce anxiety but not depression in adolescents.

Few previous reviews have investigated the effect of physical activity interventions on positive mental health, including resilience, in general populations of school children. The concept of positive mental health is a multidimensional construct [63]. Factors that have been shown to be positively correlated to positive mental health include male sex, younger age, higher education, higher income and social relations [64]. Barry et al. [64] notes that the concept is connected to socio-cultural norms. Resilience refers to a dynamic process encompassing positive adaptation within the context of significant adversity [65]. The beneficial effects on resilience in our review suggest that physical activity interventions in the school context may be important to help children cope with adversities. However, the result is based on only four studies and more research is therefore needed in this area. A review by Khalsa et al. [66] investigating the effect of yoga interventions in the school context on mental, emotional, physical and behavioural health characteristics concluded that yoga is a potentially effective strategy to improve child health in the school setting. The review included 47 yoga studies with different study designs. Like our review, the included studies were heterogeneous in terms of duration and frequency.

In order to deliver successful interventions, it is important to know by which mechanisms physical activity is leading to changes in mental health. Based on the literature, Lubans et al. [67] developed a conceptual model for the effects of physical activity on mental health by three mechanisms: neurobiological, psychosocial and behavioural (e.g. by improving sleep). In our review, we only analysed self-reported psychosocial outcomes as indicators of mental health, which are also the most commonly reported. However, it is possible that some interventions may work through the other two mechanisms to improve mental health. As emphasised by Lubans et al. [67], improving our understanding of the mechanisms of how physical activity leads to better mental health may assist in the development of more specific and effective interventions.

Strengths and limitations of the review

The review has several strengths such as the comprehensive literature search in nine databases, pre-registration of the study protocol in the Prospero database, and that the search, data extraction and quality assessment was done by two researchers independently. The review also has some limitations. Although we searched 9 databases, we might have missed some relevant articles in other languages than English. The interventions varied considerably as did the control groups resulting in high heterogeneity in effect sizes. The selection of instruments for each outcome and prioritisation among instruments involved some degree of arbitrariness, which led to slightly different pooled effect sizes depending on which instruments were included. Other investigators do not always describe which instruments are included under each outcome. We decided to include this information in Online resource 4 for transparency reasons. More research is required to reach consensus in the research community regarding how to combine instruments under different outcomes for meta-analytic purposes. The included studies were also of mixed methodological quality, and several of them were underpowered. Moreover, for pragmatic reasons we decided to exclude studies which solely included broader aspects of positive mental health and internalising mental health problems. We could thus have overlooked important findings. New studies are under way [68,69,70,71] and results can be expected within a few years. These may show whether the current findings can be confirmed and, if so, what type of interventions give the best effects.

Conclusions

The results of this systematic review indicate that school-related interventions aiming to promote physical activity can reduce anxiety, increase resilience, increase well-being and improve positive mental health of children and young people. Considering the positive effects of physical activity on health in general, these findings may reinforce school-based initiatives to increase physical activity. Future studies should more clearly describe the activities of the control group and whether the activity is added to or replacing ordinary physical education lessons in order to aid the interpretation of results. Our findings also highlight the need for more high-quality universal physical activity interventions in the school context and standardised reporting of implementation. To further understand how such interventions work and can be used in practice, there is a need to focus on mechanisms of action and on evaluation of the implementation process.