Background

There is increasing focus on identifying effective strategies to improve physical activity (PA) among children and adolescents [1]. Most young people in Europe do not achieve the recommended daily accumulation of 60 min (min) in moderate-to-vigorous intensity physical activity (MVPA) [2] of the World Health Organization (WHO) [3], in spite of the well-known health benefits [4]. Both PA-related health benefits, which can persist into adult life, and a variety of health problems in adulthood, including overweight or obesity [5], appear to have their origin in early life [5, 6]. Therefore, the low compliance with PA recommendations is alarming. Since PA habits are established early in life, promoting PA from an early age is required [7,8,9,10,11]. Active school travel (AST) is a source of habitual PA for students and therefore highly recommended [12]. AST is positively related to total daily PA [13,14,15,16], school day PA [14, 17] as well as PA before and after school [14, 15, 17]. Cycling is an important option for AST. In England, those who cycled for AST accumulated on average 1.4 h of cycling per week, which contributed 20% of recommended weekly PA [16]. As a result, a higher percentage of cyclists (36%) aged 5 to 15 years meet the weekly WHO recommendation of PA compared to walkers (25%) and those who did not walk or cycle to/from school (22%) [16]. In particular, adolescent girls, who have lower levels of PA [18] and perceive more barriers to PA (e.g., lack of energy) [19], may benefit more from participating in AST than adolescent boys [14]. Previous research showed that adolescent girls from New Zealand who participated in AST were more likely to meet the PA recommendations compared to passive travelers [14]. This was not the case for boys [14].

In addition, AST has been positively associated with body composition [15, 20], positive emotions [21], and cognitive performance (only in adolescent girls) [22]. Compared to walking, cycling is generally of higher intensity [23]. Thereby, AST by bicycle contributes to cardiovascular fitness [23] and may reduce the future risk of cardiovascular diseases. In addition, AST has been positively associated with environmental factors, such as reduction of traffic [24, 25] which contributes to a minimization of air pollution [23, 25] and enhancement of road safety [24]. Furthermore, adopting a daily AST routine including journeys to and from school [26] as early as possible may lead to a potentially lifelong habit of active transport (AT) [16] including journeys to any other destination. Moreover, a study in Ireland showed that AST by bicycle increases the mobility of adolescents living further away from school [27]. Bicycles are also the fastest means of transportation for distances less than 5 km in cities, especially when car traffic is congested [28].

Studies in Germany showed that most children and adolescents aged up to 17 years own a bicycle (57 to 98%) [29]. However, only 8% [30] to 22.2% [31] cycle to/from school daily or usually. Additionally, more boys (23.8%) than girls (20.6%) cycle to school in Germany [31]. In the Czech Republic, the percentages of boys (5.7, 3.2, 2.2%) and girls (2.3, 0.5, 2%) aged 11 to 15 years who cycled to/from school between 2006, 2010 and 2014 decreased over time [32]. According to these data from Germany and the Czech Republic, cycling is a less common form of AST, cycling habits differ by gender in favor of boys, and there might be a declining trend in some European countries.

Following this, researchers have increased interest in developing AST interventions in the last years [33]. A previous systematic review focused on the effects of AST interventions aiming to promote walking [24]. No previous systematic review dealt exclusively with the effectiveness of intervention strategies targeting cycling as means of AST, which is required for adequate policy decisions in this field [1]. Thus, the aims of this systematic review were to summarize the evidence on strategies and effects of (randomized) controlled interventions that promote cycling to school as a mode of AST among primary and/or secondary school students.

Methods

The methodological procedure of this systematic review is described in detail elsewhere [34]. For drafting this systematic review, the checklist “Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement” [35] (see Additional file 1) was utilized.

Inclusion criteria

In this systematic review, (parallel-group or cluster-randomized) controlled trials (RCTs; CTs) were considered that described a school-based bicycle intervention fostering the use of bicycles in AST. Only samples that represented primary and/or secondary school students were included. The control group (CG) could be either active in terms of getting an alternative intervention program without strategies promoting AST or not receiving any kind of intervention. Only studies published in English and, due to current relevance, between 2000 and 2019 were included.

Search strategy

A comprehensive search formula with a combination of keywords in three different categories according to “PICo” (population, intervention, context) [36] was generated in collaboration with two specialists (see Additional file 2). The first literature search based on title and abstract was conducted on November 28th, 2018 and was updated on November 25th, 2019 in eight electronic databases (ERIC: EBSCO, PsycINFO: EBSCO, PSYNDEX: EBSCO, PubMed: NCBI, Scopus: ELSEVIER, SPORTDiscus: EBSCO, SURF: BISp, and Web of Science: Clarivate Analytics).

Study selection

Records were imported into and further managed with EndNote X7.4. The identified articles were screened independently by DS and TA/AM based on title, abstract, and full text in terms of their relevance and depicted in a flow chart (see Fig. 1). Any disagreements between the reviewers during these three steps of the selection process were resolved by discussion.

Fig. 1
figure 1

Procedure of study selection

Data extraction

Data regarding general study details, characteristics of participants, theoretical background, intervention description, outcome variables, measuring instruments, statistical analysis, and results were extracted using a previously piloted data extraction spreadsheet. Due to relevance, only intervention components that directly targeted AST were extracted. The authors of the included studies were contacted via e-mail with a maximum of two reminders if relevant data was missing or a clarification of descriptions was required. Therefore, the data extractors (DS and TA/AM) were not blinded to authors and journals while extracting study information. Two evaluators (DS and TA) independently coded the behavior change techniques (BCTs) applied to intervention components using the “BCT Taxonomy v1” [37]. Intervention components which could not be assigned to the 93 BCTs originally clustered in 16 main groups were classified into two newly added strategies/groups (i.e., knowledge transfer and parental involvement) by the authors according to a previously-used procedure [38]. Therefore, strategies were classified within a taxonomy of 95 BCTs clustered in 18 main groups. Any discrepancies were resolved by discussion.

Quality assessment

For the assessment of the methodological quality of included studies, the quality assessment tool for quantitative studies “Effective Public Health Practice Project” (EPHPP) [39] was used. Where an explicit reference to a joint, more detailed article (e.g., study protocol) was mentioned, this article was additionally used to complete the assessment of the study’s methodological quality. Otherwise, articles in which the same intervention was analyzed with regard to different outcomes were assessed independently. A critical judgment was made for all items within the following eight quality sections/components (see Additional file 3): (A) Selection bias (two items), (B) Study design (four items), (C) Confounders (three items), (D) Blinding (two items), (E) Data collection methods (depending on the number of collected variables), (F) Withdrawals/Drop-outs (two items), (G) Intervention integrity (three items), and (H) Analyses (four items). Each item within the eight sections was assessed as strong, moderate or weak. The methodological quality of each item was rated independently by DS and TA/AM. Discrepancies between the evaluators despite discussions were resolved by consulting another independent evaluator (YD).

The following modifications to the EPHPP dictionary were made: Regarding the section (C) “confounders”, eight potentially relevant variables were chosen (i.e., age, sex/gender, previous AST experiences at baseline level, weight status, migration background, bicycle ownership, socio-economic status, distance from home to school) based on the “Model of Childrenʼs Active Travel” (M-CAT) [40]. When studies included five to eight of these potentially relevant variables as confounders, this item was rated as strong. It was rated as moderate when only three to four of potentially relevant variables were included and it was rated as weak when less than two of potentially relevant variables were included. In a further item of this section C, the quality of confounders was rated. If relevant to the study, the consideration of “sex/gender” [32] and “migration background” [31] led to a strong rating, whereas including “age” [32, 41] and “previous AST experience at baseline level” [42, 43] in the analysis were rated as moderate (weak: the rest). In the section (G) “intervention integrity”, the (unclear) presence of any kind of co-intervention or contamination led to a weak rating (strong: no co-intervention/contamination). Within the section (H) “analyses”, the item “unit of allocation” was rated as strong for “school”, as moderate for “class”, and as weak for “individual” based on the randomization level. The “unit of analysis” was appropriate and determined as strong when analyses were adjusted according to the “unit of allocation”. This means, for example, that analyses of a study cluster-randomized at school level had to be adjusted for schools. Otherwise, the “unit of analysis” was not appropriate and rated as weak.

After rating the individual items, each of the eight EPHPP quality components were assessed as a) strong (no weak ratings and more strong than moderate ratings), b) moderate (one weak rating), and c) weak (at least two weak ratings). Finally, a global quality rating based on the eight EPHPP components in each study was performed according to a common procedure [38]. When five or more components were assessed as strong and no components were assessed as weak, the global quality of a study was rated as strong [38]. The global quality of a study was rated as moderate when at least four components were assessed as strong and no more than one component was assessed as weak [38]. A weak methodological quality was rated when two or more components were assessed as weak [38].

Data synthesis

All findings were summarized narratively by reporting effect sizes (ES), like Cohen’s d, Odds Ratio, partial Eta-squared (\( {\upeta}_{\mathrm{p}}^2 \)), and effect estimates, like confidence intervals (CI), or p-values (significant: p ≤ 0.05). The various outcome variables were grouped and studies were marked according to their effectiveness in terms of changing the related outcome(s). Results were sorted by age group (children up to the age of 12 years; adolescents from 13 years of age) [2]. If data allowed, gender effects were reported. Against our previous intention described in the published protocol [34], it was not possible to describe cultural dynamics based on regional differences. Given that only one outcome, i.e., body-mass-index (BMI), was considered in more than one intervention and measured/classified identically [44, 45], heterogeneity of variables across reviewed studies did not permit to carry out meta-analyses for intervention effects.

Results

In total, 1711 publications were found in the first search and another 208 publications in the updated search. After removal of 776 duplicates, 1143 articles were screened. Nine relevant studies evaluating seven unique interventions were included in this review [44,45,46,47,48,49,50,51,52].

Intervention characteristics and BCTs

The characteristics of the seven included interventions, evaluated in nine studies between 2012 and 2018, were heterogeneous (see Table 1). Interventions were carried out either in Europe (n = 5) or the USA (n = 2). Four interventions were designed as RCT. Only in three interventions, a sample size calculation was performed. Six out of seven interventions reported a sample size at baseline and indicated a range from 53 to 2401 participants. The number of recruited schools ranged from 1 to 25 (1 to 5 schools: n = 4; 14 schools: n = 1; 25 schools: n = 1). Primary schools and two grade levels were the most frequently chosen settings. The age of participants was up to 17 years (children: n = 4, children and adolescents: n = 3). Only five interventions reported the gender ratio of girls and boys. Interventions lasted between 4 weeks and 1 year and were classified into short-term (≤3 months: n = 3) or moderate-term (4 to 12 months: n = 2). Only one intervention included two different intervention arms (with/without parental involvement). Five interventions clearly stated that they did not deliver any kind of intervention to the CG. Three of these interventions, however, described either a provision of information (n = 1) or some kind of contamination in terms of minor interventions or similar conditions between the intervention group (IG) and CG (n = 2). Two interventions did not clearly report the conditions of the CG but mentioned contaminations, such as minor interventions, or delivery of informational letters. Three interventions reported that components were based on established theoretical frameworks, including the “Conceptual framework of AT in children”, the “Active Living by Design: 5P model” and the “Social Cognitive Theory”. One intervention was inspired by several correlates of cycling to school. In three interventions, no theoretical model was mentioned as a basis. The interventions included different components, such as a cycle training course or a bicycle train (i.e., adult-guided group of cycling children). Six interventions used a multicomponent approach with a combination of environmental, informational and behavioral (n = 2), environmental and informational (n = 1) or informational and behavioral (n = 3) components. One intervention was based on a behavioral approach only. Each intervention component was at least linked to one BCT. In total, 19 different applied BCTs were identified across the seven interventions.

Table 1 Intervention characteristics and strategies sorted by age group

These 19 different applied BCTs were clustered in a total of 11 out of 18 main groups (see Table 2), which varied in their popularity: (1) Shaping knowledge (n = 6), (2) Comparison of behavior (n = 5), (3) Repetition and substitution (n = 5), (4) Antecedents (n = 4), (5) Social support (n = 3), (6) Parental involvement (n = 3), (7) Natural consequences (n = 2), (8) Knowledge transfer (n = 2), (9) Feedback and monitoring (n = 1), (10) Reward and threat (n = 1), (11) Goals and planning (n = 1). The seven interventions used in average 4.7 main groups.

Table 2 Applied behavior change techniques in reviewed interventions sorted by age group

Study quality

All included studies were assessed as weak in the global rating but none of the nine studies had a weak rating in all eight sections (see Table 3).

Table 3 Sectional and global quality rating of reviewed studies sorted by age group

Figure 2 gives an overview of the study quality for individual sections across all reviewed studies. Due to the inclusion of RCTs and CTs only, the section with the strongest methodological quality was “study design” rated as strong in all nine studies. Additional strong ratings were found in the sections “confounders”, “data collection methods”, “withdrawals/drop-outs”, and “analyses”. In the section “confounders”, only one study [48] did not report adjustments. The other eight studies [44,45,46,47, 49,50,51,52] reported adjustments for at least two up to eight out of ten different covariates (i.e., age, distance from home to school, sex/gender, AST, BMI, race, bike score, neighbourhood disorder, attendance, accelerometer wear time). However, group differences at baseline were only absent in two studies [44, 51]. In the section “data collection methods”, three studies were rated as weak [45, 46, 51], three as moderate [47, 48, 52], and three as strong [44, 49, 50]. Referring to the section “withdrawals/drop-outs”, six studies declared drop-outs [44, 46, 47, 50,51,52] and five studies had low retention rates [45, 48, 49, 51, 52]. The sections “selection bias” and “blinding” were never rated as strong. Apart from two studies [46, 48], seven studies did not report the representativeness of the sample. Six studies [45,46,47, 50,51,52] reached a high recruitment rate. All but one study [44] either did not report blinding at all or reported unblinded conditions. In the section “analyses” ratings were either strong [46, 49, 50, 52] or weak [44, 45, 47, 48, 51] with strengths in the unit of allocation [45,46,47, 49,50,51,52] as well as statistical methods (including ES) [44,45,46, 48,49,50, 52] and deficits in the unit of analyses [45, 47, 48, 51] as well as usage of intention-to-treat analysis [44, 45, 47, 48, 51].

Fig. 2
figure 2

Quality rating of sections across reviewed studies

The weakest section was “intervention integrity” rated as weak in all nine studies. Only one study [45] indicated the percentage of intervention delivery and measurement of consistency. Moreover, six out of nine studies [44,45,46, 48,49,50] described a potential contamination in the CG.

Intervention effects

Altogether, six studies reported proportionally more non-significant than significant intervention effects [44,45,46,47,48, 51]. One study found more adverse intervention effects in boys with larger improvements in the CG [52]. Only two studies – describing the same intervention in children: a “bicycle train” to actively travel to school – showed significant beneficial intervention effects in all their seven outcomes [49, 50] (see Tables 4 and 5).

Table 4 Outcome variables, measuring instruments, covariates and intervention effects in reviewed studies sorted by age group
Table 5 Overview of outcome variables and intervention effects across reviewed studies sorted by age group

In total, 35 different outcome variables were reported across the nine included studies. These 35 outcome variables were clustered in seven main outcome groups: (1) AST (n = 9), (2) Psychosocial factors targeting both parents or students (n = 9), (3) Physical fitness divided into cardiorespiratory/muscular fitness and speed agility (n = 8), (4) PA levels (n = 4), (5) Weight status (n = 3), (6) AT (n = 1), and (7) Cycling skills (n = 1).

A significant intervention effect was found on 13 different outcomes analyzed across five studies [45, 47, 49, 50, 52], whereas seven studies reported non-significant effects on 25 outcomes in total [44,45,46,47,48, 51, 52]. Within the outcome group “AST”, one study found a significant beneficial intervention effect on bicycle trips to school by boys [52] and another study on percentage of daily cycling trips to school (β = 44.9 [CI95: 26.8, 63.0]) [50]. One study, investigating psychosocial factors only, showed significant beneficial intervention effects on parental (β = 0.46 [CI95: 0.05, 0.86]) and child self-efficacy (β = 0.84 [CI95: 0.37, 1.31]) as well as parental outcome expectations (β = 0.47 [CI95: 0.17, 0.76]) [49]. Within the outcome group “physical fitness”, one study found a significant adverse intervention effect on aerobic capacity with an unfavorable development in the IG (β = − 1.45 [CI95: − 1.92, − 1.00]) [45]. Another study found significantly higher values in the CG for boys only on VO2max (group main effect: \( {\eta}_{\mathrm{p}}^2=0.01 \)), 20-m shuttle run test (group main effect: \( {\eta}_{\mathrm{p}}^2=0.04 \)), and handgrip strength [52]. Within the outcome group “PA levels”, one study reported positive intervention effects on total MVPA (β = 21.6 [CI95: 8.7, 34.6]), MVPA from cycling (β = 23.0 [CI95: 10.7, 35.4]) and MVPA before/after school (β = 12.8 [CI95: 8.5, 17.2]) [50]. One study found a significant intervention effect on total basic cycling skills in both intervention arms (with/without parental involvement), which were taken together in this analysis [47].

The sustainability of intervention effects were examined in only two studies at 5- [47] or 6-month follow-up [51]. After participating in a 4-week cycle training course, a significant intervention effect from pre to post to follow-up for both intervention arms (with/without parental involvement) was found on total basic cycling skills but neither on cycling to school (in min) nor on parental attitudes towards cycling [47]. Significant effects at 6-month follow-up were found on “mode of trips to school” in walking only and “frequency of active trips to school” (walking/cycling) even though non-significant intervention effects from pre to post after 6 months were shown on these variables [51].

Discussion

The aims of this systematic review were to provide an overview of existing school-based interventions focusing on the promotion of AST by bicycle in children and adolescents and their evidence on strategies and effects. Following our inclusion criterion for study designs, we only found a small number of (R)CTs in our literature search. This is consistent with the reported gap of strong study designs in this research field [53]. The included trials were predominantly not conducted in cycle-centric countries within Europe (exception: Belgium (n = 1) and Denmark (n = 2)) [54], showed a large variety of components and outcome measures, and were of weak quality. Three of the included trials did not differentiate between walking and cycling as two different types of AST in their analyses [46, 48, 51, 52]. Therefore, a final conclusion on cycling to school could not be drawn from these studies. Additionally, the reported interventions were designed for children only or both children and adolescents, implemented in primary schools. The lack of interventions for adolescents, implemented in secondary schools, is also in line with the current state of research [55]. In conclusion, the findings of our systematic review need to be interpreted with caution.

Promising intervention strategies

Overall, only one intervention using a single-component approach showed consistent positive effects on all measured outcome variables [49, 50] and provides first insights into an effective intervention strategy. For approximately 2 months, a voluntary and adult-guided bicycle train to/from school with pick up/drop off stops was provided for children on schooldays [49, 50] including the following main groups of BCTs: shaping knowledge, comparison of behavior, repetition and substitution as well as antecedents. The counterpart of a bicycle train, the “walking school bus” (WSB), is based on a similar approach for walking. In a previous review, the WSB was found to increase walking to school as well as general PA levels in children [24]. However, the bicycle train intervention effect on MVPA from cycling (23.0 min/day) was higher than the intervention effect on total MVPA (21.6 min/day) [50]. This accelerometer data might suggest a compensation in total MVPA due to the additional MVPA from AST by bicycle.

The only study that performed a sex/gender analysis reported increased bicycle trips to school in boys but not in girls [52]. As boys had higher levels of health-related fitness than girls despite comparable low cycling to school rates at baseline [52], poor fitness could be a barrier to uptake AST by bicycle in girls. More research on the existence and explanation of gender differences in intervention effects is warranted in future studies to draw final conclusions.

Strengths/limitations

The major strengths of this systematic review are the specific focus on school-based interventions that promote cycling to school and including only (R)CTs, which provide a higher evidence level than other study designs [56]. Two researchers independently conducted the process of selecting studies, extracting data, evaluating methodological quality and BCTs. Furthermore, authors of included studies were contacted in case of missing data to avoid an underestimation of the methodological quality. Finally, findings were interpreted separately from the methodological quality rating in order to provide transparency.

A limitation is that the defined criterion of including only (R)CTs could have led to a selection bias [53]. The same applies to the restriction of studies published in English. At study level, there are several reasons for a lack of effectiveness. One reason could be the complete absence of intervention periods longer than 13 months [57]. According to the “Transtheoretical Model of Behavior Change”, “individuals may need to go through a number of stages associated with the formulation and implementation of attitudes and beliefs before actually undertaking changes, and this whole process takes some time” [58] (p. 68). This is why a lack of immediate success in short- or moderate-term interventions might not necessarily indicate a failure of the intervention [59]. The adoption and integration of cycling to school into the daily routine could have happened after the observed period. Another reason for a lack of effectiveness could be that different local needs in terms of barriers to cycle to school were not sufficiently addressed in interventions [60]. In a previous study, barriers of AST in general were categorized according to the “Social-ecological model of the correlates of AT” [61]: intrapersonal/individual (i.e., child factors), interpersonal (e.g., parental factors), community (e.g., school policy), and environment (e.g., traffic) [62]. One multicomponent intervention among children was inspired by correlates of cycling to school considering such barriers (e.g., intrapersonal/individual including motivation by competitions and safety by cycle training, interpersonal including parental involvement, community including school policies, and environmental changes including traffic regulation) and used almost the same BCTs as the effective bicycle train intervention (apart from repetition and substitution including behavior substitution, habit formation, habit reversal) [45]. Despite this, this intervention was not effective on any outcome in favor of the IG [45]. Furthermore, the improvement of basic cycling skills in a cycle training program among children (examined in only one intervention) without practicing traffic-related skills in the natural environment may be insufficient to impact AST by bicycle [47]. Moreover, a cycle training program including parental assisted homework tasks (e.g., identification of the safest school cycling route and the most dangerous traffic spots close to the school) after each cycle training failed to find effective ways of involving parents as an intervention strategy [47]. The reason could be that the homework tasks insufficiently addressed or increased personal safety barriers in parents (e.g., fears, dangers, concerns about the child’s behavior in road traffic) [62]. This may have blocked behavior change in their child as the influence of parents on AST is higher among children than adolescents [40]. Therefore, adolescents may need different intervention strategies than children because all five studies that effectively influenced 13 of 24 examined outcomes included children only [45, 47, 49, 50, 52] and three of the four studies that were not effective in influencing any outcome included both children as well as adolescents [44, 46, 48]. To adequately tailor interventions to a specific population, we recommend following a systematic approach when developing interventions (e.g., the “Intervention Mapping Approach” including a comprehensive needs assessment and theoretical frameworks [63]). Moreover, we recommend conducting a process evaluation that provides insights into the implementation of the intervention (e.g., feedback on program and material, (dis)satisfaction). In addition, we recommend using a checklist when reporting the study. Adherence to the planned intervention (e.g., delivered intensity) was lacking in the majority of studies although “the dose of an intervention is a key predictor of behavior change” [58] (p. 68). Furthermore, contamination was quite common and could have caused an underestimation of effects. Finally, interpretations of findings could be biased due to group differences at baseline.

Conclusions

As a result of the heterogeneity and low methodological quality of included studies, we conclude that the evidence for the effectiveness of interventions promoting AST by bicycle is insufficient. Therewith, our findings confirm that this research field is still in an early development stage [57]. Nevertheless, there is an indication that a bicycle train to/from school among children in primary school, including four clustered main groups of BCTs (shaping knowledge, comparison of behavior, repetition and substitution as well as antecedents), is a promising intervention. More research is needed to better understand strategies and effects of school-based interventions promoting AST by bicycle, especially among adolescents in secondary school.

Based on the findings of this systematic review, there is a need for high-quality intervention studies in this research field. This is why future studies are recommended to evaluate theory-based interventions in longer-term (R)CTs using relevant, valid and reliable outcome measures. Additionally, more research is warranted to examine the moderating effect of gender in AST interventions by bicycle and to prove long-term maintenance of behavior change.