FormalPara Key Points

Findings from our umbrella review including 14 meta-analyses suggest that RT is an effective means to improve proxies of physical fitness in healthy children and adolescents beyond a level achievable from growth and maturation.

This umbrella review indicates that there are few consistent moderating effects of maturation, age, sex, expertise level, or RT type on muscle strength and muscle power across the included meta-analyses.

This umbrella review identified current gaps in the literature and suggests that future RT research should consistently report data on participants’ maturational status. Pre-pubertal children as well as girls irrespective of their maturational status should be specifically targeted in future research.

1 Introduction

Despite previous misconceptions on the effectiveness and safety of youth resistance training (RT), more recent studies show convincing evidence of RT on markers of performance and health in healthy children and adolescents, if appropriately prescribed and supervised [1,2,3]. In a position statement on youth RT, Lloyd et al. [3] summarised findings from original research, systematic reviews, and meta-analyses and reported that different types of RT (e.g., plyometrics, machine-based RT) have the potential to improve health- (e.g., improved body composition, psychological well-being) and performance-related outcomes (e.g., gains in muscle strength and muscle power). Gathering information from original research in the form of controlled or even randomised-controlled trials is a first step to advance our knowledge in this field of research. Subsequently, findings from original research can be summarised in systematic reviews and statistically aggregated in meta-analyses. However, these publication types (i.e., meta-analyses) are limited in as much as they have a rather narrow focus on one specific outcome measure, a specific population, or a specific RT type. Given these methodological limitations, it is challenging to establish comprehensive recommendations as well as robust pooled results for the overarching topic of youth RT. Further, in-depth literature reviews reveal that there are conflicting results from meta-analyses on youth RT, most likely due to different methodological approaches (e.g., searched databases, search syntax, inclusion criteria, year of literature search) and applied methods (e.g., different statistical methods). For instance, while Lesinski et al. [4] found large effects of RT on muscle power, Collins et al. [5] reported small effects. Furthermore, RT-related effects reported by Lesinski et al. [4] were not moderated by the factor sex. In contrast, Collins et al. [5] observed that the factor sex modulated RT effects on muscle power.

An attempt to overcome the above described methodological limitations of meta-analyses is to perform umbrella reviews [6]. Notably, umbrella reviews are on the highest level of the medicine evidence pyramid [7] and they summarise findings from already published meta-analyses to provide an overview on a given overarching topic. Thus, umbrella reviews help us to understand the current strengths and limitations of the entire body of literature on a specific topic, in this case youth RT.

To the best of our knowledge, there is no published umbrella review available that has examined the effects of RT on measures of physical fitness in healthy children and adolescents.

Therefore, the objectives of this umbrella review were (a) to systematically review the available meta-analytical evidence that has examined the effects of RT on proxies of physical fitness (e.g., muscle strength, muscle power, linear speed) in healthy children and adolescents; (b) to systematically report the effects of potential moderator variables, including maturation, age, sex, expertise level, and RT type (e.g., plyometric training); (c) to address the quality, strengths and limitations of the meta-analytical evidence; and (d) to identify current gaps in the literature and make suggestions for future research.

2 Methods

Our umbrella review was conducted in accordance with recommendations for umbrella reviews from Aromataris and colleagues [6] and addressed all items recommended in the PRISMA statement [8].

2.1 Literature Search

We performed a computerized systematic literature search in the databases PubMed, Web of Science, and Cochrane Library. A Boolean search syntax was used (Table 1). The search was limited to full text availability, publication dates from 01/01/1979 to 01/01/2020, ages from birth to 18 years, English and German language, and type of article (i.e., meta-analysis). The reference lists of each included meta-analysis were screened for titles to identify additional meta-analyses to be included in our umbrella review.

Table 1 Information on literature search, selection criteria, and considered moderator variables

2.2 Selection Criteria

Based on a priori defined inclusion/exclusion criteria (PICOS = population, intervention, comparator, outcome, study design; Table 1), two independent reviewers (MH, AS) screened potentially relevant articles by analysing titles, abstracts, and full texts of the respective articles to elucidate their eligibility. When MH and AS did not reach an agreement concerning inclusion of an article, ML adjudicated.

2.3 Data Extraction

The following data were extracted from the included meta-analyses: (1) first author and year of publication; (2) the number and type of primary studies included in the meta-analysis; (3) the study characteristics and the number of included participants; (4) the respective physical fitness outcome; (5) effect sizes and the equations used to compute effect sizes, the respective significance level, p values of Chi2 tests, the 95% confidence intervals (CI), and I2 values (i.e., study heterogeneity). Data were extracted and cross-checked for accuracy by ML, MH and AS. If relevant data were not available in the respective papers, we sent email inquiries to the corresponding authors. If the author did not reply or could not provide the missing data, we marked the missing information as “not applicable” (n.a.) or “non-calculable” (n.-c.). Our descriptive analyses focused on different outcome categories (i.e., muscle strength, muscle power, linear sprint speed, agility/change-of-direction speed, throwing performance, and sport-specific performance [e.g., kicking velocity]). Further, we searched the identified meta-analyses for the effects of moderating variables (Table 1).

2.4 Evaluation of the Methodological Quality

Meta-analyses of randomised controlled trials and controlled studies are subject to different sources of bias. Therefore, it is important that readers have the option to distinguish between low and high quality meta-analyses. The methodological quality of the included meta-analyses was independently assessed by three reviewers (ML, MH, and AS) using the validated AMSTAR2 (A Measurement Tool to Assess Systematic Reviews) checklist [11]. This checklist contains 16 items that include for instance the literature search procedure, data extraction, quality assessment, and statistical analyses of the meta-analyses (for more details see [11]). Each item on this checklist was answered with a ‘yes’ (1 point), ‘partial yes’ (0.5 points) or ‘no’ (0 points). Based on the summary point scores (i.e., maximum 16 points), the meta-analyses were categorised as high quality if ≥ 80% of the possible score was achieved, moderate quality if 40–79% of the possible score was reached, or low quality if < 40% of the possible score was achieved [12].

2.5 Quality of Evidence

For the assessment of the quality of evidence, the modified Grading of Recommendations Assessment, Development and Evaluation (GRADE) principles were used [13]. Accordingly, the following GRADE aspects were assessed for each single outcome of the included meta-analyses: (1) study limitations were evaluated through the findings from scales that quantify the risk of bias in the primary studies of the included meta-analyses (e.g., PEDro); (2) inconsistency was assessed through the size of statistical heterogeneity (i.e., I2-statistics); (3) indirectness was established through the evaluation of differences between study cohorts, intervention types, comparators, and outcome variables of the primary studies and those that were relevant for each included meta-analysis; (4) imprecision was determined using the width of the 95% CI of the pooled effect size of the included meta-analyses; and (5) publication bias was determined by examining the asymmetry of the funnel plots of the included meta-analyses. Each of these five aspects was evaluated for each single outcome as “not reported”, “neutral”, “serious”, or “very serious” [13]. Meta-analyses were downgraded from initially four points by one point for each “not reported” or “serious” and by two points for each “very serious” rating. Then, meta-analyses were rated as “high” (4 points), “moderate” (3 points), “low” (2 points) or “very low” (≤ 1 point) quality of evidence. The GRADE assessment was conducted independently by three authors (ML, MH, and AS), with discussion and agreement regarding any differences.

2.6 Prediction Interval

We calculated the 95% prediction interval (PI) for all included meta-analyses using the number of included studies, the standardised mean difference (SMD), the upper limits of the 95% CI and the tau-squared values (spreadsheet available at: https://www.meta-analysis.com/pages/prediction.php) [14]. The PI represents the range in which the effect size of a future study will most likely fall [14].

2.7 Data Interpretation

The main aims of umbrella reviews is to allow comparison of the magnitude of effects across all included meta-analyses. The use of one effect size measure makes this comparison straightforward. However, it is important to acknowledge that even if most of the included meta-analyses used SMDs as an effect size measure, differences were found in the respective equations that were used to compute SMDs. For instance, some meta-analyses weighted single studies and/or conducted sample size adjustment (e.g., Hedges’ g). Therefore, we extracted the equations used to compute effect sizes for each included meta-analysis (Table 2). According to Cohen [15] we classified SMD values < 0.20 as trivial, 0.20 ≤ SMD < 0.50 as small, 0.50 ≤ SMD < 0.80 as medium, and SMD ≥ 0.80 as large effects.

Table 2 Included meta-analyses that examined the effects of resistance training on physical fitness in healthy children and adolescents

3 Results

3.1 Search Results

A total of 146 potentially relevant studies were identified in the electronic databases (Fig. 1). Finally, 14 meta-analyses were eligible for inclusion in this umbrella review based on a priori defined selection criteria. We further separated the included meta-analyses into those that reported between-subject effect sizes (i.e., post-test comparison of the intervention versus control group) and those that reported within-subject effect sizes (i.e., pre- versus post-test comparison of the intervention group) (Table 2).

Fig. 1
figure 1

Flow chart illustrating the different phases of the search and study selection

3.2 Characteristics of the Meta-analyses

The 14 included meta-analyses were published between 1996 and 2019. The number of included original studies ranged from nine to 43 with an average of 28 original studies. Sample sizes ranged from 252 to 1728 trained and untrained healthy children and adolescents (average: 847 participants). The chronological age of the included participants ranged from 6 to 18 years. Six meta-analyses investigated the effects of RT in trained and untrained girls and boys,[5, 16,17,18,19,20] two meta-analyses in trained and untrained girls [21, 22], one meta-analysis in trained and untrained boys, three meta-analyses in trained boys and girls [4, 23, 24], and two meta-analyses in trained boys [25, 26]. Regarding the type of RT, eight meta-analyses [4, 16, 17, 19, 20, 22, 24, 25] included any type of RT, three meta-analyses [5, 18, 23] excluded plyometric training, and three meta-analyses [21, 26, 27] specifically focused on plyometric training.

The assessment of the methodological quality (AMSTAR2) of the included meta-analyses was summarised in Electronic Supplementary Material (Table S1). The included papers received scores ranging between 6 and 72% of the maximum score (16 points). Three meta-analyses [4, 5, 16] were of moderate quality and the remaining eleven meta-analyses of low methodological quality. The following criteria were not sufficiently addressed in the analysed meta-analyses: (2) establish methods prior to the conduct of the meta-analyses (written protocol); (3) explain the choice of study design for inclusion; (7) provide a list of excluded studies to justify the exclusion; and (10) report sources of funding for included studies.

The assessment of the quality of evidence (GRADE) of the included meta-analyses was summarised in Electronic Supplementary Material (Table S2). Two of the included meta-analyses [22, 23] presented evidence of low quality and eight meta-analyses [17,18,19, 21, 24,25,26,27] provided evidence of very low quality. The remaining four meta-analyses [4, 5, 16, 20] presented evidence of low to very low quality depending on the outcome measure under consideration.

3.3 Effectiveness of Resistance Training in Healthy Youth

To avoid bias due to growth and maturation-related performance enhancements, we focused only on the included meta-analyses that reported between-subject effect sizes (Table 1).

The included meta-analyses indicated medium-to-large effects (0.54 ≤ SMD ≤ 1.12) of RT on muscle strength [4, 17, 18, 22], small-to-large effects (0.41 ≤ SMD ≤ 0.80) on muscle power [4, 5, 16, 21, 23, 24, 26], small-to-medium effects (0.30 ≤ SMD ≤ 0.53) on linear speed [4, 5, 16], medium effects (SMD = 0.68) on agility/change-of-direction speed [4], small-to-large effects (0.41 ≤ SMD ≤ 0.99) on throwing performance [5, 16], and medium effects (SMD = 0.75) on sport-specific performance [4] in trained and untrained children and adolescents (Fig. 2).

Fig. 2
figure 2

Summary of the effect sizes (between-subject standardised mean difference [SMD]), 95% confidence intervals (black lines), and 95% prediction intervals (grey lines) from the included meta-analyses, indicating the effects of resistance training (RT) versus control group on proxies of physical fitness in healthy children and adolescents. Bars indicate the magnitude of the effects of RT for each meta-analysis including the restriction regarding the included population or type of included RT

3.4 Maturation-, Age-, Sex-, Expertise Level- and Type-Specific Effects of Resistance Training on Muscle Strength and Muscle Power

Several of the included meta-analyses performed sub-group analyses of moderator variables which were summarised in Tables 3 and 4.

Table 3 Summary of the findings (effect sizes = standardised mean differences) of the sub-group analyses regarding resistance training related effects of the moderator variables age, maturation, sex, and expertise level on measures of muscle strength and muscle power in children and adolescents
Table 4 Summary of the findings (effect sizes = standardised mean differences) of the sub-group analyses regarding the effects of different resistance training types on measures of muscle strength and muscle power in healthy children and adolescents

In terms of maturation status, sub-group analyses indicated large effects of RT in prepubertal participants (0.81 ≤ SMD ≤ 0.91) and medium-to-large effects in mid-/postpubertal participants (0.61 ≤ SMD ≤ 1.91) on muscle strength and power (Table 3). For muscle strength, Behringer et al. [18] found that prepubertal children gained significantly (p < 0.05) less muscle strength following RT (SMD = 0.81) compared to mid-/postpubertal adolescents (SMD = 1.91). Nevertheless, for muscle power, Lesinski et al. [4] as well as Collins et al. [5] did not find significant maturity-specific RT effects.

In terms of chronological age, sub-group analyses indicated medium-to-large effects of RT in children (0.57 ≤ SMD ≤ 1.35) and adolescents (0.69 ≤ SMD ≤ 0.91) on muscle strength [4, 17, 19] as well as small-to-large effects of RT on muscle power [4, 23, 26] in children (0.41 ≤ SMD ≤ 0.91) and adolescents (0.47 ≤ SMD ≤ 1.02) (Table 3). With the exception of Slimani et al. [23], there is no meta-analysis available that reported statistically significant effects of chronological age (i.e., children versus adolescents) for measures of muscle strength or muscle power [4, 5, 18, 26]. Nevertheless, the analysis of continuous moderator variables as reported by Behringer et al. [16] revealed a statistically significant (p < 0.05) negative correlation (r = − 0.25) between chronological age and the magnitude of effect sizes for motor skills (i.e., combined jumping, running, and throwing). This indicates that RT could be more beneficial in younger participants.

In terms of the sex variable, sub-group analyses indicated medium-to-large effects of RT in boys (0.72 ≤ SMD ≤ 1.21) and girls (0.54 ≤ SMD ≤ 1.42) on muscle strength [4, 17, 18] (Table 3). Further, the effects of RT on muscle power [4, 21, 23, 26] turned out to be medium-to-large for boys (0.73 ≤ SMD ≤ 0.89) and medium for girls (0.57 ≤ SMD ≤ 0.61) (Table 3). Collins et al. [5] found that boys (SMD = 0.84) compared with girls (SMD = 0.21) gained significantly (p < 0.01) more muscle power following RT. It has to be noted though that no other meta-analysis reported a statistically significant sex-specific effect of RT on muscle strength or muscle power [4, 18]. Payne and colleagues [17] did not examine the level of statistical significance.

In terms of expertise level, the included meta-analyses indicated medium-to-large effects of RT on muscle strength and muscle power in young athletes (i.e., trained children and adolescents) (0.65 ≤ SMD ≤ 1.09) [4, 5, 23, 26] (Table 3). Collins et al. [5] conducted a sub-group analysis regarding participants’ expertise level. They found that trained (SMD = 0.95) compared to untrained children and adolescents (SMD = 0.25) gained significantly (p < 0.01) more muscle power following RT.

In terms of RT type, the included meta-analyses indicated large effects of traditional RT (SMD = 1.12) [18] as well as small effects of plyometric training (SMD = 0.39) [4] on muscle strength (Table 4). Other meta-analyses indicated medium-to-large effects of traditional RT (0.41 ≤ SMD ≤ 0.80) [5, 23] or plyometric training (0.57 ≤ SMD ≤ 0.81) [4, 21, 26] on muscle power (Table 4).

Moreover, regarding the type of traditional RT, sub-group analyses indicated small-to-large effects (0.36 ≤ SMD ≤ 0.93) of machine-based RT and large effects (1.31 ≤ SMD ≤ 2.97) of free weights RT on muscle strength [4, 18]. Even though Behringer et al. [18] did not find statistically significant differences between traditional RT types in trained and untrained children and adolescents, Lesinski et al. [4] reported that free weights RT (SMD = 2.97) resulted in statistically significant larger gains in muscle strength (p < 0.001) compared to machine-based RT (SMD = 0.36) in trained children and adolescents.

4 Discussion

This systematic umbrella review aimed to provide an overview of the effects of RT on proxies of physical fitness in healthy children and adolescents. The main findings of this umbrella review are: (1) RT has medium-to-large effects on measures of muscle strength, small-to-large effects on muscle power, small-to-medium effects on linear sprint speed, a medium effect on agility/change-of-direction speed, small-to-large effects on throwing performance, and a medium effect on sport-specific performance; (2) there are few consistent findings from the included meta-analyses regarding the moderating effects of age, maturation, sex, expertise level, and/or RT type on muscle strength and muscle power, and (3) the included meta-analyses are of low-to-moderate methodological quality and the presented evidence is of low or even very low quality.

4.1 Effects of Resistance Training on Physical Fitness in Healthy Youth

This umbrella review indicates that RT interventions can enhance physical fitness in children and adolescents beyond a level which is not exclusively achievable from growth and maturation. We found that the effects of RT on measures of muscle strength and power were small-to-large in magnitude, with small-to-medium effects for secondary outcomes including linear sprint speed, agility/change-of-direction speed, and sport-specific performance. Therefore, effect sizes vary according to the respective outcome measure. The lower effects of RT on secondary outcomes can most likely be explained by the principle of training specificity [28] which suggests that the greatest strength gains occur at or near the training velocity.

4.2 Effects of Moderating Factors Such as Age, Maturation, Sex, Expertise Level, Resistance Training Type

In terms of chronological age (Table 3), the reported sub-group analyses of the meta-analyses of Moran et al. [26], Lesinski et al. [4], and Behringer et al. [18] were unable to show any statistically significant age-related effects of RT on measures of muscle strength and muscle power. Notably, Behringer et al. [16] observed a statistically significant negative correlation (r = − 0.25) between chronological age and the magnitude of effect sizes for motor skills performance (i.e., jumping, running and throwing) in their meta-analysis. These authors proposed that younger children might experience a greater effect of RT on motor skills. In accordance with this finding, Slimani et al. [23] observed in their meta-analysis that adolescents compared with children improved their squat jump performance significantly more following RT. The observed differences in findings between the included meta-analyses could be due to differences in literature research strategies (e.g., different search syntax, inclusion criteria, or year of literature research) and applied methods. Taken together, the included meta-analyses consistently reported no chronological age-related effects of RT on measures of muscle strength. For measures of muscle power, the included meta-analyses revealed no consistent findings with regards to the moderating effects of age on RT-related training effects.

Unlike chronological age, maturation is not a linear process. Skeletal, sexual and somatic maturation in children differ individually in timing and tempo which is why there is often a discrepancy between chronological and biological age (i.e., maturation) among youths [29,30,31,32]. In terms of the maturation status (Table 3), a meta-analysis [18] found that prepubertal children gained significantly less muscle strength following RT (SMD = 0.81) compared with mid-/postpubertal adolescents (SMD = 1.91). For measures of muscle power, two meta-analyses [4, 5] were unable to identify any maturation-related effects of RT. Taken together, maturity seems to be an important moderating variable with regards to RT-related effects on muscle strength. While strength gains in prepubertal children mostly occur due to neural adaptations, additional morphological adaptations may explain the increased effects of RT in mid-/postpubertal adolescents [3].

In terms of the moderating factor sex (Table 3), Behringer et al. [18] were unable to identify significant sex-related effects of RT on measures of muscle strength in their meta-analysis. Further, Lesinski et al. [4] could not find statistically significant sex-specific effects of RT on muscle power in their meta-analysis. Nevertheless, a recent meta-analysis of Collins et al. [5] found that boys gained significantly more muscle power following RT (SMD = 0.84) compared with girls (SMD = 0.21). This finding has to be interpreted with caution because only one original study with girls was included in the respective sub-group analysis. Taken together, our findings suggest that boys and girls show similar RT-related improvements on measures of muscle strength. Controversial results exist in the available meta-analyses on sex-related effects of RT on measures of muscle power. Therefore, this research question must be investigated in future studies. Over the past years, more RT studies with youth reported data on the maturational status of the included participants. However, there is currently no meta-analysis available that examined ‘biological maturity’ as a moderating variable in its overall and/or sex-specific sub-group analyses. Accordingly, we commend pursuit of such research in the future.

In terms of the moderating factor expertise level, Behm et al. [20] and Collins et al. [5] conducted sub-group analyses in their meta-analyses regarding the role of expertise level (i.e., trained vs. untrained) in RT-related performance gains in youth (Table 3). While Behm et al. [20] could not find a moderating effect of expertise level on RT-related performance improvements in muscle strength and power in youth, Collins et al. [5] observed significantly larger effects on muscle power (i.e., squat jump performance) in trained (SMD = 0.95) compared with untrained youth (SMD = 0.25). Yet, findings from Collins et al. [5] have to be interpreted with caution due to the limited number of included original studies which examined the effects of RT on muscle power in untrained children and adolescents (n = 3). Accordingly, it can be argued that the inconsistent findings of the two meta-analyses are due to differences in the applied methods (i.e., within- versus between-subject SMD). Taken together, the available scientific evidence showed no robust results for the role of expertise level on RT related improvements in youth muscle power.

In terms of the moderating effects of the type of RT (i.e., traditional RT vs. plyometric training; Table 4), the included meta-analyses showed that traditional RT produced large effects on muscle strength [18], while plyometric training caused small effects on muscle strength [4]. Thus, it seems that traditional RT causes larger gains in muscle strength compared to plyometric training including high-velocity and muscle power exercises. This was confirmed by Behm et al. [20] who conducted a meta-analysis on the effects of traditional RT versus power training (i.e., plyometric training) on muscle strength in youth by aggregating within-subject SMDs. These authors found that traditional RT induced large effects while power training induced only trivial effects on measures of muscle strength. However, these findings are limited due to the low number of included studies that investigated the effects of power training on muscle strength (n = 3) as well as due to the applied statistical approach (i.e., calculation of within-subject SMDs). Of note, within-subject SMDs are biased because of regular growth and maturation-related performance enhancements in children and adolescents. In terms of muscle power, Behm et al. [20] reported slightly higher effect size magnitudes for jump performance following power training (within-subject SMD = 0.69) compared with traditional RT (within-subject SMD = 0.53). Notably, both pooled within-subject SMDs were of medium magnitude, which is why the evidence for larger effectiveness following power training is limited. Taken together, the above-mentioned meta-analyses indicate that with reference to the principle of training specificity [28], effect size magnitudes vary according to the respective outcome measure and RT type. This means that the greatest strength gains occur at or near the respective training velocity [28]. For instance, exercises with high-velocity movements such as plyometrics specifically enhance performances in movements with similar force–velocity profiles such as vertical and/or horizontal jumps.

Furthermore, in terms of the type of traditional RT (machine-based RT versus free weights), the meta-analysis of Behringer et al. [18] did not reveal statistically significant differences between the effects of free weight versus machine-based RT on measures of muscle strength in children and adolescents. However, another meta-analysis [4] found that free weights RT (SMD = 2.97) resulted in significantly larger gains in muscle strength compared with machine-based RT (SMD = 0.36) in young athletes. Each of the observed RT types has specific benefits and limitations [33, 34]. Supervised machine-based RT may allow a more stable performance of movements (e.g., lifts) which is why they can be considered an adequate learning tool for children and adolescents to start RT. Supervised RT using free weights allows to perform the full range-of-motion which better mimics sports-specific movements [33, 34]. It might be possible that children and adolescents who have reached a certain expertise level (i.e., trained youth), may better respond to free weights RT, compared with the general youth population. Taken together, the available scientific evidence shows no robust results for the factor type of traditional RT on muscle strength.

A clear limitation of meta-analyses is that they synthesize results from heterogenous original studies but do not consider important differences across the included original studies in terms of exercise programme variables, testing methods, and other factors. Therefore, the consideration of comparative intervention studies is needed that assess the effects of moderating factors such as age, maturation, sex, expertise level, or RT type on measures of physical fitness in children and adolescents while holding other variables constant. In this regard, Peitz et al. [35] recently conducted a systematic review of 75 comparative studies on the effects of traditional RT and plyometric training on physical fitness in youth aged 6–18 years. Their findings indicate that maturity-related effects are different following traditional RT versus plyometric training, with the former showing smaller and the latter showing larger effects in prepubertal children [35]. Further, there seems to be no sex-specific effects of traditional RT on physical fitness outcomes [35]. However, the impact of sex on plyometric training adaptions is unresolved [35]. Prepubertal boys and girls seem to respond similarly, while midpubertal boys show larger gains in jump performance compared with girls [35]. Finally, comparative studies [35] show that both traditional RT and plyometric training are effective. However, moderating factors such as maturity and sex appear to modulate the effects following traditional RT and plyometric training differently [35].

4.3 Quality of the Included Meta-analyses

The methodological quality of the included meta-analyses can be classified as moderate-to-low. For the assessment of the methodological quality, Shea et al. [11] recommend that individual AMSTAR2 item ratings should not be combined to create an overall score. Users should consider the potential impact of an inadequate rating for each item independently. With the exception of Collins et al. [5], none of the included meta-analyses registered their protocol. Furthermore, only Lesinski et al. [4] explained the choice of study design for inclusion. Finally, none of the included meta-analyses provided a list of excluded studies (that were read in full text form) to justify their exclusion or reported sources of funding for the original (primary) studies. It might be possible that due to word/table/figure restrictions and/or the absence of databases for supplement materials, authors were unable to submit all information they had extracted from the primary research. Nevertheless, it might also be possible that authors were unaware of the importance of these methodological quality characteristics.

All included meta-analyses were classified as presenting low or very low quality of evidence. This might partly be due to under-reported GRADE items that also downgraded the quality of evidence. More specifically, risk and publication bias were often not reported. Because of the lack of meta-analyses with moderate or high quality of evidence, we are unable to draw conclusions as to whether future research (i.e., meta-analyses) with high quality of evidence might change the strengths of this recommendation.

4.4 Suggestions for Future Research

To strengthen preliminary findings regarding the effects of RT on a wide range of physical fitness outcomes, future research should investigate the effects of RT on secondary outcomes (e.g., agility/change-of-direction, throw, sport-specific performance) as well. Given that the sub-group analyses of the included meta-analyses with regards to the moderators age, maturation, and sex are mostly based on a low number of included studies, future research should especially focus on examining the effects of RT in prepubertal children and girls, irrespective of their maturational status. Furthermore, future research should document participants’ biological maturity status as well as distinguish between the different types of RT. Of note, biological maturity can easily be assessed through the maturity offset method as introduced by Mirwald et al. [10] or by recording Tanner stages. These variables should be included as moderators in sex-specific sub-group analyses. Finally, research with high methodological quality and high quality of evidence should be conducted in the future.

4.5 Strengths and Methodological Limitations

This umbrella review presents findings on the highest level of the medicine evidence pyramid regarding the effects of RT on proxies of physical fitness in healthy children and adolescents. Furthermore, this umbrella review ensured a high-level synthesis of potentially moderating variables and addressed the methodological quality and the quality of evidence. Finally, this umbrella review identified current gaps in the literature to make suggestions for future research.

A limitation of this umbrella review is the rather low number of meta-analyses (N = 14) which were eligible for inclusion. Another limitation is the low methodological quality and the (very) low quality of evidence of the included meta-analyses. Some of the assessed AMSTAR2 as well as GRADE criteria are under-reported or under-represented. In addition, it is important to acknowledge that even if the meta-analyses investigated similar research questions they showed methodological differences in search strategies and selection criteria as well as with regards to the applied analytical approach. It has to be noted that some primary studies were included across multiple meta-analyses while others were not. Consequently, the general weight of the single primary studies can be different. Furthermore, findings regarding the observed sub-group analyses of the moderating factors of RT mostly showed no consistent and robust results and, thus, must be interpreted with caution.

5 Conclusion

This systematic umbrella review proved that RT has the potential to enhance proxies of physical fitness in healthy children and adolescents beyond a level achievable from growth and maturation. We found that the effects of RT on measures of muscle strength and muscle power were small-to-large in magnitude, with small-to-medium effects for secondary outcomes including linear sprint, agility/change-of-direction, and sport-specific performances.

Our findings further indicate that there are few consistent effects of potentially moderating factors such as ‘chronological age’, ‘maturation’, ‘sex’, ‘expertise level’, and ‘RT type’ on measures of muscle strength and muscle power in healthy children and adolescents across the included meta-analyses. Preliminary findings suggest that ‘maturation’ (i.e., prepubertal < mid-/postpubertal) as well as ‘type of RT’ (i.e., traditional RT > plyometric training) moderate the effects of RT on muscle strength while ‘chronological age’ and ‘sex’ appear not to. Whether the factors ‘expertise level’ and ‘type of traditional RT’ have an impact on muscle strength cannot be elucidated based on the available data. Furthermore, preliminary findings suggest that the potentially moderating variables ‘maturation’, ‘sex’, and ‘type of RT’ do not modulate RT-related adaptions in youth’ muscle power. Whether ‘chronological age’, ‘expertise level’, and ‘type of traditional RT’ have an impact on muscle power is currently unresolved. Due to the limited amount of original research on specific sub-groups (e.g., girls, children, prepubertal youth), the findings of the included meta-analyses and, thus, of this umbrella review, regarding the effects of the moderating factors (e.g., sex, maturation) on RT on muscle strength and power have to be interpreted with caution. However, the benefits of safely performed and supervised RT are now irrefutable. RT should be used extensively in schools and should be embedded into PE curricula globally.

Data Availability

All data are provided in the article and the Electronic Supplementary Material.