FormalPara Key Points

Strength training with high loads (≥ 80% of one repetition maximum) can improve time trial and time to exhaustion running performance.

The combination in a program of two or more strength training methods (i.e., high loads, submaximal loads [40–79% of one repetition maximum], and/or plyometric training) may induce greater running performance improvement compared with one method alone.

Maximal oxygen consumption, velocity at maximal oxygen consumption, maximum metabolic steady state, and sprint capacity exhibited trivial changes after strength training.

The results are based on 38 studies and 894 (651 male individuals and 243 female individuals) middle-distance and long-distance runners, aged between 17 and 40 years, with a very low to moderate certainty of evidence.

1 Introduction

In middle-distance (800–3000 m) and long-distance running (5000 m to marathon) races, performance is determined by factors such as maximal oxygen uptake (VO2max), velocity at VO2max (vVO2max), maximum metabolic steady state (MMSS), running economy [1,2,3], and sprint capacity [4]. Indeed, VO2max has long been used as a primary measure of an individual’s cardiorespiratory fitness, and as a marker of training effect [5]. The interplay between VO2max and running economy determines vVO2max [2, 6], whereas the MMSS (e.g., second lactate threshold) establishes the limit of steady-state muscle metabolism [20]. Running economy, defined as the amount of energy required for running at submaximal speeds [7], may differentiate running performance in athletes with similar VO2max levels [8], and sprint capacity can influence races that require changes of pace [9] or a final sprint [4].

The implementation of strength training (ST) can improve the performance in middle-distance and long-distance runners [10,11,12,13,14]. However, previous meta-analyses have focused mainly on running economy [11,12,13] and time trial running performance [13], without exploring the effects of ST on other determinants of performance (i.e., VO2max, vVO2max, and sprint capacity). For example, it has been found that ST could induce a trivial effect on VO2max in endurance athletes [15]. In addition, the incorporation of diverse ST methods has demonstrated improvements in running economy among endurance runners [10,11,12,13, 16]. Moreover, ST may improve anaerobic and neuromuscular characteristics (e.g., sprint capacity) [3]. These changes may be manifested in factors influenced by these variables, such as vVO2max [2, 6].

Strength training is a versatile method of exercise that can be customized by the manipulation of factors such as intensity, volume, inter-set rest, frequency, type and sequence of exercise, and speed of movement [17]. For instance, by manipulating the load (i.e., intensity) ST may be classified as high load training (HL, i.e., ≥ 80% of 1 repetition maximum [1RM]), submaximal load training (SubL, i.e., 40–79% 1RM), or plyometric training (PL, i.e., jump-based training with light or no loads) [18, 19]. Each of these ST methods target a specific outcome such as maximal strength, strength at submaximal loads with higher speed of movements, or stretch–shortening cycle and muscle–tendon stiffness, respectively [18]. Therefore, the effect of ST on performance and its determinants may vary depending on the specific characteristics of each ST method [14, 19]. For example, ST has shown improvements in fixed blood lactate after PL [20] and blood lactate concentration at 16 km/h after a combined HL and PL intervention [21]. However, some studies have not shown any improvement in MMSS [22,23,24].

The concerns described above may be related to the small number of studies that have compared ST methods, with most studies simply comparing standard running training protocols to ST. A comparison of different ST methods can entail highly complex logistical planning for researchers, meaning it is not always feasible to carry out. However, a systematic review with meta-analyses may offer a viable alternative to addressing such methodological challenges by combining studies that utilise different ST methods, thus enabling their comparison. Although some systematic reviews with meta-analyses have been published involving runners [10,11,12,13,14], a more comprehensive understanding of the effects of ST methods on endurance running performance (i.e., time trial and time to exhaustion) and its other determinants (e.g., VO2max, vVO2max, MMSS, sprint capacity) is needed.

Based on the above, the aim of this systematic review with meta-analysis was to analyze the effect of different ST methods (e.g., HL, SubL, PL, combined training) on running performance (i.e., time trial and time until exhaustion) and its determinants (i.e., VO2max, vVO2max, MMSS, sprint capacity) in middle-distance and long-distance runners.

2 Methods

The 2020 PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines [25] were followed for this systematic review and meta-analysis. The original protocol was registered on the Open Science Framework before the data analysis (https://osf.io/gyeku).

2.1 Information Sources and Search Strategy

Multiple databases including PubMed, Web of Science (all databases), Scopus, and SPORTDiscus were searched using various search terms and Boolean operators (Table S1 of the Electronic Supplementary Material [ESM]). All articles indexed up to January 2022 were included for selection. The search was updated in November 2022 with notifications of new studies found in the previously searched databases. No restrictions were placed on databases regarding study design, date, language, age, or sex of the participants. Additionally, the reference lists of relevant reviews, systematic reviews, and meta-analyses were reviewed, as well as the reference lists of the articles included in the analysis.

2.2 Selection Process

Two reviewers (LL and SV) reviewed all titles and abstracts obtained from the databases. When the titles and abstracts suggested that the article might meet the inclusion criteria (Table 1), the full article was reviewed. In the case of disagreement between the two reviewers, a third reviewer (RC) was consulted.

Table 1 Inclusion and exclusion criteria for meta-analysis

2.3 Data Collection Process

Data were collected by an independent reviewer (LL), including subject characteristics, methodological data, endurance training, ST intervention, and main outcomes for further analysis. In those articles where only data in the form of figures were presented, the validated WebPlotDigitizer software (Version 4.5; Ankit Rohatgi, Pacifica, CA, USA) [26] was used to extract the data. The reviewers (LL, SV, and RC) discussed the extracted data collectively and discussed any disagreements or controversial data after recoding.

2.4 Eligibility Criteria

Studies were eligible for inclusion according to the PICOS criteria (Participants, Intervention, Comparator, Outcome, and Study design; Table 1).

2.4.1 Participants

Subjects over 16 years of age were included in the study, as puberty may influence the adaptive response to training because of hormonal changes that occur during this period [27]. Strength training experience was classified as either experienced or not experienced in ST based on the information provided by each study. The initial VO2max level was recorded, further categorized by performance level into moderately trained (male individuals ≤ 55 mL/kg/min, female individuals ≤ 45 mL/kg/min), well trained (male individuals 55–65 mL/kg/min, female individuals 45–55 mL/kg/min), or highly trained (≥ 65 mL/kg/min, ≥ 55 mL/kg/min) [28]. When male and female performance levels were not distinguished, ranges were established by averaging the values of both sexes for each respective performance level. If initial VO2max values were not recorded, performance level was determined according to the participant’s level of competition (moderately trained = recreational or local club level; well trained = collegiate or provincial level; highly trained = national or international level) [19].

2.4.2 ST Intervention

Strength training methods were classified according to the training target and training load [18, 19] as follows: (1) HL, defined as programs aiming to improve maximal strength development by performing exercises with high loads (e.g., barbell squat at ≥ 80% 1RM or ≤ 7RM); (2) SubL, defined as programs aiming to improve strength development using exercises with moderate-to-low loads (e.g., maximal power load at 40–79% 1RM or 8–20RM; usually with maximal movement velocity intention); (3) PL, defined as programs aiming to improve stretch–shortening cycle functioning using exercises with light loads or body weight (e.g., jump-based training); and (4) combined training (Combined), defined as programs that included two or more ST methods. The groups that performed ST with very low loads (VL, < 40% 1RM or > 20RM) were considered as a control group. The duration of the intervention was recorded as total weeks, sessions per week, and total number of sessions.

2.4.3 Outcome Measurements

Maximal oxygen uptake, vVO2max, MMSS, sprint capacity, and running performance were recorded before and after the ST interventions. Maximum metabolic steady state was considered if measured as: maximal lactate steady state, second lactate threshold, onset of blood lactate accumulation, lactate turn point, critical speed, or second ventilatory threshold. Sprint capacity was measured as the speed in meters (m/s) or time to cover a distance (s), in efforts where energy resources have been released mainly from glycolysis and phosphates [29]. Running performance was measured by a time trial or time to exhaustion in runs of more than 75 s, where aerobic metabolism predominates [30]. If running performance was measured in more than one test (e.g., 1500 m and 10,000 m), the most similar test between studies was selected. For all outcomes, where the study reported multiple timepoints (i.e., more than two data points), the first record and the last record immediately after the intervention were recorded.

2.5 Risk of Bias, Publication Bias, and Certainty Assessment

The risk of bias of the studies was assessed using the PEDro (Physiotherapy Evidence Database) scale [31, 32], with items 5–7 removed in consideration of the lack of blinding of subjects, assessors, and researchers in supervised exercise interventions [31, 33]. Based on previous criteria [33], the studies were categorized as low risk (≥ 6 points), moderate risk (4–5 points), and high risk (≤ 3 points). To assess the publication bias of the studies on each ST method, a funnel plot was constructed, indicating a publication bias if an asymmetry was observed.

The GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach was used to evaluate the certainty of evidence [34,35,36]. High certainty of evidence was initially assumed and then downgraded based on the following criteria: risk of bias, downgraded by one or two levels if the median PEDro score was indicative of moderate risk (< 6 points) or high risk (< 4 points), respectively; inconsistency, downgraded by one level if the Cochrane Q test for heterogeneity was significant (i.e., p < 0.05); indirectness, considered low risk, as the PICOS criteria were ensured; imprecision, downgraded by one level if the number of participants in the control group with the ST group was < 800 or if the confidence interval (CI) was crossed by a small effect size)ES) [i.e., − 0.15 to 0.15]; publication bias, downgraded by one level if an asymmetry was observed in the funnel plot.

2.6 Effect Measures

A standardized mean difference between groups (i.e., control-experimental) was calculated as previously recommended [37]. Effect size was calculated as Hedges’ g corrected for sample size [38] to help deal with small samples [39], which are recurrent in the sport science literature [40]. Where studies reported data as mean and standard error (SE), the standard deviation (SD) was calculated from the SE [41]. The criteria for determining the ES magnitude were established as follows: 0.15, 0.45, and 0.80 for a small, moderate, and large effect, respectively [42].

2.7 Synthesis Methods

A meta-analysis was performed for each ST method (i.e., HL, SubL, PL, or Combined) for each of the main outcomes (i.e., VO2max, vVO2max, MMSS, sprint capacity, and running performance) when at least three studies provided an outcome measure [16]. If a study had two or more comparison groups in the same analysis, the sample size of the control group was divided by the number of intervention groups [41]. Because of multiple sources of variation between studies (e.g., training and participant characteristics), a randomized effect model with a restricted maximum likelihood estimation method was conducted for estimating the parameters model (τ2) recommended over the traditional DerSimonian and Laird method for continuous data [43]. We based the test statistic and CIs in t-distribution with a Knapp and Hartung adjustment [44].

To examine heterogeneity between studies, the Cochrane Q test was accompanied by the value of I2 to quantify the effect of heterogeneity, with values of < 25%, 25–75%, and > 75% indicating low, moderate, and high levels of heterogeneity, respectively [41]. Outliers were defined as an ES in which the upper limit of the 95% CI was lower than the lower limit of the pooled effect CI or the lower limit of the 95% CI was higher than the upper limit of the pooled effect CI [45]. A sensitivity analysis was then performed with and without the outlier ES to assess their impact on the analysis [45] (i.e., p value from Q test).

A moderator analysis was performed to explore factors associated with ES (e.g., subject characteristics; ST intervention characteristics) if at least eight studies were pooled [46, 47], through meta-regression (i.e., age, body mass, height, initial VO2max, weeks, sessions per week, and total sessions) and sub-group analysis (i.e., sex, performance level, and ST experience). Alpha was set as 0.05. A Comprehensive meta-analysis (Version 3.3.0.70) was used for the analysis and GraphPad Prism 9 (Version 9.2.0) was used to generate the plots.

3 Results

3.1 Study Selection

The search strategy identified a total of 1749 records (Fig. 1). After removing duplicate records, records not retrieved, and documents excluded after reading the title and/or abstract, 73 studies were assessed for eligibility. Upon full-text reading, 35 studies were excluded because of the following reasons: participants aged under 16 years [48,49,50,51,52,53] or injured before the intervention [54,55,56]; no comparator group [57,58,59,60,61,62,63,64,65,66]; ST method considered not includable (e.g., core strength training; flywheel and isokinetic eccentric training; local muscular endurance training) [67,68,69,70,71,72]; no relevant outcomes included (i.e., VO2max, vVO2max, MMSS, sprint capacity, running performance) [73,74,75,76]; repeated outcome results derived from secondary analysis publications [77,78,79,80]; or cross-sectional study [81, 82]. As a result, 38 studies were included in the meta-analyses.

Fig. 1
figure 1

Flow diagram of the studies selection process. ST strength training, WOS Web of Science, *studies found from notifications of new studies found in the search strategy in the different databases, **studies found in the reference lists of articles, reviews, systematic reviews, and meta-analyses retrieved from our search strategy

3.2 Study Characteristics

The studies included in the meta-analysis are presented in Table 2 for the characteristics of the participants and the interventions, and in Table 3 for the outcome results included in meta-analyses. Thirty-eight studies were included in at least one analysis: 31 studies measured VO2max [3, 20,21,22,23,24, 83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107]; 15 studies measured vVO2max [23, 83,84,85,86,87, 91, 93, 94, 97, 102, 105,106,107,108], 21 studies measured MMSS [3, 20, 22,23,24, 83, 84, 86, 87, 90, 94, 95, 99,100,101, 105,106,107,108,109,110], eight studies measured sprint capacity [3, 21, 86, 92, 93, 106, 111, 112], and 24 studies measured running performance [3, 20,21,22,23, 85, 89, 90, 92, 95, 97,98,99, 101, 103, 105, 106, 108,109,110,111,112,113,114]. The studies included 894 participants (651 male individuals [290 control and 361 treatments] and 243 female individuals [115 control and 128 treatments]), aged between 17 and 40 years, mean body mass and height of 68.70 kg and 174.50 cm, respectively. Participants were moderately trained (n = 324), well trained (n = 272), and highly trained (n = 298). The ST programs lasted between 6 and 40 weeks, with one to four sessions per week.

Table 2 Participant and strength training intervention characteristics of the included studies
Table 3 Analysis of the studies included in the meta-analysis of VO2max, vVO2max, maximum metabolic steady state, sprint capacity, and running performance

3.3 Risk of Bias, Publication Bias, and Certainty Assessment

The median of risk of bias was 6 (range from 4 to 7; moderate-to-low risk of bias; Table S2 of the ESM). Publication bias was found only in the analysis of running performance in the combined group (Fig. S1 of the ESM). The results of the certainty of the evidence for each outcome are presented in Table 4. The reasons for downgrading by one or more levels of certainty were (1) risk (moderate) of bias, (2) inconsistency (i.e., significant heterogeneity was found), (3) imprecision (i.e., low number of participants and/or CI crossing the small effect size), and (4) publication bias (i.e., asymmetry in the funnel plot was found). Certainty of the evidence was moderate for eight outcomes, low in four outcomes, and very low for one outcome.

Table 4 GRADE (Grading of Recommendations Assessment, Development and Evaluation) assessment for the certainty of evidence

3.4 VO2max

From the studies that measured VO2max, 11 studies implemented HL [21, 22, 24, 87, 89, 94, 98, 101, 105,106,107], two studies SubL [85, 97] (not included in the meta-analysis), 12 studies (involving 14 groups) implemented PL [20, 23, 83, 85, 90,91,92, 95, 96, 99, 100, 104], and ten studies (involving 11 groups) implemented Combined [3, 21, 84, 86, 88, 93, 97, 102, 103, 107]. Compared with control conditions, no significant effects on VO2max were found with HL training (ES [95% CI] =  − 0.014 [− 0.324 to 0.297], p = 0.924, I2 < 0.001%; Fig. 2), PL (ES [95% CI] = 0.075 [− 0.183 to 0.332], p = 0.541, I2 < 0.001%; Fig. 2) or Combined (ES [95% CI] =  − 0.095 [− 0.398 to 0.208], p = 0.499, I2 < 0.001%; Fig. 2). Meta-regressions and subgroup analyses showed no significant moderating variables for any ST method (all p > 0.166; Tables S3–5 of the ESM).

Fig. 2
figure 2

Effect of strength training methods on maximal oxygen uptake. A positive effect size represents beneficial effects after strength training compared with control conditions. CI confidence interval, Combined high load training, plyometric training and/or submaximal load training, HL high load training, nES number of effect sizes, PL plyometric training

3.5 vVO2max

Among the studies that measured vVO2max, six studies applied HL [22, 87, 94, 105,106,107], two studies applied SubL [85, 97] (not included in the meta-analysis), five studies applied PL [23, 83, 85, 91, 108], and six studies (involving seven groups) applied Combined [84, 86, 93, 97, 102, 107]. Compared with the control group, no significant effects on vVO2max were found with HL training (ES [95% CI] =  − 0.161 [− 0.662 to 0.341], p = 0.448, I2 < 0.001%; Fig. 3), PL (ES [95% CI] = 0.275 [− 0.269 to 0.818], p = 0.233, I2 < 0.001%; Fig. 3) or Combined (ES [95% CI] = 0.112 [− 0.311 to 0.534], p = 0.542, I2 < 0.001%; Fig. 3). The reduced number of studies (i.e., < 8) per each ST method precluded meta-regression and subgroup analyses.

Fig. 3
figure 3

Effect of strength training methods on velocity at maximal oxygen uptake. A positive effect size represents beneficial effects after strength training compared with control conditions. CI confidence interval, Combined high load training, plyometric training and/or submaximal load training, HL high load training, nES number of effect sizes, PL plyometric training

3.6 Maximum Metabolic Steady State

From the studies that measured MMSS, nine studies implemented HL [22, 24, 87, 94, 101, 105,106,107, 109], ten groups from eight studies implemented PL [20, 23, 83, 90, 95, 99, 100, 108], and five studies implemented Combined [3, 84, 86, 107, 110]. Compared with the control condition, no significant effects on MMSS were found with HL training (ES [95% CI] = 0.049 [− 0.308 to 0.407], p = 0.760, I2 < 0.001%; Fig. 4), PL (ES [95% CI] = 0.017 [− 0.289 to 0.323], p = 0.902, I2 < 0.001%; Fig. 4) or Combined (ES [95% CI] =  − 0.026 [− 0.564 to 0.513], p = 0.902, I2 < 0.001%; Fig. 4). Meta-regressions and subgroup analyses showed no significant effects of possible moderators in the HL and PL methods (all p > 0.181; Tables S6 and S7 of the ESM).

Fig. 4
figure 4

Effect of strength training methods on maximum metabolic steady state. A positive effect size represents beneficial effects after strength training compared with control conditions. CI confidence interval, Combined high load training, plyometric training, and/or submaximal load training, HL high load training, nES number of effect sizes, PL plyometric training

3.7 Sprint Capacity

Of the studies that measured sprint capacity, two studies applied HL [21, 106], two studies applied PL [92, 112], and five studies applied Combined [3, 21, 86, 93, 111]. Compared with the control condition, no significant effect on sprint capacity was found with Combined training (ES [95% CI] =  − 0.493 [− 1.057 to 0.070], p = 0.072, I2 < 0.001%; Fig. 5).

Fig. 5
figure 5

Effect of strength training methods on sprint capacity. A negative effect size represents beneficial effects after strength training compared with control conditions. CI confidence interval, Combined high load training, plyometric training and/or submaximal load training, nES number of effect sizes

3.8 Running Performance

From the studies that measured running performance, eight studies implemented HL [21, 22, 89, 98, 101, 105, 106, 109], two studies implemented SubL [85, 97], 14 groups from 12 studies implemented PL [20, 23, 85, 90,91,92, 95, 99, 108, 112,113,114], and six studies implemented Combined [3, 21, 97, 103, 110, 111]. Compared with the control group, a significant moderate effect were found with HL training (ES [95% CI] =  − 0.469 [− 0.872 to − 0.066], p = 0.029, I2 < 0.001%; Fig. 6) and a significant large effect with Combined training but with a significant and moderate level of heterogeneity (ES [95% CI] =  − 1.035 [− 1.967 to − 0.103], p = 0.036; Q(5) = 15.373, p = 0.009, I2 = 67.475%; Fig. 6). No significant effect on running performance was found with PL training (ES [95% CI] =  − 0.210 [− 0.433 to 0.014], p = 0.064, I2 < 0.001%; Fig. 6). Meta-regressions and subgroup analyses showed no significant effects of possible moderators (all p > 0.211, Tables S8 and S9 of the ESM).

Fig. 6
figure 6

Effect of strength training methods on running performance. A negative effect size represents beneficial effects after strength training compared with control conditions. CI confidence interval, Combined high load training, plyometric training, and/or submaximal load training, HL high load training, nES number of effect sizes, PL plyometric training

4 Discussion

The aim of this systematic review with meta-analysis was to analyze the effect of different ST methods (i.e., HL, SubL, PL, and Combined) on performance and its determinants (i.e., VO2max, vVO2max, MMSS, and sprint capacity) in middle-distance and long-distance runners. The analyses revealed that, compared with endurance training alone or with ST with very low loads, the HL produced a significant moderate effect on running performance but not PL. Furthermore, when more than two ST methods (i.e., HL, PL and/or SubL) are combined, a significant large effect on running performance is produced. In contrast, no effects on VO2max, vVO2max, MMSS, and sprint capacity were found for all ST methods analyzed. These results suggest that HL is an effective method for improving running performance without interfering with other physiological parameters (i.e., VO2max, vVO2max, and MMSS), and this effect may be enhanced when PL and HL and/or SubL are combined.

4.1 VO2max and vVO2max

Maximal oxygen uptake is defined as the highest rate at which oxygen can be taken up and utilized by the body during severe exercise [5] and is an important prerequisite for performance in middle-distance and long-distance running [115]. There was no significant effect of any ST method on VO2max (all p > 0.544), which is consistent with previous meta-analyses in endurance athletes [10, 16]. An improvement in VO2max depends mainly on “upstream factors”, which include all physiological pathways that transfer oxygen from the environment to the blood, pumping it to the periphery and distributing it to and within muscle cells [116, 117]. The short duration of most ST efforts (i.e., exercise duration) probably did not induce an adequate stimulus to these factors. For example, traditional ST with variable resistance elevates oxygen uptake to approximately 45% of VO2max [118], which is not a sufficient aerobic stimulus to improve VO2max [119].

Although traditional ST methods may not stimulate VO2max, they can induce changes in neuromuscular function, musculotendinous stiffness, and muscle fiber type [120], factors that may aid endurance velocity and, by extension, vVO2max [3, 18]. The measure of vVO2max is the interaction between VO2max and running economy [2, 6], influenced by anaerobic and neuromuscular characteristics [121], and can explain differences in performance that VO2max and running economy alone cannot [6]. Indeed, vVO2max has been shown to be a good predictor of performance in middle-distance [122, 123] and long-distance running [124,125,126]. However, we did not find a significant effect of any ST method on vVO2max (all p > 0.479). The lack of an effect of ST methods on vVO2max may be related to the test protocol used to measure this outcome, particularly the duration of the stages [127]. Protocols with short duration stages and 1.00-km/h incremental changes every minute have been suggested for athletes to reach vVO2max, resulting in lower work and energy cost than longer duration stages [128]. From a total of 15 studies included in the meta-analysis for vVO2max, seven studies [22, 23, 84, 91, 97, 105, 106] used protocols with 1.00-km/h increments every minute, or in shorter stages (i.e., 30 s), and four of these studies showed significant effects after PL [91], HL [106], SubL [97], and Combined methods [84, 97]. From the seven studies that applied longer duration stages (i.e., 3 min or more), three found an effect on vVO2max after PL [83, 108] and HL [87], whereas others found no effect after Combined [86, 93, 102, 107] and HL [107]. These results are in line with the suggestion of a recent meta-analysis to use ramped protocols to elucidate the effects of plyometric jump training on vVO2max [16]. Considering the above, future research is needed to determine which protocol is valid for detecting vVO2max adaptations following different ST methods.

4.2 Maximum Metabolic Steady State

Maximum metabolic steady state dictates the boundary between heavy-intensity exercise and severe-intensity exercise [129, 130]. Below MMSS, exercise intensity can reach a steady state of muscle metabolism, whereas above MMSS this state is altered [131], which means increasing the MMSS through training would enable an athlete to achieve a steady state at higher running speeds. Our meta-analysis found no significant effect of any ST method on MMSS (all p > 0.760). The findings suggest that the analyzed ST methods do not generate sufficient metabolic impact to improve the MMSS, which typically benefits from training at intensities around this threshold [132]. Even an alternative approach [133] has been explored involving ST with low loads and high repetitions, aiming at a near-threshold intensity, showing different physiological and mechanical responses compared with aerobic training at these intensities. The absence of an effect of ST on the MMSS (and VO2max) implies that ST may not induce a sufficient stimulus to induce changes in metabolic factors, at least with traditional ST approaches.

4.3 Sprint Capacity

In our meta-analysis, we found no significant effect of Combined on sprint capacity (p = 0.072), but not enough studies were found to be able to perform a meta-analysis (i.e., at least three studies) for the other ST methods. Sprint capacity is an important variable because it allows runners to hold a favorable position at the start of a race and to sprint maximally towards its finish [4]. This may be especially relevant in middle-distance events in which the initial and final parts of the race have a higher proportion of sprinting than longer distances. Indeed, a relationship has been found between time achieved in elite male 800-m races with maximum sprint speed in elite male 800-m runners (R2 = 0.550) [134] and a near-significant relationship in sub-elite female 800-m runners (R2 = 0.380, p = 0.057) [135], but this has not been correlated with changes in 5-km performance [3]. As the capacity to sprint is determined by the application of skeletal muscle force (and not necessarily by anaerobic metabolism) [136], improvements may be because of improved neuromuscular capabilities [3]. However, it is important to mention that all [3, 86, 93, 111] but one [21] of the studies included sprint training in combination with ST, so these possible improvements may be due more to sprint training than to ST. Therefore, research is needed to examine the effect of ST on sprint capacity and its effect on middle-distance and long-distance races.

4.4 Running Performance

4.4.1 High Load Training

Interventions with HL improved running performance, including time trial and time to exhaustion measures (ES =  − 0.469 [moderate], p = 0.029). In contrast, our meta-analyses indicated no effect of HL on VO2max, MMSS, and sprint capacity. Given that no improvement in VO2max or MMSS was found with HL training, following a model that explains performance through metabolic (VO2max, MMSS, and running economy) and non-metabolic (running economy and sprint capacity) factors [137, 138], it is reasonable to argue that HL could improve performance through non-metabolic factors such as running economy [137]. Indeed, HL can induce non-metabolic (e.g., neuromuscular) adaptations [139] leading toward an improved rate of force development [101, 139]. A larger rate of force development may allow high force levels to be generated at shorter contact times (i.e., at a faster running pace) [140, 141], allow a faster transition from the braking phase to the propulsive phase [140], and allows for quasi-isometric muscle conditions that favor muscle energy costs [141, 142]. Indeed, the rate of force development has been correlated with running economy [101, 140, 142]. Additionally, HL can generate changes in lower limb stiffness [142,143,144], improving energy storage and release from the lower limbs during running, and this could lead to a reduction in the energy expenditure during running [145] and thus running economy [142,143,144]. Additionally, increased absolute strength may reduce relative effort at submaximal running speeds, activating a lower number of higher threshold motor units, resulting in a lower energy cost during running, and thus improved running economy [141]. Indeed, a secondary analysis of studies included in this systematic review revealed improved running economy after HL (ES =  − 0.266, p = 0.039).

Furthermore, we included time to exhaustion as an indicator of running performance. Two studies [101, 106] measured time to exhaustion at vVO2max, showing an improvement after HL intervention. Potentially, these improvements are due to an improvement in running economy [94, 101, 106] and anaerobic capacity [106]. Given that the time to exhaustion at severe intensity (i.e., intensity between MMSS and VO2max) is constrained by a decline in force production [137] and reduced fiber recruitment [146], it is plausible that enhanced rate force development and maximal strength (i.e., 1 RM) could offset the effects of fatigue through enhanced activation of motor neurons and recruitment of muscle fibers [101]. Indeed, while running at vVO2max, athletes with reduced decline in force production may reduce the increase in energy cost (i.e., greater muscular strength endurance) [147]. Consequently, HL could contribute to the delay of muscle fatigue at this specific intensity [101]. Overall, considering the myriad of factors associated with running performance [137, 138, 148], future studies should elucidate the underlying mechanisms (particularly non-metabolic factors) of the improvement in running performance (and fatigue resistance) following HL interventions.

4.4.2 Plyometric Training

Plyometric training may induce neuromuscular adaptations, such as increased motor unit recruitment and improved intermuscular coordination [149]. These neuromuscular improvements have been shown to correlate with improved running economy and anaerobic capacity [137]. Additionally, PL can improve stiffness and compliance (e.g., muscle, tendon, joint) [99, 150]. This mechanism enables greater storage and release of elastic energy within the tendon [150], resulting in reduced muscle energy expenditure [141], and thus improved running economy. Indeed, recent meta-analyses [13, 16] found a significant improvement of running performance after PL. In contrast, our meta-analysis denotes no improvement of running performance after PL (ES =  − 0.210, p = 0.064). One possible reason for the discrepancy is that we included a larger (more representative) number of studies in our analysis (n = 12) when compared with recent meta-analyses (n = 7–10) [13, 16]. However, most of the analyzed studies in previous meta-analyses (e.g., seven of ten) [16] were also included in this analysis. Further, our meta-analysis yielded a nearly significant effect for PL on running performance, with a higher ES compared with a similar meta-analysis that found a favorable running performance effect after PL (ES =  − 0.210 vs − 0.170, respectively) [13]. The reason for the discrepancies between published meta-analyses and our meta-analysis is currently unclear. Possible methodological differences (e.g., inclusion–exclusion criteria; statistical [meta-analytical] approach) may have played a role.

4.4.3 Combined

Combined involves incorporating more than one ST method into a training program. Our study revealed that Combined produced a significant large effect on running performance (ES =  − 1.035, p = 0.036), producing a greater effect than the use of a single ST method alone. Interestingly, the studies that included the Combined method utilized PL in combination with HL and/or SubL, confirming that including PL with resistance training has a favorable effect on running performance [16] and a greater effect than HL alone. In addition, in a secondary analysis, we found that the Combined method has a greater effect (ES =  − 0.426, p = 0.018) on running economy compared with the HL (ES =  − 0.266, p = 0.039), SubL (ES =  − 0.365, p = 0.131),and PL (ES =  − 0.122, p = 0.167) methods used individually. Therefore, we can assume that this improvement in running economy also translates to enhanced running performance. This can be observed in the study by Li et al. [21], which found that HL alone and HL combined with PL both improved running economy and 5-km running performance. However, despite no significant differences between the two groups, the HL with PL group exhibited a higher percentage improvement than HL alone in both running economy (7.68% vs 4.89% at 14.00 km/h) and running performance (2.80% vs 2.09%) [21].

One reason for an increased effect may be that the incorporation of different ST methods can generate a variety of overloads that challenge the neuromuscular system [19] and potentially enhance running performance by eliciting diverse neuromuscular mechanisms. Additionally, the sequence of exercises corresponding to different ST methods within the same training session or in separate sessions may serve different purposes for the force–velocity profile [151]. For example, contrast training (i.e., high load exercises followed by alternating plyometric exercises) could induce post-activation potentiation by improving the speed of plyometric exercises through enhancing both the force and velocity components, whereas traditional training (i.e., low load exercises followed by high load exercises) may primarily develop the force component and not be potentiated by exercises with low loads [151]. However, improvements have been observed in studies where both ST methods were included in the same session [21, 97, 103], as well as in separate sessions [3]. Of note, from the five studies that included SubL, four [3, 103, 110, 111] instructed athletes to perform exercises with maximal velocity intention, and one [97] described the intervention as explosive training. Maximal movement velocity intention at a given load can positively influence neuromuscular adaptations [152], and therefore running performance adaptations.

4.5 Limitations and Strengths

Some limitations of this meta-analysis should be mentioned. First, because of the different composition of each of the ST methods, we decided to perform a separate analysis of each ST method on each of the performance parameters (i.e., VO2max, vVO2max, MMSS, sprint capacity, and running performance), which resulted in the SubL group not reaching the minimum number of studies (i.e., three studies) for the main analysis in any of the performance parameters, while HL and PL did not reach the minimum number of studies for sprint capacity. In addition, in some cases, the minimum number of studies (i.e., eight studies) for a moderator analysis was not achieved. Second, high heterogeneity was found for Combined in the analysis for running performance (p = 0.009, I2 = 67.475), probably because different types of ST methods with varying effects were included in this group, and thus their effect on running performance should be interpreted with caution. Finally, in this study, we have focused mainly on aerobic parameters, but the anaerobic component is also a determinant of running performance [3], as well as durability [153]. The strengths of our study are also important to note. To our knowledge, this is the first meta-analysis to analyze the effect of different ST methods on different parameters determining running performance specifically in middle-distance and long-distance runners. Furthermore, we included time to exhaustion as an indicator of running performance allowing us to increase the number of studies in the analysis and to discuss durability.

5 Conclusions

In summary, this systematic review with meta-analysis suggests that ST with HL improves running performance measured by a time trial and time to exhaustion. Combining the PL method with HL and/or SubL showed greater improvement in running performance compared with the ST methods alone, while the PL method alone did not enhance running performance. These improvements occurred without changes in VO2max, vVO2max, MMSS, and sprint capacity, suggesting that the adaptations are mainly due to non-metabolic factors. These results suggest that middle-distance and long-distance coaches and athletes should consider the inclusion of more than one ST method in their training plan. Future research should aim to analyze and compare the effect of different ST methods combined and separately on running performance, as well as the underlying mechanisms related to these effects.