FormalPara Key Points

A graded relationship exists between velocity loss (VL) experienced during a set and acute training volume, neuromuscular, metabolic, and perceptual responses to resistance training with factors such as type of exercise, loads used, and individual characteristics of a trainee seeming to modulate these relationships.

Factors that can specifically affect the consistency of VL determination include reference repetitions, velocity variables (e.g., mean or peak), and criteria for set termination after VL has been exceeded, all of which should be considered when implementing VL in practice.

The amount of VL experienced during resistance training does not seem to affect strength and muscle endurance gains whereas higher VL may be superior when the aim is to induce hypertrophy. Allowing only low to moderate VL during resistance training seems to be a viable strategy for optimizing jumping, sprinting, and velocity against submaximal loads performance.

As higher VL experienced during resistance training could interfere with the ability to rapidly produce force, cause a reduction in the expression of fast-twitch muscle fibers, and prolong recovery from resistance training, low to moderate VL could be recommended to optimize strength and power training adaptations as well as the performance of sport-specific tasks. However, if hypertrophy is also the goal, more of the prescribed sets could utilize moderate VLs, or more total sets with low to moderate VL could be performed.

1 Introduction

Resistance training (RT) can produce many adaptations including strength, power, hypertrophy, and endurance, and for this reason plays an integral role in many long-term athlete development programs. While these adaptations may improve performance of athletic tasks such as jumping, sprinting, and change of direction [1, 2], resistance training also plays an important role in injury prevention and rehabilitation and has numerous beneficial effects on health and quality of life [3,4,5,6]. Designing an effective RT program requires careful consideration of many training variables such as the choice and order of the exercises, load, repetition range, volume, rest, intended velocity, and set structure configuration. Among these, training load and volume appear to be the most important training variables dictating the type and extent of acute and chronic adaptations to RT [7,8,9]. Traditionally, load is prescribed relative to a one-repetition maximum (%1RM) while RT volume is manipulated by modifying the total number of sets performed and/or the number of repetitions performed per set. Although this approach is relatively simple and efficient, it does not account for physiological and psychological stressors that might affect an individual’s day-to-day RT performance as well as inter-individual variability in RT performance [10]. For instance, load prescription based on %1RM might be less accurate as maximal strength can fluctuate daily [11] when an individual is fatigued or significantly increase within a few weeks because of training adaptations [12]. Further, the number of repetitions that can be completed with a given %1RM is highly variable as it is both individual and exercise specific [13, 14]. In this regard, sport scientists have explored velocity-based training approaches to load and volume prescription as an alternative method that may circumvent some of these limitations [10].

Load and volume prescription with velocity-based training rests on the premise that there is an inverse linear relationship between barbell velocity and %1RM; heavier loads cannot be lifted with the same velocity as lighter loads [10]. Furthermore, if an exercise is performed with maximal concentric effort and fatigue ensues, barbell velocity inevitably decreases [14]. Indeed, very strong correlations exist between intra-set velocity loss (VL) and mechanical, perceptual, and metabolic markers of fatigue [14,15,16], as well as between VL and the number of completed repetitions relative to the maximum number of repetitions possible in a set [15, 17]. For instance, in the squat, terminating a set after reaching 20% VL would typically result in 50% of the possible repetitions being completed [14], whereas a 40 or 50% VL would result in repetitions performed to, or very near, muscle failure [18]. Therefore, VL may be used as an indicator of fatigue during RT, and thus, may be used to regulate volume and proximity to failure with reasonable precision [14,15,16,17, 19].

Indeed, several studies have been conducted to investigate the acute effects of different VL thresholds on various correlates and markers of fatigue and generally reported nearly linear increases in fatigue as VL increased across the sets [14,15,16, 20]. For instance, Rodríguez-Rosell et al. [16] observed a gradual increase in blood lactate accumulation as VL thresholds increased from 10 to 45% and from 15 to 55% during sets of back squat and bench press, respectively. Weakley et al. [21] observed the same trend with 10, 20, and 30% VL, while also reporting a gradual decline in countermovement jump height and gradual increases in perceived exertion of the lower limbs and breathlessness after each set. Finally, Pareja-Blanco et al. [22] reported that for a given %1RM, a higher magnitude of VL in a set results in greater impairment of neuromuscular performance immediately after the training session and slower post-exercise recovery 24 and 48 h later. While these findings illustrate the utility of monitoring VL for RT prescription, some researchers suggested that the effects of different VL experienced during a set on the magnitude of neuromuscular, metabolic, and perceptual fatigue accumulation might depend upon the exercise and load used [16, 23]. In addition, the magnitude of VL itself could be affected by the reference repetition for determining VL (i.e., first vs fastest) [24] and the criteria for set termination (e.g., terminating a set after one or more repetitions passed below a certain VL threshold) [24]. Finally, although VL is frequently used to prescribe RT volume, the exact number of repetitions performed before reaching certain VL thresholds is also likely affected by the load and exercise used, as well as inter-individual variability and perhaps the reliability of velocity monitoring devices. Despite these limitations, different VL thresholds are often used with the aim of creating more homogeneous RT stimuli among individuals, which in turn are thought to lead to more consistent and enhanced long-term adaptations [10], although more research is needed to confirm these speculations.

Considerable evidence is accumulating from longitudinal studies (> 4 weeks in duration) comparing the effectiveness of different VL thresholds to one another on muscular strength, hypertrophy, and endurance as well as the performance of athletic tasks. In this regard, it has been suggested that the selected VL threshold can modulate adaptations to training in a dose–response manner [18, 25,26,27]. For instance, Pareja-Blanco et al. [26] recently showed that there might be an upper and lower VL threshold that should be prescribed during RT to induce optimal training adaptations, indicating that the dose–response relationship might follow an inverted U shape. Thus, it was concluded that low to moderate VL thresholds (i.e., 10 and 20%) should be chosen to optimize adaptations to RT because VL thresholds lower than 10% induced levels of fatigue that were too low to maximize adaptations, whereas high VL thresholds (i.e., > 40%) did not promote further strength or hypertrophy, and negatively affected the improvement of athletic tasks compared with moderate VL thresholds [26]. However, not all studies support this as similar improvements in maximal strength [28, 29], hypertrophy [29], and sprinting and jumping performance [28] were observed between lower and higher VL thresholds. To further confound matters, other factors such as training duration, choice of exercise, load, and participant strength levels likely moderate the effects of VL thresholds on various training adaptations.

In light of these considerations and inconsistencies in the scientific literature, there is a clear need for a comprehensive review and synthesis of the available evidence. Therefore, the aim of this systematic review and meta-analysis was to synthesize the available evidence on (1) the acute effects of different VL thresholds on markers of fatigue and number of repetitions per set during RT and (2) the chronic effects of different VL thresholds on training adaptations. This review also aimed to provide an overview of the factors that might differentially influence the magnitude of acute and chronic responses to different VL thresholds, thus providing a more nuanced assessment of the dose–response relationship between VL, acute fatigue accumulation, and various training adaptations. Such information is important to inform RT prescription strategies based on VL thresholds, ultimately allowing for better fatigue management and attainment of intended training adaptations.

2 Methods

2.1 Registration of Systematic Review Protocol

A systematic review of the literature was performed according to the guidelines in the Cochrane Handbook for Systematic Reviews of Interventions (version 6.0) and following the 2020 checklist for the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [30]. The original protocol was prospectively registered at the Open Science Framework (https://osf.io/q4acs/). The protocol registration occurred after searches were conducted, but before screening was completed and data extraction started.

2.2 Eligibility Criteria

All studies included met the following inclusion criteria: (1) the study was published in English; (2) evaluated the acute effects of one or more VL thresholds during RT on neuromuscular, metabolic and perceptual markers of fatigue, and/or examined their chronic effects on muscular strength, hypertrophy, endurance or power adaptations; (3) RT was prescribed using VL thresholds; (4) intensity of load (%1RM) and frequency were matched between conditions; (5) participants had no known medical condition or injury; (6) in acute studies, neuromuscular, metabolic, or perceptual responses (and variability thereof) to these thresholds were considered; (7) in longitudinal studies, the outcomes were assessed pre-intervention and post-intervention for muscular strength with a repetition maximum component, or maximum voluntary contraction test, hypertrophy (lean body mass changes or changes at the muscle level), endurance (total repetitions performed or mechanical work), and power adaptations (jump height, sprint and change of direction times, or velocity at a fixed load); and (8) training interventions in longitudinal studies lasted a minimum of 4 weeks.

2.3 Information Sources and Search Strategy

A PICO strategy consisting of terms for different VL thresholds, RT, and neuromuscular, perceptual, and metabolic outcomes as well as muscular strength, endurance, hypertrophy, and power adaptations was used to build search criteria for electronic databases. To ensure the inclusiveness of the search terms, the Word Frequency Analyser tool (http://sr-accelerator.com/#/help/wordfreq) was used to suggest potentially relevant search terms [31]. In addition, the Research refiner tool (https://ielab-sysrev2.uqcloud.net/) was used to optimize the sensitivity and specificity of the search for PubMed, while the Polyglot Search Translator Tool (https://sr-accelerator.com/#/polyglot) was used to adapt the search to other databases [31, 32]. The search string used for MEDLINE/PubMed is reported in the Electronic Supplementary Material (ESM). The following bibliographic databases were searched from inception to 6 December, 2020: PubMed/MEDLINE, SCOPUS, CINAHL (Cumulative Index to Nursing and Allied Health), SPORTDiscus, and Web of Science. No year restrictions were applied. Secondary searches included: (a) screening the reference lists of all included studies and relevant review papers; (b) examining the studies that cited the included studies (i.e., forward citation tracking) through Google Scholar; and (c) search alerts to monitor any new search results after the date of the last search up to 21 June, 2022.

2.4 Study Selection

Duplicate references were first removed using the EndNote reference manager (version X9.0.3; Clarivate Analytics, Philadelphia, PA, USA). Two authors (IJ and AGR) then independently screened titles and abstracts to determine initial eligibility using the systematic review software Rayyan. Authors were blinded to avoid bias during this process. Thereafter, the authors (IJ and AGR) independently screened the full texts to determine inclusion eligibility. Disagreements over eligibility at any stage were resolved through discussion, or with a third reviewer (BVH) when required.

2.5 Data Extraction

The following data were extracted from the included studies into an Excel spreadsheet: (1) study design and identification information; (2) adherence and study duration; (3) sample size; (4) participants’ age, body mass, height, sex, strength levels, and training experience; (5) relevant information regarding VL thresholds used, including various methodological factors (e.g., reference repetition, velocity variable, prescription method); and (6) means and standard deviations as well as raw mean changes and standard deviations of changes for pre-intervention and post-intervention assessments of the relevant outcome measures. If insufficient data were reported, the authors of those studies were contacted by e-mail. Web Plot Digitizer software (Version 4.1; https://automeris.io/WebPlotDigitizer/) was used to extract data from figures when the authors did not report or provide the data. Data extraction was completed independently by three authors (IJ, AGR, and APC) using two pilot-tested forms (one for acute and one for longitudinal studies) on five randomly selected studies that were then modified accordingly. Coding files were cross-checked between the authors, and any differences were resolved via discussion and agreement, or with a fourth reviewer (BVH).

2.6 Risk of Bias Assessment

Risk of bias assessment was performed using a modified Cochrane Collaboration tool for assessing the risk of bias in randomized trials [33]. Modifications included removal of the performance bias and blinding of outcome assessment bias criteria and adding effort bias, feedback bias, training prescription bias (for longitudinal studies only), outcome assessment bias, and familiarization bias. Blinding of outcome assessment bias was excluded as visual and verbal velocity feedback were used in the reviewed studies to ensure participants’ maximal intent, which improves the reliability of performance. Similar to previous systematic reviews and meta-analyses on exercise intervention studies [34, 35], the performance bias criterion was removed because it is impossible to blind participants and personnel in supervised exercise intervention studies. Assessments were completed independently by two reviewers (IJ and ERH) while any observed differences were resolved via discussion and agreement before merging the scores into a single spreadsheet.

2.7 Statistical Analysis

2.7.1 Acute Effects of Velocity Loss Thresholds

While we a priori planned to examine the acute effects of different VL thresholds during RT on repetition volume, neuromuscular, metabolic, and perceptual responses, and potential moderating effects of exercise, training prescription method, reference repetition for VL calculation, load, and strength levels of individuals, this was not done because of one or more of the following reasons: (1) a low number of studies reporting these outcomes; (2) a large amount of missing data; and (3) authors’ non-responsiveness to data request e-mails or refusal to provide data necessary for calculating effect sizes (usually baseline means and standard deviations, standard deviations of difference scores, or pre-post correlations). Attempts were made to circumvent these issues while making assumptions about baseline data based on other studies and estimating missing data using the data that were available following the procedures outlined by Elbourne et al. [36] and Borenstein et al. [37]. However, these procedures often resulted in spurious calculations (e.g., r > 1) that discouraged us from pursuing the meta-analysis. Nevertheless, to aid the interpretation of the findings, we used the data reported in the original studies and created visualisations that could be used to observe potential trends and interactions between the variables. Importantly, this was done only when a whole range of VL thresholds were investigated for a given outcome.

2.7.2 Chronic Effects of Velocity Loss Thresholds

The nature of our research question with regard to chronic effects of different VL thresholds on muscle strength, hypertrophy, and endurance, as well as sprint, countermovement jump, and velocity against submaximal load performance required the inclusion of a VL threshold, as a continuous moderator, in all meta-analytic models. This was needed as each study compared different VL thresholds to one another, rather than to no training at all (i.e., no control groups were included in the studies).

2.7.2.1 Calculation of Effect Size and Variance

Standardized mean changes were computed to quantify the effect of the intervention using different VL thresholds relative to the baseline, thereby permitting synthesis of the same outcome variable (e.g., strength, hypertrophy) from different procedures or scales. However, raw mean changes were computed and used as a summary measure of effect size when a given outcome was assessed using the same procedure or scale to aid the interpretation of the findings. Standardized mean changes for each group was calculated as the difference between post-test and pre-test scores, divided by the pre-test standard deviation with an adjustment (C) for a small sample bias [38,39,40]:

$${\text{SMC}} = C\left( {\frac{{M_{{{\text{post}}}} {-} M_{{{\text{pre}}}} }}{{{\text{SD}}_{{{\text{pre}}}} }}} \right); \quad C_{j} = 1 - \frac{3}{{4\left( {n_{j} - 1} \right) - 1}}.$$

The standardized mean change magnitude was interpreted as: small (0.20–0.49), moderate (0.50–0.79), and large (> 0.80) [41].

No studies reported the pre-intervention to post-intervention correlations required to determine the variance. Therefore, when the authors did not provide correlations upon our request, standard deviations of the pre-intervention to post-intervention change were used to calculate pre-to-post correlations using the following formula:

$${r}_{j}= \frac{{\mathrm{SD}}_{j,\mathrm{pre}}^{2} + {\mathrm{SD}}_{j,\mathrm{post} }^{2}- {\mathrm{SD}}_{j,\mathrm{ change}}^{2}}{2 \times {\mathrm{SD}}_{j,\mathrm{ pre}} \times {\mathrm{SD}}_{j,\mathrm{post}}}.$$

The corresponding authors were contacted when the standard deviations of the pre-intervention to post-intervention change were not reported. Of all the corresponding authors, one did not respond [42], whereas the corresponding author of the following studies included in this review [43,44,45] declined to provide the requested data. The other authors provided the necessary data to calculate the variance. For the missing standard deviation of the pre-intervention to post-intervention change, the median correlation using all other studies for a given outcome was imputed. This ensured that the maximum number of studies were included. The variability in designs among eligible studies required several decisions to ensure the data could be appropriately combined for the calculation of effect sizes. These decisions are detailed in the ESM.

2.7.2.2 Statistical Synthesis of Effect Sizes

Most studies in the quantitative part of the synthesis (81.2%) provided two or more effect sizes while comparing the effects of different VL thresholds. Effect sizes from the same study are likely more similar than effect sizes from different studies [46]. Thus, the inclusion of multiple effect sizes from a single study violates the assumption of independence in effect sizes in traditional meta-analyses (e.g., [47, 48]). As such, a three-level meta-analysis (i.e., a multilevel model) was used to account for dependencies among effect sizes from the same study [49]. A multilevel meta-analysis accounts for the hierarchical nature of the data (e.g., effect sizes nested within studies) and, in so doing, the extraction of multiple effects from each study preserves information improving statistical power [46]. This approach also decomposes the variance components of the pooled effect into sampling variance of the observed effect sizes (level 1), and variance within (level 2) and between studies (level 3) [47]. A multilevel meta-analysis was conducted for every outcome separately except for velocity at submaximal loads. For velocity against submaximal (low and moderate) load outcomes, a multivariate mixed-effects meta-regression was performed. In addition, cluster-robust variance estimation methods [50] with small-sample adjustments [51] were implemented to calculate standard errors of the overall effect size estimates, with clustering at the study level. This was done because (1) most studies reported changes in velocity against low and moderate loads and (2) all these studies reported multiple effect sizes for both sub-outcomes (i.e., moderate and low loads), and different VL thresholds. Therefore, these two sub-outcomes were highly correlated as the data from the same participants were analyzed multiple times for both sub-outcomes, giving rise to both hierarchical and correlated effects for this outcome. The correlation (ρ) between moderate and low loads was assumed to be 0.6. Observations were weighted by the inverse of the sampling variance, and all (final) model parameters were estimated by the restricted maximum likelihood estimation method. Tests of individual coefficients in all models, and their corresponding confidence intervals, were based on a t-distribution. Multilevel and multivariate models were fitted in R language and environment for statistical computing (version 4.0.5; R Core Team, Vienna, Austria) using the metafor package [52], while the cluster-robust variance estimation method was implemented using the clubSandwich package [53].

2.7.2.3 Moderator and Sensitivity Analyses

All meta-analytic models (i.e., multilevel and multivariate mixed-effects meta-regressions) included VL as a continuous moderator. Further, other theoretically relevant moderators were included when (1) the number of effect sizes was sufficient (at least eight to ten per moderator) and (2) the range of observations (or levels in case of categorical predictors) was not very narrow or identical among the studies. These moderators included study duration (continuous predictor), exercise (upper or lower body exercise), loads (higher and lower than 70% of 1RM), and strength levels (continuous predictor). The exercise moderator was categorized because back squat and bench press were the most prevalent exercises among the studies. In addition, the loads moderator was categorized as the majority of primary studies used progressive overloads across the weeks and averaging these loads to a single number might not accurately represent the loads used in a given study. Because of the inclusion of both fixed and random effects, restricted maximum likelihood estimation was used to evaluate the final models for each outcome. Furthermore, their contribution—and the contribution of modeled interactions among predictors—to the explanatory power of any of the explored models was examined using a likelihood ratio test, deviance statistic, and Akaike information criterion score for small sample sizes before selecting the final model to obtain the best fit while maintaining model parsimony. During this process, models were fitted—and subsequently compared—using the maximum likelihood method as likelihood ratio tests cannot be used to compare models with nested fixed effects using restricted maximum likelihood estimation estimates [54]. Finally, a dose–response relationship considering (1) individual study effect sizes; (2) average effect sizes of individual VL thresholds; and (3) average effect sizes of low (15% VL), moderate (> 15% < 30% VL), and high (> 30% VL) grouped VL thresholds was also evaluated for each outcome to aid interpretation of the findings.

For all meta-analytic models, Leverage, outlier, and influential case diagnostics were performed by calculating hat, Cook’s distance, and studentized residuals, respectively [55,56,57]. Cases were red flagged with their hat and Cook’s distance’s values greater than three times their respective mean, and with a studentized residual’s value greater than 3, in absolute values. For the multivariate model investigating the effects of VL thresholds on velocity against submaximal loads, a range of correlations between the outcomes were imputed (ρ = 0.4–0.8) to ensure the robustness of the estimates.

Publication bias was not assessed as we were not interested in the effects of training interventions in individual studies, but rather as a moderator effect of VL thresholds examined within those studies. In addition, there was no reason to expect that a certain training intervention would not result in a significant improvement over time in at least some of the outcomes given the absence of control groups (interpreted here as groups who would not train at all).

2.7.2.4 Statistical Heterogeneity

As all multilevel models included moderators (i.e., VL), statistical indices of heterogeneity were evaluated using I2 and τ2, which represented relative and absolute values of residual heterogeneity or the amount of the unaccounted for variability that is due to residual heterogeneity [58]. This heterogeneity was then partitioned across two levels (i.e., within-study and between-study heterogeneity). Importantly, for all multilevel models, the estimated proportional reduction in the total variance was computed using the variance accounted for, a pseudo R2 value (i.e., the amount of heterogeneity accounted for by the moderators) [59]. For the cluster-robust multivariate meta-regression, the amount of heterogeneity (τ2) for each outcome was calculated as well as the correlation between the outcomes (ρ).

3 Results

3.1 Search Results

The primary search yielded 545 results, of which 22 met the inclusion criteria. Forward citation tracking as well as monitoring the newly published relevant literature yielded an additional 15 studies, resulting in 37 studies included in this review. The stages of the search and study selection process are presented in Fig. 1.

Fig. 1
figure 1

Literature search flow chart. n number of studies

3.2 Study Characteristics

Out of 37 studies included, 18 were randomized cross-over acute studies, and 19 were training intervention studies. The total number of participants pooled across studies was 846 (767 were male and 69 were female). However, upon inspection, it was clear data from the same participants were used in multiple studies [20, 60,61,62]. This reduced the total number of participants to 735 (656 were male and 69 were female). Only five studies [29, 63,64,65,66] included male and female participants, two of them only female [67, 68] while the rest included only male participants. Back squat was the most frequently used exercise (26 studies), followed by bench press (12 studies), deadlift (two studies), bench pull, overhead press, leg press, loaded countermovement jump, and pull-up (one study each). Eleven studies used free-weight exercises, while the remaining used a Smith machine. A large range of VL thresholds were examined (0–55%) with 10, 20, 30, and 40% VL thresholds being the most frequent (ten or more studies each). In addition, participants with a large range of strength levels (1RM/body mass) were examined with the average lower and upper body maximal strength of participants being 1.48 (range 0.7–2.2) and 1.15 (range 0.65–1.56) times body mass, respectively. Velocity loss thresholds were prescribed using the first repetition (14 studies), and the fastest repetition (23 studies) of the set as the reference point. Load was prescribed with percentage of 1RM (12 studies), generalized load-velocity profiles (22 studies), and individualized load-velocity profiles (four studies). For longitudinal studies, the median study duration was 8 weeks (range 4–12). A more comprehensive description of the participants and the included studies can be found in Tables 1, 2, and 3.

Table 1 Study characteristics
Table 2 Summary of the acute studies included in the review
Table 3 Summary of the longitudinal studies included in the review

3.3 Risk of Bias Assessment

Only three studies [64, 66, 69] provided sufficient information regarding the method of randomization and were therefore at a low risk of an order effect bias. The remaining studies were classified as an unclear risk as they did not provide sufficient information regarding the method of randomization. No studies provided information regarding allocation concealment. One study [65] was at a high risk of attrition bias, excluding randomized participants (or their data) from the analysis without sufficient reason. Six studies [16, 20, 21, 43, 62, 70] did not provide sufficient information on the number of participants assessed and included in the analysis after reporting that some of them did not complete the entire intervention or all procedures and hence, had an unclear risk of attrition bias. No studies pre-registered their protocols on a publicly available registry platform, thus it was unclear whether selective reporting bias was present. Two studies [65, 67] had an unclear risk of effort bias as they did not provide information regarding the instructions to perform the concentric actions as fast as possible. The remaining studies had a low risk of effort bias as the instruction to perform concentric actions as fast as possible was given. Ten studies [63,64,65,66, 68, 70,71,72,73,74] did not provide any information on the provision of velocity feedback and hence, had an unclear risk of feedback bias. The rest of the studies either provided feedback to all groups or standardized the conditions between groups by not providing any feedback. Seven studies [28, 29, 66, 67, 74,75,76] were at a high risk of training prescription bias because the participants performed other forms of training (additional non-standardized RT, endurance training, or playing sports), or because not all exercises used VL thresholds, but rather a combination of training prescriptions. Two studies [64, 65] used a linear encoder that was not, to our knowledge, validated in the peer-reviewed literature whereas all other studies used valid and reliable methods, equipment, or instruments to evaluate their outcomes of interest. Fourteen studies [18, 25, 26, 42,43,44,45, 60, 61, 70, 73, 77,78,79] were at a high risk of bias for not having a familiarization session. Four studies [69, 75, 76, 80] did not provide sufficient information regarding their familiarization sessions and hence, had an unclear risk of bias The rest of the studies provided sufficient information about familiarization session procedures or specifically stated that all participants were accustomed to the study protocols (i.e., performed them in the past). The risk of bias assessment is also illustrated in Fig. 2.

3.4 Acute Studies

The following variables were visualized: (1) the mean and standard deviation of the number of repetitions performed in the set; (2) changes in countermovement jump height performance; (3) velocity against the load that can be lifted at 1 m·s−1 in a rested state (V1); and (4) blood lactate concentration after training sets or the entire session (Figs. 3, 4). In addition, to examine the discrepancy between the VL threshold prescribed and the actual VL experienced by the participants in each study, standard deviations of the actual VL experienced were visually represented using density plots (Fig. 3).

3.5 Longitudinal Studies

For all multilevel models, significant moderators and sensitivity analyses are described in the text, whereas their output is presented in Table 4 and visualized in Figs. 5, 6 and 7. For the multivariate model, all information is described in the text, and model estimates are visualized in Fig. 6b. Dose–response relationships, as quantified by effect sizes, between VL and outcomes of interest are also illustrated in Figs. 5, 6 and 7.

3.5.1 Muscle Strength

The final multilevel model investigating the effects of different VL thresholds on maximal strength gains revealed exercise, strength levels, and study duration to be significant moderators (Table 4; Fig. 5a). Two individual groups from two different studies were identified as influential. Excluding these influential groups from the analysis affected the interpretation of the model, with exercise (b = − 0.163 [− 0.416, 0.094]; p = 0.206) and strength levels (b = − 0.181 [− 0.655, 0.293]; p = 0.444) no longer being significant moderators.

3.5.2 Muscle Hypertrophy

The final multilevel model investigating the effects of different VL thresholds on muscle hypertrophy revealed VL to be a significant moderator (Table 4; Fig. 5c). Two individual groups from two studies were identified as influential. Excluding these influential groups from the analysis affected the interpretation of the model, with VL no longer being a significant moderator (b = 0.005 [− 0.002, 0.013]; p = 0.144).

3.5.3 Muscle Endurance

The final multilevel model investigating the effects of different VL thresholds on muscle endurance did not reveal VL to be a significant moderator (Table 4; Fig. 7a). Two individual groups from two different studies were identified as influential. However, the overall results were robust to their exclusion from the model as the interpretation of the model did not change.

3.5.4 Countermovement Jump Height

The final multilevel model investigating the effects of different VL thresholds on the countermovement jump revealed VL and study duration to be significant moderators (Table 4; Fig. 6a). Three individual groups from three different studies were identified as influential. However, the overall results were robust to their exclusion from the model as the interpretation of the model did not change. In fact, the confidence in the estimate for both VL (b = − 0.048 [− 0.073, − 0.023]; p = 0.001) and study duration (b = 0.400 [0.105, 0.695]; p = 0.010) increased after their removal.

3.5.5 Sprint Time

The final multilevel model investigating the effects of different VL thresholds on sprint time revealed VL and study duration as significant moderators (Table 4; Fig. 6c). Three individual groups from three different studies were identified as influential. Excluding these influential groups from the analysis affected the interpretation of the model, with study duration no longer being a significant moderator (b = − 0.005 [− 0.031, 0.021]; p = 0.696).

Table 4 Moderator analyses (of multilevel models)

3.5.6 Velocity Against Submaximal (Low and Moderate) Loads

For the final multivariate model investigating the effects of different VL thresholds on velocity against low and moderate loads, seven groups from five studies were identified as influential. Because of the high number of influential groups, these were excluded, and estimates of the model without these influential groups were retained (Fig. 7c). This model revealed VL (b = − 0.018 [− 0.029, − 0.006]; t = − 3.69; p = 0.010) and load (b = 1.182 [0.342, 2.022]; t = 3.12; p = 0.011) as significant moderators (note that low load was a reference outcome). The interaction between the VL and outcome was not significant (b = 0.014 [− 0.007, 0.035]; t = 1.73; p = 0.146). Heterogeneity for the low load outcome was considerably lower (τ2 = 0.235) compared with the moderate load outcome (τ2 = 2.034) with the model-estimated correlation between the outcomes being high (ρ = 0.844). Imputing a range of different correlations between the low and moderate loads (ρ = 0.4–0.8) did not affect the interpretation of the model, confirming the robustness of the estimates.

Fig. 2
figure 2

Risk of bias assessment for all included studies. Na not applicable

4 Discussion

The present systematic review evaluated the acute effects of different VL thresholds on volume and fatigue during RT and meta-analyzed their chronic effects on training adaptations while considering several factors that might differentially influence the magnitude of these acute and chronic responses. Several interpretations stem from our findings: (1) while the number of repetitions per set generally increases as the VL increases, the variability in repetitions performed is modulated by exercise choice and load and (2) because of these increases in repetitions per set, blood lactate concentration and rating of perceived exertion increase whereas countermovement jump, sprinting, and V1 performance decrease proportionally as VL increases. However, the magnitude of these effects is highly influenced by exercise and load; (3) the specific VL threshold used does not have a profound effect on gains in strength and muscle endurance; however, (4) selecting moderate to high VL thresholds for hypertrophy, and low to moderate thresholds for enhancing countermovement jump, sprint, and velocity against submaximal loads may be a viable strategy to induce superior training adaptations. Therefore, many factors should be considered when prescribing RT using VL thresholds to create more homogeneous stimuli among individuals, thereby optimizing fatigue management and intended training adaptations.

4.1 Effects of Velocity Loss Thresholds on the Number of Repetitions Completed Per Set

Researchers have recommended RT prescription with VL thresholds over traditional methods owing to the strong relationship between the magnitude of VL and the number of repetitions performed with respect to the total number that can be completed before reaching failure [15, 17]. The argument is strengthened by the fact that the number of repetitions performed to failure with a given %1RM has a high inter-individual variability [13]. However, this argument does not discount that the number of repetitions performed before reaching different VL thresholds might also have a high inter-individual variability. Indeed, this contention seems to be empirically supported because data from two recent studies [21, 81] suggest that the number of repetitions performed until reaching 10, 20, and 30% VL in the free-weight back squat exercise is not only highly variable between individuals but is also unstable across sessions. In addition, this inter-individual variability may increase as the magnitude of VL increases [21]. Based on the studies included in the present review, it seems that exercise choice and load can further influence the actual number of repetitions performed and the variability thereof (Fig. 3). Specifically, both the actual number of repetitions and its variability seem to be higher in the back squat compared with the bench press exercise across VL thresholds. Furthermore, both factors tend to have a strong inverse relationship with load, as higher loads allowed for fewer repetitions and produced lower variability in repetitions across VL thresholds. This is a previously overlooked outcome as studies often focus on the ability of VL thresholds to modulate, with acceptable reliability, the percentages of the completed repetitions per set with respect to the maximum number of repetitions possible [15, 17] and kinetic and kinematic outputs [21, 62, 81]. Although these aspects of VL thresholds present an advantage over traditional methods for prescribing RT volume, the effects of the variability of the actual number of repetitions performed before reaching a certain VL threshold have not yet been empirically investigated. It is possible that individuals completing different numbers of repetitions using the same VL threshold might experience different degrees of neuromuscular, metabolic, and perceptual fatigue, potentially influencing resultant training adaptations. In this regard, it is unknown whether the specific VL threshold is a more important variable than the actual number of repetitions performed, as no studies to date have compared different VL thresholds matched for volume. Collectively, based on the studies included in the present review, it seems the use of VL thresholds for RT prescription could result in the considerable variability of the actual number of repetitions per set completed, which can further be confounded by other factors such as the choice of exercise and the load used. Whether this variability could modulate both the acute and chronic effects of VL thresholds presents an interesting avenue for future research.

Fig. 3
figure 3

Visual representation of the mean number of repetitions performed per set by intensity of load (a) and exercise (b), as well as standard deviation of the number of repetitions performed per set by intensity of load (c) and exercise (d) across the velocity loss thresholds reported in the literature. Note, longitudinal studies were also included here when they reported number of repetitions per set for each training session. Note, one study outlier was removed from the figure as the participants completed more than 25 repetitions in a set

4.2 Acute Effects of Velocity Loss Thresholds on Neuromuscular, Metabolic, and Perceptual Markers of Fatigue

Fatigue is traditionally defined as a loss of force-generating capacity with the eventual inability to sustain exercise at the required or expected level [82, 83]. Muscle-shortening velocity decreases and relaxation time increases as fatigue ensues [84]. In this regard, velocity against a fixed load (e.g., V1) before and after RT is often used as a marker of neuromuscular fatigue in studies investigating the acute effects of different VL thresholds. Indeed, this marker has a high correlation (r > 0.9) with other markers of fatigue such as blood lactate and ammonia accumulation as well as countermovement jump height loss after RT [14,15,16, 20]. Therefore, it is not surprising that several studies reported an almost linear decrease in post-session V1, and countermovement jump height, as well as an increase in blood lactate accumulation as VL increased [14,15,16, 21]. However, the dose–response relationship of VL with these markers of fatigue seems to be modulated by the exercise and load used (Fig. 4). For instance, as load decreases while using a given VL threshold, greater reductions in post-session V1 and countermovement jump height are observed [16]. Furthermore, Rodríguez-Rosell et al. [16] observed greater declines in post-session V1 in the bench press compared with the back squat, independent of load and VL. The authors attributed these V1 differences between exercises to the smaller muscles—with more type II fibers and higher fatiguability index—involved in the bench press than the squat exercise [85,86,87]. Rodríguez-Rosell et al. [16] also reported greater blood lactate accumulation during the back squat compared with the bench press, regardless of the load used and VL experienced. In addition, the rate at which metabolic stress increased, as the VL increased, was considerably lower with greater loads (i.e., 80% RM) during the back squat but not bench press, for which metabolic stress uniformly increased as the VL increased regardless of the load used. Therefore, it seems that VL thresholds induce differential neuromuscular and metabolic responses to RT depending on the exercise used.

Fig. 4
figure 4

Visual representation of the variability of the actual velocity loss experienced in a set (a), post-session blood lactate accumulation across velocity loss thresholds by exercise and intensity of load (b), pre-post percent change in velocity against the load that can be lifted at 1 m·s−1 (V1) by exercise and intensity of load (c), and pre-post percent change in countermovement jump (CMJ) height (d) across velocity loss thresholds reported in the literature

One potential explanation for this phenomenon could lie in the actual number of repetitions performed before reaching different VL thresholds. Namely, while the RT protocols employing different exercises used the same VL threshold, it is plausible that performing more work (i.e., more repetitions) until reaching a given VL led to a greater blood lactate accumulation [88, 89]. This is supported by the findings of Weakley et al. [90], which showed greater metabolic responses accompany increases in work completed during RT. Studies included in this review generally show that a higher number of repetitions are completed with the back squat compared with the bench press (Fig. 3). Therefore, when completing more work with the back squat compared with the bench press for a given VL threshold, higher metabolic stress is a logical outcome. Thus, the actual training volume completed in a set with a given VL threshold is an important consideration when prescribing RT. Considering the above, it seems that neuromuscular responses are less sensitive to subtle changes in volume during a set compared with metabolic responses, whereas greater neuromuscular fatigue is induced when using exercises involving smaller muscle groups (greater localized fatigue) with greater percentages of type II muscle fibers (a higher fatiguability index). However, countermovement jump height, also a valid marker of neuromuscular fatigue [91], seems to be extremely sensitive to changes in load (Fig. 4d). As higher loads typically allow for less volume (i.e., repetitions) to be completed in a set, it is plausible that countermovement jump height would also be sensitive to subtle changes in training volume, highlighting that different neuromuscular fatigue assessments might differ in sensitivity. Nevertheless, future research should substantiate these contentions.

Based on the available literature, rating of perceived exertion also seems to increase as VL increases. For instance, Weakley et al. [21] found gradual increases in perceived exertion of the lower limbs and breathlessness after each set with 10, 20, and 30% VL. More specifically, the rate of increase in both perceptual measures seemed to be consistent for the 10% VL threshold, whereas perceived exertion of the lower limbs increased at a greater rate compared with breathlessness across sets with higher VL thresholds (20 and 30%), although the overall magnitude of both perceptual responses was similar. This finding is somewhat supported by Emanuel et al. [92] who reported that the most frequent cause of set termination during sets of back squats to volitional failure was perceived fatigue in the targeted muscles, whereas cardiovascular factors were not as frequent a cause. However, this likely depends on the training background of the individuals. Based on these findings, prescribing larger velocity loss thresholds (e.g., 20 and 30%) for back squats might lead to larger increases in perception of leg muscle exertion than breathlessness across repeated sets. Similar findings were reported by Dos Santos et al. [70] who found that both perceived exertion and discomfort linearly increased as the number of back squat sets increased with a 30% VL threshold. Although it has not been discussed in the literature, the intention of continuously performing repetitions as fast as possible might also impact perceptual responses, especially leg muscle exertion [21] and perceived discomfort [70]. Admittedly, this hypothesis is challenging to investigate as the provision of maximal intent is a prerequisite for reliable velocity outputs.

The time course of fatigue recovery following RT depends on a myriad of factors including training volume and load. Despite the proposed benefits of VL thresholds in the literature [10, 14, 15], only Pareja-Blanco et al. [22] examined the time course of recovery after using different VL thresholds and loads during RT. For this purpose, the researchers examined vertical countermovement jump height, 20-m sprint time, and V1 before RT, and immediately, 6, 24, and 48 h post-back squat training with a combination of 20 and 40% VL and 60 and 80% of 1RM. Interestingly, with 60% 1RM, regardless of the VL used (20 vs 40%), none of the performance tasks fully returned to pre-exercise values at 48 h post-RT. In contrast, the RT protocol using higher loads (80% 1RM) and lower VL (20%) resulted in lower performance impairment immediately after RT, and greater sprint performance at 48 h post-RT compared with baseline. Interestingly, sprint time generally recovered faster compared to countermovement jump height and V1, suggesting their superior sensitivity for detecting RT-induced neuromuscular fatigue, and the fact that recovery may be exercise dependent. Nevertheless, prescribing higher VL (e.g., 40%) and lower relative loads (e.g., 60% 1RM) could result in greater fatigue immediately after RT and a slower rate of recovery than lower VL (e.g., 20%) and higher relative loads (e.g., 80% 1RM). This finding is especially relevant for sports where RT precedes sport-specific training, in which case an appropriate VL may decrease interference with subsequent sports training.

4.3 Methodological Considerations When Implementing Velocity Loss Thresholds and Future Research Directions

Several research groups have suggested that implementing VL thresholds may allow for better fatigue management compared with traditional RT training prescription methods [14, 15]. It also has been suggested VL can serve as a valid indicator of fatigue because of its high correlation with other frequently used neuromuscular and metabolic markers of fatigue [14,15,16,17]. While this presents a considerable advancement for RT monitoring and prescription, there are a few methodological factors that could compromise their utility both in research and practice. First, it is not clearly understood when exactly one should terminate a set after reaching a pre-determined VL threshold. In the literature, set termination after either one or two repetitions exceeding a VL threshold is common. The rationale for two repetitions is based on the fact that individuals can in some cases produce a velocity above a certain VL threshold, even after this threshold was exceeded for the first time [24]. On this note, some of the studies included in this review—all of which used VL to prescribe RT—reported considerable variability in the VL achieved at the end of a set (Fig. 4a). The magnitude of this variability reported in several studies [70, 80, 93, 94] ranged from 5 to 13%. At the extreme end of this range, one could theoretically expect an individual to reach 40% VL in a set when only 30 or 35% was intended. These limitations should be considered in practice and future research should investigate ways of reducing this variability. Second, the reference repetition from which the VL is calculated (i.e., the first or the fastest in the set) is an important consideration as it affects the VL achieved and subsequently the number of repetitions performed [24]. As the first repetition is not always the fastest [24, 95, 96], it is important to use the fastest repetition as the reference for VL calculations to ensure more precise RT monitoring and prescription. Third, a reduction in the ability to accelerate the load at the beginning of the concentric phase will likely affect mean velocity more than peak velocity [97, 98]. In this regard, mean velocity should be used rather than peak velocity when implementing VL in training because of its higher sensitivity in detecting the fatigue progression during a set [24]. Fourth, while studies established a close relationship between VL and the percentage of the repetitions completed out of the maximum possible, these percentages may have a high inter-individual variability [24]. In this regard, future research should investigate whether prescribing individualized VL thresholds could circumvent these uncertainties associated with prescribing the same VL for all individuals in a training session. Finally, while the effects of load and exercise selection were thoroughly discussed in the present review, there are other potentially relevant factors such as strength and height of the individual that might affect the utility of VL in practice [99]. Therefore, future research should continue exploring factors that could affect the precision of VL thresholds and subsequent acute and chronic effects of their implementation.

At least some of the limitations already described could be potentially alleviated by establishing the repetitions in reserve (i.e., the specific number of repetitions that remain uncompleted at set termination) velocity relationship. The rationale for establishing the repetitions in reserve velocity relationship is that despite the strong relationship between the percentage of repetitions completed out of the maximum possible with VL, the post-set repetitions in reserve remains unknown when using VL [19]. This is important because the last repetitions of a set contribute more to the alteration of muscle energy balance and the abrupt increase in metabolites such as ammonia [14, 100, 101]. In this regard, two studies attempted to establish the relationship between repetitions in reserve and velocity [19, 102]. Morán-Navarro et al. [19] examined the within-individual variability for the velocity associated with a given number of repetitions in reserve (i.e., 2, 4, 6, and 8) in the Smith machine bench press, shoulder press, bench pull, and back squat. The authors concluded that regardless of the load used, velocity at a given repetition in reserve is very similar and highly reliable for a given exercise. However, within-individual variability was considerably higher for the bench press and shoulder press compared with other exercises, but this variability was lower among more RT-experienced participants. García-Ramos et al. [102] also examined the repetitions in the reserve velocity relationship, and while they found a high correlation for the Smith machine bench press (r = 0.88), they also reported large between-individual variability for velocity at a given repetition in reserve (from 1 to 10). Based on these findings, it seems that a repetition in the reserve velocity relationship, like a load velocity profile, should be established for each exercise, and for each individual. Doing so may alleviate many of the shortcomings identified for the VL prescription method. With that said, the literature on this relationship is still scarce with no information available for free-weight exercises, nor on the potential moderating effects of strength, training background, or sex. Considering this, and the conflicting results already reported in the literature, future studies should be conducted to address the potential utility of this RT prescription method.

4.4 Effects of Velocity Loss Thresholds on Muscle Strength, Hypertrophy, and Endurance Training Adaptations

Based on the results of the present meta-regression, the choice of VL during RT does not seem to affect the magnitude of strength gains when controlling for other factors such as choice of exercise, strength levels, and training duration (Table 4; Fig. 5). This is despite the fact that most studies reported considerable differences in training volume that linearly increased as the VL increased. These findings are somewhat in accordance with the meta-analysis by Ralston et al. [103] who found only trivial to small effects (effect size differences: 0.14–0.23) of higher (5+ sets) versus lower (1–4 sets) weekly set volumes on strength gains. However, it must be noted that participants in the majority of studies included in that meta-analysis performed sets to muscle failure. In contrast, different VL groups included in the present review differed not only in training volume, but also proximity to failure in each set. For instance, performing repetitions until 10% VL would result in not only lower training volume, but also more repetitions left in reserve compared with performing repetitions until 30% VL with the same load and exercise. Therefore, the findings of the present review might be used to support both the notion of avoiding training to failure and also not needing to perform high-volume protocols when the aim is to optimize strength gains. Indeed, although the majority of studies included in the present review found no statistically significant differences in strength gains between different VL thresholds, the magnitudes of improvement (as quantified by effect sizes) seem to suggest a slight advantage of low to moderate over high VL thresholds (Fig. 5b). The authors from the several studies [25,26,27, 60] suggested that an inverted U-shaped relationship might exist between VL experienced in a set and maximal strength gains. For instance, Pareja-Blanco et al. [25, 26] reported that once a moderate VL threshold was exceeded (e.g., 20 or 25% VL), further increases in strength gains were not observed. In addition, higher VL thresholds can cause a decrease in the early rate of force development [26] and a reduction in the expression of fast-twitch muscle fibers [18] following RT. Further, several researchers [25, 26] reported that a 0% VL, meaning performing only one repetition during a set, did not lead to optimal strength gains. Therefore, a minimal VL threshold (e.g., ≥ 5%) is needed to induce optimal strength gains. Considering all the above, low to moderate instead of high VL thresholds should be prescribed when the goal is to optimize neuromuscular adaptations to RT.

Fig. 5
figure 5

Multilevel mixed-effects meta-regression illustrating the effects of velocity loss thresholds on muscle strength gains (also see Table 4) after controlling for exercise, study duration, and strength levels of the individuals (a), and the effects of velocity loss thresholds on muscle hypertrophy (c). Dose–response relationship considering (1) individual study effect sizes (green circles); (2) average effect sizes of individual velocity loss thresholds (red circles); and (3) average effect sizes of low (≤ 15%), moderate (> 15% < 30%), and high (> 30%) grouped velocity loss thresholds (purple circles and lines) between velocity loss and muscle strength (b) and hypertrophy (d) gains. Black (non-vertical) solid and dotted lines represent estimated relationships and corresponding upper and lower 95% confidence intervals, whereas vertical dotted lines represent boundaries between velocity loss thresholds. SMC standardized mean change

In contrast to gains in maximal strength, an increase in VL led to a somewhat linear increase in muscle hypertrophy (Fig. 5c, d). In this regard, a meta-analysis from Schoenfeld et al. [8] found a graded dose–response relationship between training volume and muscle hypertrophy. As training volume concomitantly increases with VL, it is not surprising that moderate, and especially high VL thresholds induced the most muscle hypertrophy. Volume, rather than the VL threshold itself, seems to be the factor driving differences in hypertrophy as illustrated by Andersen and colleagues [29] who observed no significant differences between 15 and 30% VL threshold groups in the only longitudinal VL study examining muscle hypertrophy with equated volume. However, this finding is not universal as some studies found moderate VL (e.g., 20–25%) thresholds to be equally effective as higher (e.g., > 40%) VL thresholds at promoting hypertrophy [25, 26]. These discrepancies were not discussed in the scientific literature but could at least partially be explained by the combination of the following factors: (1) training status of the participants (e.g., slight numerical differences in muscle cross-sectional area at baseline in favour of moderate thresholds) and (2) relatively low training frequency (~ 2×/week), study duration (~ 8 weeks; 16 sessions), and the number of sets (~ 6/week). Thus, moderate VL thresholds should be prescribed when the aim is to optimize hypertrophy without sacrificing neuromuscular adaptations.

Traditionally, performing many repetitions per set has been recommended when the goal is to induce positive muscle endurance adaptations during RT [104, 105]. Similar conclusions were drawn in a more recent meta-analysis [35]. Contrastingly, the results of the present meta-regression suggest that different VL thresholds, and thus varying number of repetitions performed per set, do not seem to modulate gains in muscle endurance during RT (Fig. 7a, b; Table 4). In fact, higher VL thresholds seemed to be slightly less effective at inducing muscle endurance gains (Fig. 7b). This is surprising given the observed differences in training volume that linearly increased as the VL increased. Moreover, one study [79] recently reported that the group who performed bodyweight pull-ups until reaching 25% VL improved muscle endurance in the same exercise (i.e., number of repetitions to failure) slightly more than the 50% VL group despite the differences in training volume. In this regard, studies [25, 26, 43, 44] often hypothesize that the superior gains in maximal strength observed for low to moderate compared to high VL thresholds might be responsible for these findings. This is a plausible explanation as the muscle endurance tests used a fixed load both at baseline and post-intervention, meaning that the group that experienced greater strength gains would perform the strength endurance test with a lower relative load compared with the group that experienced lesser strength gains, thus allowing more repetitions to be performed until failure. Indeed, high correlations (r = 0.63–0.71) have been reported between improvements in maximal strength and muscle endurance, which could support this contention [43, 44]. In addition, similar dose–response curves for muscle strength and endurance, but not hypertrophy, were observed in a recent study [106] investigating the effects of training volume on muscle adaptations, which aligns with the results of the present meta-regression. However, a training program with a repetition range that mimics the endurance test generally leads to greater improvements in muscle endurance [107]. In this regard, it is unclear why higher VL thresholds, which generally allow for greater repetitions per set and therefore more closely mimic muscle endurance tests, did not prove to be superior for this outcome. Perhaps the fact that most studies in the present review terminated their muscle endurance tests when the barbell reached ~ 0.50 m·s−1 could be responsible for these findings, thus making the test relatively more similar to low to moderate, but not high VL thresholds. Future studies are needed to investigate these possibilities.

4.5 Effects of Velocity Loss Thresholds on Performance of Athletic Tasks and Velocity Against Submaximal Loads

Based on the results of the present meta-regression, there is an inverse relationship between VL and subsequent improvement in countermovement jump and sprint performance. In addition, study duration also seems to modulate the gains in jumping and sprinting performance with longer training interventions leading to greater gains in performance. This finding was observed despite the fact that only two out of ten studies that investigated the effects of VL thresholds on jumping or sprinting performance incorporated sprinting or jumping in their training programs (either directly or through playing sport). Jumping and sprinting improvements were also unrelated to maximal strength gains, which were more similar between VL thresholds compared to athletic task performance. Therefore, some authors concluded the degree of RT transfer to actual physical performance was more dependent on the magnitude of VL attained in the set rather than gains in strength [43, 44]. This contention could be supported by the principle of training specificity [108]. In general, average velocity was higher for low to moderate than high VL thresholds. In this regard, significant correlations were reported between the velocity of the repetitions performed and changes in jumping and sprinting performance [43, 44], supporting the importance of repetition velocity for enhancing high-speed actions such as jumping and sprinting. The inverse could also explain these findings, as the number of repetitions performed at slower velocities was progressively greater as VL increased. Therefore, it could also be argued that the excessive amount of fatigue from high VL interferes with athletic task performance. However, more research is needed to determine the causal factor, as Pérez-Castilla et al. [28] found no significant differences in jumping and sprinting improvement between 10 and 20% VL threshold groups with equated volume, the only study to have controlled for volume. Admittedly, this study lasted only 4 weeks (below the average in the present review), compared with only low to moderate VL thresholds, and included different jumping exercises in their training interventions, all of which could have affected the results.

The findings of the present meta-regression on the effects of different VL thresholds on velocity against submaximal loads might support the importance of actual repetition velocity during RT that is implemented with the intent of improving jumping and sprinting performance. Indeed, improvement in velocity against moderate (< 0.8 m·s−1), and especially low loads (> 1 m·s−1) progressively increased as the VL decreased (Fig. 7c, d). As lower VL thresholds allow for greater velocities and therefore higher velocity adaptations against low loads, these findings collectively support the training specificity concept in relation to RT transfer to the performance of athletic tasks such as jumping and sprinting. A large degree of variability in velocity against moderate loads was observed, which could probably be explained by the large range of loads that fell into the moderate loads category. Nevertheless, it seems that moderate VL thresholds (Fig. 7d) were slightly more effective compared with low and high VL thresholds at improving velocity against moderate loads, further supporting the principle of training specificity. Collectively, these findings support the idea that training should be informed by changes in an individual’s load-velocity profile, as doing so identifies the specific RT-induced adaptations along the load-velocity curve, thus providing a more comprehensive analysis of RT-induced changes compared to maximal strength changes alone.

Fig. 6
figure 6

Multilevel mixed-effects meta-regression illustrating the effects of velocity loss thresholds on countermovement jump (a) and running sprint time (c) after controlling for study duration (also see Table 4). For (a) and (c), larger data points received greater weighting than smaller data points. Dose–response relationship considering (1) individual study effect sizes (green circles); (2) average effect sizes of individual velocity loss thresholds (red circles); and (3) average effect sizes of low (≤ 15%), moderate (> 15% < 30%), and high (> 30%) grouped velocity loss thresholds (purple circles and lines) between velocity loss and countermovement jump (b) and running sprint (d) performance improvement. Black, solid, and dotted (non-vertical) lines represent estimated relationships and corresponding upper and lower 95% confidence intervals, whereas vertical dotted lines represent boundaries between velocity loss thresholds. MC mean change

Fig. 7
figure 7

Multilevel mixed-effects meta-regression illustrating the effects of velocity loss thresholds on muscle endurance quantified by the number of repetitions performed in a fatigue test (a). Multivariate mixed-effects meta-regression illustrating the effects of velocity loss thresholds on velocity against low (> 1 m·s−1; red circles and lines), and moderate (< 0.8 m·s−1; green circles and lines) loads (c). For a and c, larger data points received greater weighting than smaller data points. Dose–response relationship considering (1) individual study effect sizes (green circles); (2) average effect sizes of individual velocity loss thresholds (red circles); and (3) average effect sizes of low (≤ 15%), moderate (> 15% < 30%), and high (> 30%) grouped velocity loss thresholds (purple circles and lines) between velocity loss and muscle endurance (b) and velocity against submaximal loads (d) performance improvement. Black, green, and red (solid and dotted) lines represent estimated relationships and corresponding upper and lower 95% confidence intervals, whereas vertical, dotted, and black lines represent boundaries between velocity loss thresholds. MC mean change, SMC standardized mean change

4.6 Implications for Training and Research Based on the Findings from Longitudinal Studies

Overall, based on the findings of the present review it can be concluded that (1) while the differences in strength and muscle endurance adaptations between VL thresholds are small, low to moderate VL thresholds may be slightly more effective for inducing these adaptations compared with higher VL thresholds; (2) moderate to high thresholds are likely more effective for muscle hypertrophy compared with lower thresholds; (3) jumping and sprinting performance improve the most following lower VL threshold training; and (4) low to moderate VL thresholds will improve velocity against low loads, whereas moderate thresholds more effectively improve velocity against moderate loads. Considering less time is required when training with low to moderate VL thresholds, potential reductions in early rate of force development [26], percentage of fast-twitch muscle fibers [18], and the likely delayed time course of recovery after RT with high VL thresholds [22], low to moderate VL thresholds should generally be prescribed when the goal is to optimize strength and performance adaptations. These findings are especially relevant for team sports where frequent matches throughout the season and extended competition periods alter the length of the preparatory period and its specific phases, but also for individual sports where athletes often train multiple times a day and need to manage RT fatigue for both event performance and sport-specific training sessions.

It must be noted, however, that it is presently unclear if differential effects of low to moderate and high VL thresholds are indeed due to differences in VL (and therefore repetition velocity and proximity to failure), training volume, or a combination of both. In this regard, only two longitudinal studies equated training volume between different VL thresholds, both of which found no significant differences between groups [28, 29]. Therefore, it may be that differences in training volume are the main drivers of differential adaptations following the use of different VL thresholds. In partial support of this, reductions in type IIx fibers and the rate of force development have been shown to be larger following higher as compared with lower volume training [109]. Nevertheless, future studies should equalize training volume between VL thresholds to isolate their effects from the influence of total volume load to support or refute this contention. Furthermore, no studies investigating the effects of different VL thresholds have manipulated the number of sets. Manipulating the number of sets could be a viable strategy to further increase the effectiveness of low to moderate VL thresholds. Increasing the number of sets while keeping VL low to moderate might yield additional muscle hypertrophy, comparable to higher VL thresholds with fewer sets. Choosing to perform more sets with low to moderate VL thresholds to increase volume, rather than use high VL thresholds, might avoid the aforementioned downsides of high VL thresholds (neuromuscular fatigue, poorer strength, and athletic task performance adaptations) while still producing (or perhaps even amplifying) the observed adaptations associated with low to moderate VL thresholds. Another area in need of study is the periodized use of VL thresholds over time (e.g., low to moderate VL phases following high VL in a linear manner, or used concurrently in an undulating design). Such a multifaceted approach to training does have merit, especially in high-performance settings where multiple training qualities often have to be considered throughout a microcycle or mesocycle. Importantly, in a similar manner to VL thresholds for those who do not have access to velocity-tracking devices, cluster or rest-redistribution set structures may be a viable alternative to maintain high repetition velocity while minimizing neuromuscular fatigue during RT [35, 110, 111]. Indeed, Jukic and Tufano [96] recently reported that rest redistribution allowed almost all repetitions (~ 17.5 out of 18) in a clean pull exercise to be performed above 20% VL regardless of the load used across three sets and therefore suggested that rest redistribution could potentially serve as a free ad-hoc alternative to VL thresholds. However, future research is needed to explore these alternatives with a range of different exercises, loads, and athletic populations. Finally, acute responses to different VL thresholds discussed in the present review should also be considered when implementing them in RT programs as they are also likely to affect the magnitude of RT-induced adaptations.

4.7 Risk of Bias Assessment

Most of the studies included in this review did not provide sufficient information regarding the method of randomization. Further, no studies provided information regarding allocation concealment and no studies pre-registered their protocols on a publicly available registry. As a result, these studies were of unclear risk of order effect, allocation concealment, and selective reporting bias. Therefore, researchers should improve their reporting of this information in future studies. Importantly, some studies also had an unclear risk of attrition bias due to not providing sufficient information as to the number of participants included in the analysis after reporting that some did not complete the entire intervention or all procedures. Future studies should report the predefined criteria for participant exclusion from analysis, and clearly state how many were included. We recommend the use of the CONSORT flow diagram [112]. Almost half of the studies included in this review were at high risk of familiarization bias because the authors did not report or did not familiarize their participants with the testing procedures. This is especially important in the context of velocity-based training where participants need to provide maximal intent during all repetitions to ensure the reliability of velocity outputs. In addition, some studies failed to report details regarding the provision of velocity feedback or encouragement, both of which can affect the findings of a study. Therefore, future research should ensure that familiarization sessions are performed, the procedures are fully reported, and the provision of velocity feedback or encouragement occurs and is documented. Most studies were at a low risk of bias for other factors that could have affected their findings and used valid and reliable methods, equipment, or instruments to evaluate their outcomes of interest.

4.8 Limitations and Considerations

Several aspects of this review should be considered when interpreting the findings. First, the visualizations made from the acute studies and their interpretation are limited by the data reported in the original studies. While attempts were made to perform a meta-analysis of the acute studies, missing data and subsequently authors’ refusal to provide data prevented us from doing so. Second, there were considerably fewer female participants in both the acute and longitudinal studies, which reduces the generalizability of our findings to female participants, and more research on VL thresholds should include female individuals when possible. However, Rissanen et al. [74] recently reported robust and similar increases in strength and power performance in male and female individuals over 8 weeks while performing repetitions until 20% or 40% VL. This suggests that male and female individuals might be responding similarly to different VL thresholds; although, more research is needed to substantiate these claims. Third, while we attempted to consider the moderating effects of study duration, exercise, loads used, and strength levels of the individuals in all meta-analytic models, the number of studies and effect sizes per study meant this could only be performed for some outcomes. For instance, exercises in the vast majority of longitudinal studies were performed in Smith machines. In this regard, the effects of exercise mode (i.e., Smith machine vs free-weight exercises) have not been formally investigated. Therefore, it is presently unknown to what extent the findings of the present review can be translated to scenarios when only free-weight exercises are used, and thus, the findings of this review should be interpreted with this in mind. This also highlights a need for studies that directly compare the acute and chronic effects of different VL thresholds with exercises performed using free weights or using both free weights and Smith machines (while keeping exercises the same) in a cross-over manner. Fourth, some studies did not report all information required for meta-regressions; therefore, we extracted the required information from figures or made estimations (e.g., pre-post assessment correlations) based on other studies. This likely introduced some error and we therefore urge researchers to report standard deviations of differences (and or pre-post assessment correlations) in training intervention studies. In addition, we also urge researchers to respond to data request e-mails and to provide data when there are no legal barriers to doing so. Fifth, a few longitudinal studies estimated 1RM rather than testing 1RM as a measure of maximal strength. Although not ideal, the fact that all these studies were consistent with their procedures before and after the intervention, used load–velocity relationships with high loads (up to 80–95% 1RM), and used Smith machine exercises to predict maximal strength should minimize the impact on their findings. Finally, as there is no consensus regarding the actual velocities attained against low, moderate, and high loads (because these velocities are highly individual), what is considered a “moderate” or “low” load is subjective. Therefore, when interpreting the velocity against submaximal loads outcome in the present review, it should be noted that loads associated with > 1 and < 0.8 m·s−1 were classified as low and moderate loads, respectively.

5 Conclusions

Monitoring VL during RT may offer additional insights about training response not captured by more traditional methods of prescribing and monitoring RT. However, it is important to note that the acute neuromuscular, metabolic, and perceptual responses to different VL thresholds will likely depend upon the choice of exercise, loads used, number of sets performed, individual athlete characteristics, and more. In addition, factors that can specifically affect the consistency of VL determination such as reference repetition, use of peak or mean velocity, and criteria for set termination (repetitions allowed after the VL is exceeded) should all be considered when implementing VL in practice. Prescribing low to moderate VL thresholds during RT seems to be more time efficient and a generally advantageous strategy compared with higher VL thresholds for optimizing muscle strength and endurance, jumping and sprint performance, as well as velocity against submaximal loads. In contrast, higher VL thresholds may be more effective for promoting muscle hypertrophy. However, prescribing higher VL thresholds during RT can impair rapid force production capability, reduce the expression of fast-twitch muscle fibers, and prolong recovery from RT. In contrast, extremely low VL thresholds can sometimes lead to suboptimal training adaptations. Therefore, low to moderate VL thresholds may be a viable strategy for ensuring optimal performance improvement while preventing the potentially negative effects of fatigue. To conclude, the findings of this review indicate that the specific choice of VL threshold will influence the subsequent RT adaptations, highlighting that VL threshold selection is an important consideration in RT program design.