FormalPara Key Points

In most cases in this systematic review, athlete-population–derived resting metabolic rate (RMR) prediction equations were observed to be among the top performing equations, demonstrating greatest accuracy and precision. The potential for bias when assessing the performance of an equation within the same cohort from which it was derived may inflate the reported efficacy of the equation. To mitigate this risk, validation of locally derived equations within a separate cohort(s) is recommended. Until externally validated, caution is warranted before using such equations in practice.

Where possible, when choosing a RMR prediction equation, ensure that the subject characteristics of the athlete of interest are similar to the subject characteristics of the population in which the RMR equation was derived (height, weight, age, sex, fat-free mass [FFM]).

When using RMR ratio as a proxy indicator of low energy availability, use of an arbitrary or commonly used equation to detect suppression of RMR is not advised unless a specific equation has been shown to accurately predict an individual’s RMR. A more suitable use may be in longitudinal monitoring, and interpretation of directly measured RMR and body composition (i.e. RMR relative to body weight and/or FFM).

1 Introduction

Accurate determination of total daily energy expenditure (TDEE) is essential for athletic performance and health [1]. Knowledge of TDEE is fundamental for calculating appropriate energy intake for athletes. In the general population, resting metabolic rate (RMR) is typically the largest component, contributing 60–75% of total daily energy expenditure [2]. In athletes, the contribution of RMR to TDEE on training days can vary widely. However, it remains a key contributor to overall TDEE. Therefore, accurate calculation of RMR is essential for determining an athlete’s overall energy requirements.

Indirect calorimetry (IC) is often referred to as the gold standard of RMR measurement. However, it is time-consuming and requires trained personnel to operate specialised equipment. To overcome this, RMR prediction equations are frequently used as an alternative method. In addition, the ratio of predicted versus measured RMR value is increasingly being used as a proxy indicator of low energy availability (LEA), whereby a ratio of 0.9 indicates energy deficiency [3,4,5,6]. However, such a method is dependent on the accuracy of the prediction equation in the first place. This emphasises the importance of establishing accurate RMR prediction equations for use in athletes.

Although population-specific equations are encouraged when possible, the American College of Sports Medicine [1] recommends the Harris-Benedict [7] and Cunningham [8] equations as giving reasonable estimates of RMR in athletes, with an activity factor (PAL) applied to estimate TDEE. These are among the most widely used equations. However, the Harris-Benedict equation was formulated over 100 years ago in a non-athletic population [7], and the Cunningham equation was similarly formulated in a population that omitted participants that were deemed ‘athletic’ [8]. These equations have regularly been shown to lack consistency when predicting RMR in both male and female athletes [9,10,11,12,13]. For example, the Harris-Benedict equation was found to under-estimate RMR by approximately 500 kcal on average in a group of elite male rowers and canoeists [14]. Given these athletes were highly active (PAL of 2+), calculated total daily energy requirements could be inaccurate by > 1000 kcal/day for some. Although other factors could potentially contribute to the initial difference between measured and predicted RMR (such as LEA and supressed RMR) [3,4,5,6, 15], this study highlights the errors that RMR prediction equations can have in predicting athlete energy requirements.

Other commonly used RMR prediction equations include, but are not limited to, the Owen [16], Mifflin-St. Jeor [17], Nelson [18], De Lorenzo [19], Henry [20], Ten Haaf and Weijs [21], Jagim [22], Tinsley [23], Wang [24], Watson [25], and FAO/WHO/UNU [26] equations. Multiple studies have assessed the accuracy of these RMR prediction equations in various athletic populations [3, 9, 12, 14, 23, 27,28,29,30]. However, the results of these studies have shown significant variability between cohorts. The performance of a prediction equation is likely highly dependent on homogeneity between the athlete physical characteristics and the characteristics of the population the equation was derived in, meaning any application of the prediction equation to groups differing physically from the original group may be inappropriate.

Several studies have examined the accuracy of different RMR prediction equations in athlete groups such as in collegiate athletes, recreational athletes, and female rugby players [9, 12, 21]; however, these studies have not been systematically reviewed and meta-analysed. The aim of this systematic review and meta-analysis was to determine which RMR prediction equations are (i) most accurate (average predicted values closest to measured values) and (ii) most precise (number of individuals within 10% of measured value) for predicting RMR in male and female athletes. It will also explore differences between subgroups based on key characteristics such as sex, athlete training status, and body composition to name a few.

2 Methods

This systematic review and meta-analysis adhered to Preferred Reporting Items for Systematic Reviews and Meta-Analyses diagnostic test accuracy guidelines [31] and was prospectively registered in PROSPERO (CRD42020218212).

2.1 Inclusion Criteria

The inclusion criteria for the review were: randomised controlled trials (RCTs), cross-sectional observational studies, case studies or any other study wherein RMR, measured by IC, was compared with RMR predicted via prediction equations. Studies had to report the following outcomes (or have them extrapolatable): (i) accuracy of a prediction equation (i.e., predicted energy expenditure expressed as a percentage of the measured energy expenditure) and/or (ii) precision of a prediction equation (i.e., the percentage/number of participants predicted correctly within 10% of their measured values). An accurate RMR prediction equation has previously been defined as predicting within 10% of measured values in studies and reviews in athletes [9, 27, 28, 32, 33], and in the general population [34,35,36]. This has been justified as consistent with IC measurement errors of ≤ 5% [36, 37]. Studies had to include male and/or female competitive and/or recreational athletes who were > 18 years old. The study had to be a full-text article, in English.

2.2 Exclusion Criteria

The study could not be a review article, a commentary or an animal study, in children only, in pregnant/lactating women, in hospitalised patients, in individuals with physical disabilities/conditions and/or the presence of disease, include medications or known stimulant or drug use, or in older adults (≥ 65 years).

2.3 Search Strategy

A systematic search of the literature was initially conducted on 10 December 2020 and updated on 12 November 2021 (by JERO). Search terms included keywords and subject headings in the following areas: athletes, sports, exercise, resting metabolic rate, and resting metabolic rate prediction equations. The search strategy was applied across electronic bibliographic and grey literature databases; MEDLINE via PubMed; EMBASE via Ovid; CINAHL and SportDiscus via Ebsco; and Web of Science (see Supplementary Document 1 in the electronic supplementary material [ESM]).

2.4 Study Selection

Other potentially relevant studies were identified by hand-searching reference lists of included articles and reviews (JERO). All articles were uploaded for deduplication and title and abstract screening via Covidence (Veritas Health Innovation Ltd, Melbourne, Australia) (JERO). After deduplication, articles were screened based on title and abstract for eligibility independently by two authors (JERO, KH). Where discrepancies between reviewers were noted, eligibility was agreed on by discussion (JERO, KH). For the title and abstract screening stage, a kappa value of 0.77 was observed demonstrating substantial agreement between the reviewers, according to the guidelines by Landis and Koch [38]. Full-text articles were then independently reviewed by two authors for inclusion in this systematic review (JERO, KH). Where any discrepancies between reviewers were noted, eligibility was agreed on by discussion (JERO, KH). At the full-text screening stage, a higher kappa value was observed, 0.93, indicating almost perfect agreement [38]. For any potentially eligible articles with missing/unclear information, or where data were not possible to extract, corresponding authors were contacted, and data/information requested (JERO). If no response was received to this or a follow-up data/information request, the article was excluded. This resulted in the exclusion of two articles from 11 requested.

2.5 Data Extraction

Pre-determined variables were independently gathered from each included study (JERO, KH). Variables included study characteristics (study title, study design, year of publication, authors, journal, funding sources), athlete characteristics (number of participants, nationality, sex, age, exercise training [hours and number of sessions per week], sport participated in, training and performance calibre, body weight [kg], height [m], body fat [%], fat-free mass [kg]), RMR prediction equation(s) used (name of equation, year equation was formulated, equation formula, performance of equation [accuracy and/or precision]), RMR measured by IC (name and brand of equipment used, position of participant, test duration, definition of steady state, measured RMR, respiratory exchange ratio), study limitations, and any other additional noteworthy points of information from the authors (such as conclusions, new hypotheses and/or recommendations for future research). Differentiation between levels of training and performance was determined using the athlete classification framework by McKay et al. [39] (see Supplementary Document 2 in the ESM). The outcomes extracted were mean (SD) predicted and measured RMR in kcal/24 h or converted to kcal/24 h where necessary and/or the precision of the prediction equation as the number/percentage of individuals predicted within 10% of measured values.

2.6 Meta-analysis Data Synthesis

Meta-analysis for accuracy was subsequently conducted comparing measured versus predicted RMR values where a prediction equation was compared against IC in at least three separate studies. RMR predicted by prediction equation, RMR measured via IC, SD, and sample sizes were used to calculate the effect size (ES) and 95% confidence intervals (CIs) for each equation using Revman software (version 5.4.1) [40, 41]. A negative ES represents an underestimation of the predicted RMR value relative to the measured value via IC, and a positive value represents an overestimation. Interpretation of ES was as follows: < 0.20 as trivial, 0.20–0.39 as small, 0.40–0.80 as moderate and > 0.80 as large [42]. A random effects model was employed for all analyses based on the assumption that heterogeneity would exist between included studies due to the variability in athlete characteristics and study design [43]. To determine heterogeneity, the I2 statistic was used. Depending on the magnitude and direction of effect, I2 values from 0 to 40% are likely to lack importance, values from 30 to 60% may represent moderate heterogeneity, values from 50 to 90% may represent substantial heterogeneity, and values from 75 to 100% may represent considerable heterogeneity [44].

Meta-analysis for precision was conducted by pooling the number of participants whose RMR was predicted by RMR prediction equation within ± 10% of RMR measured via IC for each included equation. For an equation to be included in this meta-analysis, a minimum of three separate studies had to report precision values (either ratio or %) for the equation. Once pooled, the weighted mean (%) was calculated for each included equation.

2.7 Subgroup and Sensitivity Analysis

For inclusion in the subgroup analysis, a subgroup had to contain at least three separate comparisons from three separate studies. Subgroup analysis was performed for the following categories which satisfied criteria: sex (females, males), body composition measurement method (DXA, BIA, Bodpod), energy availability (EA) status (non-LEA, LEA), training and performance calibre (Tier 1: recreational, Tier 3: highly trained/national level, Tier 4: elite/international level) as described by McKay et al. [39], and for heavy and light males and females. Participants of a single included study were classified as heavy if their mean body weight was greater or equal to the mean body weight of all studies included in the review (≥ 78.9 kg males, ≥ 62.7 kg females). Participants of a study were classified as light if their mean body weight was less than the average body weight of all included studies (< 78.9 kg males, < 62.7 kg females). In addition, a subgroup analysis was performed for best practice RMR measurement guidelines [45] such as the inclusion of prior-day physical activity abstinence (studies that imposed ≥ 24 h vigorous physical activity abstinence before testing versus those that did not impose such restrictions), adequate analysis to determine RMR (studies that discarded the first 5 min of testing and that utilised a validated RMR method to extract RMR versus those that did not), subject preparation protocols (studies that imposed a pre-test acclimation/rest period immediately prior to testing versus those that did not), the combination of physical activity abstinence and subject preparation protocols (studies that imposed ≥ 24 h vigorous physical activity abstinence and a pre-test acclimation/rest period immediately prior to testing versus those that did not), and subject fasting status (studies that imposed a ≥ 7 h fast, a ≥ 4 h abstinence from caffeine/stimulants, and a ≥ 2.5 h abstinence from nicotine before testing versus studies that imposed two out of three of these criteria versus studies that imposed one out of three of these criteria versus studies that imposed none of these criteria).

For sensitivity analysis, the impact of each study on the combined effect was assessed by omitting one study at a time. Funnel plots were generated to investigate any differences in study effects and publication bias (see Supplementary Document 3 in the ESM). All analyses were completed using Revman software (Revman version 5.3.5; The Cochrane Collaboration) and forest plots produced using GraphPad Prism version 8.0 for Mac (GraphPad Software, San Diego, CA, USA).

2.8 Risk of Bias/Quality Assessment

According to best practice guidelines, risk-of-bias tools should be used in their original unmodified form, and authors should avoid developing their own critical assessment tool to assess risk of bias/study quality [46]. The existing risk of bias tools [47,48,49,50] are not appropriate or validated for the study designs included in this systematic review. Therefore, whilst risk of bias and quality assessment of included studies was considered, following best practice in this context [46], no risk of bias or quality assessment was performed.

2.9 Adherence to Best Practice Resting Metabolic Rate Measurement Guidelines

It is acknowledged that methodological differences in RMR measurement itself could have a significant impact on the accuracy of prediction equations. Therefore, during the review process, it was decided to quantify how many criteria for best practice RMR measurement [45] were fulfilled by each study. These guidelines consist of 10 criteria which include physical activity abstinence, fasting adherence, caffeine/stimulant and nicotine abstinence, pre-test acclimation/rest periods, body position, room/environmental conditions, and appropriate test analysis to extract a value for RMR. These data were extracted in a separate table (see Supplementary Document 4 in the ESM) and completed independently by two authors (JERO and KH), with any discrepancies subsequently discussed and resolved.

2.10 Locally Derived Equations

In this review, locally derived equations are defined as those formulated within the population of a particular study. These equations emerge from linear or multiple linear regression analyses that encompass all or a sub-sample of participants involved in the same study.

3 Results

3.1 Description of Included Studies

A total of 29 studies were deemed eligible and included in the systematic review (Fig. 1).

Fig. 1
figure 1

PRISMA flowchart. RMR resting metabolic rate

An overview of the study characteristics, methods and outcomes of all included studies is provided in Supplementary Document 5 in the ESM. Study publication dates ranged from 1993 to 2021.

3.1.1 Athlete Characteristics

Across all included studies, there was a total of 1430 participants (mean [SD]: age 24.2 [7.0] years, height 1.72 [0.10] m, weight 69.3 [15.8] kg). Of these participants, 822 were female (age 24.3 [6.9] years, height 1.67 [0.07] m, weight 62.7 [11.4] kg) and 608 were male (age 24.1 [7.1] years, height 1.78 [0.08] m, weight 78.9 [15.5] kg).

3.1.2 Athlete Status

Using the athlete classification framework by McKay et al. [39], the number of athletes per tier (including means and standard deviations for age, height, and weight) included across all studies is shown in Table 1.

Table 1 Athlete characteristics and status according to the athlete classification framework by McKay et al. [39] across all included studies

3.1.3 Sport Type

Athletes of individual studies were classified as endurance (n = 5 studies), team sport (n = 5 studies), recreational exercisers (n = 5 studies), combat (n = 2 studies), weightlifting (n = 1 study), bodybuilders (n = 1 study), and dancers (n = 1 study). Eight studies included a variety of athletes from multiple different sporting backgrounds (such as basketball, baseball, track and field, dancing, archery, diving, gymnastics, American football, waterpolo, volleyball, fencing, etc.). Further details on sports included are reported in Supplementary Document 6 in the ESM.

3.2 Equations Included

One hundred different prediction equations from forty-six separate original articles were investigated across all included studies. Studies ranged from comparing the accuracy of 30 different equations [51] to examining the accuracy of a single equation [4, 10, 22, 52,53,54,55,56]. The top five most included equations were the Cunningham (1980) (lean body mass [LBM]) (n = 21 studies), the Harris-Benedict (1918) (age, weight, height) (n = 21 studies), the Mifflin St. Jeor (1990) (age, weight, height) (n = 11 studies), the De Lorenzo (1999) (age, weight, height) (n = 8 studies), and the FAO/WHO/UNU (1985) (age, weight) equations and the FAO/WHO/UNU (1985) (age, weight, height) equations (both n = 7 studies). Most equations (n = 43) were only used in single studies. An overview of all equations that have been developed in athletes, and results regarding mean bias and precision is provided in Table 2. For a complete list of all prediction equations that have been studied in athletes in chronological order, see Supplementary Document 7 in the ESM.

Table 2 List of equations developed in athletes and mean bias and precision results reported from studies in athletes

3.3 Narrative Synthesis

Despite the athlete populations of studies included in the meta-analysis being heterogenous, five similar athlete demographic groups were identified for the purpose of grouping studies for a narrative synthesis (recreational athletes, endurance, rugby, mixed sports, and other).

3.3.1 Recreational Athletes

Five studies classified participants as recreational athletes (resistance, aerobic, and concurrent) [29, 30, 51, 57, 58]. In the original study in which the Koehler DXA (2016) (sum of organelle masses) equation was developed, it was the only equation examined and found to over-predict RMR by 4% in female recreational athletes [58]. For two other studies, the Koehler DXA (2016) (sum of organelle masses) was found to be most accurate (overestimate 7% [30], accurate within 1% [57]) compared with three other equations in females [30, 57]. Elsewhere in female athletes, Mackay et al. [29] found the Mifflin St. Jeor (1990) (age, weight, height) to be the most accurate out of three equations examined; however, RMR was overestimated by ~ 15%. In recreationally active males and females of varying BMI from Trinidad and Tobago, 30 equations were studied with several equations found to be the most accurate (range: within 1–2%) [51]. The most accurate equation for females with a BMI < 25 kg/m2 was Johnstone (2006) (fat-free mass [FFM], fat mass [FM], age) and for BMI \(\ge\) 25 kg/m2 was Müller (MJ/d) (2004) (FFM, FM); and the most accurate equation for males with a BMI < 25 kg/m2 was Owen (1988) (weight) and for BMI \(\ge\) 25 kg/m2 was Livingston and Kohlstadt (2005) (age, weight) [51].

3.3.2 Endurance

Five studies included endurance athletes. One study examined the accuracy of the Harris-Benedict equation only, finding it underestimated RMR by ~ 6% in male runners, tri- and bi-athletes [56]. Two studies examined the Cunningham (1980) (LBM) only, finding it overestimated RMR by ~ 10% in male cyclists [4] and underestimated RMR by ~ 16% in female elite endurance athletes (sport/s unspecified) [53]. Sjodin et al. [59] examined the accuracy of two equations—FAO/WHO/UNU (1985) (age, weight, height) and Westerterp (1995) (FFM, FM)—and found the Westerterp equation to be more accurate, but still to underestimate RMR by 12% in eight male and female cross-country skiers. Devrim-Lanpir et al. assessed both accuracy and precision of nine equations in 30 male and female triathletes and ultra-marathoners, finding most under-estimated RMR and the Mifflin-St. Jeor equation to be most accurate (underestimating by ~ 5%) [27]. However, this equation was only able to predict ~ 50% to within 10% of measured values [27].

3.3.3 Rugby

Three studies included rugby athletes [9, 10, 60]. In a study of six male rugby league athletes, only one equation (Cunningham [1980] [LBM]) was used [10]. It was found to overestimate RMR by 17%, with only one athlete estimated to within 10% of measured RMR [10]. In the two other studies, locally developed equations were reported to be the most accurate (to < 1%) in rugby league males compared with three previously published equations [60], and in rugby union females compared with seven previously published equations [9], respectively. The Ten-Haaf (2014) (FFM) equation was found to be most precise (predicting 31 out of 36 [82%] within 10%) in the latter study [9].

3.3.4 Mixed Sports

Eight studies included populations that were not specific to one sport and contained mixed disciplines (see Supplementary Document 6 in the ESM) [12, 19, 21, 22, 25, 32, 33, 61]. One study examined the accuracy of a locally developed equation finding it to be accurate to within < 2% in athletes from a range of NCAA sports [22]. All other studies compared measured values to a number of other equations ranging from five to twelve equations [12, 19, 21, 25, 32, 33, 61], with five of these studies also developing locally derived equations [19, 21, 25, 33, 61]. In all cases, those locally derived were found to be the most accurate (within < 1%) [19, 21, 25, 33, 61]. In two of the studies wherein precision was also reported, locally developed equations were also found to be the most precise [21, 33]. In the two other studies, the Mifflin St. Jeor (1990) (age, weight, height) equation was found to be the most accurate (< 1%) and precise (59%, 29/49) in male and female members of the Turkish Olympic National team [32], and the Cunningham (1980) (LBM) was found to be the most accurate both in male and female NCAA Division III athletes from mixed disciplines (within ~ 4%) [12].

3.3.5 Other

Eight studies included athletes that did not fit a specific grouping: ballet dancers (n = 1) [3], bodybuilders (n = 1) [23], karate (n = 1) [62], rowers and canoeists (n = 1) [14], soccer (n = 2) [52, 55], taekwondo (n = 1) [54], and weightlifters (n = 1) [28]. Three studies examined the accuracy of one equation only; the levels of recommended assumption of nutrients for the Italian population (LARN) (1996) equation underestimated RMR by ~ 13% in female soccer players [52], the Cunningham (1980) (LBM) equation, in contrast, overestimated RMR by ~ 6% in female soccer players [55], and the Cunningham (1980) (LBM) equation was accurate to < 1% in a single athlete case study of a male taekwondo athlete [54]. Two other studies found the Cunningham (1980) (LBM) equation to be the most accurate in male and female rowers and canoeists (~ 15% under-estimation in males, ~ 10% over-estimation in females) [14] and female karate (~ 8% under-estimation) [62], compared with two and three other equations respectively. In male and female ballet dancers, the Koehler DXA (2016) (sum of organelle masses) was found to be the most accurate (within ~ 10%) out of three equations, with the Harris-Benedict (1918) (age, weight, height) most precise (26/40; 65%) [3]. In male and female bodybuilders, a locally developed equation was found to be the most accurate (within < 1%) when compared with 11 other equations [23]. In males from the Indian national weightlifting team, the FAO/WHO/UNU (1985) (age, weight, height) was reported to be the most accurate and precise out of eight equations but its performance was still very poor (accuracy: underestimating by ~ 18%, precision: 11/30 [37%]) [28].

3.4 Meta-analysis

Six articles could not be synthesised by meta-analysis for the following reasons: means and/or SDs for predicted or measured RMR were not provided (n = 3) [19, 51, 56], RMR prediction equation used by less than three separate studies (n = 2) [22, 52] and study sample size of n = 1 [54].

3.4.1 Meta-analysis Accuracy—Description of Included Studies

Twenty-three studies were included in the meta-analysis for prediction equation accuracy, involving 1058 participants (mean [SD]: age 23.4 [5.8] years, height 1.72 [0.09] m, weight 66.4 [13.5] kg). Of these participants, 671 were female (age 23.4 [5.2] years, height 1.67 [0.07] m, weight 61.0 [8.6] kg) and 387 were male (age 23.9 [6.7] years, height 1.78 [0.09] m, weight 76.6 [14.4] kg). A total of 1206 comparisons were made between predicted RMR by equation and measured RMR via IC.

3.4.2 Meta-analysis Accuracy—Results

A forest plot of results for individual equations across all studies is shown in Fig. 2. Tabulated results of the meta-analysis evaluating accuracy are shown in Supplementary Document 8 (see ESM).

Fig. 2
figure 2

Forest plot containing all equations analysed in meta-analysis listed in chronological order. CI confidence interval, ES effect size, FFM fat free mass, FM fat mass, LBM lean body mass

Examining the equations individually, the accuracy of predicted compared with measured values did not differ significantly for five equations—the Cunningham (1980) (LBM), the Harris-Benedict (1918) (age, weight, height), the Cunningham (1991) (FFM), the De Lorenzo (1999) (age, weight, height), (the Ten-Haaf (2014) (age, weight, height)—whereas all others significantly underestimated or overestimated RMR (p < 0.05) (Fig. 2). The Ten-Haaf (2014) (age, weight, height) showed the smallest effect size (0.04), indicating the greatest accuracy.

3.4.3 Meta-analysis Accuracy—Heterogeneity Summary

Significantly large heterogeneity was observed for most equations included in the meta-analysis (range: I2: 80–93%) [7, 8, 16,17,18,19, 26, 58, 63]. For example, although the Cunningham (1980) (LBM) and the Harris-Benedict (1918) (age, weight, height) equations showed trivial ESs (ES = 0.15 [95% CI − 0.26 to 0.57], 18 studies, 30 comparisons, 846 participants and ES = − 0.14 [95% CI − 0.52 to 0.25], 17 studies, 29 comparisons, 892 participants, respectively), heterogeneity was considerable with many comparisons in each equation under- and overpredicting RMR (both equations I2 = 93%; p < 0.0001). The Ten-Haaf (2014) (age, weight, height) equation showed a trivial ES (ES = 0.04 [95% CI − 0.16 to 0.23], 4 studies, 7 comparisons, 204 participants) and no heterogeneity (p = 0.48, I2 = 0%).

3.4.4 Meta-analysis Accuracy—Subgroup Analysis

Results of the accuracy meta-analysis and accuracy subgroup analysis for each equation separately are shown in Supplementary Document 9 (see ESM).

3.4.4.1 Sex

A significant difference was observed between male and female subgroups for the De Lorenzo (1999) (age, weight, height) equation (I2 = 95.4%; p < 0.0001) and a trend for subgroup differences for the Harris-Benedict (1918) (age, weight, height) (I2 = 72.7%; p = 0.06) equation was observed. For the De Lorenzo (1999) (age, weight height) equation, a small and large ES was observed for males and females, respectively (males: ES = − 0.31 [95% CI − 0.73 to 0.11], I2 = 81%, p < 0.0001, six studies, six comparisons, 266 participants; females: ES = 0.93 [95% CI 0.63 to 1.23], I2 = 0%, p = 0.91, four studies, four comparisons, 93 participants). For the Harris-Benedict (1918) (age, weight, height) equation, a moderate and trivial ES was observed for males and females, respectively (males: ES = − 0.53 [95% CI − 0.93 to − 0.13], I2 = 84%, p < 0.0001, 11 studies, 11 comparisons, 357 participants; females: ES = 0.11 [95% CI − 0.40 to 0.63], I2 = 93%, p < 0.0001, 14 studies, 18 comparisons, 535 participants). There were no significant differences observed between male and female subgroups for any other equations analysed [8, 16, 17, 21, 26].

3.4.4.2 Body Composition Measurement Method

The Cunningham (1980) (LBM) equation was the only equation that satisfied criteria for subgroup analysis. When subgrouping studies that used DXA, BIA, and Bodpod to determine body composition, there were no significant subgroup differences observed.

3.4.4.3 Energy Status

Only the Cunningham (1980) (LBM) equation satisfied criteria for subgroup analysis by energy status of participants, with no significant subgroup differences observed.

3.4.4.4 Athlete Status

Only the Cunningham (1980) (LBM) and the Harris-Benedict (1918) (age, weight, height) equations satisfied criteria for subgroup analysis. No subgroup differences were observed for the Cunningham (1980) (LBM) equation. However, significant subgroup differences were observed for the Harris-Benedict (1918) (age, weight, height) equation with differences observed between Tier 1 and Tier 3, and between Tier 1 and Tier 4 athletes (p < 0.0001 for both). A significant large ES for overestimation of RMR was observed for Tier 1: recreationally active female athletes (studies included in this subgroup were female only) and a significant moderate ES for underestimation was observed for Tier 3: highly trained/national level and Tier 4: elite/international level athletes (Tier 1: ES = 1.50 [95% CI 1.26 to 1.73], I2 = 20%, p = 0.29, three studies, four comparisons, 243 participants; Tier 3: ES = − 0.42 [95% CI − 0.82 to − 0.02], I2 = 86%, p < 0.0001, 11 studies, 17 comparisons, 444 participants; Tier 4: ES = − 0.38 [95% CI − 1.00 to 0.25], I2 = 81%, p = 0.0003, three studies, five comparisons, 115 participants).

3.4.4.5 Heavy and Light Males and Females

The Cunningham (1980) (LBM) and the Harris-Benedict (1918) (age, weight, height) equations satisfied criteria for subgroup analysis in both males and females, and the Mifflin St. Jeor (1918) (age, weight, height) equation satisfied criteria for subgroup analysis in females only. A trend for subgroup differences between heavy and light males for the Harris-Benedict (1918) (age, weight, height) equation was observed (I2 = 68.8%; p = 0.07). A small and large ES was observed in males on average < 78.9 kg and > 78.9 kg, respectively (< 78.9 kg: ES = − 0.33 [95% CI − 0.85 to 0.20], I2 = 88%, p < 0.0001, seven studies, seven comparisons, 286 participants; > 78.9 kg: ES = − 0.91 [95% CI − 1.25 to − 0.56], I2 = 0%, p = 0.68, four studies, four comparisons, 71 participants). No significant differences were observed between subgroups for any other equation and/or subgrouping.

3.4.4.6 \(\ge\)24-Hour Physical Activity Abstinence

The Cunningham (1980) (LBM), the Harris-Benedict (1918) (age, weight, height), the Mifflin St. Jeor (1918) (age, weight, height), and the De Lorenzo (1999) (age, weight, height) equations satisfied criteria for subgroup analysis. No significant subgroup differences were observed for any equation.

3.4.4.7 Discard Period, Steady State, and Validated Extraction Method

The Cunningham (1980) (LBM), the Harris-Benedict (1918) (age, weight, height), the Mifflin St. Jeor (1918) (age, weight, height), and FAO/WHO/UNU (1985) (age, weight) equations satisfied criteria for subgroup analysis. No significant differences were observed; however, a trend for subgroup differences was observed in the Harris-Benedict (1918) (age, weight, height) equation (p = 0.06, I2 = 72%). A small ES for overestimation of RMR was observed in the subgroup when an RMR discard period, use of a steady-state model, and a validated RMR extraction method were omitted, and a moderate ES for underestimation of RMR was observed in the subgroup where these methodological criteria were present (Present: ES = − 0.47 [95% CI − 0.83 to − 0.12], I2 = 81%, p < 0.0001, eight studies, 15 comparisons, 400 participants; Omitted: ES = 0.25 [95% CI − 0.41 to − 0.91], I2 = 95%, p < 0.0001, nine studies, 14 comparisons, 492 participants). No significant differences between subgroups were observed for any other equation.

3.4.4.8 Pre-test Rest Versus No Pre-test Rest

The Cunningham (1980) (LBM), the Harris-Benedict (1918) (age, weight, height), the Mifflin St. Jeor (1918) (age, weight, height), the De Lorenzo (1999) (age, weight, height), and the FAO/WHO/UNU (1985) (age, weight) equations satisfied criteria for subgroup analysis. No significant differences were observed between subgroups for any equation.

3.4.4.9 Fasting Status and Preparation

Only the Cunningham (1980) (LBM) and the Harris-Benedict (1918) (age, weight, height) equations satisfied criteria for subgroup analysis. For the Cunningham (1980) (LBM) equation, subgroup differences were observed between studies that implemented all three preparation procedures (fast, caffeine abstinence, and nicotine abstinence) versus those that implemented only one (1/3 vs 3/3: p < 0.0001, I2 = 95%). For the Harris-Benedict (1918) (age, weight, height) equation, subgroup differences were observed between studies that implemented only one preparation procedure versus those that implemented two and three preparation procedures (1/3 vs 2/3: p = 0.003, I2 = 89%); (1/3 vs 3/3: p < 0.0001, I2 = 95%). For subgroups where all three preparation procedures were implemented, significant overestimation was observed in both the Cunningham (1980) (LBM) equation (ES = 0.97 [95% CI 0.56 to 1.37], I2 = 78%, p = 0.0001, four studies, seven comparisons, 314 participants) and the Harris-Benedict (1918) (age, weight, height) equation (ES = 0.71 [95% CI 0.07 to 1.35], I2 = 93%, p < 0.0001, five studies, eight comparisons, 365 participants). For subgroups where two of three preparation procedures were implemented, no significant magnitudes of effect were observed in either equation. For subgroups where one of three preparation procedures were implemented, significant ES was observed for the Harris-Benedict (1918) (age, weight, height) equation only (ES = − 0.92 [95% CI − 1.23 to − 0.61], I2 = 32%, p = 0.18, four studies, seven comparisons, 134 participants).

3.4.5 Meta-analysis Precision—Description of Included Studies

Ten studies were included in the meta-analysis for prediction equation precision [3, 9, 10, 21, 27, 28, 32, 33, 51, 54], involving a total of 497 participants (mean [SD]: age 27.3 [9.4] years, height 1.72 [0.01] m, weight 72.8 [10.5] kg). Of these participants, 225 were female (age 28.4 [9.6] years, height 1.66 [0.07] m, weight 67.9 [15.4] kg) and 272 were male (age 26.1 [9.0] years, height 1.76 [0.09] m, weight 75.7 [12.8] kg).

3.4.6 Meta-analysis Precision—Overall Result

Tabulated results of the meta-analysis evaluating precision for individual equations is shown in Table 3. Overall, the Ten-Haaf (2014) (age, weight, height) equation was found to be the most precise equation predicting 80.2% of participants to be within ± 10% of measured values with all other included equations ranging from 40.7 to 63.7%. Subgroup analysis was only possible for precision for six equations for sex and athlete status, with precision being poor across all subgroups (range 19–57%) (Table 3).

Table 3 Precision and subgroup analysis of RMR calculated by RMR prediction equation versus RMR measured via indirect calorimetry

3.5 Adherence to Best Practice Resting Metabolic Rate Measurement Guidelines

An overview of the adherence of each study to the Fullmer et al. [45] best practice guidelines is shown in Supplementary Document 10 (see ESM). In summary, of the 29 included studies in this review, the average number of criteria satisfied [mean (SD, range)] was 5 (2, 2–9) out of 10 criteria.

4 Discussion

Given the widespread use of RMR prediction equations in athletes, the present systematic review and meta-analysis aimed to investigate (i) the most accurate and (ii) the most precise equations for use in athletes, and (iii) whether differences exist based on factors such as athlete or methodological characteristics. Overall, it is evident that a variety of equations are being used in athletes with 100 different equations used across the included studies.

4.1 Accuracy

Regarding accuracy, several studies assessed and/or used only one equation to predict RMR, making it difficult to infer their overall accuracy and precision. However, narrative synthesis revealed that in studies in which multiple equations were evaluated, no single equation consistently performed better than any other. Despite this, the most consistently top performing equations were locally derived equations. In all cases where a population-derived equation was developed and compared with other equations (n = 8 studies), the derived equations were always the most accurate (within ± 1% of measured RMR). This is no surprise as these prediction equations were derived from subject characteristics such as age, weight, height, and sex from all participants in their study. Using regression analysis, a relationship is then established between these characteristics and measured RMR. Unless the subject characteristics of the athlete whose predicted RMR is required are very similar to the subject characteristics of the population in which the equation was derived, there is a high probability that the estimation of RMR will be inaccurate.

Meta-analysis for accuracy identified the Ten-Haaf (2014) (age, weight, height) equation as the best performing, showing the smallest ES and 95% CIs and no heterogeneity. This is most likely a result of the physical characteristics of the athletes within the included studies (e.g., age, height, weight) being similar to the population in which the Ten-Haaf equation was derived. In the one comparison where RMR tended towards underestimation of RMR using the Ten-Haaf equation [23], the athletes in this study were much heavier than the individuals from which the Ten-Haaf equation was derived.

4.2 Precision

Precision was evaluated by only ten of all studies included in this review, with only two of these studies assessing precision of locally derived equations [9, 21]. In both studies, locally derived equations performed exceptionally well (83–93%), with one study’s locally derived equation only being slightly outperformed by the Ten-Haaf (2014) (FFM) equation and identical in precision to the Ten-Haaf (2014) (age, weight, height) equation [9]. It must be noted that in this study, the Ten-Haaf (2014) (age, weight height) equation was also found to be highly accurate, predicting RMR within 1% [9]. Although some equations in other studies performed somewhat well, none exceeded the performance of the locally derived equations and the Ten-Haaf (2014) (age, weight, height) equation.

Similar to accuracy, meta-analysis for precision revealed the best performing equation to be the Ten-Haaf (2014) (age, weight, height), predicting 80% of participants within ± 10% of measured RMR. Once again, this may be explained by participants having similar physical characteristics to the original population in which the equation was derived. Indeed, one of the studies that found the Ten-Haaf (2014) (age, weight, height) equation to be the most precise was the original study proposing the equation [21].

4.3 Equations that Performed Poorly

In addition to examining the equations that performed best, it is worth noting that there were some equations that consistently underperformed (for accuracy and precision) in athlete populations and should be avoided. These equations include the Mifflin St. Jeor (1990) (age, weight, height), the Owen (1988) (weight), the FAO/WHO/UNU (1985) (age, weight, height), the FAO/WHO/UNU (1985) (age, weight), and the Nelson (1992) (FFM, FM). Although no one specific reason exists as to why these equations consistently underperformed, it is hypothesised that the physical characteristics of the athletes in the studies included did not match the physical characteristics of the populations in which the equations were derived.

4.4 Sex Differences

Significant heterogeneity was observed across all equations (except Ten-Haaf [2014] [age, height, weight]) included in the meta-analysis, which we attempted to explore when possible through subgroup analysis. Subgroup differences were found for sex for both the De Lorenzo (1999) (age, weight, height) and the Harris-Benedict (1918) (age, weight, height) equations. A large overestimation of RMR predicted by the De Lorenzo (1999) equation was observed in female-only comparisons. This could be expected, as the De Lorenzo equation was based on 51 males who on average were heavier and larger [19]. Interestingly, in males who closely matched the physical characteristics of those in the original study, trivial to small effect sizes were observed [21, 32, 33, 61], whereas a large underestimation was observed in the two studies in which males weighed on average ~ 17 kg heavier than those in the original De Lorenzo study [12, 23]. In the case of the Harris-Benedict (1918) equation, RMR was underestimated in the majority of male-only comparison groups, except for one (involving a group of Danish Royal ballet dancers) where it was overestimated [3]. In contrast, for females there was no overall difference between predicted and measured RMR, although heterogeneity remained high. This suggests that sex should be considered when using these equations in practice, whereas for the other equations analysed (FAO, Owen, Mifflin St. Jeor, Cunningham, Ten-Haaf), sex does not appear to influence accuracy.

4.5 Athlete Status

It is also possible that some equations may be more appropriate for athletes of a specific athlete status. We explored this using a recent classification framework [39] for the Harris-Benedict (1918) (age, weight, height) and the Cunningham (1980) (LBM) equations where sufficient data allowed. The trend for underestimation of RMR using the Harris-Benedict equation observed in Tier 3 and Tier 4 athletes may be explained by the fact that the equation was developed in a non-athlete population and does not include the measurement of FFM [7]. However, RMR was still significantly overestimated in Tier 1 recreational athletes. With possible differences in body composition between these groups, it is difficult to infer whether the variable performance of the Harris-Benedict (1918) (age, weight, height) equation is due to differences in competitive status or body composition alone. Nevertheless, the findings suggest athlete status may influence the accuracy of the equation. Interestingly, the Cunningham equation did not perform any differently in those of differing athlete status, suggesting athlete status is not a key factor to consider when using this equation in practice.

4.6 Body Weight

As previously highlighted, the performance of an equation can be considerably influenced by the physical characteristics of an individual in relation to those of the population from which the equation was derived. It is conceivable that equations formulated based on general population metrics may yield inadequate results when applied to athletes, who typically possess significantly larger statures (for instance, bodybuilders or rugby players). To assess whether the accuracy of equations may differ in heavier versus lighter athletes, studies were split by those with a mean body weight greater or equal to the mean body weight of all studies included in this review (≥ 78.9 kg males, ≥ 62.7 kg females) versus those below the mean.

Although not statistically significant, a tendency toward subgroup differences was observed in males with the Harris-Benedict (1918) equation. While the performance of the equation was inconclusive in lighter males, there was a consistent trend towards underestimation in heavier males. On closer examination, studies in which this underestimation was prevalent included those featuring bodybuilders, rugby players, American footballers, baseball players, and heavyweight rowers. The average mean weight within these studies ranged from 93 to > 100 kg [12, 14, 23, 60]. Similarly, although not possible to perform meta-analysis, in the two studies that showed the De Lorenzo (1999) equation to underestimate RMR in males, mean body weight was higher. While further studies are necessary to make a definitive statement that the Harris-Benedict (1918) equation (and potentially other equations) tend to underestimate RMR at higher body weights, the trends observed should be considered. Hence, the application of the Harris-Benedict (1918) and De Lorenzo (1999) equations to heavier male athletes should be approached with caution.

4.7 Energy Availability Status, Body Composition Method and Prior Exercise Avoidance

Other characteristics were also explored to try and identify factors that may contribute to heterogeneity in results, including the mean body weight or energy availability status of included athletes, method of body composition measurement or how long participants were required to avoid exercise prior to RMR measurement. However, none of these factors showed significant differences. For some of these analyses, however, it should be noted that the number of equations that satisfied criteria for inclusion were limited and, therefore, further studies are required to explore the influence of these factors on the accuracy of different equations. For example, the body composition variables and measurement technique used to develop equations should be considered when interpreting the accuracy of equations, given the wide range of methods used in athletes. A common issue is the interchangeable use of LBM versus FFM in the Cunningham (1980) equation. This may influence the accuracy of calculations and warrants further study and consideration when applying equations in practice [64].

With regard to athlete energy status, the ratio of measured RMR to predicted RMR (RMR ratio) is being increasingly used as a proxy indicator of energy availability [3,4,5,6, 15, 53]. An RMR ratio of < 0.9 is considered indicative of low energy availability, meaning a difference between measured and predicted RMR of only 10% could be interpreted as the suppression of RMR [15]. The Cunningham (1980) (LBM) equation has been used to determine RMR ratio as a proxy indicator for LEA in numerous studies [3,4,5,6, 15, 53]. However, as evident in the results of this meta-analysis, observed ES of individual studies for the Cunningham (1980) (LBM) equation ranged from large overestimation (ES = 3.08) to large underestimation (ES = − 2.11), with 13 out of 30 comparisons having ESs greater or less than 1 and − 1, respectively. Furthermore, ESs varied widely between equations, highlighting that classification of suppressed RMR will depend on the equation used. Unless a specific RMR equation has been shown to accurately predict an individual’s RMR at different body weights (within 10%), it appears ill-advised to use an arbitrary or even ‘commonly used’ equation to detect the suppression of RMR from a single measurement. A more suitable use may be in longitudinal monitoring, and interpretation of directly measured RMR and body composition. RMR values relative to body weight and/or FFM can then be compared to detect the suppression of RMR. Whilst considering adaptive thermogenesis in cases where body weight is being lost or gained, there are several scenarios that could be considered an indicator of supressed RMR. For example, decreasing RMR when body weight remains unchanged or greater than expected losses to RMR during weight loss.

4.8 Measurement Methods and Test Preparation Procedures

Differences in RMR measurement methodologies and preparation procedures could also influence RMR results, with a wide variety of methods and procedures evident across included studies. Some methodologies include the use of a discard period, a steady-state model, and a validated RMR extraction method, whereas others do not. For the Harris-Benedict (1918) equation, a small overestimation was observed when these protocols were omitted, whereas a moderate underestimation was observed when these protocols were present. Although other equations (the Cunningham [1980], Mifflin St. Jeor and FAO/WHO/UNU [age, weight]) did not differ with the presence or lack of these protocols, these protocols should be employed to ensure accuracy of measurement according to best practice [45].

Furthermore, some studies additionally omitted subject preparation protocols whilst others included these protocols. Such protocols include fasting for at least 7 h and abstaining from both caffeine/stimulants and nicotine for at least 4 and 2.5 h, respectively, before the measurement of RMR. Only studies using the Cunningham (1980) (LBM) and the Harris-Benedict (1918) (age, weight, height) equations were possible to analyse. These showed differences between subgroups depending on implementation of the criteria or not. However, no clear pattern of results was evident, and accuracy was not shown to be improved by their implementation. Once again, large heterogeneity was evident for most subgroups. It is also unclear whether these protocols were employed in the original studies in which these equations were derived [7, 8]. Although there is mention of fasting and physical rest during testing, there is no information on any other methodological procedures that were employed [7, 8]. Therefore, it is inappropriate to deduce a true effect of the presence/lack of these preparation protocols on the accuracy of the equations, and they remain best practice.

4.9 Methodological Considerations

Some methodological aspects of the current review should be considered. Variability between studies is inevitable in a systematic review [44]. Similar to a meta-analysis of studies examining activity energy expenditure monitors [65], this review demonstrated large heterogeneity between and within RMR prediction equations. Taking the variability into account, a random effects model was employed, and narrative synthesis and pre-specified subgroup analysis were conducted to examine the role of participant and methodological diversity. In addition, most studies provided comparable data for meta-analysis similar to those examining the accuracy of RMR equations in other populations [66]. Therefore, the present analysis should contribute to any future research on the accuracy/precision of equations.

It is also acknowledged that meta-analysis results will be influenced by the number of comparisons made. For example, although the Ten-Haaf (2014) (age, weight, height) equation was found to be the best performing equation and had the least heterogeneity, only seven comparison groups from four separate studies contributed to this result. An increase in the number of comparisons and studies to approximately match the Cunningham (1980) (LBM) and the Harris-Benedict (1918) (age, weight, height) equations (29 and 30 comparisons, 17 and 18 separate studies, respectively) is required to better understand the performance of the Ten-Haaf (2014) (age, weight, height) equation relative to more frequently used equations.

For locally derived equations, it is important to note that potential for bias exists when assessing the performance of the equation within the same cohort from which it was initially derived. This may inflate the reported efficacy of the equation. To mitigate this risk, validation of these locally derived equations within separate cohorts is recommended to facilitate an impartial evaluation of their predictive performance. As shown in Table 2, the majority of locally derived equations were not cross-validated internally or externally in athletes. Therefore, until such validation studies are performed, caution is warranted before using these in practice. In addition, the criteria reported for appropriately validating an equation should be considered. This review focused on mean ± SD bias between predicted and measured values, and precision. These variables were selected for several reasons, including being (i) the most commonly reported or possible to derive from existing studies; (ii) able to provide insight into equation performance at the group and individual level; and (iii) in line with previous literature whereby accurate predicted values were defined as those falling within ± 10% of measured values [34,35,36]. Focusing on mean bias alone could mask important inter-individual differences. For example, Balci et al. [32] found no significant bias between measured and predicted values by Harris-Benedict (1918) equation, but only 40% of participants were calculated to be within 10% of measured values. Therefore, while mean bias may indicate direction of values on a group level, it should be considered alongside the limits of agreement and percentage precision of an equation when determining the most appropriate equation for an individual. While some studies reported root mean square error, it was not consistently reported or possible to derive. For standardising reporting in future studies, the root mean square error would be valuable to report alongside bias, limits of agreement and precision.

The characteristics of athletes included in the present study should be considered when interpreting the generalisability of findings. Despite inclusion criteria spanning adult athletes aged 18–65 years, data extraction revealed that studies meeting inclusion criteria involved adults of 18–35 years. Studies in masters athletes aged 35–84 years inclusive [67] and in youth athletes [68, 69] have been conducted but did not provide a breakdown of equation performance by age category. Consequently, these studies were not eligible for inclusion but are of interest when considering RMR prediction for athletes in these categories. Further research is needed to validate equations in adult athletes aged 35–65 years to inform recommendations for this age group. Racial differences in RMR should also be considered when interpreting findings. The majority of studies included did not specify participant race, solely reporting nationality and only in some cases. Given evidence that race may influence RMR [70], further studies are needed to incorporate and compare athletes of different racial backgrounds to determine whether this may impact on choice of equation.

4.9.1 Emerging Research and Practical Implications

The increasing interest in identifying the best equations for predicting RMR in athlete groups is evident by the number of recent publications on this topic and new equations proposed. As shown in Table 2, out of 14 studies proposing equations based on athlete populations, the first was in 1999 by De Lorenzo, and the majority were published in the last decade. It should also be noted that between the final search date (November 2021) and June 2023, five further studies were published that fit the inclusion criteria. These studies are not included in the narrative synthesis and meta-analyses presented. However, in order to compare with the overall findings of the current review, the key findings from these studies along with the performance of the Ten-Haaf (2014) (age, weight, height) equation (found to be most accurate and precise overall) are discussed below and findings noted in Table 2.

Of these five studies, three [71,72,73] did not include the Ten-Haaf (2014) (age, weight, height) equation. In these studies, the key findings were as follows:

  • In NCAA collegiate men and women athletes, all prediction equations investigated (Cunningham, De Lorenzo, Freire, Harris-Benedict, Mifflin St. Jeor, Nelson, Owen, Tinsley, Watson, Schofield) were found to underestimate RMR [72].

  • In Korean collegiate soccer players, the Taguchi (2011) equation performed best out of five FFM-based RMR equations [71].

  • In groups of active resistance trained females and males, the Cunningham (1991) and the DeLorenzo (1999) equations respectively were closest to measured values of seven equations studied [73].

In the two studies that included the Ten-Haaf (2014) (age, weight, height) equation [74, 75], both showed that predicted RMR values did not differ significantly from measured values. Inclusion of these data in the overall meta-analysis for the Ten-Haaf (2014) (age, weight, height) equation resulted in an ES of − 0.02 (95% CI − 0.17 to 0.14, p = 0.83, I2 = 0%) and MD − 2.6 kcal/24h (95% CI − 29.6 to 24.4), compared with the original meta-analysis findings reported of ES = 0.04 (95% CI − 0.16 to 0.23, p = 0.70, I2 = 0%). Therefore, these data support the overall findings of our review.

In relation to the two latter studies, Freire et al. [74] also proposed two new equations derived from 71 ‘high-level’ Brazilian athletes, all minimum national level (majority Tier 4–5, 87% World Championship, 45% Olympic level) from 21 different sports. The equation was cross-validated in a further sample of 31 athletes in the same study. Elsewhere, Van Hooren et al. found the Ten-Haaf (2014) (age, weight, height) equation performed best, while the Oxford equation [20] underestimated RMR in 25 professional cyclists [75]. The authors also developed a new equation for use in professional cyclists. These equations require further validation in future studies.

To aid decision making regarding choice of equation, an overview of considerations based on the evidence in this review is presented in Fig. 3. Due to emerging studies and new equations, the best equations for use are likely to evolve. Therefore, when determining the equation to use, practitioners and researchers should first consider whether there is an appropriately validated equation developed in athletes of similar characteristics to the athlete/s of interest. If not available, equations that have been externally validated in studies of athletes of similar characteristics should be considered and reported accuracy and precision considered when interpreting results (Fig. 3, Table 2, Supplementary Document 7 (see ESM)).

Fig. 3
figure 3

Flow chart to guide choice of equation for predicting resting metabolic rate (RMR) in an athlete. Ideally an equation developed in or validated in athletes of similar characteristics (considering age, sex, body composition) should be used. If not available, the equations shown (which demonstrated no overall mean bias in the present meta-analysis and have been validated for some athlete groups) could be considered. Under each equation, the key characteristics of athletes that have been studied (including sport and mean body weight), mean bias and precision (where available) are reported. This information should be considered when selecting an equation to best match the athlete/s of interest. Studies that met below average best practice RMR measurement guidelines are shown in red text and should be interpreted more cautiously. Full details of each equation, the population(s) it was developed and validated in along with references are shown in Table 2 and Supplementary Document 7 (see ESM). Created in Biorender.com. LBM lean body mass, RMR resting metabolic rate

5 Conclusion

Many different RMR prediction equations have been used in athletes. These can differ widely in accuracy and precision. Choosing a prediction equation based on an athlete population of similar characteristics (physical characteristics, sex, sport, athlete status) is the preferred option. While no single equation is guaranteed to be superior, the Ten-Haaf (2014) (age, weight, height) equation appears to be most accurate and precise in estimating RMR in general athlete groups. In addition, some equations have been observed to consistently underperform in athletes and should be avoided. Caution should be applied when utilising any prediction equation for the cross-sectional calculation of RMR ratio as a proxy indicator of LEA.