Study design
This study was a cross-sectional study that involved three visits to the University of Bath, UK. All participants provided written informed consent prior to participating in the study. The study was performed in accordance with the Declaration of Helsinki and was approved by the Research Ethics Approval Committee for Health at the University of Bath (REF: EP 16/17 141) and the South West-Bristol NHS Research Ethics Committee (17/SW/0269) and registered on ClinicalTrials.gov: NCT03029364.
Briefly, participants completed two matched trial days (Trial A and Trial B) separated by 7–28 days that involved the assessment of anthropometrics, resting metabolic rate, a fasting venous blood sample and a FATMAX test. A third visit (Trial C) was also organised 2–7 days after Trial B that involved a dual-energy X-ray absorptiometry (DEXA) scan to assess body composition. Trials were completed after an overnight fast (10–12 h) and started at a similar time (± 1 h within participant) of the day (0630–1230 h). Over the 48-h preceding each trial, participants were asked to: (a) abstain from alcohol and strenuous physical activity; and (b) wear a physical activity monitor and replicate their dietary intake and physical activity (all confirmed by verbal questioning). Additionally, over the 7 days before Trial A, participants recorded a self-weighed diet diary and wore a physical activity monitor. On the morning of each trial, participants minimised physical activity and consumed 568 mL of water upon waking (see accompanying open access readme file for study protocol deviances (Chrzanowski-Smith et al. 2020). Participants also maintained their habitual lifestyle throughout their involvement in the study. All trials (within-subject) were performed under similar laboratory conditions [particularly for ambient temperature (CV = 4%) and barometric pressure (CV = 1%) with more variance in humidity (CV = 16%); p values for systematic differences between Trial A and Trial B > 0.187] where ad libitum water intake and use of fans were permitted.
Participants
Ninety-nine healthy male and female adults (aged 18–65 years) were recruited from the South West region of the UK. Exclusion criteria included; age < 18 or > 65 years; having current or any history of cardio-pulmonary, metabolic or musculoskeletal disease; breastfeeding or was/potentially pregnant; a body mass index outside of < 18.5 and > 35 kg·m−2; not willing to meet the demands of the study or maintain their habitual lifestyle during their involvement; not being weight stable (± 5% body mass; self-reported) for at least the 3 months prior to their involvement; or any conditions or concurrent behaviour (including medication) that may have posed undue personal risk to the participant or introduced bias to the study. Participant characteristics are presented in Tables 1 and 2. In female participants who were eumenorrheic and not on contraceptive medication, trials were scheduled (based on self-reported and predicted phases) to take place in the same phase of the menstrual cycle. The menstrual cycle was split into two broad phases: the follicular and the luteal (which included ovulation). The success in controlling for menstrual cycle phase between Trial A and Trial B (based on self-report and predicted phases) was then objectively verified by the analysis of oestradiol and progesterone concentrations. As oestradiol concentrations can vary widely across the menstrual cycle, the follicular and luteal phases were determined by a progesterone concentration of < and ≥ 5 nmol·L−1, respectively (Oosthuyse et al. 2005). As shown in Supplementary Table 1, the success of controlling for menstrual cycle phase was varied. In all females whose menstrual cycle phase was matched between Trial A and Trial B (i.e. were tested in the same phase), testing occurred in the follicular phase (a progesterone concentration of < 5 nmol·L−1). If Trial A and Trial B occurred in a different phase of the menstrual cycle, participants were classed as non-matched. Female participants for whom it was unknown what phase of the menstrual cycle Trial A and/or Trial B occurred in (e.g. progesterone concentrations were not available) were grouped as ‘unknown’. Female participants who self-reported the absence of menstrual cycle for ≥ 365 days were classified as post-menopausal, where low concentrations of oestradiol and progesterone were apparent (Supplementary Table 1). Contraceptive use was categorised into four sub-groups: combined pill, progesterone-only pill, intrarauterine system (IUS) or intrauterine device (IUD).
Table 1 Participant demographic and lifestyle characteristics Table 2 Participant metabolic characteristics and metabolite and hormone concentrations Anthropometrics
Anthropometric measurements were performed upon participant arrival at the laboratory. Body stature was measured to the nearest 0.1 cm using a wall-mounted stadiometer (Holtain Ltd, Pembrokeshire, UK) alongside body mass to the nearest 0.1 kg using electronic weighing scales (BC-543 Monitor, Tanita, Tokyo, Japan). During Trial C, body stature and body mass were assessed in addition to waist and hip circumference [to the nearest 0.1 cm using a non-elastic measuring tape (SECA 201, Hamburg, Germany)] and a whole-body dual-energy X-ray absorptiometry scan was taken to quantify fat and fat-free mass (Discovery, Hologic, Bedford, UK).
Blood sample and analysis
After resting metabolic rate was assessed, a 10-mL whole venous blood sample was obtained from an antecubital vein (BD Vacutainer Safety Lok, BD, USA). Blood samples were equally dispensed into either a 5-mL ethylenediaminetetraacetic acid-coated tube (K3 EDTA, Sarstedt, Germany) or a 10-mL serum/clotting activator tube (Serum Z/10 mL, Sarstedt, Germany) for plasma and serum separation, respectively. Samples for plasma were immediately centrifuged (1700g for 15 min at 4 °C); whereas, serum tubes were left to clot for 20–30 min at room temperature prior to centrifugation (standardised within-participant; Heraeus Biofuge Primo R, Kendro Laboratory Products Plc., UK). The plasma and serum samples, alongside the buffy coat layer from the K3 EDTA tube, were dispensed equally into 0.5-mL aliquots and immediately frozen at − 20 °C, before longer-term storage at − 80 °C for later batch analysis. The plasma samples were analysed for concentrations of various metabolites and hormones according to manufacturer instructions. Total plasma non-esterified fatty acids (NEFA; Cat No: FA115; intra-assay < 5% and inter-assay < 5%), glucose (Cat No: GL3815; < 5% and < 6%), lactate (Cat No: LC3980; < 4% and < 5%) and triglycerides (Cat No: TR3823 < 4% and < 4%) concentrations were run in singular on a Daytona Rx Series (Randox Laboratories, Crumlin, NI, USA). Total 17β-oestradiol (Elecsys Estradiol III; < 7% and < 11%) and progesterone (Progesterone III; < 11% and < 23%) concentrations were run in singular on a Cobas 8000 (Modular analytics Cobas e 602, Roche Diagnostics, Rotkreuz, Switzerland). Total plasma insulin concentrations were analysed by an enzyme-linked immunosorbent assay (ELISA) kit in duplicate (Cat No: 900095, Cyrstal Chem, Illinois, USA) with absorption determined by a microplate reader (SPECTROstar Nano, BMG LABTECH, Ortenberg, Germany) at wavelengths specified by the manufacturer (intra-assay CV < 2%; inter-assay CV < 24%).
FATMAX test
After resting metabolic rate was assessed and a fasting venous blood sample was obtained, participants then completed a FATMAX test. This test adopted a protocol previously validated in individuals who were trained (Achten et al. 2002) and in individuals who had low cardiorespiratory fitness (Chrzanowski-Smith et al. 2018). Briefly, the FATMAX test was an incremental graded cycling test to volitional exhaustion completed on a mechanically braked cycle ergometer (Monark Peak Bike Ergomedic 894E, Varberg, Sweden). The graded test comprised of four-min stages for the first seven stages and two-min stages from the eighth stage onwards. The initial power output was ~ 30 or 40 W and increased by ~ 25 W (excluding the 10-W increment between first and second stages in the 30-W protocol) over the next five and six stages, respectively, and by ~ 50 W from stage seven onwards. One-min expired gas samples, heart rate and RPE were collected in the final min of the first seven stages and upon the participant’s signal of one-min remaining before volitional exhaustion. The graded test was used to determine:
-
a)
Peak fat oxidation (g·min−1);
-
b)
FATMAX (expressed as a % of \(\dot{V}\)O2peak);
-
c)
Peak power output (W; power output of the last completed stage, plus the fraction of time in the final non-completed stage, multiplied by the Watt increment of that stage);
-
d)
An estimate of peak oxygen consumption (\(\dot{V}\)O2peak; mL·kg−1·min−1)
Three data analysis approaches were applied to determine PFO and FATMAX. These involved: (1) the measured values approach [MV; the stage with the highest recorded fat oxidation value and the corresponding \(\dot{V}\)O2 (Achten et al. 2002)]; (2) the fitting of a least squares second-order polynomial curve to the measured fat oxidation rates (P2) (Hansen et al. 2019; Stisen et al. 2006); and (3) the Sine model [SIN; a mathematical model that applies a sinusoidal equation to the observed fat oxidation rates and takes into account the dilation, symmetry and translation of the fitted curve (Chenevière et al. 2009). This model estimate was achieved through an excel spreadsheet that involved a solver function kindly provided by Dr Xavier Chenevière].
Metabolic measurements
Expired gas samples were collected into 100–150 L Douglas bags (Cranlea and Hans Rudolph, Birmingham, UK) via a mouthpiece connected to a two-way, T-shaped non-rebreathing valve (Model 2700, Hans Rudolph Inc, Kansas City, USA) and Falconia tubing (Hans Rudolph Inc, Kansas City, USA). Concentrations of O2 and CO2 were measured in a known volume of each sample via paramagnetic and infrared transducers, respectively (Mini MP 5200, Servomex Group Ltd., Crowborough, East Sussex, UK) and until values were stable. The sensors were calibrated to a two-point low and high calibration of known gas concentrations (low: 99.998% nitrogen, 0% O2 and CO2; high: balance nitrogen mix, 20.06% O2, 8.11% CO2) (BOC Industrial Gases, Linde AG, Munich, Germany). Concurrent measurements of inspired air composition were made during collections of expired gas samples to adjust for changes in ambient O2 and CO2 concentrations (Betts and Thompson, 2012). Indirect calorimetry was used to determine: \(\dot{V}\)O2 (L·min−1); \(\dot{V}\)CO2 (L·min−1); and rate of fat oxidation [g·min−1; estimated by Frayn’s stoichiometric equations assuming urinary nitrogen excretion was negligible (Frayn, 1983)].
Resting metabolic rate [(RMR; kcal·day−1) and resting rates of fat oxidation (g·min−1)] were measured following guidelines for best practice (Compher et al. 2006): after 15 min of quiet rest in a semi-supine position, RMR was measured by indirect calorimetry of at least two expired gas samples of five-min duration and within 100 kcal·day−1.
Habitual lifestyle assessment
Habitual physical activity levels were assessed by asking participants to wear a physical activity monitor (Actiheart™, Cambridge Neurotechnology, Papworth, UK) over the7 days prior to Trial A. Ideally, a minimum of four valid days (monitor worn for ≥ 90% of time in a day and < 30% of no heart rate signal) was required to determine habitual physical activity levels (excluding n = 5 participants for whom only three valid days were available). Additionally, energy expenditure and heart rate values from rest and the FATMAX test were entered in the Actiheart™ software to derive an individually calibrated model estimate of physical activity energy expenditure (kcal·day−1) and mins per day spent in different physical activity thresholds. To assess pre-trial physical activity standardisation, the monitor was also worn for the 48 h before Trial A and Trial B. Habitual energy and macronutrient intake were assessed by a self-weighed diet diary. Participants were provided with a set of scales (Pro Pocket ScaleTOP2KG, Smart Weigh Scales) and asked to keep a written record of their food and fluid intake for at least 4 days in the week preceding Trial A (including at least one weekend day). Additionally, the two days immediately prior to Trial A were recorded, so that participants could replicate this on the two days prior to Trial B. Diet records were analysed using Nutritics software (Nutritics Ltd., Dublin, Ireland).
Statistical analysis
Assumptions (normality, heteroscedasticity, linearity and proportional bias) for the below statistical tests were explored by a combination of visual inspection (histograms, skewness and kurtosis values and scatter graphs) and quantitative statistical tests (Shapiro–Wilk test, correlations, Levene’s test, Mauchly's Test of Sphericity) on raw data and residuals of comparisons. Parametric statistical tests were conducted when assumptions were met with either transformation (natural logarithm followed by anti (inverse)-log to facilitate the interpretation of data in their raw units), or the appropriate non-parametric equivalent was performed. ANOVA models were conducted irrespective of normality due to robustness against violations of normality (Maxwell 1990).
A range of a priori statistical analysis tests were performed to assess the day-to-day reliability of PFO (g·min−1) and FATMAX (%\(\dot{V}\)O2peak) as advocated (Atkinson and Nevill 1998): (1) systematic bias was assessed by dependent sample t tests and mixed-design analysis of variance (within-subject: Trial A and Trial B; between-subject: group category as per below). Bonferroni-adjusted p values were applied to control for multiple comparisons and for when significant main or interaction effects were detected in the ANOVA models; (2) an index of relative reliability was obtained by bivariate correlation (Pearson correlation coefficient; r); (3) the absolute day-to-day reliability was investigated by within-subject coefficient of variation [CV; root mean square method(Bland 2006)]; typical error [TE; SD of difference between scores/√2 (Hopkins 2015a)]; and Bland–Altman plot with mean difference (bias) and 95% limits of agreement (LoA) (Bland and Altman 1986). Mean difference was calculated by Trial A minus Trial B; and (4) individual data were plotted on graphs (as shown in Supplementary figures).
These tests were performed on the whole sample and on a range of sub-group analyses:
-
i.
Whole sample (n = 97). Systematic bias was assessed by dependent sample t tests. As PFO and FATMAX were not available for n = 2 participants in one or both trials (participant fainting and hyperventilation, respectively), these participants were excluded, leaving a maximum sample size of n = 97.
-
ii.
Data analysis approach (MV, P2 and SIN; n = 72; n = 34 females). A two-way repeated measures ANOVA (within subject; Trial: Trial A and Trial B; Model: MV, P2 and SIN) was performed for this analysis. This analysis primarily investigated the day-to-day reliability of each individual data analysis approach rather than the level of agreement between modelling approaches. Mathematical modelling could not be performed for n = 25 participants due to lack of fat oxidation data points or a plateau in data.
-
iii.
Sex (n = 50 males and 47 females). Participants were divided into male and female based on self-report from a participant questionnaire.
-
iv.
Cardiorespiratory fitness (n = 97). Participants were categorised into three training classifications (untrained, recreationally trained, highly trained) based on the corresponding \(\dot{V}\)O2peak thresholds outlined for males and females (De Pauw et al. 2013; Decroix et al. 2016). Due to the low sample size (n = 2), the highly trained group was excluded from reliability statistics.
-
v.
Fat Mass Index (n = 96). Participants were classified into four categories (fat deficient, healthy, excess adiposity and obese) as identified by Kelly et al. (2009). Due to only one participant being classified as obese, this individual was excluded from this respective sub-group analysis.
-
vi.
Physical activity level (n = 94). Participants were categorised into four physical activity level classifications (sedentary, low active, moderately active, very active) as identified by Brooks et al. (2004). Physical activity data were not available for n = 3 participants and due to the low sample size (n = 3), the sedentary group was excluded from reliability statistics.
-
vii.
Menstrual cycle status and contraceptive use (females only, n = 47). Female participants were divided into seven categories [menstrual cycle matched (Trial A and Trial B occurred in the same phase of the menstrual cycle verified by progesterone concentrations), menstrual cycle non-matched (Trial A and Trial B occurred in different phases of the menstrual cycle phase verified by progesterone concentrations), unknown (eumenorrheic but stage of the menstrual cycle when Trial A and Trial B took place was unknown), contraceptive use combined pill, contraceptive use progesterone-only pill, contraceptive use intrauterine device (IUD), contraceptive use intrauterine system (IUS) and post-menopausal]. Due to the low sample sizes in the progesterone-only pill, IUD, IUS and post-menopausal categories (n = 4, 5, 3 and 3, respectively), these sub-groups were excluded from reliability analyses.
Additionally, the above statistical tests were also employed to explore the level of agreement between the three analysis approaches (MV, P2 and SIN) to determine PFO and FATMAX. Estimates of PFO and FATMAX represent the average of Trial A and Trial B, where a one-way ANOVA [within-subject (three levels): MV, P2 and SIN] was used to assess model differences and systematic bias. The sample size for this analysis was n = 72 (n = 34 females).
Log transformation and antilog were required for FATMAX analyses of: (1) whole sample, (2) data analysis approach (reliability of individual models), (3) sex, (4) cardiorespiratory fitness (\(\dot{V}\)O2peak), and (5) physical activity level. Readers should note that the interpretation of these analyses is distinctly different from when log-transformation was not performed (see Supplementary material 1A for a description). Pearson correlation coefficient, TE and CV were computed for logged data via analysis recommended by Hopkins (2015b). When transformation did not improve the proportional bias (differences plotted against mean) and/or heteroscedasticity (absolute differences plotted against mean) in the data (or consistently across sub-groups), the raw non-transformed data were used for analysis (as such, more caution is required for the interpretation of these results). This was apparent for FATMAX analysis of: (1) fat mass index, (2) menstrual cycle status and contraceptive use, and (3) level of agreement between data analysis approaches. Pearson correlation coefficients were interpreted by an r of < 0.40, 0.40–0.74 and ≥ 0.75 for poor, fair to high and excellent, respectively (Dandanell et al. 2017a, b). There is no consensus to date on what constitutes an acceptable level of reproducibility for CVs, TEs or 95% LoAs for PFO and FATMAX. However a mean CV of 8% and 11% for the day-to-day reliability of PFO and FATMAX have been previously stated as acceptable (Hansen et al. 2019). Additionally, Nordby et al. (2015) and Rosenkilde et al. (2015) report an exercise training-induced increase in PFO and FATMAX of ~ 0.13 to 0.16 g·min−1 and 5–8%\(\dot{V}\)O2peak, respectively, compared to non-exercising control groups. Thus, these values were used to help interpret the day-to-day variability values produced for CVs and particularly 95% LoAs in PFO and FATMAX.
Additionally, prior to any of the above analyses, a sensitivity analysis performed in women found that the differences in concentrations of oestradiol and progesterone between Trial A and Trial B did not affect estimates of PFO and FATMAX (see Supplementary material 1B). This was performed due to the speculation that substrate utilisation during exercise may differ across the menstrual cycle only if concentrations of oestrogen differ by twofold or more between testing occasions (Oosthuyse and Bosch 2010). Consequently, a sensitivity analysis also found no differences in the interpretation of results from menstrual cycle status and contraceptive use when the above statistical tests were performed with and without individuals whose concentrations of oestradiol and progesterone were ≥ two- and < twofold between trials, respectively.
Descriptive and statistical analyses were run on Microsoft Excel (2013) and IBM SPSS statistics version 25 for windows (IBM, New York, USA) and graphs were created on Graph Pad Prism 7 software (La Jolla, CA, USA). Data are presented as means ± SD (or 95% confidence intervals for r, CV and TE) unless otherwise stated and statistical significance was accepted at p ≤ 0.05.