Introduction

Studies in sports science are typically conducted and reported at the group level yet applied at the individual level. However, it has been increasingly questioned whether group-level results can generalize to individuals [9, 13, 20, 33], as group-level findings could conceal relevant inter-individual variability in response to a training stimulus or intervention [17]. This could lead to sub-optimal training or nutrition prescriptions for an individual athlete. When the group-level variability of data does not resemble the individual-level variability, or when individual-level variability exhibits changing variance over time, the data is non-ergodic [29, 32]. Nonergodicity could lead studies to overestimate the accuracy of aggregated statistical estimates and in turn, the generalizability of conclusions between the group and individual. In light of this, nonergodicity has been suggested as a threat to human subjects research [13].

Sports nutrition guidelines recommend carbohydrate intake be modulated according to changes in exercise volume [44], with the intention of optimizing training adaptation while ensuring adequate recovery. Under-fueling can cause low energy availability and impaired recovery [30], while over-fueling can cause weight gain and potentially attenuate desired training adaptations [6]. It is commonly reported that perceived ratings of wellness and recovery are sensitive to fluctuations in training load [16, 45], and sleep duration [39]. However, the influence of dietary intake on daily recovery during endurance training is less understood. During short-term periods of intensified endurance training, increasing energy and carbohydrate intake may attenuate symptoms of overreaching [1, 19, 24, 42], although it is unclear if this relationship between carbohydrate intake and daily recovery extends over longer time periods and/or across a range of training volumes in a practical setting.

To determine ergodicity of a given data set, a Cattell data box can be used as previously described by Molenaar and Campbell [32] and Neumann et al. [33]. This can be visualized as a 3-dimensional box with time, measured variables, and individual subjects as the dimensions. For group-level analysis of a variable of interest, a single time point is pooled across all subjects (e.g., all subjects on day 1), repeated for each additional time point, and summarized [e.g., mean, standard deviation (SD), confidence intervals (CI), etc.]. For individual-level analysis the variable is analyzed across all time points separately for each subject and then summarized. If the structure of the group- and individual-level data differ (e.g., statistics of central tendencies, variations, and/or correlations of time series data), the process is considered non-ergodic and results obtained from standard analysis at the group level cannot be applied to the individual [32].

Nonergodicity is relevant in the context of nutrition and training, as evidence-based practitioners and athletes often apply group-level research findings to the individual [15, 44]. Therefore, the purpose of this study was to examine the relationship between daily carbohydrate intake and perceived recovery status and determine if group-level statistics can generalize to individual athletes. To do so, 55 endurance athletes recorded daily measures of self-selected nutrition intake, exercise training, sleep habits, and subjective wellbeing for 12 weeks. We constructed linear models to measure the influence of daily carbohydrate intake on perceived recovery status the following morning while accounting for other factors such as training load, sleep, and muscle soreness. Using the model coefficient for carbohydrate intake we tested whether the distributions (mean and SD) differed at the group and individual levels. As an exploratory analysis, we also created a decision tree model to understand general traits of athletes that would predict a positive, negative, or non-significant model coefficient for carbohydrate intake. This could serve as the next step in understanding individual level-differences, and provide a direction for coaches and practitioners to make better decisions to support the individual athlete’s needs.

Methods

Study Design

Self-selected nutrition intake, exercise training, sleep habits, and subjective wellbeing of endurance athletes were monitored daily over a 12-week period. Throughout the study period, participants were free to perform any type of exercise and consume any type of diet. Results presented herein are from a wider study of endurance training and recovery. Data related to carbohydrate periodization [37] and machine learning predictions [38] have been reported elsewhere. The study was open to male and females aged 18 or older who train at least 7 h per week, were using a smartphone app to track their dietary intake at least 5 days per week, captured HRV daily, and tracked sleep using a wearable device. All study protocols and materials were approved by the Auckland University of Technology Ethics Committee (22/7), and all participants provided informed consent prior to starting the study.

Participants

Fifty-five endurance athletes (61.8% male, aged 42.6 ± 9.1 years, training 11.6 ± 3.9 h per week) took part in the study. The primary sports represented were triathlon (n = 37, 67.3%), running (n = 11, 20.0%), cycling (n = 6, 10.9%), and rowing (n = 1, 1.8%). The self-reported competitive level included professional (2.6%), elite non-professional (qualify and compete at the international level as an age-group athlete, 34.6%), high-level amateur (qualify and compete at National Championship-level events as an age-group athlete, 25.6%), and amateur (enter races but don’t expect to win, or train but do not compete, 37.2%) athletes.

Assessment of Self-Reported Exercise

All exercise was recorded in Training Peaks software (TrainingPeaks, Louisville, CO, USA). Each session was noted for modality (e.g., bike, run, swim), duration, and session rating of perceived exertion (sRPE [14]) using the Borg CR100® scale, which offers additional precision compared with the CR10 scale [10]. Participants were instructed to rate their perceived effort for the whole training session within 1-h of exercise, although sRPE scores are temporally robust from minutes to days following a bout of exercise [14].

Assessment of Self-Reported Dietary Intake

Participants were instructed to maintain their typical dietary habits and record all calorie-containing food and drink consumed for the duration of the 12-week study. Weighing of food was encouraged, but not mandated, and common issues such as underreporting were discussed before starting the study. Participants were not required to record non-caloric fluid ingestion, micronutrient content, or timing of meals. Dietary intake was self-reported using the MyFitnessPal application (www.myfitnesspal.com). Compliance with dietary tracking was monitored by connecting to participant food logs via MyFitnessPal, and enquiring about any unexpected values (determined both visually and using anomaly detection software [11]). Incomplete days of tracking (2.2% ± 4.6% of days per participant) were removed from the data. To aid compliance, participants were recruited who were already regularly tracking their diet (in several cases daily for 4+ years), and so all participants displayed strong intrinsic motivation for habitual diet tracking.

Assessment of Sleep and Subjective Wellbeing

Nightly sleep duration was recorded using wearable devices, which included Oura ring, Whoop strap, Applewatch, Fitbit, and Garmin models as previously described [38]. These consumer-grade devices offer adequate accuracy in detecting sleep–wake times, but not sleep staging [7, 8, 31, 34, 46]. Each morning participants answered four questions related to subjective wellbeing based on the recommendations of Hooper and Mackinnon [21]. The perceived recovery status (PRS) scale [28] was used to measure overall recovery with athletes manually typing a number into Training Peaks software. The 100-point version of the scale was used, which has been shown to provide more accurate measures of recovery than the 10-point scale [10]. In addition, ratings of life stress (1–7), sleep quality (1–7), and muscle soreness (1–10) were also recorded into the software each morning. Participants were familiarized with all scales prior to starting the study.

Data Preparation

Training load was calculated for each workout as the product of sRPE and duration of exercise in minutes [18], divided by 10 to account for the 100-point scale, and summed into daily totals. External load metrics such as heart rate, power or pace were not collected because many athletes undertake activities that can’t be quantified on a common scale such as strength training, yoga or swimming without a HR monitor, and also because the sRPE is considered to be a valid and reliable method for calculating training load across modalities [18]. Seven-day rolling measures for training monotony (a measure of day-to-day variability in the weekly training load, calculated as average daily load divided by the standard deviation) and training strain (product of total weekly training load and training monotony) were calculated [18]. A sleep index score was calculated as the product of sleep duration and sleep quality [40]. Dietary macronutrient intake was converted to a relative intake (g per kg body mass) to allow appropriate comparison between athletes.

Participants were excluded from the analysis if they were training on average less than 6 h per week (n = 8) or did not log at least 85% of the required data points (n = 3). Participants who did not complete the full 12 weeks due to illness, injury, or drop-out but completed at least 6 weeks of tracking were included in the analysis (n = 11). Among participants included in the analysis (n = 44), 2.4% ± 1.7% of data points were missing. Missing values were imputed at the individual level using multiple linear regression and nearest neighbor algorithms for diet and training measures and using median values for other variables [25].

Analysis

Following the recommendations of previous studies [13, 33], we extracted a subset of data that was symmetrical (i.e., an equal number of participants and observations per participant) to equalize statistical power for analysis at the group and individual levels. Because we had 44 participants in the final analysis, 44 consecutive days were chosen beginning with day 8 to allow for an accurate calculation of training strain (which reflects the previous 7 days of training). Repeated measures correlation [4] was used at the group level to examine the bivariate relationship between the morning (AM) PRS score and prior day carbohydrate intake. Pearson or Spearman correlations, depending on normality of the data as determined by the Shapiro–Wilk test, were used to examine the bivariate relationship between the AM PRS score and prior day carbohydrate intake for each individual.

Previous studies of ergodicity have focused on comparisons of univariate distributions and bivariate correlations [13, 33]. However, the relationship between diet and recovery is likely also dependent on other factors relating to training and sleep. To account for this, linear regression models were constructed with AM PRS score specified as the dependent variable, and prior day carbohydrate intake (g/kg), prior day training load, training strain (encompassing the previous 7 days), muscle soreness, and sleep index specified as independent variables. These variables were chosen because they had the highest importance scores in our predictive modeling study [38]. The model coefficient for carbohydrate intake was the primary variable of interest. For group level analysis, models were made for all 44 athletes together on day 1 and repeated for each of the 44 days with the results summarized across days (as mean, SD, and 95% CIs). For individual-level analysis, a separate model was created for each athlete, and the results were then summarized. However, data at the individual level are a time series, which refers to a sequence of data points at equally spaced points in time and ordered chronologically [41]. Time series data cannot be analyzed with common techniques such as linear modeling if the day-to-day observations are correlated with observations at previous time points (i.e., auto-correlated) and are not independent of each other, as key assumptions of linear regression are violated [23]. Autoregressive Integrated Moving Average (ARIMA) models are commonly used in time series analysis to account for these issues [41]. Therefore, for individual-level analyses we obtained the model coefficient for carbohydrate intake by constructing ARIMA models using the Hyndman-Khandakar algorithm for automatic ARIMA modelling [22]. Ergodicity can be confirmed if the mean and SD at the group and individual levels were not significantly different [32]. R-squared (R2) was also calculated as an overall measure of model fit.

To explore characteristics which might inform the individual responses to carbohydrate intake, a decision tree model was created to predict the classification of statistical significance for the model coefficient of prior day carbohydrate intake from the individual ARIMA models (non-significant, significantly positive, or significantly negative). To determine statistical significance, 95% CIs for the unstandardized regression coefficients were calculated, and values were considered significant if the CIs did not cross zero. The coefficients were organized into these three categories with practical application in mind. That is, coaches or nutritionists might benefit more from knowing if/how an individual responds to carbohydrate in this context, rather than getting a predicted model coefficient for the individual athlete.

Variables used in the decision tree model were age, training age, competitive level, primary sport, sex, BMI, percentage of training days performing fasted-state training, and average values of daily kcal intake (kcal/kg), daily carbohydrate, fat, and protein intake (g/kg), carbohydrate monotony (mean daily intake/SD), weekly training volume (h), training monotony, and training strain. All available data points were used for the decision tree models (n = 3588, 81.5 ± 10.4 days per participant), rather than the 44-day subset used to compare group vs. individual responses to obtain the most accurate picture of each individual’s characteristics. Modeling was performed in R using the Tidymodels ecosystem [26]. Hyperparameters were tuned using 100 bootstrap resamples and model accuracy was established using 500 bootstrap resamples. Class imbalances were handled by up-sampling prior to tuning. Cohen’s Kappa (κ) was used as the primary accuracy measure due to the imbalanced, multi-class nature of the outcome variable. κ accounts for the accuracy that would be generated simply by chance, producing values between −1 and 1. We interpret these values using the guidelines of Landis and Koch [27], with values of 0–0.20 considered slight, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, and 0.81–1 as almost perfect. In addition, we report positive predictive value and negative predictive value [2]. All analyses were carried out with R version 4.0.3 (The R foundation for Statistical Computing, Vienna, Austria). Descriptive statistics are provided as mean ± SD.

Results

During the 44-day period selected for the primary analysis, average participant training volume was 11.9 ± 3.4 h per week. Mean daily dietary intake was 39.4 ± 9.0 kcal/kg, 4.0 ± 1.6 g/kg carbohydrate, 1.9 ± 0.4 g/kg protein, and 1.7 ± 0.6 g/kg fat. Average sleep duration was 7.5 ± 0.7 h per night. Bivariate repeated-measures correlation at the group level revealed a significant negative relationship between AM PRS and carbohydrate ingestion the prior day (r = −0.09, 95% CI −0.14 to −0.05, P < 0.001), but this relationship varied considerably among individuals (Fig. 1).

Fig. 1
figure 1

Bivariate correlations between AM perceived recovery status (PRS) and carbohydrate ingestion the prior day (g/kg). Values given in (a) used repeated-measures correlation analysis. Panel (b) shows Pearson (circle) or Spearman (triangle) correlation values and 95% confidence intervals for each participant, colored based on statistical significance (P < 0.05). Panel (c) shows example scatterplots for the participants with the three highest and three lowest correlation values. Numbers at the top of each panel in (c) relate to the participant ID shown in (b). The light grey points in (c) depict all points for the six participants shown in (c), with each individual’s points shown in color (red for negative correlation values and green for positive)

After accounting for prior day training load, 7-day training strain, muscle soreness, and sleep index via linear modeling, model coefficients for carbohydrate intake were negative for three participants (7%), positive for four participants (9%), and non-significant for 37 participants (87%, Fig. 2). Mean values for model coefficients were similar between the group and individual (evidenced by overlapping CIs), whereas SDs were different, (i.e., non-overlapping CIs) indicating nonergodicity (Fig. 3). Non-ergodicity was also observed in the overall model accuracy. Mean R-squared values were 0.32 (95% CI 0.29–0.35), and 0.40 (95% CI 0.35–0.45), for the group and individual models, respectively, and SD values were 0.11 (95% CI 0.09–0.13) and 0.18 (95% CI 0.14–0.22) for the group and individual models, respectively.

Fig. 2
figure 2

Individual participant model coefficients with 95% confidence intervals for the effect of prior day carbohydrate (CHO) intake (g/kg) on AM perceived recovery status (PRS) score after accounting for prior day training load, 7-day training strain, muscle soreness, sleep index (product of sleep duration and sleep quality), using autoregressive integrated moving average (ARIMA) modeling. This can be interpreted as a change of AM PRS in the amount shown on the x-axis for every 1 g/kg increase in daily CHO intake, after holding everything else constant. Green indicates statistically significant positive values, red indicates statistically significant negative values, and grey indicates non-significant values (P > 0.05)

Fig. 3
figure 3

Density plots of the model coefficients for the effect of prior day carbohydrate (CHO) intake (g/kg) on AM perceived recovery status (PRS) score after accounting for prior day training load, 7-day training strain, muscle soreness, and sleep index (product of sleep duration and sleep quality). Inset table shows mean, SD, and 95% confidence intervals for model coefficients for carbohydrate intake from group and individual level modeling

A decision tree was created to explore potential factors that could provide coaches or practitioners with an indication of an athlete’s relationship between carbohydrate intake and perceived recovery status (Fig. 4). The κ value was 0.52, indicating a moderate level of agreement. Positive predictive value was 0.44, and negative predictive value was 0.87. A confusion matrix of actual and predicted classes is shown in Fig. 5.

Fig. 4
figure 4

Decision tree predicting the response to prior day carbohydrate (CHO) intake on AM perceived recovery status. Each node indicates the predicted class (negative, non-significant, or positive model coefficients). At each level, following the node to the left corresponds to yes, and following the node to the right corresponds to no

Fig. 5
figure 5

Confusion matrix of actual and predicted classes for the decision tree model predicting the response to prior day carbohydrate ingestion on AM perceived recovery status. Values shown in the dark green boxes indicate the number of correct predictions for each class

Discussion

The aim of this study was to examine the relationship between daily carbohydrate intake and perceived recovery status and determine if group-level statistics can be generalized to individual athletes. The main outcomes are (1) the data are non-ergodic, meaning group-level findings cannot be generalized to the individual, (2) daily carbohydrate intake does not influence perceived recovery status the following morning for most athletes, after accounting for other influential variables such as training load, muscle soreness, and sleep, (3) for those that are affected the influence can be positive or negative, and (4) we build upon previous work using bivariate correlations to include linear model coefficients and offer a method for understanding the individual responses through a decision tree algorithm.

We observed a large discrepancy between inter- and intra-individual variation (i.e., nonergodicity), as SDs at the individual level were ~2.4 times larger than at the group level. This means there would be a difference when computing statistics by first averaging the data before the calculations versus first calculating the statistics for each individual before averaging these results [20]. Furthermore, mean values may be misleading when determining the influence of carbohydrate intake on AM PRS. At the group level, a traditional interpretation would suggest carbohydrate has minimal influence on AM PRS after accounting for the other variables. Although this would be true for most athletes (~87% of our participants), model coefficients were positive for 9% and negative for 7% of our participants (Fig. 2). This means the individual, rather than the group, should be placed at the level of analysis to avoid wrong conclusions [20].

Previous studies have used bivariate correlations to explore ergodicity [13, 33]. In this context, bivariate correlations could be misleading because athletes often increase carbohydrate intake on days with higher training loads [37]. Because of the multifactorial nature of day-to-day recovery, we created linear models to account for these additional factors while focusing the analysis on daily carbohydrate intake. Subjective muscle soreness and sleep index were included because they are two of the most important factors predicting AM PRS scores, as reported by us [38] and others [16, 40, 45]. Training strain was included in the model to account for potential residual fatigue from the previous seven days of training. Training strain (the product of training load and training monotony) is high when high training loads are combined with low variability of load, and low when athletes complete either low training loads or have regular variation in training [18]. Together, these variables account for a substantial amount of the variance in PRS scores and allow a more focused look at the influence of carbohydrate intake.

The beliefs and practices surrounding nutrition and training vary widely among athletes [35, 36]. Although it could be tempting to try and find unifying answers to some of the contrasting beliefs held by athletes (e.g., the positive or negative influence of fasted-state training or increasing carbohydrate intake), the current study underscores the idea that what’s best for one athlete may not be best for another. It is also noteworthy that athletes in this study were undertaking self-selected training programs, and results cannot be generalized to short-term periods of intensified training, where increasing energy and/or carbohydrate intake has been shown to attenuate symptoms of overreaching [1, 19, 24, 42]. Future research can examine the influence of carbohydrate intake on daily recovery during periods of prescribed training, as well as exploring if/how the influence of carbohydrate changes based on how closely an athlete matches their daily intake based on their training volume and/or intensity, a practice recommended and followed across a diverse range of sports [3, 12, 43]. It would also be of interest to study whether the influence of carbohydrate on training adaptations has any relationship with the influence of carbohydrate on daily recovery.

As a way of translating the interindividual variability from a statistical concept to practical application, a decision tree model was created. Variables such as age, sex, BMI, competitive level, training volume, and habitual dietary patterns were included to better understand what traits or qualities might be related to a certain response to carbohydrate intake. Although interpretation of the decision tree is challenged by the small number of athletes presenting significant model coefficients for carbohydrate intake and the inability of the model to accurately predict positive coefficients (Fig. 5), it can serve as a starting point for understanding how an athlete might be expected to respond to carbohydrate intake. The most important variables were carbohydrate monotony, followed by average daily protein intake. Among athletes with low carbohydrate monotony scores (i.e., larger daily variations in carbohydrate intake), those with a lower average daily protein intake were likely to have a negative response to carbohydrate intake whereas those with a higher daily protein intake were more likely to have a non-significant effect of carbohydrate intake on AM PRS score (Fig. 4). Athletes with a higher daily fat intake and those training less than 12 h per week were also less likely to be influenced by changes in daily carbohydrate intake (Fig. 4). The model displayed moderate accuracy (κ value of 0.52), although the ability of the model to learn from the data was challenged by the small and imbalanced data set. As shown in Fig. 5, negative and non-significant outcomes were able to be predicted very well, but the model did not accurately predict any positive responders. Nevertheless, we feel this approach can be adopted by others who wish to better understand individual responses to a given intervention or stimulus.

There are several limitations to this study, primarily related to the use of self-report measures. Data integrity was checked based on the number of missing values, and by looking for unexpected values. However, it is possible that participants did not always enter data as accurately as possible. There is also the risk of bias in reporting if an athlete is aware that a coach or a researcher will be seeing their data, answering based on what they think is desirable. Imputation of missing values could influence the findings, although this impacted a relatively small amount of the data (2.4% ± 1.7% of data points per participant). In addition, the limited number of data points, particularly with unbalanced classes, made training and interpreting the decision tree model challenging. Finally, alcohol is known to influence recovery [5], but was not recorded by participants in this study.

Conclusion

Our findings suggest the influence of dietary carbohydrate intake on daily recovery differs at the group and individual level. Therefore, inferences may not be generalized from the group to the individual, and practical recommendations should be based on individual analysis. Furthermore, at the group level, the previous day’s carbohydrate intake did not influence the perceived recovery status of athlete training ~12 h per week. This research also adds to the literature around ergodicity in sports science, an emerging concept that should be routinely considered as part of the statistical analysis process. Future research in athletes should focus on individual responses to better understand the relationship between nutrition, training, and recovery for each athlete.