Subjects
Eleven male participants (mean ± SD: age 22 ± 3 years, stature 1.74 ± 0.09 m, body mass 66 ± 8 kg) volunteered to participate in this study. They were well-trained competitive cyclists with a minimum of 3-years cycling experience and 200 km training per week. This study was approved by the Ethics Committee of the University of Rome Sapienza in compliance with the Declaration of Helsinki. Written informed consent was obtained from all of the participants.
Study protocol
Participants reported to the laboratory on six separate occasions over a 3-week period, with visits separated by at least 48 h. On the first visit, participants performed a ramp incremental exercise test, followed by a TTE test, with the two tests separated by 30 min of recovery. On the second visit, participants performed two TTE tests separated by 30 min of recovery. The 30-min recovery time used in both visits is in line with previous studies performing multiple performance tests in the same visit with the aim to obtain the power–duration relationship [30, 31]. The three TTE tests differed in the exercise intensity and, therefore, exercise duration. This allowed us to obtain the power–duration relationship for each individual, which was used to set the exercise intensity for the experimental TTE tests. Four identical experimental TTE tests were then performed on separate days (visits 3–6). Based on the power–duration relationship, the exercise intensity corresponding to a TTE of 1200 s was selected for these four tests. All the protocols were performed on an electromagnetically-braked cycle ergometer (Lode Excalibur Sport, Groningen, The Netherlands). The positions of the ergometer seat and handlebar during the first visit were recorded for each participant and reproduced in the following visits.
Ramp incremental test
The ramp incremental test was preceded by a 5 min warm-up at 100 W, 3 min of rest, and 2 min pedalling at 20 W. The test consisted of a continuous ramped increase in work rate of 30 W min−1, starting from 20 W. Preferred pedalling cadence was selected by each participant and was kept constant throughout the test, which terminated when cadence fell by more than 10 rpm, despite strong verbal encouragement. The peak power output (PPO) was defined as the highest power output achieved at exhaustion, registered to the nearest 1 W, and the \(\dot{V}{\text{O}}_{{2\;{\text{peak}}}}\) as the highest value of a 30-s average.
Before the ramp incremental test, participants were given standard instructions for providing RPE using the Borg 6–20 scale [32]. During the ramp incremental test, participants were asked to rate their perceived exertion on the RPE scale every minute during exercise and, retrospectively, at exhaustion. This procedure served as a familiarization with the scale.
Preliminary TTE tests
Three preliminary TTE tests were performed during visits 1 and 2 to obtain the power–duration relationship for each individual. These three TTE tests were performed on average at 87 ± 0.3%, 76 ± 1.3%, and 70 ± 2.7% of the ramp PPO, to result in TTEs of approximately 4, 10, and 18 min, although between-subject variability was expected. This choice was made to have exercise durations suitable for a good prediction of the power output corresponding to a TTE of 20 min. The 76% PPO test was performed during the first visit after the ramp incremental test, while the 70% PPO test was performed before the 87% PPO test during the second visit. The three performance data points were then used to derive the power–duration relationship for each individual using a power law mathematical function with the following equation: Y = cXb, where Y (power output) and X (time) are the two variable quantities, c is the theoretical maximal power output at time zero, and b is the scaly exponent describing the decrease in power output over time. For more detailed information on the use of the power law function in endurance sports, see García-Manso et al. [33]. In the present study X was fixed to 1200 s, and the corresponding power output was obtained for each individual.
Preferred pedalling cadence was selected by each participant before the first preliminary TTE test. The participant was asked to maintain the cadence within a range of preferred cadence ± 7 rpm during the test. This was done to reduce potential changes in physiological and psychological variables due to changes in pedalling cadence. The participant had been informed that a 10-s countdown would have started whenever pedalling cadence fell outside the predefined range. If the cadence returned to a value within that range before the countdown was completed, the test continued; otherwise, participants were judged to have reached exhaustion, which corresponded to the end of the 10-s countdown. This objective exhaustion criteria allowed us to register TTE to the nearest second. Preferred cadence was kept constant for a given participant throughout both preliminary and experimental TTE tests. The participant did not receive any feedback or encouragement during any of the TTE tests performed in the present study.
Every 2 min during the three preliminary TTE tests, RPE and affective valence (i.e., pleasure/displeasure experienced during exercise, measured using the Feeling Scale) were collected to allow the participants to thoroughly familiarise with the two scales. The Feeling Scale was first presented to participants in the first visit before the 76% PPO test, and standard instructions were provided [34]. Feeling Scale scores can range from + 5 (the exercise feels “very good”) to − 5 (the exercise feels “very bad”).
Experimental TTE tests
On visits 3–6, participants performed an identical TTE test at a power output corresponding to a predicted TTE of 1200 s, as detailed in the previous section, with the aforementioned exhaustion criteria. Power output was prescribed based on the individual power–duration relationship in an attempt to reduce the relatively high between-subject variability in TTE that is commonly observed when other methods of exercise prescription are used. The four experimental TTE tests were preceded by a standardized warm-up. This consisted of 3 min at 100 W, 6 min at 50% of PPO, 1 min at 60% of PPO, and 1 min at 100 W. Tests were then preceded by 3 min of rest and 2 min pedalling at 20 W. During all the tests, fR, minute ventilation (\(\dot{V}_{\text{E}}\)) and heart rate (HR) were measured breath-by-breath using a metabolic cart (Quark b2, Cosmed, Rome, Italy). Appropriate calibration procedures were performed following the manufacturer’s instructions. RPE and affective valence were reported every 2 min.
Control of factors potentially confounding performance
In an attempt to limit the within-subject variability in TTE, a number of potential confounding factors were controlled, to limit their influence on performance. All testing was completed in the laboratory with a room temperature of 20–22 °C and at the same time of day (± 1 h) within participants. Participants were asked to refrain from caffeine and alcohol at least for the 3 h and 24 h, respectively, preceding each test. They were asked to record food intake on the day of the first experimental TTE test as well as the day before, to replicate it before the subsequent experimental TTE tests. Participants were also asked to standardise their training routine and to avoid strenuous exercise the day before the test. At each visit to the laboratory, participants were asked to complete a pre-test checklist to verify that they had complied with the instructions given to them. They were also asked to confirm that they were not in a state of mental fatigue, physical fatigue, sleep deprivation, and that they were free of injury and under no medical treatment. A single test was rescheduled, because the participant failed to meet some of the requirements. To reproduce a competitive setting and favour the achievement of a maximal effort in all the tests, a performance-based prize (£ 200 voucher) was offered for the participant with the longest average TTE considering all the four tests. No feedback on performance was provided to participants until all the tests were completed.
Data analysis
Data were analysed with MATLAB (R2016a, The Mathworks, Natick, MA, USA). Before performing the three different TTE analyses, breath-by-breath ventilatory data were filtered for errant breaths (i.e., values resulting from sighs, swallows, coughs, etc.) by deleting values greater than 3 standard deviations from the local mean [35]. Subsequently, breath-by-breath ventilatory data were interpolated with a linear function and extrapolated every second. Data were then smoothed by a moving average of 60 s. RPE and affective valence data collected every 2 min were interpolated with a linear function and subsequently extrapolated to have continuous values every second. For each individual, the four tests were rank ordered from the best to the worst based on TTE. This was done to have different levels of performance within individuals, reflective of within-subject variability in TTE.
Group isotime method
When data were processed with the “group isotime” method, the worst test of the participant with the shorter TTE (i.e., participant 2; Fig. 1) was selected to identify the timepoints in which to segment all the tests of all the participants. This test lasted 530 s. To obtain 10 equally spaced timepoints characterizing each test, the following timepoints were considered: 53, 106, 159, etc. up to 530 s. This method is termed “group isotime”, because all the tests of all the participants are analysed considering the same absolute timepoints. The extent of data loss (EDL) that occurs with this method was calculated as: EDL = (avgTTE − isotime duration)/avgTTE × 100, where avgTTE is the average TTE of the group in seconds (e.g., 1026 s in the present study; worst test), while isotime duration corresponds to the last timepoint (in seconds), where all the participants were represented (e.g., 530 s in the present study; worst test). When data were available, the same formula was used to calculate the EDL from the previous studies that used the “group isotime” method, for a comparison with the present study.
Individual isotime method
When data were processed with the “individual isotime” method, each participant was considered in isolation when segmenting the tests in timepoints, hence the name “individual isotime”. For each participant, the worst test was taken into account for identifying ten timepoints in which the four tests were segmented. Considering again participant 2, exactly the same timepoints used for the “group isotime” analysis (53, 106, 159, etc. up to 530 s) were selected. However, the timepoints identified for the other participants differed from each other on the basis of the TTE of their worst test. For instance, the worst test of participant 5 had a TTE of 1173 s. Hence, the timepoints considered for the four tests of that participant were 117, 235, 352, etc. up to 1173 s. Importantly, with this analysis, the worst test of all participants did not result in any data loss. This means that a greater portion of data was included in the between-test comparison, relative to the “group isotime” analysis. For further clarification on this analysis, please note that each data point corresponds to different absolute time values between participants. This can be depicted graphically by adding horizontal error bars. However, horizontal error bars are identical across conditions when using the “individual isotime” method, apart from the test end value. Therefore, we opted for a graphical representation of the horizontal error bars, which preserves the quality of the graph and avoids redundancy (Figs. 2, 3, 4). To promote a full understanding of the “individual isotime” analysis as well as the “relative isotime” analysis described below, we have made available the codes used to run the two analyses as Supplementary material (Online Resource 1).
Relative isotime method
When data were processed with the “relative isotime” method, each test of each participant was segmented into ten timepoints on the basis of the TTE of the test analysed. Again, for participant 2, the worst test was segmented in the following timepoints: 53, 106, 159, etc. up to 530 s, as for the other two analyses. However, the best test of participant 2 was segmented in the following timepoints: 93, 187, 280, etc. up to 933 s because of the longer TTE. The same procedure was applied for the other tests of participant 2 as well as for all the tests of the other participants. With this method, there is no data loss, but different tests are not compared at the same absolute timepoints within participants but at the same percentages of TTE. Therefore, this method is here termed “relative isotime”.
Statistical analysis
An a priori power analysis was performed using G*Power (version 3.1.9.2; Kiel University, Kiel, Germany). Expecting a large effect size for the sensitivity of fR and RPE to within-subject performance ranking, a sample size of 7 was required based on 1 − β = 0.80 and α = 0.05. Eleven participants were recruited to account for potential dropping out.
Statistical analyses were conducted using IBM SPSS Statistics 20 (SPSS Inc, Chicago, IL, USA) unless otherwise stated. Data were checked for normality prior to analysis. The reliability in TTE was quantified by means of the log-transformed coefficient of variation (CV) with 90% confidence limits using a published open-source spreadsheet in Microsoft Excel (Microsoft Corp.) [36]. For TTE data, the within- and between-subject variance components were calculated as percentages of the total variance by means of a linear mixed model based on the restricted maximum likelihood estimates approach, where participants served as a random between-subjects factor and test as a fixed within-subject factor [7]. More specifically, the within- and between-subject variance components were summed to obtain the total variance, and their percentage contributions to the total variance were calculated. A one-way repeated-measures ANOVA was used to compare the end value of fR, \(\dot{V}_{\text{E}}\), HR, RPE, and affective valence across the four TTE tests. A two-way repeated-measures ANOVA (rank × time) was used to analyse the effect of rank on fR, \(\dot{V}_{\text{E}}\), HR, RPE, and affective valence responses. The same statistical analysis was used for the three methods of data processing under study, i.e., the “group isotime”, “individual isotime”, and “relative isotime”. When the sphericity assumption was violated, the Greenhouse–Geisser adjustment was performed. For the main effect of rank, the main effect of time, and the interaction, partial eta squared (η2p) effect sizes were calculated; an effect of η2p ≥ 0.01 indicates a small effect, η2p ≥ 0.059 a medium effect, and η2p ≥ 0.138 a large effect [37]. When a significant main effect of rank was found, the Bonferroni test was used as follow-up analysis. When a significant interaction was found, a one-way repeated-measures ANOVA was used to test the simple main effect of rank at different timepoints.
Within-subject correlation coefficients (r) were computed for the correlations between RPE and fR, using the method described by Bland and Altman [38]. This method adjusts for repeated observations within participants, using multiple regression with “participant” treated as a categorical factor using dummy variables. A correlation coefficient and a P value were obtained considering the four tests together, as well as for each test considered separately. These correlations were computed using data analysed with the “relative isotime” method, because it is the only analysis method which results in no data loss for any test. A P value < 0.05 was considered statistically significant in all analyses. The results are expressed as mean ± SD in text and as mean ± SE in figures.