In the present study, we showed that (1) it is possible to predict the future performance development of a master athlete from a single measurement, and that (2) the prediction by an ML approach is superior to the prediction by a naïve average approach, and (3) to the application of a constant decline rate with individualized starting points. Interestingly, (4) the estimated performance decline rate was highest in athletes with a high starting performance and a low starting age, as well as in those with a low starting performance and high starting age, while the lowest decline rate was found for athletes with a high starting performance and a high starting age. This tendency was the same for all disciplines, while the absolute values of the decline rate varied.
Performance prediction from a single value is of interest for clinical practice, frailty research, athletes, and insurance companies. The ML model presented in this paper is potentially applicable to other scenarios in ageing, given an appropriate dataset is available, such as declines in hand grip strength [30], other measures of sarcopenia and frailty [31, 32], or bone density and fracture risk prediction [33, 34].
The present study tested a machine learning approach and showed its superiority to traditional approaches in the prediction of age-related master athletics performance decline trajectories. These differences, however, were small in absolute values. This comes as no surprise, given the seemingly impossible nature of the task of predicting performance years in advance from a single measurement without any additional information, such as the individual health status or training habits. Nevertheless, the differences in prediction accuracy proved to be statistically significant. Far more important, however, is the fact that we visualized the output of the ML model, thereby revealing the learned non-linear mapping between the three inputs “discipline”, “starting age”, and “starting performance” and the output of the predicted parameters of a performance decline curve. We believe that our approach showcases the possibility to learn from machine learning, i.e. using (at least seemingly) black-box systems to reveal aspects that may not be detected otherwise and that may be worth additional scientific attention and analyses.
The ML model delivered new insights into factors that determine performance decline trajectories in master athletes. Our findings on factors that influence a slower or faster performance decline and the identified criteria are entirely new and have to our knowledge not been previously reported. It has been speculated that regular physical activity is associated with a better general fitness [35] and it was suggested to flatten the physical performance decline trajectory, also with regard to the VO2max decline [13,14,15]. Our findings confirm this theory, as associations of the starting age and starting performance with the performance decline rate were identified. The estimated performance decline rate was highest in athletes with a high starting performance and a low starting age, as well as in those with a low starting performance and a high starting age. The phenomenon that a high starting performance between 35 and 40 years is connected with a high decline rate could potentially be explained by reductions in training volumes. Individuals in this age group often have less free time than before due to their family and career, potentially coming from very high training volumes in their 20 s and early 30 s. Injuries and degeneration may contribute to the performance decline [36]. Since the reported performance decline rate in this age group is usually very low [5, 10, 11, 15, 16], these findings were surprising and should be followed up in future research. A high decline rate in athletes with a low starting performance and a high starting age is in line with the theory that lifelong exercise helps to flatten the performance decline curve that would in turn be steeper in those who have not exercised continuously [13,14,15]. In addition, we interpret the output of our model as an indication that individuals who start late can still achieve a lower performance decline rate. This is in line with the finding that master athletes maintain better health than age-matched non-athletes [35]. We also found low decline rates for individuals with a very low starting performance and low starting age. However, these results need to be interpreted with extreme caution, because although seemingly reasonable (“whoever starts low can only decline so much”), there were virtually no data points in that area to allow for the model to properly learn about this group.
The lowest decline rate was found for athletes with a high starting performance and high starting age. A high performance at a high starting age may result from an ongoing, lifelong engagement in other sports and physical activities or from a combination of factors including nutrition and a good genetic constitution. Unfortunately, we do not have data on the amount of exercise and other biographic, genetic, or socioeconomic aspects of these athletes, and we can therefore only speculate on the underlying causes.
Our work has some limitations: First, the number of nodes in the model (n = 16) was determined via an initial grid search on the data. This can be regarded as information leakage and might be associated with overfitting. However, we re-evaluated the presented analysis with various values of n (n = 10, n = 12, n = 14, n = 18). In this analysis, we found that the model behaves essentially the same for all tested values of n. For example, all Bland–Altman plots (Fig. 3) showed an absolute value of the mean smaller than 0.06%, an upper bound smaller than 8.14% and a lower bound greater than − 8.13%. The difference between the ML model and the shifted model was always significant (p < 0.05, Fig. 4). Moreover, the four regions identified in Fig. 5 were visible for all variations of n. We thus conclude that the model is fairly insensitive to the exact number of nodes n. Nevertheless, the ML model was not validated against a completely independent hold-out test set due to the limited amount of data, but a cross-validation regime was used. We chose this approach, as the main focus of the work was on the interpretation of the model output and not to find the best possible or most robust ML model. In addition, we plan to test the model on additional independent datasets once they become available.
Note that cross-validation ensures a complete separation of the training and test set, and thus minimizes the risk of overfitting. Still, the ML model has far more parameters than the global model or the shifted global model and we can therefore only speculate about its performance on data recorded under vastly different conditions, which remains to be evaluated. Thus, if only the prediction of the performance trajectory is of interested, the shifted global model might be the preferred option. If the focus, however, lies on the extraction of novel information (e.g. the identification of subgroups in this work), the ML model is more suitable.
The dataset that we have used here only contains the best performance of a person for each age this person has competed in, but no further information. We hope to obtain longitudinal datasets with additional information, such as training volumes and diseases in the future. Finally, due to the low number of women in the dataset, only men were analyzed, and it remains unknown if the same findings apply to women [37]. Although this is a problem common to many medical-related AI studies, we acknowledge the severity of the implications and plan to follow up on this issue once more data will be available.
In conclusion, for the prediction of performance decline trajectories of master athletes based on one measurement, an ML approach in terms of a multilayer neuronal network showed lower prediction errors and was thereby superior to traditional approaches. ML models should be explored further in big-data research on age-related performance decline rates, in particular in two ways: First, to optimize prediction results, the possibility to integrate more data (i.e. more measurement points, additional individual information, external factors, etc.) should be studied. Second, the potential of ML models to identify relevant factors can be explored in other ageing-research areas or with additional model inputs.