Background

Ageing is typically accompanied by physical and cognitive decline, leading to limitations in activities of daily living (ADL), which jeopardise older people’s functioning, independence and quality of life [1]. To prevent early decline and preserve functioning in older people, enhancing an active lifestyle is recommended and interventions aimed at enhancing this are currently widely implemented [2]. Focusing on the young old enables the initiation of such interventions well before the onset of the decline in functioning. Therefore, identifying people at increased risk of early functional decline is essential for timely initiating of targeted preventive interventions to achieve the highest possible health gains [3,4,5,6].

Previously developed prediction models for the risk of decline in functioning in community-dwelling older people consistently revealed age, sex [7,8,9], and arthritis-related complaints [7, 8] as independent predictors. Other predictors observed were low physical activity levels [8], impaired cognition, hypertension, higher body mass index (BMI), poor self-rated health [7], chronic diseases, reduced muscle strength and socioeconomic status [9]. However, these previous prediction models were developed in older populations with a wide age range. Major life events, such as retirement, have shown strong effects on physical activity behaviour [10, 11], hence people around the age of retirement could be an important group for increasing an active lifestyle [11, 12]. A specific focus on people around the retirement age should reveal predictors particularly relevant for this target group for instigating timely preventive interventions. Furthermore, the follow-up period in previous studies ranged from six [7] to ten years [9], but a short-term risk prediction of limitations in (instrumental) ADL functioning is likely to prove a more relevant timeframe for individuals to commit to lifestyle changes if needed [13].

In the present study we aimed to develop and validate a clinical prediction model for the onset of functional decline at three years of follow-up in older people of 65–75 years old based on four population-based cohorts across Europe. We used a broad range of predictors, including easy-to-measure physical performance variables, to identify the most sensitive parameters.

Methods

We conducted a study in developing and validating a clinical prediction model for the onset of functional decline at three years follow-up and reported in line with the TRIPOD (Transparent Reporting of multivariable prediction model for Individual Prognosis Or Diagnosis) statement [14].

Study population

This study included baseline data and data from the first follow-up measurement from four on-going population-based cohort studies across Europe: Germany, United Kingdom, Italy and the Netherlands. These cohorts were selected based on the availability of the data within the PreventIT consortium [15] and availability of relevant outcome and predictor variables. Data from the four cohort studies were harmonised to allow a pooled analysis.

The Activity and Function in the Elderly in Ulm study (ActiFE-ULM) is conducted in a representative sample of 1506 German community-dwelling older people (65–90 years old) living in the greater Ulm area [16]. Included measurement cycles were conducted in 2009–2010 and 2013–2014.

The English Longitudinal Study of Aging (ELSA) is conducted in the United Kingdom and comprises a representative sample of 11,391 British older people (> 50 years old) [17]. Included measurement cycles were conducted in 2004–2005 and in 2008–2009.

The Invecchiare in Chianti study (InCHIANTI) is a cohort study from Italy. It comprises a representative sample of 1453 Italian people from two municipalities in Tuscany based on age strata [18]. Included measurement cycles were conducted in 1998–2000 and 2001–2003.

The Longitudinal Aging Study Amsterdam (LASA) is conducted in a representative sample of 3107 Dutch older people [19]. Participants were sampled from population registries in 11 municipalities in the Netherlands, based on age, sex, and level of urbanisation strata. Included measurement cycles were conducted in 1995–1996 and 1998–1999.

From all four cohort studies we included participants aged 65–75 years at baseline who reported no limitations in functional ability at baseline.

Functional decline

The outcome is the onset of functional decline at three-year follow-up (four years in ELSA), defined as any increase (worsening) in score on self-reported (instrumental) ADL. Following prior harmonisation guidelines [20], we selected only those items that overlapped across the four cohorts to create a comparable assessment of functional decline. This resulted in a selection of two items on basic ADL [21] and three items on instrumental ADL [22]: 1) dressing and undressing; 2) sitting down and standing up; 3) using own or public transportation; 4) walking up and down a flight of stairs without resting; 5) walking outside for 400 m/for five minutes without stopping (see Additional file 1: Table S1 for details). These items have shown to be well associated with fractures [23] and recurrent falls [24]. All items were recoded into a uniform dichotomous score (0 = no limitations reported; 1 = at least some limitations reported). Participants who reported no limitations in functioning at three-year follow-up were classified as experiencing no functional decline. Participants who reported at least some limitations at three-year follow-up on any of the five items were classified as experiencing functional decline.

Candidate predictors and missing data

Candidate predictors were measured at baseline and consisted of sociodemographic, lifestyle, clinical, and physical performance variables. We recoded variables to create uniform candidate predictors across the four datasets (see Additional file 1: Table S1 for details). Sociodemographic variables included sex, age, marital status, living status, and level of education. Lifestyle variables that were considered as candidate predictors were smoking behaviour, alcohol intake and self-reported physical activity levels. Clinical variables included BMI, mean arterial pressure (mmHg), self-reported chronic diseases, depressive symptoms (defined by the validated cutoff scores for the Center for Epidemiologic Studies-Depression scale, CES-D [25] or Hospital Anxiety and Depression Scale Depression subscale, HADS-D [26]) and cognitive status (assessed with Mini-Mental State Examination, MMSE [27] or Cognitive Function Index [28]). Physical performance variables comprised the tandem stance (seconds), five repeated chair stands (seconds), gait speed (m/s), handgrip strength (kg) and self-reported fall history in the previous year. As different test protocols were used across cohorts, values for gait speed were converted to Z-scores within each cohort before pooling the data to create comparable values.

Missing values on candidate predictors were handled by multiple imputation using the multivariate imputation by chained equations (MICE) procedure within each cohort [29], using information from all candidate predictors within the specific cohort. Based on the percentage of participants with missing data on at least one predictor (resp. 27% in ActiFE-ULM, 21% in ELSA, 23% in InCHIANTI, 14% in LASA) we created 27 datasets with missing variables imputed [30]. Rubin’s rules were applied for pooling estimates across the imputed datasets [31].

Statistical analysis

We combined the data from the ActiFE-ULM, ELSA, InCHIANTI and LASA cohorts in a pooled analysis to develop the prediction model (Fig. 1). For the analyses we used the rms and mice packages in R for Windows version 3.3.1 (R Development Core Team, Vienna, Austria: R Foundation for Statistical Computing).

Fig. 1
figure 1

Flowchart of inclusion of participants across the four cohort studies

Model development

The onset of functional decline was treated as a binary outcome, and logistic regression models were considered for the analysis. For all candidate predictors we fitted logistic regression models including the candidate predictor and a dummy variable as cohort index to account for different baseline risks within each cohort [32]. First, continuous predictors were examined on linearity using restricted cubic splines [33]. If the spline function indicated a non-linear association, we modelled the variable with a spline function with three knots at 10th, 50th and 90th percentile [34]. Second, we assessed multicollinearity among candidate predictors with Spearman’s correlation coefficient and considered this present if r ≥ 0.40 [35]. In case of multicollinearity, the variable with the highest predictive value was included in the multivariable model. We excluded the variable living status (multicollinear with marital status).

For developing the prediction model, we fitted a multivariable logistic regression model in the pooled dataset of four cohorts, including all candidate predictors and the dummy cohort variable. We applied a stepwise backward elimination procedure to exclude variables from the model that were not statistically significant (likelihood ratio test p > 0.05). Only variables with p < 0.05 after applying Rubin’s rules were considered significant predictors [36] and odds ratios (OR) and 95% confidence intervals (95%CI) were estimated. Performance of the model developed was assessed using the area under the receiver operating curve (C statistic, 0.50 represents no discrimination and 1.00 represents perfect discrimination) and the calibration intercept and slope (intercept of 0 and slope of 1 represent perfect calibration) [33]. Performance statistics are reported with median (interquartile range, IQR) across imputed datasets [37]. To assess the robustness of findings of the stepwise backward elimination, we performed sensitivity analyses by repeating the procedure in complete-cases (80.6% of total).

Internal-external cross-validation

To assess heterogeneity of findings across the cohorts and evaluate the external validity of the model, we performed an internal-external cross-validation [32, 38]. This is a novel strategy recommended for developing and validating prediction models in pooled data. Since all datasets are used for model development, all available information on the predictors is used and power is optimised [32, 38]. Through an iterative approach, this procedure assesses the external validity of the model across the four different datasets. In our study, the internal-external cross-validation consisted of the following steps: 1) Using three pooled datasets for developing the prediction model with the set of selected predictors from the stepwise backward elimination; 2) Using the remaining fourth dataset to validate the model; 3) Assessing model performance of the derivation dataset through the C statistic, calibration intercept and slope; 4) Rotating steps 1–3 across the four datasets. We compared model performance across the four iterations of the internal-external cross-validation [39].

Internal validation, model performance and risk scores

We performed internal validation by applying bootstrapping techniques to address the possibility of overfitting [33]. Using 250 bootstrap samples we obtained shrinkage factors and we multiplied these with the original coefficients from the developed model. We fitted a new intercept to maintain overall calibration, which resulted in our final prediction model. Model performance of the final model was assessed with the C statistic and calibration intercept and slope. We developed a clinical prediction rule for the final model to calculate an absolute risk score, based on the procedures described by Sullivan and colleagues [40]. We estimated the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of the clinical prediction rule.

Results

Study participants

A total of 2560 participants were eligible for inclusion in the pooled analysis (mean age 69.7 years; 47.4% females, Fig. 1 for overview). Most included participants were from the ELSA cohort (n = 1136, 44.4% of total). Prevalence of functional decline at three-year follow-up was comparable across cohorts, with overall 572 (22.3%) participants showing functional decline at follow-up (22.2% in ActiFE-ULM, 23.9% in ELSA, 19.4% in InCHIANTI and 21.6% in LASA). Table 1 presents descriptive characteristics of the potential predictors in the four cohorts and the pooled database. Participants in the InCHIANTI study had lower education (13.4% with > 9 years education compared to 55.8% overall) and had the fastest gait speed (mean ± SD, 1.29 ± 0.20 m/s compared to 1.01 ± 0.40 m/s overall).

Table 1 Baseline characteristics of 65–75 years old people from the four European cohorts

Model development

Stepwise backward logistic regression showed that 10 of 22 potential predictors were significantly associated with functional decline at follow-up (Table 2). Time to complete five repeated chair stands showed a non-linear association with functional decline and was modelled using a spline function with three knots (at 10th, 50th and 90th percentile). Table 2 reports ORs and 95%CIs of the linear and converted variable for chair stands, as these were modelled simultaneously to account for the non-linear association. Sensitivity analysis in complete-cases resulted in similar results for the stepwise backward elimination (in Additional file 1: Table S2).

Table 2 Final model developed in pooled data of 65–75 year old people from the four cohorts (n = 2560)

Internal-external cross-validation

Ten significant predictors resulting from the model development were used in the internal-external cross-validation. Rotating the internal-external cross-validation across the four cohorts, performance of the developed models remained stable with a C statistic ranging from 0.691 to 0.740 (Table 3). Calibration in the large was overall good with calibration intercepts close to zero and ranging from − 0.271 to 0.135 (Table 3). The calibration slopes remained close to one across all four cohorts and indicated a slight overfitting when LASA was the validation sample with a slope of 1.215 (Table 3).

Table 3 Model performance in the pooled dataset and after internal-external cross-validation

Model performance

Bootstrapping showed that a uniform shrinkage factor ranging from 0.946–0.951 across the imputed datasets was needed to adjust predictor coefficients for optimism. Table 3 shows the apparent performance of the unadjusted prediction model in the four cohorts and the performance after shrinking the coefficients. After adjusting for optimism, the final model was able to discriminate between people with and without functional decline with a C statistic of 0.719 (IQR, 0.716–0.720 across imputed datasets). Calibration of the final model was excellent with an intercept of 0.059 (IQR, 0.047–0.073) and calibration slope of 1.053 (IQR, 1.042–1.065, Table 3).

Risk scores

Regression coefficients were converted to simple absolute risk scores to facilitate individual prediction of risk of functional decline by summing risk scores for specific characteristics (Table 4, 39]. The total score has a possible range from 0 to 117. For example, a Dutch (+ 1) female of 74 years old (+ 9), with a BMI of 23.9 (+ 0), shows no symptoms of depression (+ 0), has no cardiovascular disease or COPD (+ 0), but diagnosed with diabetes mellitus (+ 6) and arthritis (+ 5). Her handgrip strength is 18 kg (+ 7), converted gait speed is 0.214 m/s (+ 9) and she performs five repeated chair stands in 16 s (+ 17, + 0 for non-linear term). Her total risk score would be 54. Figure 2 shows the distribution of probability of functional decline across grouped risk scores and the prevalence of risk scores within the pooled database. Someone with a risk score of 54 is predicted to have a 32.7% risk of functional decline in three years and this risk applied to 17.3% of the participants in the pooled database. Predictive values for specific cutoffs in the total risk score are presented in Table 5 and illustrate an increasing specificity with increasing values for the cutoff, while this reduces the sensitivity.

Table 4 Score chart for calculating individual risk scores derived from the prediction model
Fig. 2
figure 2

Predicted probability of functional decline by total risk scores and prevalence of the scores. Legend: Grey columns indicate the probability of experiencing functional decline at three-year follow-up with a specific risk score. Black columns indicate the prevalence of the scores within the database

Table 5 Predictive value of the prediction model for different cutoffs in the total risk score

Discussion

Based on four European cohort studies we showed that in people aged 65–75 years, the onset of functional decline in ADL at a short follow-up period of three years can be predicted by specific physical performance variables in combination with age, BMI, presence of depressive symptoms and four chronic conditions: cardiovascular disease, diabetes, COPD and arthritis. This multifactorial prediction model showed good discrimination and calibration, which both remained stable across the four cohorts in an internal-external cross validation.

Few previous studies have developed a clinical prediction model for the risk of functional decline in ADL [7,8,9] and those which have done so, included wide age groups within the older population (resp. 55–90+ years old [7], 40–80 years old [9] and 60–79 years old, women only [8]). The present study focused on the specific age group of 65–75 years old, since recently retired people may be a particularly relevant target group for initiating behaviour change interventions [10,11,12]. In the previous studies, age was consistently reported as a significant predictor [7,8,9] and even within our narrow age range, we found age to be a significant predictor of the onset of functional decline.

The contribution of chronic conditions in our prediction model is in line with prior models, where a higher number of chronic conditions was associated with a higher risk [9]. Of the chronic conditions, particularly arthritis seems to be an important predictor, in the broader range of older age too [7, 8]. However, the predictive effects we observed for BMI and depressive symptoms in our specific cohort were not consistently found in the studies with a wider age range [7,8,9]. Our findings extend the evidence on the important role of depressive symptoms in age-related decline [41] and highlight the need to consider different characteristics when screening specific age groups for risk of functional decline.

Three of the predictors identified in our model (depressive symptoms, lower handgrip strength and lower gait speed) are part of the frailty concept [42]. Frailty (which next to those three factors also encompasses unintentional weight loss and low levels of physical activity [42]) has been shown to be predictive of functional decline and mortality [43]. The prior studies on prediction models for functional decline in older people did not specifically focus on the frailty concept in the variable selection [7,8,9]. Den Ouden and colleagues [9] did consider physical performance variables in their prediction model. These investigators included a composite score from the short physical performance battery (SPPB) and a composite score for handgrip strength and leg extensor strength in their analysis and found only the composite muscle strength to be predictive of functional decline at ten years [9]. From our analysis it seems that the easier measure of handgrip strength alone is sufficient in predicting functional decline. Our findings emphasize the importance of considering the tests for gait speed [44,45,46] and chair stands [44] separately instead of a composite score like the SPPB. Moreover, our findings confirm that frailty plays an important role in the prediction of functional decline, even in this group of young older people. Future studies should consider all variables of the frailty concept as candidate predictors.

Unlike the studies by Tas and colleagues [7] and Den Ouden and colleagues [9], sex was not a significant predictor in our prediction model. This might be explained by the strong associations we observed for the physical performance measures. As physical performance measures are found to differ substantially between sexes [47], the data for handgrip strength, gait speed and chair stands may already account for the variance between males and females.

Our prediction model provides clinicians with a small set of easy-to-measure variables that discriminate well in predicting functional decline in community-dwelling people aged 65–75 years old. Clinicians can use this set of variables to screen individuals on their risk of functional decline in the coming three years. In the digital era, the presented prediction model can also be developed into an online tool that can estimate a more detailed risk score. Outcomes of the screening can help to decide whom to target for starting preventive interventions designed to reduce the risk of functional decline. Although a variety of behaviour change interventions have been developed and shown to be effective in increasing an active lifestyle in older adults [48], evidence from interventions specifically targeted towards people around the retirement age is scarce [49]. Interventions for the general population of older adults might also be suitable for the subgroup of people 65–75 years old [48], yet further investigation of interventions specifically designed for this age group is needed to optimise uptake by individuals and identify the best strategies for reducing the risk of functional decline.

This study used pooled data of four European ongoing cohort studies [16,17,18,19] to develop a prediction model specifically for a young older population. Our approach allowed the inclusion of a higher number of participants in the analysis (resulting in higher power) while at the same time assessing the generalisability of our findings across the four cohorts and enhancing external validity of the developed model in a new population [50]. Differences in baseline risk due to merging disparate samples were addressed by including cohort-specific intercepts in the model [32]. Yet, using existing data from different cohorts introduced some limitations. First, we were dependent on data available in the four cohorts and heterogeneity in measurements across the cohorts could have affected our results. Given the design of a pooled analysis, our outcome measure only included functional decline items that were available in all cohorts. Of the five included items, three addressed instrumental ADL. We expect that this might lead to a more sensitive measure in our specific cohort of adults of 65–75 years old, since it is likely that people experience decrease in instrumental ADL prior to decrease in basic ADL [9]. Although the items we used to define functional decline have shown to be a valid measure of functional performance in prior studies [23, 24], a full comparison with validated instruments to assess (instrumental) ADL is needed. Similarly, variables that were not available in all cohorts were not considered in our analysis of potential predictors. Inclusion of more sensitive variables, such as walking fast or across obstacles [51, 52], might have altered the outcomes of the stepwise backward elimination or increased the discrimination of the prediction model. Second, we restricted our analysis to people from 65 to 75 years old to focus on a target group for initiating preventive interventions [10,11,12, 15]. We may question whether risk identification of short-term functional decline should be expedited to an even younger age group, since 39.1% of participants of 65–75 years old in the cohort studies reported limitations on at least one ADL item at baseline (Fig. 1). The same holds for older age groups, as a large proportion of the participants included in our study were not suffering from any functional limitations after 3 years follow-up. There is a need to further investigate the onset of functional decline in adults below 65 and above 75 years of age to assess if the current model can also be applied at an earlier stage in life or if a tailored model is needed. Finally, the inclusion of one cohort that was about twice the size of the other cohorts (ELSA) might have biased the estimated predictors. To assess this potential source of bias, we applied a novel approach for developing and validating prediction models using multiple datasets, through internal-external cross-validation [32, 38]. Performing the steps of developing the model in three pooled datasets while externally validating the performance in the fourth dataset, and alternating this across the four datasets, showed consistent model performance of the predictors across the four cohorts. The small shrinkage factor further suggests that the coefficients from our prediction model are accurate in new participants. This provides strong evidence for the generalisability of our prediction model [50], although future validation in completely independent data is needed to confirm this.

Conclusions

In people aged 65–75 years, the onset of functional decline in ADL at a short follow-up period of three years can be predicted by specific physical performance variables and age, BMI, chronic conditions and depressive symptoms. The prediction model showed good discrimination and calibration, which remained stable across the four cohort studies, supporting the external validity of our findings.