Introduction

Pre-frailty is prevalent amongst older adults [1], and it reportedly poses a socioeconomic burden such as healthcare costs on the society [2, 3]. It is an early and reversible risk-state of health before frailty which can lead to negative healthcare outcomes such as falls, cognitive decline, hospitalization or even death [4,5,6]. Thus far, the interventions for addressing pre-fraily included physical activity, nutrition, and physical activity combined with nutrition [7].

With the ongoing rise in life expectancy worldwide [8], there is increasing public health focus, at least in Singapore, on healthy aging through physical activity such as community-based exercises to maintain independence among older adults [9]. In recent years, community-based exercises amongst older adults with pre-frailty have been increasingly studied [9,10,11]. Community-based exercises also provide an opportunity to stimulate social engagement amongst older adults [12]. The availability of community-based exercises has brought about convenience to the older adults due to increased accessibility [12]. To date, the average adherence rates of community-based exercise for older adults has been estimated to be approximately 70% by a previous study [13]. However, the evidence on the effectiveness of community-based exercises on clinical measures in pre-frailty appears mixed or unclear. For example, significant changes in grip strength have been reported in two trials [14, 15], but not in two other trials[16, 17]. Secondly, there are systematic reviews which investigated the effects of exercise intervention on physical measures in pre-frailty. Two of them were descriptive in nature [7, 18], whilst another review did not manage to investigate physical outcome measures such as strength, balance and walking speed [19].

Therefore, we aimed to review randomized controlled trials comparing the effects of community-based exercise (intervention) with minimal intervention on physical function, cognition and quality of life (outcome) in community-dwelling pre-frail older adults (participants). A secondary objective was to investigate the influence of parameters such as frequency of sessions per week, and total number of sessions on the effect size of statistically significant outcome measures.

Methods

The protocol of this study was published at PROSPERO (http://www.crd.york.ac.uk/PROSPERO/; registration number CRD42022348556). This review was also completed in accordance to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [20].

Search strategy

We searched MEDLINE (1966-present), CINAHL (1966-present), Google Scholar and Web of Science, for literature on the effects of community-based exercise on physical function, cognition, frailty status and quality of life in community-dwelling older adults with pre-frailty (Supplementary Fig. 1). The last search was run on Sep 16, 2023. The following search terms were used to search the databases: group exercise; physical activity; community*; pre-frail*; randomized controlled trial. These steps were then repeated for the other databases. The reviewers followed a selection process, defined prior to the beginning of the review, which included a checklist for inclusion criteria. Articles were eligible for inclusion if they were randomized controlled human trials, included community-dwelling pre-frail older adults aged 60 years and above, assigned the experimental group to receive treatment which includes at least exercise, assigned the comparison group to receive other forms of intervention other than exercise, and lastly, used outcome measures that included physical function, cognition, quality of life and/or frailty status. We also included trials with at least 50% or more older adults with pre-frailty. Participants were considered pre-frail if pre-frailty has been mentioned explicitly by the authors and/or determined via the use of screening tools such as Fried’s frailty criteria [21], FRAIL questionnaire [22], and Cardiovascular Health Study criteria [23]. Pre-frailty is herein defined as having met 1 or 2 criteria with reference to an established set of indicators in the aforementioned screening tools such as unintended weight loss, self-reported exhaustion, poor handgrip strength, slow walking speed, and low physical activity [24]. Articles were excluded if the participants consist of less than 50% pre-frail community-dwelling older adults,did not include outcome measures such as physical function, cognition, frailty status and quality of life as outcome measures, and/or the participants were hospitalised or institutionalized. Eligibility assessment for included studies was determined by 2 reviewers (H.J.L. and E.C.W.L). Disagreements between reviewers were resolved by consensus with another 2 independent reviewers (W.T.P. and N.D.J.).

Data extraction and quality of trials assessment

The methodological quality of the trials was assessed using the 11-item PEDro scale [25]. We assessed the methodological quality of the studies by evaluating the domains of population, treatment allocation, blinding, prognostic comparability, and analysis. Using a standardized extraction form, information on the characteristics of trial participants (age and gender), details of intervention (type of exercise, number of sessions per week, duration of session in minutes, and time span of exercise program in weeks), and outcome measures (pre- and post-intervention) were extracted from each included trial. The assessment methodological quality and extraction of data were performed and verified between 2 reviewers (H.J.L. and E.C.W.L). Differences between reviewers were resolved by agreement with another 2 independent reviewers (W.T.P. and N.D.J.).

The outcome measures included herein in our review were hand grip strength [22, 26], functional lower limb strength measures such as timed 5-times sit-to-stand [27], 30 s chair rise test [28], and Short Physical Performance Battery (SPPB) chair rise score [29], functional balance measures such as timed one-legged stance [30] and SPPB balance score [29], gait speed such as 4- to 6-m walk test [31,32,33] and SPPB gait score [29], Timed Up And Go test [34], SPPB overall score [29], functional exercise capacity such as 2-min walk test [35] and 6-min walk test [36], cognitive measures such as Mini-Mental State Examination [37], Montreal Cognitive Assessment [38], Frontal Assessment Battery [39], and Repeatable Battery for the Assessment of Neuropsychological Status [40], quality of life such as EuroQoL-5D [41], 36-Item Short Form Health Survey [42], Quality of life visual analogue scale [43] and Life Satisfaction score [44], and the number of participants with reversal of pre-frailty status.

The risk of bias was assessed with the use of revised Cochrane risk-of-bias tool [45]. It evaluates risk of bias in 5 distinct domains, that is, the randomization process, deviations from intended interventions, missing outcome data, measurement of the outcome, and selection of the reported result [45]. If the outcome measures were reported for more than one side and/or multiple time points, then the pre- and post-intervention outcome measures which gave the worst mean difference (MD) were extracted [46]. The outcome scores were approximated with the use of available median value, range, interquartile range, and standard error, whenever they were not presented in mean and/or standard deviation [47,48,49].

Quantitative data synthesis and analysis

Reliability analyses of inter-rater agreement were performed with IBM SPSS Statistics for Windows, Version 21.0 (IBM Corp, Armonk, NY). Inter-rater reliability was reported for the total quality score with Kappa statistics,[50] and was interpreted as poor (< 0.00), slight (0.00–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), or almost perfect (0.81–1.0). Where appropriate and possible, the results were pooled with formal meta-analytical techniques using RevMan 5.4.1 (The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen, Denmark). To account for differing outcome scales used among studies, we calculated standardized mean differences (SMDs) for the outcome scores, their 95% confidence intervals (CIs), and performed tests of heterogeneity (χ2). The I2 statistic was used to measure the extent of between-trial heterogeneity. Fixed-effect or random-effects models were used as appropriate and were based on our interpretation of commonality of effect size[51]. For example, data were pooled using a random-effects model, if trials differed in ways that might have plausibly impacted on the pooled outcome [51].

For continuous data, the differences in pre- and post-intervention pain score were calculated such that positive values indicated that the results favored community-based exercises, whilst negative values indicated that the results favored minimal intervention. We used odds ratios and 95% CIs to calculate the intervention effects for dichotomous data such as frailty status, and the number needed to treat [52]. Post-hoc sensitivity analyses were performed in the presence of apparent outliers.

Multivariable regression analyses were repeated to investigate if the commonly reported temporal parameters, that is, frequency, time span and duration predict the effect size for outcome measures which yielded statistically significant pooled result, and have at least 10 available trials [53]. The assumptions of this regression model were verified by examining the normal predicted probability plot, scatterplot of predicted values versus residuals, and variance inflation factor. For all analyses, significance was set at P < 0.05. To evaluate the risk of publication bias (due to non-publication of small trials with negative results), we plotted SMD versus SE and visually assessed the symmetry of this ‘funnel’ plot.

Quality of evidence assessment

The Grading of Recommendations Assessment, Development and Evaluation (GRADE) system was used to determine the overall quality of evidence for variables used in meta-analyses. GRADE considers five criteria (risk of bias, publication bias, imprecision, inconsistency and indirectness) to rate the quality of evidence as high, moderate, low or very low. In the GRADE approach, randomized controlled trials start as high-quality evidence and observational studies as low-quality evidence supporting estimates of intervention effects. The quality of evidence was rated up or down by two independent reviewers (H.L and E.C.W.L) for certain factors and the lowest quality of evidence among the criteria is considered the overall quality of evidence.

Results

Study selection

A total of 293 articles emerged from the inceptive electronic database search; of these, 34 were assessed for eligibility, and 22 eligible papers were included in this review. Fig. 1 displays the flow of papers through review. The basis for exclusion of articles after retrieval and assessment of eligibility included non-relevance to pre-frailty [54], non-relevance to community-dwelling older adults [55,56,57,58,59], lack of outcome measures of interest [23, 60,61,62,63,64,65,66], failure to meet desired representation of participants [67], lack of reporting on the proportion of pre-frail community-dwelling adults [68], lack of reporting on the pre- and/or post-intervention data [69], lack of suitable comparator [70,71,72], and non-randomized controlled trials [9, 16].

Fig. 1
figure 1

Selection process for studies included in analysis

Methodological quality

There was substantial concurrence between the 2 reviewers (κ = 0.802, P < 0.001). Individual item agreement percentages ranged from 72.7% to 100%. The methodological quality assessment using the PEDro scale yielded a mean score of 6.45 (range = 3–9) out of a possible 10 points (Table 1). Criteria commonly not met were concealment of allocation, blinding of treating therapists or patients, and intention-to-treat analysis.

Table 1 Details of the included randomized controlled trials

Study characteristics

Twenty-two randomized controlled trials (900 participants in the experimental group and 1015 participants in the control group), which had data for physical measures, cognition, frailty status and/or quality of life were available for pooling (Fig. 2). The criteria used for determining pre-frailty across the trials included Fried’s frailty criteria [10, 11, 14, 17, 21, 43, 76,77,78,79,80, 84], FRAIL questionnaire [22, 44, 74, 83], Cardiovascular Health Study criteria [23, 81], Frailty phenotype [26], Kaigo-Yobo Checklist [82], Chinese Canadian Study of Health and Aging Clinical Frailty Scale Telephone Version [73], and was not mentioned in one of the trials [75]. Ten trials evaluated the effects of multi-component exercise [10, 17, 21, 22, 26, 44, 73, 75, 80, 84], 4 trials on multi-component exercise with nutrition [43, 77, 78, 83], 2 trials on multi-component exercise with nutrition and cognitive [81, 85], 2 trials on TaiChi [11, 76], 1 trial on strengthening exercises with nutrition [82], 1 trial on strengthening exercises [79], 1 trial on elastic band [14], and 1 trial on stepping exercises [74]. Six trials were found to have high risk of bias [22, 26, 43, 74, 77, 85], whilst there was some concerns in 15 trials [11, 14, 17, 21, 44, 73, 75, 76, 78,79,80,81,82,83,84], and low risk of bias in 1 trial [10]. Evidence of symmetry was visually confirmed in the funnel plot (Fig. 3). A symmetrical distribution in the studies about the combined effect size was observed in Fig. 3.

Fig. 2
figure 2

A Forest plot (standardised mean difference and 95% CI), and (B) (odds ratio and 95% CI) of outcome measures in randomized controlled trials. Pooled estimates of subgroup outcome measures are indicated by empty symbols

Fig. 3
figure 3

Funnel plot of standardised mean difference (SMD) against standard error of SMD in outcome measures

Grip strength

Ten trials (378 participants in the experimental group and 482 participants in the control group) had data for grip strength [14, 17, 21, 22, 26, 43, 44, 78, 80, 82]. Data were pooled using a random-effects model; there was a non-significant pooled standardized mean difference in grip strength (0.22, 95% CI -0.07 to 0.50, P > 0.05) between exercise and minimal intervention, with a high level of heterogeneity (I2 = 74%, τ2 = 0.15, χ2 = 34.44, df = 9, P = 0.0001) (Fig. 2).

Lower limb strength

Ten trials (384 participants in the experimental group and 482 participants in the control group) had data for lower limb strength [10, 11, 17, 22, 26, 44, 76, 77, 80, 84]. Data were pooled using a random-effects model; there was a significant pooled standardized mean difference in lower limb strength (0.67, 95% CI 0.29 to 1.04, P < 0.0001) between exercise and minimal intervention, with a high level of heterogeneity (I2 = 84%, τ2 = 0.29, χ2 = 55.3, df = 9, P < 0.00001) (Fig. 2). Whilst there were no apparent outliers, we performed post-hoc sensitivity analysis by removing 3 RCTS with high risk of bias [22, 26, 77]. The pooled effect size for lower limb strength (7RCTs, 265 participants in the experimental group and 264 participants in the control group) remained significant (0.79, 95% CI 0.29 to 1.29, P = 0.002), with a high level of heterogeneity (I2 = 86%, τ2 = 0.38, χ2 = 42.07, df = 6, P < 0.00001).

Balance

Six trials (233 participants in the experimental group and 248 participants in the control group) had data for balance [17, 73, 78, 80, 82, 84]. Data were pooled using a random-effects model; there was a non-significant pooled standardized mean difference in balance (0.04, 95% CI -0.14 to 0.22, P = 0.69) between exercise and minimal intervention, with a low level of heterogeneity (I2 = 0%, τ2 = 0.0, χ2 = 2.36, df = 5, P = 0.80) (Fig. 2).

Gait speed

Thirteen trials (484 participants in the experimental group and 581 participants in the control group) had data for gait speed [14, 17, 22, 26, 43, 44, 76, 78,79,80,81,82, 84]. Data were pooled using a random-effects model; there was a significant pooled standardized mean difference in gait speed (0.37, 95% CI 0.09 to 0.64, P = 0.009) between exercise and minimal intervention, with a high level of heterogeneity (I2 = 77%, τ2 = 0.19, χ2 = 52.61, df = 12, P < 0.001) (Fig. 2). Similarly, in the absence of apparent outliers, we proceeded with post-hoc sensitivity analysis by removing 3 RCTS with high risk of bias [22, 26, 43]. The pooled effect size for gait speed (10 RCTs, 389 participants in the experimental group and 379 participants in the control group) was not statistically significant (0.25, 95% CI -0.06 to 0.55, P = 0.11), with a high level of heterogeneity (I2 = 76%, τ2 = 0.18, χ2 = 37.5, df = 9, P < 0.001).

Timed up and go (TUG)

Seven trials (343 participants in the experimental group and 344 participants in the control group) had data for TUG [11, 17, 43, 77, 79, 80, 82]. Data were pooled using a random-effects model; there was a significant pooled standardized mean difference in TUG (0.39, 95% CI 0.04 to 0.75, P < 0.0001) between exercise and minimal intervention. Due to the high level of heterogeneity (I2 = 80%, τ2 = 0.18, χ2 = 30.18, df = 6, P < 0.0001) (Fig. 2), we performed post-hoc sensitivity analyses by removing an outlier [11]. However, the pooled standardized mean difference (6 RCTs, 313 participants in the experimental group and 314 participants in the control group) remained significant (0.21, 95% CI 0.06 to 0.37, P = 0.008) with a low level of heterogeneity (I2 = 0%, τ2 = 0.0, χ2 = 3.38, df = 5, P = 0.64). But when we repeated the analysis by removing 2 RCTS with high risk of bias [43, 77], the pooled effect size for TUG (5 RCTs, 197 participants in the experimental group and 184 participants in the control group) was not statistically significant (0.52, 95% CI -0.04 to 1.07, P = 0.07), with a high level of heterogeneity (I2 = 84%, τ2 = 0.33, χ2 = 25.64, df = 4, P < 0.0001).

Short physical performance battery (SPPB)

Five trials (120 participants in the experimental group and 219 participants in the control group) had data for SPPB [11, 17, 21, 22, 84]. Data were pooled using a random-effects model; there was a significant pooled standardized mean difference in SPPB (0.27, 95% CI 0.03 to 0.51, P = 0.03) between exercise and minimal intervention, with a low level of heterogeneity (I2 = 0%, τ2 = 0.0, χ2 = 1.8, df = 4, P = 0.77) (Fig. 2). Similarly, we performed post-hoc sensitivity analysis albeit there were no apparent outliers. We removed 1 RCT with high risk of bias [22], but the significance in pooled effect size for SPPB (4 RCTs, 94 participants in the experimental group and 97 participants in the control group) persisted (0.33, 95% CI 0.04 to 0.62, P = 0.02) with a low level of heterogeneity (I2 = 0%, τ2 = 0.0, χ2 = 1.23, df = 3, P = 0.75).

Functional capacity

Three trials (159 participants in the experimental group and 159 participants in the control group) had data for functional capacity [10, 74, 80]. Data were pooled using a random-effects model; there was a non-significant pooled standardized mean difference in functional capacity (0.52, 95% CI -0.02 to 1.06, P = 0.06) between exercise and minimal intervention, with a high level of heterogeneity (I2 = 80%, τ2 = 0.18, χ2 = 10.11, df = 2, P = 0.006) (Fig. 2).

Cognition

Five trials (225 participants in the experimental group and 325 participants in the control group) had data for cognition [11, 22, 44, 73, 85]. Data were pooled using a random-effects model; there was a non-significant pooled standardized mean difference in cognition (0.22, 95% CI -0.07 to 0.51, P = 0.14) between exercise and minimal intervention, with a moderate level of heterogeneity (I2 = 61%, τ2 = 0.07, χ2 = 10.18, df = 4, P = 0.04) (Fig. 2).

Quality of life

Eight trials (390 participants in the experimental group and 406 participants in the control group) had data for quality of life [21, 43, 44, 73, 75, 77, 78, 82]. Data were pooled using a random-effects model; there was a non-significant pooled standardized mean difference in quality of life (0.15, 95% CI -0.28 to 0.58, P = 0.50) between exercise and minimal intervention, with a high level of heterogeneity (I2 = 88%, τ2 = 0.33, χ2 = 60.67, df = 7, P < 0.00001) (Fig. 2).

Reversal of frailty status

Eight trials (351 participants in the experimental group and 363 participants in the control group) had data for the proportion of frailty status [10, 11, 21, 26, 43, 44, 77, 83]. Data were pooled using a random-effects model; pre-frail older adults who received community-based exercises were more likely to reverse from pre-frailty to robust state (OR = 8.11, 95% CI 2.12 to 30.92, P = 0.002), when compared to those who received minimal intervention. Due to the high level of heterogeneity (I2 = 81%, τ2 = 2.91, χ2 = 36.88, df = 7, P < 0.00001), we performed post-hoc sensitivity analysis, that is, we removed two outliers [10, 44]. However, the pooled odds ratio (6 RCTs, 263 participants in the experimental group and 281 participants in the control group) remained significant (OR = 2.74, 95% CI 1.36 to 5.51, P = 0.005) with a low level of heterogeneity (I2 = 23%, τ2 = 0.18, χ2 = 6.48, df = 5, P = 0.26). Thereafter, we repeated the analysis by removing 3 RCTS with high risk of bias [26, 43, 77], the pooled odds ratio of pre-frailty reversal (5 RCTs, 197 participants in the experimental group and 195 participants in the control group) remained persistently significant (OR = 14.01, 95% CI 1.89 to 103.58, P = 0.01), with a high level of heterogeneity (I2 = 84%, τ2 = 4.32, χ2 = 24.88, df = 4, P < 0.0001).

Parameters of community-based exercise as predictors of the effect size measures

The most commonly used parameters were 60-min duration [11, 17, 22, 43, 44, 73,74,75,76, 78, 79, 83], 3 sessions per week [10, 14, 26, 73,74,75,76,77, 79, 80] over a time span of 12 weeks [10, 17, 21, 26, 44, 73,74,75, 78, 80,81,82, 84, 85]. Our multivariable meta-regression analyses identified frequency (Beta = 0.6, 95% CI 0.084 to 1.115, P = 0.028) as an independent predictor of the effect size for gait speed amongst older adults with pre-frailty (Table 2). In other words, increased frequency per week was associated with greater effect size for gait speed. The model correctly predicted 56.6% of the variability in the effect size for gait speed. The normality in the distribution of residuals and homoscedasticity in the scatterplot were visualised. Based on the variance inflation factor value, there was no evidence of multicollinearity.

Table 2 Predictors of the effect size for gait speed in pre-frail older adults

GRADE

The strength of evidence is illustrated in Table 3 according to the GRADE criteria with an overall certainty of evidence ranging from very low to moderate.

Table 3 Quality of the evidence (GRADE) for SMD in significant outcome measures

Discussion

This systematic review has synthesized the evidence for the role of community-based exercises in improving lower limb strength and function (SMD = 0.27–0.67, P < 0.05) when compared to minimal intervention in pre-frail older adults (Supplementary Fig. 2Ai). In addition, community-based exercises is superior to minimal intervention in reversing pre-frailty to healthy state amongst them. The frequency, that is, the number of community-based exercise sessions per week, may be a predictor of the effect size of gait speed in pre-frail older adults. These findings have implications on the implementation of public health intervention such as community-based exercises targeted at pre-frailty.

We did not find a significant pooled SMD in grip strength between pre-frail older adults who have received community-based exercises and those who have received minimal intervention. In comparison with a recent review by Liu and co-workers (2022) [86], they have reported significant pooled MD in grip strength, that is, pooled MD of 1.36 from 4 studies which investigated exercise only, and pooled MD of 2.71 from 2 studies which investigated the effects of exercise with nutrition (Fig. 2 therein, p1431.e5) [86]. We propose that the inconsistency in findings between reviews may be explained by a few methodologically plausible reasons, that is, the different types of dynamometer that have been used across the studies in our review [21, 26, 44], and the different methods of assessing grip strength with variation of the protocol or body position [87, 88]. Perhaps a greater consistency in methodology in future studies may yield further insight on this. Having said this, we have calculated the SMD value which would have accounted for the variation in spread of data due to the different testing methods and exercise protocols.

Our review has revealed significantly moderate effect size in lower limb strength when comparing pre-frail older adults who received pre-frailty intervention compared to minimal intervention (SMD = 0.67). During our post-hoc subgroup analyses of trials which used timed 5-times sit-to-stand test (n = 4) [22, 26, 44, 77], the significance in pooled lower limb strength remained (SMD = 0.58, p = 0.04). This corresponds to a reduction by 2.25 secs (Supplementary Fig. 4Aii), which concurs with the minimal clinically important difference, that is 0.5 to 1.7 secs, as reported by a previous study [89]. When we analysed trials which used 30 s chair rise stand test (n = 3) [10, 11, 76], the significance in pooled lower limb strength persisted as well (SMD = 1.35, p = 0.001), and this is borne out to be approximately 4 repetitions. On the contrary, Liu and co-workers (2022) reported a lack of significance in pooled mean difference [86]. This discordance may plausibly be due to the trials included during analysis. For example, we included trials which recruited mostly community-dwelling older adults with pre-frailty [10, 11, 76], whereas Liu and co-workers (2022) [86] included trials which recruited hospitalized pre-frail older adults [56], pre-frail older adults from residential living centres [90], pre-frail elderly women who visited the sport training centre [91], and community-dwelling pre-frail elderly people [76]. Future reviews may include more specific inclusion criteria to enhance comparison between studies, and to better understand the target population being studied.

We have found a significantly small pooled effect size in gait speed (SMD = 0.37) with more precise estimate, and this correlates with a reduction in time taken by approximately 0.16 s to complete the gait speed test. Similarly, Liu and co-workers (2022) [86] have also reported a significant pooled effect size in gait speed. Conversely, they have reported a higher effect size (SMD = 1.06) with less precise estimate. This differential in effect size could be attributed to the difference in method of data extraction, that is, we have extracted change score whilst Liu and co-workers (2022) extracted follow-up scores [92]. Secondly, Liu and co-workers (2022) included 4 trials with exercise-only intervention [14, 44, 76, 79] in their review during analysis of pooled SMD in gait speed (Supplementary Fig. 6 therein, p1431e.13). In contrast, we included 13 trials with diverse exercise protocols during analysis. Interestingly, our post-hoc subgroup analysis which looked at 6 trials with multi-component exercises yielded a lack of significance in pooled SMD in gait speed (Supplementary Fig. 3). Overall, we believe that our estimated effect size in gait speed herein is considered conservative in view of the larger number of studies included in our analysis. Future reviews may consider the extraction of change score, instead of follow-up scores to yield a more precise estimate.

Our review has unveiled significantly small effect size in timed up-and-go (SMD = 0.39), and this parallels with a reduction in timing by 0.73 secs. However, this is less than the minimum detectable change of 2.08 secs reported by a previous study on community dwelling adults aged 50 and above [93]. Our estimated minimal clinically important difference for timed up-and-go worked out to be 0.41 secs [94]. We are unaware of any available minimal clinically important difference values for timed up-and-go in frail or pre-frail older adults within the literature for comparison. In contrast to our finding, Liu and co-workers (2022) have reported a lack of significance in pooled effect size in timed up-and-go. The lack of significance remained based on their subgroup analyses of trials which investigated the effect of exercise only, and trials which investigated the effect of exercise with nutrition (Supplementary Fig. 9 therein, p1431.e16). We were unable to replicate the aforementioned subgroup analyses on a post-hoc basis due to the diverse exercise protocols. For similar reason, we believe that the discrepancy in our findings may be attributed to the difference in trials included during analaysis.

We have found a significantly pooled effect size in SPPB, and this concurs with the finding by Liu and co-workers (2022) [86]. Our pooled effect size in SPPB yielded a SMD of 0.27, and this is borne out to be 0.45 point, which is considered clinically significant [95]. In addition, we estimated that the minimal clinically important difference in SPPB is 0.83 point [94]. In contrast, Liu and co-workers (2022) have reported a much larger overall pooled mean difference in SPPB of 0.81 point; pooled mean difference of 1.02 points from 4 studies which investigated exercise only, and pooled mean difference of 1.2 points from 1 study which investigated exercise with nutrition (Fig. 1 therein, p1431.e4). For similar reason, it is plausible that the difference in magnitude of effect size may be due to the difference in method of data extraction. SPPB, which is a composite measure of balance, gait speed and lower limb strength, has been reported to be a protective frailty factor and can be monitored in pre-frail older adults [96]. These corroborate the use of SPPB as a tool, at least in part, in evaluating the effectiveness of community pre-frailty intervention.

By inference of a recent study which has reported a lack of significant change in balance among retirees who were aged 60 years and above after participation in a 3-month community-based physical activity with fall prevention program [97], it is conceivable that detecting a significant change in balance with community-based exercises among pre-frail older adults may be just as challenging as in our review. Having said these, it is noteworthy that studies which used SPPB balance score [17, 84] had consistently larger effect size estimates than studies which used one-legged stance test [73, 78, 80, 82]. This suggests that SPPB balance score which assesses the ability to assume normal, semi-tandem and tandem stance for 10 s, may be more sensitive in detecting changes when compared to the timed one-legged stance test. Interestingly, some of the included trials in this analysis did not include balance exercises in their pre-frailty program [80, 82]. This may highlight the importance of incorporating balance exercises in the community pre-frailty intervention. Future studies may consider the use of SPPB balance score, instead of the single leg stance test in evaluating balance performance.

Notwithstanding the inclusion of aerobic exercises in the pre-frailty intervention across the included trials [10, 74, 80], there is a lack of significance in the pooled effect size for functional exercise capacity. To our knowledge, we are unaware of reviews which have investigated the effect of community-based exercise on functional exercise capacity in pre-frail older adults. We believe that we could have yielded a different result if there were more trials which had incorporated multi-component exercises in their protocol included in our analysis, that is, elements of resistance, aerobic, balance and flexibility training, to augment the effect on improving functional exercise capacity [98]. Our finding may also be explained by other reasons, that is, the pre-frail older adult participants were likely to be working at the limit of their physical capacity to carry out activities of daily living [99]. Lastly, when interpreting the result from a previous study [100], the training effect of cycling exercise [80] or stepping exercise [10, 74] may be inadequate to improve functional exercise capacity in pre-frail older adults. Future studies may consider multi-component exercise and the inclusion of outdoor or treadmill walking as part of the exercise protocol. Nonetheless, our result should be interpreted with caution given the relatively low number of included trials and reduced statistical power to detect difference in pooled functional exercise capacity. Hence, this merit further investigation.

We have found a lack of significance in pooled cognitive performance, and this finding did not agree with a previous review by Racey and co-workers (2021) [19]. This discrepancy in conclusion may be ascribed to the difference in method of including trials in the meta-analysis. For example, some of the trials were included more than once in their meta-analysis (Fig. 3B therein, pE740) which may have overstated the precision of their results [19]. Another plausible reason could be attributed to the diverse clinical outcomes used to measure different cognition domains across our included trials [11, 22, 44, 73, 85]. Interestingly, the removal of trials which used the Mini-Mental State Examination during post-hoc subgroup analysis uncovered a significant effect (SMD = 0.39, 0.06 to 0.72, P = 0.02) (Supplementary Fig. 5). Based on the neuroanatomical correlates of the cognitive measures, that is, Frontal Assessment Battery [101], Repeatable Battery for the Assessment of Neuropsychological Status [102] and Montreal of Cognitive Assessment [103], it is appealing to consider that exercise may exert its effect, at least in part, through pathways involving the pre-frontal, medial temporal and/or subcortical area respectively. Further studies are warranted to support this assertion.

Despite the positive association reported between physical activity and quality of life [104], the enhancement in quality of life by pre-frailty intervention was not observed in our review. Furthermore, our finding did not concur with previous reviews [19, 86]. This may be attributed to the different methodologies used such as method of data extraction [86] and inclusion of trials during meta-analysis [19]. It is also conceivable that the lack of significance in pooled quality of life amongst community-dwelling pre-frail older adults herein may reflect the multidimensional construct of quality of life, which may be influenced by a plethora of factors such as financial resources, health and meaning in life [105]. This merits further investigation.

Albeit the scarcity of information on pooled pre-frailty reversal odds ratio, our review has revealed that the pooled odds of reversal from pre-frailty to robust state is about 3 times amongst the older adults who received community-based exercises, when compared to those who received minimal intervention. This finding concurs with other trials [26, 106], which has demonstrated similar result. Based on a proposed method to derive the number needed to treat [52], we estimated that 20 pre-frail older adults would be required to participate in community-based exercises in order for one additional pre-frail older adult to achieve healthy robust state. In comparison to findings from one of the included trials [10], we believe that our estimated number needed to treat is considered conservative based on the diverse pre-frailty intervention across our included trials. Nevertheless, our findings have implications on public health policy, that is, it underscores the benefit of public health intervention such as pre-frailty intervention in altering frailty trajectory at the population level [22]. But this would call for recommended actions by both healthcare providers and policy makers. For example, healthcare providers could consider implementing more community-based exercise programs [107], whilst policymakers could consider integrating such programs into mainstream care for the pre-frail aging population [22].

By inference of previous studies [10, 22, 54], it is tempting to speculate that exercise intervention modifies the risk factors of frailty such as reduced walking speed by altering the body composition and immune profile. For example, the reversal of pre-frailty was reportedly associated with reduced body fat mass, increased fat-free mass and improved fitness [10]. Similarly, Tan and co-workers (2023) have also reported an improved appendicular skeletal muscle index after 3–6 months of exercise with or without cognitive stimulation therapy amongst pre-frail older adults [22]. Other proposed mechanisms include the reduction in inflammatory biomarkers such as interleukin-6 and C-reactive protein after a 6-month exercise training amongst older adults [54]. From a social psychological perspective, the benefits of regular participation in community-based exercise events may be attributed to the participants’ positive and rewarding social behaviours and experiences such as subjective enjoyment and energy level [108]. These may be mediated by a reduction in feelings of fatigue and cortisol level [109]. Nonetheless, these mechanisms merit further studies for validation.

Our review of the literature revealed variability in the temporal parameters of community-based exercise, and that it is uncertain how community-based pre-frailty intervention can be rolled out to optimize clinical benefits at the population level. Thus far, a previous study has identified weekly frequency as one of the predictors of SPPB in pre-frail and frail older adults (Table 5 therein, p11) [110]. Similarly, we have identified herein that the frequency (number of sessions per week) as a significant predictor of the effect size for gait speed. However, this predictor was not significant when univariable regression analysis was performed (P > 0.05). We believe that further studies in this area would elucidate further insights on the predictive potential of pre-frailty intervention parameters on the clinical outcome.

Limitations

One of the challenges encountered during this review included the variability in the pre-frailty intervention across the included trials. However, this was overcome with the use of random-effects models a priori. Secondly, different outcome measures were used across the included trials to measure the same construct. Conversely, we expressed our pooled results in units of standard deviation, that is, standardized mean difference to circumvent this issue. Thirdly, we included trials with a mix of pre-frail and frail older adults. Nevertheless, we ran post-hoc sensitivity analyses by excluding trials which included frail older adults, and the results were consistent in most of the outcome measures. Lastly, there were high risk of bias in 6 out of 22 included RCTs. Our post-hoc sensitivity analyses revealed persistently statistically significant pooled results for lower limb strength, but not for gait speed and TUG after removing RCTs with high risk of bias, hence our data should be interpreted with caution.

Conclusion

In conclusion, this review highlights that community-based exercises is superior to minimal intervention for improving physical function and health in older adults with pre-frailty. The frequency of exercise sessions per week may influence the effect size for gait speed amongst pre-frail older adults. Further research works are warranted to investigate responsive outcome measures and optimal parameters of community-based exercises for the community-dwelling pre-frail older adults.

What is already known

  • ▪ Pre-frailty poses a large socioeconomic burden and it affects the older adults.

  • ▪ There is conflicting evidence on the effectiveness of community-based exercises in improving clinical outcomes amongst older adults with pre-frailty.

What are the new findings

  • ▪ Community-based exercise is superior to minimal intervention in improving physical function such as lower limb strength and gait speed in older adults with pre-frailty.

  • ▪ The odds of reversing pre-frailty to robust state is about 3 times amongst those who received community-based exercises, when compared to minimal intervention. Out of 20 pre-frail older adults who participate in community-based exercises, one is expected to achieve healthy robust state who would not otherwise have done so.

  • ▪ The frequency of exercise sessions per week may influence the effect size for gait speed in older adults with pre-frailty.