Introduction

Prostate cancer is the second most common cancer type among men worldwide and the fifth most frequent cause of cancer deaths in men [1]. Prostate tumour growth is dependent on stimulation from the male hormone testosterone [2]. Androgen deprivation therapy (ADT) significantly lowers the levels of testosterone or inactivate the function of the hormone. This lead to slowing tumour growth or shrinking the tumour size [3]. Therefore, ADT is foundational in treatment of metastatic prostate cancer, and used as an important adjuvant therapy in locally advanced prostate cancer [3,4,5,6]. Up to half of all patients with prostate cancer will undergo ADT at some time point during their treatment course [7]. The benefits of ADT in delaying prostate cancer progression, relieving symptoms and prolonging survival for patients with advanced disease is widely documented [4,5,6]. Unfortunately, ADT can have several detrimental side effects including; increases in fat mass, low bone mineral density, loss of muscle mass and reduced muscle strength [8,9,10]. These side effects can lead to significant reductions in physical performance of everyday activities [8, 11] and reduced quality of life [12], and they can increase the risk of cardiovascular diseases, diabetes, fractures [5, 8, 9] and depression [8, 13, 14]. The management of these considerable negative effects is an essential part of the supportive cancer care for men receiving ADT [9, 12, 15].

Randomized controlled trials (RCT) have shown that supervised exercise can ameliorate many of the debilitating effects of ADT [16,17,18,19,20,21,22], and supervised exercise is recommended as a strategy to manage these side effects [9, 15]. Some systematic reviews have demonstrated beneficial effects of exercise therapy in reversing side effects of ADT in meta-analyses [23,24,25,26], but several RCTs have been published since these reviews were conducted [20, 21, 27,28,29,30,31].

None of the previous reviews have used the GRADE-approach (Grades of Recommendation, Assessment, Development and Evaluation). The strength of the GRADE-method includes pre-specification of the inclusion criteria and pre-specified outcomes, judged as critical or important to patients. This process ideally involves asking patients [32, 33].

The aim of this review was to systematically evaluate the effect of supervised exercise therapy compared with no exercise therapy in patients with prostate cancer undergoing ADT using GRADE. The effect was assessed using the patient critical outcomes ‘disease-specific quality of life’ and ‘physical performance’ measured by walking performance at end of treatment.

Methods

This systematic review and meta-analysis was conducted according to the guidelines of the Cochrane Collaboration [34] and based on the GRADE-approach [35]. The reporting adheres to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) recommendations [36]. The work was conducted as a part of updating a national clinical guideline on rehabilitation of patients with prostate cancer published by the Danish Health Authorities in 2021 [37]. As such, the protocol was pre-specified including specification of detailed inclusion and exclusion criteria regarding populations, interventions, comparators and outcomes. The protocol was approved by the management of the Danish Health Authority before the literature search was conducted and is publicly available at NKR-38-Focused-questions-PICOs-for-updating1.ashx (sst.dk).

Data Sources and Search Strategy

First, to identify systematic reviews with relevant RCTs to be included in the synthesis, we performed a search for systematic reviews on 18th January 2016 including records published from 2005-2016. In this step, we included a systematic review by Bourke et al. [16]. Next, a systematic search for primary trials was conducted on 24th February 2016. The search was limited to the last search in the identified review (January 2015). This search was updated on 16th June 2021. Searches were conducted in PubMed/MEDLINE, EMBASE, Cochrane Library, Cinahl and Pedro. No restrictions regarding publication status were applied. Language was limited to English, Danish, Swedish and Norwegian. See the full search protocols in Supplementary appendix.

Trial selection

We used Covidence systematic review software (Veritas Health Innovation, Melbourne, Australia) for screening, data extraction and risk of bias assessments. One reviewer screened all titles and abstracts for eligibility (AU), and two reviewers independently assessed records selected for full-text review (AU and BV, JV or MLK). Disagreements were resolved through discussions. If necessary, a third reviewer (BV or JV) was consulted to reach consensus. Reference lists of the included trials and relevant identified systematic reviews were hand searched for more relevant trials.

Pre-specified eligibility criteria were based on the Population, Intervention, Comparison, Outcome, and Time (PICOT) framework [33, 38]. We included RCTs that investigated the effect of supervised exercise therapy compared to no exercise therapy in patients with prostate cancer receiving ADT. Both published and unpublished trials could be included as long as results were available in an abstract or at a website. The PICOT eligibility criteria for inclusion and exclusion of trials were as follows:

Population

Adult patients diagnosed with prostate cancer currently receiving ADT. All types of ADT could be included. Trials in which more than 60% of the population received ADT could also be included. When such trials were included in an analysis, we performed a pre-specified sensitivity analysis with exclusion of these trials. Trials with a subgroup of patients on ADT (<60%) were included as long as data were provided separately for the ADT-population.

Intervention

Supervised exercise therapy was defined as: a regimen of physical exercises that was instructed, supervised, and monitored by a health care professional. We included trials with physical exercises involving the whole body at a moderate to high intensity e.g. resistance training involving the upper and lower extremity with an intensity of minimum 60% of one repetition maximum (RM) and/or aerobic (cardiovascular) exercise at a minimum of 60% of the estimated maximum heart rate. Supervision had to be given at least twice per week and length of interventions minimum two months.

Comparator

We defined ‘no exercise therapy’ as: no treatment, usual care not including physical exercise therapy, waiting list, and sham training e.g. stretching or relaxation training. Home-based exercise after initial instruction was not included as comparator.

Outcome

According to GRADE, outcomes were predefined as critical or important to patients [33, 38]. Patient representatives from the Danish Association of Prostate Cancer contributed to the ratings of outcomes.

Critical (primary) outcomes were ‘disease-specific quality of life’ and ‘physical performance measured by walking performance’. The preferred measure for disease-specific quality of life was The Functional Assessment of Cancer Therapy–Prostate (FACT-P) (range 0–156, Minimum Clinically Important Difference (MCID) 6–10 points [39]. For ‘walking performance’ we preferred 400 m walking test when available (MCID: 20–30 s) [40].

Important (not critical, secondary) outcomes included ‘health related quality of life’, e.g. SF-26 or EQ-5D, ‘physical performance measured by sit to stand performance, ‘muscle strength’, ‘VO2 peak’, ‘prevalence of depression, cardiovascular diseases and diabetes’, ‘fractures’, ‘exercise related injuries’ and ‘dropouts for all causes’.

Time point of interest

The primary time point of interest was end of treatment for all outcomes except for the outcomes ‘fractures’, ‘prevalence of depression’, ‘prevalence of cardiovascular diseases’ and ‘prevalence of diabetes’ for which the time point of interest was longest follow-up.

Data extraction and quality assessments

Two authors (AU and MLK) independently extracted data using a predefined extraction template in Covidence including information of trial design, trial population, baseline characteristics, interventions, comparators and outcome measures. Discrepancies were resolved through discussion.

We evaluated the internal validity of the included systematic review using the AMSTAR tool [41] and assessed risk of bias using Cochrane Risk of Bias tool [42], which evaluates random sequence generation, allocation concealment, blinding of personnel, patients and outcome assessors, incomplete outcome data, selective outcome reporting and other sources of bias. Two independent reviewers (AU and MLK) performed all quality assessments. Consensus was reached through discussions.

Certainty of evidence

The certainty of evidence per outcome across trials was assessed using GRADE [43]. According to GRADE, evidence from RCTs starts at “high certainty” and can be downgraded to “moderate”, “low”, or “very low” certainty based on limitations in trial design (risk of bias), indirectness, imprecision, inconsistency, and publication bias. The overall certainty of evidence was determined by the lowest certainty level for the critical outcomes [43]. All assessments were decided by consensus in the guideline panel.

Statistical analyses

Meta-analyses were performed as random effects model using Review Manager 5.3 (The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen, Denmark) as variation between studies were anticipated. Missing values for standard deviations were calculated from the available data when possible. Continuous outcomes were reported as mean differences (MD) when data were reported at the same measurement scale, otherwise standardised mean differences (SMD) were calculated. To support interpretation, SMD estimates for critical outcomes were transformed to MD estimates, using the method described by Thorlund et al. [44]. Dichotomous outcomes were reported as Risk Ratios. For all estimates, 95% confidence intervals (CIs) were provided. Absolute effect estimates per 1000 individuals and corresponding CIs were calculated based on the presumed risk in the control group and the estimated risk ratio. When at least one trial reported no events in one group for a dichotomous outcome, the absolute effect was based on a risk difference analysis. Heterogeneity was assessed by visual inspection of the forest plot, and by interpreting the I² statistic and Chi²-test. We generated funnel plots to judge publication bias when ten or more trials were included in an analysis.

To explore potential heterogeneity, the following subgroup analyses were pre-specified: group vs individual training, timing of exercise after starting ADT (>1 month vs <1 month) and modalities e.g. football (European soccer), resistance or aerobic exercise. In addition, pre-planned sensitivity analyses excluding trials where less than 90% of the patients received ADT were conducted.

Results

Selection of trials

In the search for systematic reviews we identified 732 records. We excluded 644 by screening of title and abstracts, and 68 records were assessed for inclusion by full-text review. One systematic review was included [16]. From this review, we included seven RCTs after full text review. In the search for primary studies (February 2016) we identified 149 records of which 17 were assessed for inclusion by full-text review and one RCT was included. The update of this search (June 2021) identified another 871 records, of which 109 were assessed by full-text review and 10 RCTs were included. In total 18 RCTs (25 publications) were included [17,18,19,20,21,22, 27,28,29,30,31, 45,46,47,48,49,50,51,52,53,54,55,56,57,58]. See PRISMA flow charts for the trial selection process in Supplementary Figure S1, S2. A complete list of excluded trials assessed in full-text with reasons for exclusion is given in Supplementary Table S1.

Trial characteristics

The 18 eligible trials included 1,477 men with prostate cancer for our comparisons of interest. For 16 of the 18 included trials a protocol was available [18,19,20,21,22, 27,28,29,30,31, 45,46,47,48,49,50, 52,53,54,55,56], but in two trials the protocol was registered retrospectively [18, 49, 50, 52, 54]. In the majority of the trials, all participants (100%) received ADT [18,19,20,21,22, 28, 30, 31, 45,46,47,48,49,50,51,52,53,54,55,56] or data were reported separately for the ADT population [27]. In one trial 95% received ADT [29] and in another trial 61% received ADT [17]. The included trials comprised patients with cancer stage T1-T4. Nine trials included patients with stage T1-T4 cancer [17, 21, 22, 27, 31, 47, 50,51,52,53,54, 56], five trials included stage T3-T4 [18, 19, 29, 30, 45], and in four trials cancer stages were unclear [20, 28, 46, 48].

Four trials included men, who had started ADT within the last month [21, 22, 30, 46, 48]. In the remaining trials, the participants had been on ADT between two month and three years. One trial had no information on ADT duration [17]. The interventions comprised supervised exercise therapy with moderate to high intensity between 60–90% of 1 RM (repetition maximum) for resistance exercise and between 55–85% of the estimated maximum heart rate for aerobic exercise. The majority of the exercise programs were progressive in nature [17,18,19,20,21,22, 27,28,29,30,31, 45,46,47,48, 50, 52,53,54,55,56,57,58].

The included interventions consisted of a combination of resistance and aerobic exercise [18, 20,21,22, 29,30,31, 45,46,47,48,49,50, 52, 54], resistance or aerobic exercise [17], solely resistance exercise [19, 28, 51, 55, 56] and football training [27, 53]. The duration of the interventions was 12 weeks [20, 28,29,30, 45,46,47], 16 weeks [19, 48, 51, 53], 6 months [17, 21, 22, 27, 50, 52, 54] and 12 months [18, 31, 49, 55, 56] for our comparisons of interests, respectively. Trial characteristics are presented in Table 1.

Table 1 Characteristics of included trials.

Quality assessment

The AMSTAR evaluation of the included systematic review revealed that the review had adequate description of nine out of 11 domains (Supplementary Table S2). Concerning the rigour and transparency of the literature search and inclusion of primary trials (domain 1–4), we judged that the review was of sufficient quality to enable us to base our search for primary trials on their last search date.

Risk of bias assessment

Low risk of selection bias were assessed in 14 trials [17, 18, 20,21,22, 27,28,29,30,31, 45,46,47,48,49,50, 52, 54], and four trials [19, 51, 53, 56] had unclear risk of selection bias due to inadequate reporting regarding random sequence generation and/or allocation concealment. All included trials were assessed to have high risk of performance bias due to lack of blinding of participants and personnel, since blinding for the intervention was not feasible. Thirteen trials [17, 18, 21, 22, 27, 29, 30, 45,46,47,48,49,50, 52, 54, 57, 58] had high risk of detection bias, since the critical outcome ‘disease-specific quality of life’ was self-reported and participants not blinded. Three trials had unclear risk of attrition bias (incomplete outcome data) [17, 49, 51]. In total, five trials [20, 31, 53, 55, 56] were assessed to have high risk of selective reporting, primary due to omission of reporting ‘quality of life’ despite the fact that the outcome was stated in a protocol [20, 31, 53, 55, 56]. One trial was reported only in an abstract and at clinicaltrials.gov and had unclear risk of other bias, due to inadequate reporting [48]. See the risk of bias assessment in Fig. 1.

Fig. 1: Risk of bias assessment of included trials.
figure 1

Assessed by the Cochrane risk of bias tool. Green (+): indicates low risk of bias, red (-): indicates high risk of bias, yellow (?): indicates unclear risk of bias.

Certainty of evidence (GRADE)

The results of the GRADE-process are shown in Table 2. The certainty of the evidence for the two critical outcomes ‘disease-specific quality of life’ and ‘physical performance, walking performance’ was downgraded one level due to serious risk of bias because of lack of blinding of participants and self-reported measures of disease-specific quality of life. Thus, the overall certainty of evidence for supervised exercise therapy compared with no exercise therapy was moderate. Since the funnel plots did not suggest publication bias, no downgrading for this item was performed (Supplementary Figures S3).

Table 2 GRADE Summary of Finding Table. Supervised exercise therapy compared to no exercise therapy for patients with prostate cancer receiving androgen deprivation therapy. Population: Patients with prostate cancer receiving androgen deprivation therapy. Intervention: Supervised exercise therapy. Comparison: No exercise therapy.

Results for critical outcomes

For the critical (primary) outcome ‘disease-specific quality of life’ we found that supervised exercise therapy resulted in clinically relevant improvements compared to no exercise therapy. The standardised mean difference (SMD) was 0.43 (95% CI: 0.29, 0.58), see Fig. 2 and Table 2. When transformed to a mean estimate, the result corresponded to a mean improvement of 8 points on FACT-P (95% CI: 6, 11), range 0-156, Minimum Clinically Important Difference (MCID) 6–10 points [39]. The other critical outcome ‘physical performance measured by walking performance’ also showed clinically relevant improvements in favour of supervised exercise therapy, SMD was −0.41 (95 % CI: −0.60, − 0.22), see Fig. 3 and Table 2. The result corresponded to a mean reduction of 23 s on 400 m walking test (95 % CI: 13, 34), MCID 20–30 s [40]. The certainty of the evidence for the two critical outcomes was moderate.

Fig. 2: Forest plot of the critical outcome ‘disease-specific quality of life’.
figure 2

ADT: androgen deprivation therapy, CI: confidence interval, df: degrees of freedom, EORTC-CLQ-C30: The European Organization for Research and Treatment EORTC core quality of life questionnaire, Fact-P: The Functional Assessment of Cancer Therapy - Prostate (range 0-156), Std: standardised.

Fig. 3: Forest plot of the critical outcome ‘physical performance’ measured by walking performance.
figure 3

ADT: androgen deprivation therapy, C:I confidence interval, df: degrees of freedom, Std: standardised.

Results for important outcomes

The results for the important (secondary) outcomes are shown in Table 2 and the forest plots are shown in Supplementary Figures S3.

Evaluation of ‘physical performance measured by sit to stand performance’ showed a possible difference in effect in favour of supervised exercise therapy compared to no exercise therapy, SMD was 0.35 (95% CI: 0.14, 0.56) (low certainty), corresponding to one extra repetition on 30-second sit to stand test. This is just below the MCID of two repetitions on this test [59].

Evidence of moderate certainty showed that supervised exercise therapy improved muscle strength (SMD 0.47, 95% CI: 0.28, 0.65) and VO2 peak (MD 1.76 ml/kg/m, 95% CI: 0.82, 2.69) compared to no exercise therapy. The latter corresponds to an improvement of 8% (95% CI: 4%, 13%) compared to no exercise therapy, and the guideline panel considered this to represent a clinically relevant improvement for the examined population.

Supervised exercise therapy did not imply higher dropout compared to no exercise therapy (risk ratio 0.73, 95% CI: 0.54, 0.96) (moderate certainty). The absolute difference was 40 fewer dropouts per 1000 (95% CI: 67 fewer to zero) with supervised exercise therapy. The pre-planned subgroup analysis revealed a significant lower risk of dropping out when supervised exercise therapy started within one month after starting ADT (P-value 0.04). The risk ratio was 0.36 (95% CI: 0.18, 0.73) for early start vs 0.82 (95% CI: 0.61, 1.11) for later start of supervised exercise therapy.

More people had training related injuries with supervised exercise therapy compared to no exercise therapy, (risk ratio 5.86, 95% CI: 1.55, 22.06) but the absolute number of persons with injuries was low, nine per 1000 with supervised exercise therapy compared to zero per 1000 with no exercise therapy. One trial of football training was not included in the analysis since data were only reported for the football group [27]. The trial reported 60 training related injuries (19 overuse injuries and 41 acute injuries) among the 109 participants in the football group (41% on ADT).

Sixteen trials reported data regarding adverse events. Fractures were reported as adverse events in the two football trials. The analysis revealed an increased risk of fractures for supervised exercise therapy, the risk ratio was 1.86 (95% CI: 0.25, 13.99), but the absolute numbers were very low. Three persons with fractures per 1000 with supervised exercise therapy compared to two per 1000 with no exercise therapy. The absolute difference was one more with a fracture per 1000 with supervised exercise therapy (95% CI: 14 fewer to 16 more).

The important outcomes ‘prevalence of cardiovascular disease’, ‘prevalence of diabetes’ and ‘prevalence of depression’ were not reported in the included trials. Two trials reported data for depressive symptoms, these data were used as indirect evidence for the outcome ‘prevalence of depression’. The results showed no statistical significant or clinically relevant differences (very low certainty).

Sensitivity and subgroup analyses

Sensitivity analyses with exclusion of one trial where only 61% of the population received ADT [17], did not change any results to a significant degree. No meaningful subgroup analysis regarding individual vs group exercise could be conducted, since only one trial reported individual exercise therapy. The other pre-planned subgroup analyses revealed no additional significant subgroup differences, besides the subgroup effect for dropout reported above.

Discussion

Based on evidence of moderate quality our results show that supervised exercise therapy was superior to no exercise therapy on the two critical outcomes ‘diagnose-specific quality of life’ and ‘physical performance, walking performance’. The calculated SMD of 0.43 for ‘diagnose-specific quality of life’ was both statistically significant and clinically relevant. When transformed to MD, the result corresponded to an improvement of 8 point on FACT-P in favour of exercise therapy. This is above the MCID of 6–10 points [39].

Other authors have found effects on disease-specific quality of life somewhat smaller than we did [16, 25, 60, 61], but Teleni et al. reported results in line with ours [24]. In a meta-analysis not restricted to patients on ADT, Bourke et al. found no significant effect of exercise compared to usual care on cancer-specific quality of life, but in a sensitivity analysis restricted to high quality trials, they found a statistically significant result similar to ours [16]. The differences in results across meta-analyses could mainly be due to variations in populations [16, 60, 61] and inclusion criteria [16, 25, 60, 61].

Regarding the critical outcome ‘physical performance, walking performance’ we found a statistically significant and clinically relevant SMD of −0.41 in favour of exercise therapy. When transformed to MD, the estimate corresponded to a mean reduction of 23 s on 400 m walking test and lies within the range of the estimated MCID [40]. Our result is in line with Bourke et al. who evaluated sub-maximal aerobic fitness (primarily including outcomes for 400 meters walking test) and found an SMD of −0.49 [16]. In contrast, Keilani et al. reported a result just below the MCID with a reduction of 18 s [62]. Keilani et al. included trials with resistance exercises not restricted to RCT-designs and not limited to patients on ADT. These differences could explain variations in results.

The included trials comprised patients with stage T1-T4 prostate cancer and the heterogeneity in all analyses for critical outcomes and in nearly all analyses of the important outcomes were very low, meaning there was no important systematic variation between effect sizes in the included trials. This suggests that our results probably are appropriate for all patients with prostate cancer receiving ADT regardless of stages of prostate cancer.

The debilitating side effects of ADT leading to limitations in physical performance probably is an important factor in reducing quality of life in patients with prostate cancer on ADT. Performance status have been stated as a critical factor for quality of life among cancer patients [63], and increases in physical performance have been suggested to be directly related to maintenance or improvement of quality of life among cancer patients [62]. At the same time, physical exercises are proposed as the most important intervention to mitigate psychological side effects of ADT [64]. We did not analyze whether improvements in physical performance mediated the shown improvement in disease-specific quality of life, but it seems reasonable as improved physical performance can enable increased participation in everyday life. Other mechanism of actions could be improved self-efficacy and psychological benefits of interaction with other patients.

More people experienced exercise related injuries with supervised exercise therapy compared to no exercise therapy, but the absolute number of injuries was low. Thus, it appears that the number of injuries is not higher among persons on ADT compared to any other person participating in exercise therapy.

We found an increased risk of fractures with supervised exercise therapy compared to no exercise. Fractures were only reported in trials with football interventions. The certainty in the estimate was low and thus it is uncertain whether supervised exercise therapy protects against fractures in this population, as well as it is uncertain whether football entails a risk of fractures. The included trials were underpowered to detect an effect on fractures. A larger population and long follow-up may be needed to show the effect of exercise therapy on fractures.

When offering supervised exercise therapy especially to untrained persons, one should take into account, that there might be an increased risk of exercise related injuries with football and that it is uncertain whether football entails a risk of fractures. One should consider recommending other exercise modalities to persons not used to football training.

In pre-planned subgroup analyses, we found that early start of exercise therapy is just as effective as later commencement (all outcomes). Interestingly, we found that early start significantly reduced the risk of dropout. As delayed exercise therapy postpones the positive effect of exercise [21, 22, 50, 52], it should be recommended to start exercise therapy with the initiation of ADT [21].

We did not find any trials evaluating the effect of exercise therapy on prevalence of cardiovascular diseases, diabetes and depression. Similar to fractures, larger populations and longer follow-up may be needed to evaluate the effect on these outcomes.

Strengths and limitations

This review was conducted using rigor and transparent methods in accordance with the Cochrane collaboration, the PRISMA recommendations and the GRADE-method. The strength of the GRADE-method includes pre-specification of the inclusion criteria and pre-assessment of critical and important outcomes, judged as critical or important to patients. This process ideally involves asking patients, as we did in our work. By this, we may have reduced the risk of using outcomes less relevant to patients. The GRADE-method represents a rigour and transparent method of formulation the research question (PICOT) and assessing the certainty in the evidence by assessing risk of bias, inconsistency, indirectness, imprecision and publication bias both per outcome and as overall certainty. Furthermore, we conducted a systematic literature search, and two independent reviewers conducting trial selection, data extraction and quality assessments. Limitations were literature searches restricted to English, Danish, Swedish and Norwegian language. Furthermore, the search for primary trials was based on the last search date of the included systematic review [16]. This may have resulted in not identifying all older relevant primary trials. However, supplemental hand searches for primary trials in both systematic reviews and primary trials have probably limited this risk.

All included trials were assessed to have high risk of performance bias due to lack of blinding of participants and personnel. We could have chosen not to downgrade for this aspect, since it could be argued that blinding for an exercise intervention is not feasible [16]. However, since lack of blinding can still affect the professionals delivering the intervention and the assessment of self-reported outcomes, we decided to maintain this assessment. Despite this, the certainty in the evidence was still moderate.

Conclusion

Evidence of moderate quality shows that supervised exercise therapy is superior to no exercise therapy in improving ‘disease-specific quality of life’ and ‘physical performance’ measured by walking performance in patients with prostate cancer undergoing ADT. The results apply to all patients receiving ADT regardless of cancer stage. Based on moderate certainty of the evidence, the results support a strong recommendation of supervised exercise therapy for managing side effects of ADT in this population. To avoid postponement of the positive effects of supervised exercise therapy and to reduce dropouts, it should be recommended to start exercise therapy when initiating ADT.