Effects of Different Types of Exercise on Bone Mineral Density in Postmenopausal Women: A Systematic Review and Meta-analysis

In this sub-analysis of a comprehensive meta-analysis, we aimed to determine the effect of different types of exercise on (areal) bone mineral density (BMD) in postmenopausal women. A systematic review of the literature according to the PRISMA statement included (a) controlled trials, (b) with at least one exercise and one control group, (c) intervention ≥ 6 months, (d) BMD assessments at lumbar spine (LS), femoral neck (FN) or total hip (TH), (e) in postmenopausal women. Eight electronic databases were scanned without language restrictions up to March 2019. The present subgroup analysis was conducted as a mixed-effect meta-analysis with “type of exercise” as the moderator. The 84 eligible exercise groups were classified into (a) weight bearing (WB, n = 30) exercise, (b) (dynamic) resistance exercise (DRT, n = 18), (c) mixed WB&DRT interventions (n = 36). Outcome measures were standardized mean differences (SMD) for BMD-changes at LS, FN and TH. All types of exercise significantly affect BMD at LS, FN and TH. SMD for LS average 0.40 (95% CI 0.15–0.65) for DRT, SMD 0.26 (0.03–0.49) for WB and SMD 0.42 (0.23–0.61) for WB&DRT. SMD for FN were 0.27 (0.09–0.45) for DRT, 0.37 (0.12–0.62) for WB and 0.35 (0.19–0.51) for WB&DRT. Lastly, SMD for TH changes were 0.51 (0.28–0.74) for DRT, 0.40 (0.21–0.58) for WB and 0.34 (0.14–0.53) for WB&DRT. In summary, we provided further evidence for the favorable effect of exercise on BMD largely independent of the type of exercise. However, in order to generate dedicated exercise recommendations or exercise guideline, meta-analyses might be a too rough tool.


Introduction
Exercise is considered a highly relevant component in the prevention and treatment of osteoporosis and fracture reduction [1,2]. Consequently numerous exercise studies (review in [3]) aim to increase bone strength, predominately assessed by (areal) bone mineral density (BMD) in postmenopausal women, as the most prominent and largest cohort at risk for osteoporosis. However, although there are some evidencebased recommendations for exercise protocols [1,4,5], the most promising exercise to address BMD still remains unsettled [2]. Apart from exercise parameters and principles, even basic decisions, for example about the type of exercise that should be applied, is still (or once again) controversial [6,7]. In a recent meta-analysis, Rahimi et al. [6] reported the absence of effects of resistance exercise and negative effects of weight bearing aerobic exercise on BMD at lumbar spine (LS) and femoral neck (FN) in postmenopausal cohorts 60 years and older (n = 16). Provided that these data are reliable and generalizable to the entire cohort of postmenopausal women, all the current exercise recommendations (e.g., [1,4,5,8,9]. and-even more importantly-the exercise effect on BMD in general are rendered questionable. In order to verify the findings of Rahimi et al. [6], and to estimate the effects of different roughly classified types of exercise on BMD at different regions of interest (ROI), we conducted a sub-analysis based on a recent comprehensive meta-analysis on exercise effects on BMD in postmenopausal women [3]. Similarly to Rahimi et al. [6], we roughly categorized exercises into (dynamic) resistance exercise (DRT), weight bearing (WB) exercise and combined WB&DRT exercise. Our hypotheses were that all types of exercise significantly 1 3 affect BMD at (1) LS, (2) FN and (3) total hip (TH) (4), albeit without significant differences between the exercise categorizations at any BMD-ROI.

Material and Methods
The present study is based on a comprehensive systematic review of the effect of exercise on (areal) BMD in postmenopausal women [3] to which the reader is kindly referred for details.

Data Sources and Search Strategy
We strictly followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [10]; and fully registered the study in PROSPERO (CRD42018095097). Briefly eight databases (PubMed, Scopus, Web of Science, Cochrane, Science Direct, Eric, ProQuest and Primo) were searched for articles published up to March 1, 2019 without language restrictions.
The search strategy comprised a combination of population, intervention, and outcomes. Databases were systematically searched around the following combination of terms: "Bone Mineral Density", "Exercise", and "Postmenopausal". Following the primary search and duplicate exclusion, the same reviewer (MS) screened studies by title and abstracts according to the eligibility criteria. A manual search in the reference lists of all included articles was carried out in an attempt to find new relevant studies. Authors of trials that were potentially eligible were contacted by e-mail for any missing data (e.g., mean change of BMD or SD) or clarification of data presented.

Inclusion and Exclusion Criteria
We included studies/study arms with (1) randomized and non-randomized controlled protocols with at least one exercise group versus one control group with sedentary/habitual active lifestyle or placebo exercise; (2) women who were postmenopausal at study start; (3) ≥ 6 months intervention duration; (4) areal BMD of the LS, femoral neck (FN) and/or total hip (tH) region at baseline and follow-up assessment as determined by (5) dual-energy X-ray absorptiometry (DXA) or dual-photon absorptiometry (DPA); (6) ≤ 10% of women on osteoanabolic/antiresorptive, or osteocatabolic (glucocorticoids) pharmaceutic agents; albeit only when the number of subjects was comparable between exercise and control.
We further excluded studies with (1) mixed gender or mixed pre-and postmenopausal cohorts without separate BMD analyses; (2) women undergoing chemo-and/or radiotherapy and (3) women with diseases that relevantly affect bone metabolism. (4) Duplicates from one study and (5) review articles, case reports, editorials, conference abstracts, and letters were not considered. Lastly, exercise study groups (see below) that cannot be classified on the intended type of exercise were also excluded from the present analysis.

Data Extraction
We designed a pre-piloted extraction form to extract relevant data. The form asked for details with respect to publication characteristics, methodology, participant characteristics, exercise characteristics, risk assessment and outcome characteristics. Two reviewers (SvS and MS) independently evaluated full-text articles and extracted data from the included studies, in case of inconsistency, a third reviewer decided (WK).

Outcome Measures
The primary outcome was change of (areal) BMD at LS-, FN-and TH-ROI as assessed by DXA or DPA between baseline and follow-up. In cases of multiple BMD assessments, we considered only changes between the baseline and final BMD assessments.

Quality Assessment
All studies included were independently assessed for risk of bias by two independent raters (WK and MV) using the Physiotherapy Evidence Database (PEDro) scale [11]. In case of inconsistency, a third reviewer decided (SvS).

Data Synthesis
For the detailed procedure to impute missing standard deviations (SD) the reader is kindly referred to the comprehensive meta-analysis of Shojaa et al. [3]. Briefly, if the studies presented a confidence interval (CI) or standard errors (SE), they were converted to SD. In cases of missing CI or SE data we first contacted authors (n = 11) to provide corresponding information. When no reply was received or data were not available, the exact p-value of the absolute change of BMD was obtained to compute the SD of the change. In the case of unreported p-value, we calculated the SDs using pre and post SDs.
In order to determine the effects of different types of exercise we categorized the studies according to the following approach: (a) dynamic resistance exercise, i.e., any kind of resistance exercise that involves joint movement to develop musculoskeletal strength. We focus on studies that applied isolated DRT without any adjuvant exercise component and without bone-specific warm ups (e.g., running, hopping, aerobic dance) with validated effect on bone [1,4,5], (b) weight bearing exercise that involved any kind of aerobic and anaerobic loading of axial skeletal sites due to gravity, i.e., Tai Chi, walking, running, dancing, movement games, heel drops, hopping, jumping and (c) exercise studies that combined weight bearing and DRT exercise, even if WB exercise was applied only shortly during warm up. The latter approach was selected due to the observation that only few cycles with high strain rates may induce positive effects on bone [12,13]. Two raters (WK and MV) independently categorized the data, in case of inconsistency, a third reviewer decided (SvS).

Statistical Analysis
The statistical analysis was performed using the statistical software R (R Development Core Team) [14]. Effect size (ES) value was considered as the standardized mean differences (SMDs) combined with the 95% confidence interval (95% CI). Random-effects meta-analysis was performed by applying the metafor package [15]. Heterogeneity for between-study variability was determined using the Cochran Q test, as with other statistical analyses a P-value < 0.05 was considered significant. The level of heterogeneity was analyzed with the I 2 statistic. For those studies with two different intervention groups, the control group was proportionally split into two groups for comparison against each intervention group [16]. Sensitivity analysis was conducted to check whether the overall result of the analysis is robust regarding the use of the imputed SDs. Funnel plots with regression test and the rank correlation between effect estimates and their standard errors, using the t-test and Kendall's τ statistic respectively, were applied to explore potential publication bias. To adjust the results for possible publication bias, we also conducted a trim and fill analysis using the L0 estimator proposed by Duval et al. [17]. The present subgroup analysis was conducted as a mixed-effect metaanalysis with "type of exercise" as the moderator. A P-value of < 0.05 was considered as significant for all tests.
The pooled number of participants was 2793 in the exercise and 2319 in the control group respectively. In detail, the number of participants in exercise and control was 1344 and 1175 women in the combined WB DRT group, 1045 and 815 women in the WB and 404 and 329 women in the DRT group. Table 1 gives the anthropometric participant characteristics of the included studies.

Methodologic Quality
The Pedro scores of the included studies are listed in Table 3. Methodologic quality of the trials ranges from 3 to 9 score points (Table 3), with a mean and SD of 5.44 ± 1.32 score points. Methodologic quality of the DRT studies was on average (6.24 ± 1.30 points) significantly higher (P = 0.024) compared with the other groups.

Outcomes
Apart from two studies [28,30] that applied DPA, all the others used DXA. Furthermore, all the other studies except two ( [23]: hip only; [30]: LS only) determined both, BMD at LS and proximal femur regions of interest.

Effect of Different Types of Exercise on LS-BMD
Sixteen DRT exercise groups, 26 WB exercise groups and 33 combined WB&DRT exercise groups evaluated the effect of exercise on LS-BMD. In summary, the pooled estimate of random effect analysis for DRT was SMD: 0.40, 95% CI 0.15-0.65 (P = 0.009), for WB exercise SMD: 0.26, 95% CI: 0.03-0.49 (P = 0.037) and SMD: 0.42, 95% CI 0.23-0.61 (P = 0.001) for the combined WB&DRT exercise. No significant differences between the types of exercise were observed (P = 0.508). All types of exercise revealed a similarly high level of heterogeneity between their trials (I 2 = 76.3-76.5%)) ( Fig. 2).

Effect of Different Types of Exercise on FN-BMD
Fifteen DRT exercise groups, 23 WB exercise groups and 25 combined WB&DRT exercise groups evaluated the effect of exercise on femoral neck-BMD. In summary, the pooled estimate of random effect analysis for DRT was SMD: 0.27, 95% CI 0.09-0.45 (P = 0.003), for WB exercise SMD: 0.37, 95% CI 0.12-0.62 (P = 0.004) and SMD: 0.35, 95% CI 0.19-0.51 (P = 0.001) for the combined WB&DRT exercise. No significant differences between the types of exercise were observed (P = 0.822). Heterogeneity level of included trials in the WB and WB&DRT group was considerable (I 2 : 82.1) or substantial (I 2 : 63.6); but was negligible (I 2 : 16.5) in the DRT group (Fig. 3).

Effect of Different Types of Exercise on TH-BMD
Ten DRT exercise groups, seven WB exercise groups and 12 combined WB&DRT exercise groups evaluated the effect of exercise on total hip-BMD. In summary, the pooled estimate of random effect analysis for DRT was SMD: 0.51, 95% CI 0.28-0.74 (P < 0.001), for WB exercise SMD: 0.40, 95% CI 0.21-0.58 (P < 0.001) and SMD: 0.34, 95% CI 0.14-0.53 (P < 0.001) for the combined WB&DRT exercise. No significant differences between the types of exercise were observed (P = 0.554). Heterogeneity level of included trials in the WB or DRT group was negligible (I 2 < 10%) and moderate (I 2 : 43.8%) in the WB&DRT group (Fig. 4). Funnel plots for LS, FN and TH did not suggest positive evidence of publication bias. The regression and rank correlation test for funnel plot asymmetry did not indicate significant asymmetry for LS or TH, but did for TH with missing studies to the right (positive difference/effects). The trim and fill analysis that correspondingly imputed three studies results in a slightly higher total SMD (0.43; 95% CI 0.31-0.54) than the non-adjusted results listed in Fig. 4.

Discussion
In this sub-analysis of a comprehensive meta-analysis, we clearly confirmed the significant positive effects of different types of exercise on BMD at LS, FN and TH in postmenopausal women. Further, WB type exercises, DRT and a combination of both types of exercise revealed at least no significant groups differences for LS, FN or TH-BMD. Thus, we verified all our hypothesis and in turn now question the data of Rahimi et al. [6]. One possible explanation for the diverging results of the present analysis and the data of Rahimi et al. [6] might be the focus on studies with    women 60 years + , i.e., the advanced postmenopausal status in the latter study. Considering that menopausal transition and early menopausal status is related to considerably increased bone turnover [92,93], there is some evidence that exercise might be more effective during early than in late post-menopause, at least with respect to trabecular bone loss [76,94]. The meta-analysis of Shojaa et al. [3] on this issue observed only slight, non-significant differences between exercise during the early vs. late postmenopausal years, 3 be it for LS (SMD "early": MV = 0.64, 95% CI 0.33-0.95 vs. "late": 0.39, 0.14-0.55) or total hip ROI (SMD: 0.51, 0.27-75 vs. 0.38, 0.20-0.56). Apart from age, both metaanalyses also differ with respect to eligibility criteria, i.e., randomization, language, publication type, medication and diseases, while the limitation on studies ≥ 6 months with healthy postmenopausal women without hormone replacement therapy and previous DRT are common to both studies. The most striking difference, on the other hand, is the low amount of studies classified into the exercise categories by Rahimi et al. [6]. Considering that only two studies were analyzed to determine the effect of WB aerobic exercise on LS-BMD (vs. n = 23 in the present study), one should draw definite conclusions from that data with extreme caution.
Although we consistently determined significant positive exercise effects on BMD-ROIs, (SMD: 0.26-0.51), SMDs of the single exercise trial vary substantially, particularly for the LS (I 2 = 76-77%). Even in the DRT group, which can be considered as the most homogeneous group with respect to exercise type classification (see above), the heterogeneity level for LS-BMD effects was substantial (i.e., I 2 > 75%). This is understandable, however, since considerable differences can be observed between the trials or study groups (Table 2) particularly with respect to exercise parameters (i.e., strain magnitude, rate [5]) and training principles (e.g., progression, periodization [5,95]).
Revisiting the effects of different types of exercise, it is noteworthy that the effect of the WB type interventions at the LS was considerably less pronounced compared with the DRT group (SMD: 0.26 versus SMD: 0.40). This is not necessarily related to higher effects of DRT-induced direct muscular impact on LS in general, however, but to the large number of WB studies that applied low ground reaction forces (e.g., walking: n = 11) with corresponding axial impact loading that might not (longer) reach the LS area. Two meta-analyses [96,97] that reported significant positive "walking effects" at FN-BMD without effects at LS-BMD support this estimation. Another surprising result is that the combined effect of WB&DRT group failed to generate relevantly higher BMD effects compared with DRT (…or apart  The point is awarded either for intention to treat analysis or when "all subjects for whom outcome measures were available received the treatment or control condition as allocated" 3 Early: ≤ 8 years vs. late postmenopausal > 9 years postmenopausal for LS-BMD (LS-BMD: 40, total hip: 20 exercise groups).

Fig. 2
Forest plot of meta-analysis results at the LS. The data are shown as pooled standard mean difference (SMD) with 95% CI for changes in exercise and control groups 1 3 Fig. 3 Forest plot of meta-analysis results at the femoral neck. The data are shown as pooled standard mean difference (SMD) with 95% CI for changes in exercise and control groups from LS-BMD, WB type exercise). Recent evidence-based guidelines that focus on bone development [1,4,5] consistently recommended exercise protocols that included impact activities and progressive resistance training applied with high strain magnitude and rate. However, at this point at the latest, we have to acknowledge and discuss the limited ability of meta-analyses to derive exercise recommendations [98], largely independent of the outcome [99]. Selecting the adequate type of exercise to address a given training aim is only the first, rough decision within the training process [5,95]. Much more challenging, particularly when addressing bone, is the consideration how to optimally specify the type of exercise in the light of the large variety of exercise parameters (e.g., strain magnitude, rate, duration, frequency, cycle number, rest periods) [5,95]. Another modifying aspect within the exercise process is the inclusion of exercise principles [5,95]. Applying, e.g., progression and periodization might not be important within a 10-week exercise intervention; however, considering that studies included in the present analysis on BMD average between ≥ 6 months and 30 months their relevance becomes obvious. The fact that even slight differences in exercise parameters, e.g., movement velocity of the concentric phase during DRT, significantly modify the effect on BMD [100] suggests that high complexity of exercise effects on BMD could conflict with the comprehensive meta-analytic approach. One may assume that the rather high number of study groups included in the present subgroups might even out differences at individual study levels, but this assumption is frequently wide of the mark. This might be confirmed by the considerably higher effects of DRT versus WB for TH-BMD (SMD: 0.51, 95% CI 0.28-0.74 vs. 0.34, 0.14-0.52), however, not for BMD Fig. 4 Forest plot of meta-analysis results at the total hip. The data are shown as pooled standard mean difference (SMD) with 95% CI for changes in exercise and control groups. HI high intensity, LI low intensity at the adjacent FN-region (SMD-DRT: 0.27, 0.09-0.45 versus SMD-WB: 0.37, 0.12-0.62, Table 2), a constellation for which no serious explanation 4 can be provided. Furthermore, some limitations and study features of the present analysis may decrease the evidence and generality of our finding. (1) Although we placed high emphasis on eligibility and reliable classification of the exercise types, some decisions are certainly debatable. This may be the case for the exclusion of the study of Rhodes et al. [101] 5 that combined non-weight bearing exercise (however only as a warm up) and DRT, while still including others (e.g., [66,67,87]. that applied a mixed weight bearing/non-weight bearing & DRT intervention. However, in our defense it should be noted that some studies were very lapse in their standards of exercise reporting, and so extracting the relevant information was sometimes challenging. (2) We conducted funnel plots with trim and fill analysis for the entire cohort of included studies for LS, FN and TH (not given). However, it might have been better to conduct separate funnel plots for the effects of the isolated exercise group for each ROI. On the other hand, reviewing the three funnel plots in detail, we did not observe relevant differences between the different exercise groups that might have significantly changed the present result. (3) We failed to generate reliable scores/categories for exercise intensity/strain magnitude across the different types of exercises, in order to conduct a sub-analysis for this crucial exercise parameters. A sub-analysis of our outcome adjusted for "exercise intensity/strain magnitude" might have resulted in more sophisticated results and higher overall treatment effects. (4) The present literature search was conducted up to March 1, 2019, i.e., some more studies might have been published in the meantime. However, considering the large amount of studies included in this systematic review and meta-analysis, we feel that the few additional exercise studies will not considerably modify our finding.
In conclusion, we do not share the enthusiasm for basing exercise recommendations or exercise guidelines on metaanalyses -at least in the area of "bone strengthening". Nonetheless, at least uncritical acceptance of the acquired data should be avoided if this is done. Accurately designed randomized controlled exercise trials that manipulate a dedicated single aspect while maintaining all other exercise parameters and confounders will be more qualified to generate reliable exercise recommendations.