Introduction

Numerous observational studies have reported higher areal bone mineral content and areal bone mineral density (aBMD) in physically active children compared with sedentary children [4, 5, 10, 11, 20]. Furthermore, there are biophysical reasons to expect higher aBMD and bone mineral content with increased bone-loading activities [14]. However, there is a possibility that physically active children differ from other children in ways that may confound these results. For example, physically active children might eat differently than other children or have more muscle than other children before they are physically active. It is possible these other factors are leading to greater aBMD and bone mineral content and not physical activity per se.

During growth, increased body weight, muscle strength, and longitudinal bone growth lead to increased loads placed on the skeleton. As reported to be proposed by Wolff in 1892, and later developed as the “mechanostat” theory by Frost, bone adapts to these loads by increasing its strength [14]. It has been suggested that periods of growth are the best time to influence bone through increased loading owing to the high rates of bone modeling and remodeling that are occurring [44]. During prepuberty and early adolescence, periosteal surfaces are growing rapidly, whereas during late adolescence, endocortical apposition is occurring and cortical thickness is increasing. It is possible that exercise at these different periods of development may affect bone differently: exercise during the prepubertal period and early adolescence may affect periosteal surfaces, whereas exercise during late adolescence may affect endosteal surfaces and cortical thickness. The bone response to exercise around the time of puberty also may vary by sex. During puberty boys experience greater periosteal expansion likely resulting from growth hormone, IGF-1, and testosterone, whereas girls have greater endosteal contraction likely resulting from the inhibitory effects of estrogen on periosteal formation and stimulatory effects on endocortical bone formation [17, 47].

Randomized trials are the gold standard for evaluating whether an intervention is effective. Numerous trials have been completed evaluating the effect of bone-loading activities on pediatric bone [3, 6, 8, 9, 15, 23, 25, 26, 31, 32, 35, 36, 38, 39, 4143, 48, 49, 51, 54, 55]. Unfortunately, most used dual-energy x-ray absorptiometry (DXA) methodology to assess bone changes such as bone mineral content and aBMD [6, 9, 15, 23, 25, 26, 35, 36, 38, 39, 4143, 48, 49, 51, 54, 55] and few report results pertaining to changes in bone area by peripheral quantitative computed tomography (pQCT) [3, 26, 31, 32, 48]. Therefore, it often is not possible to determine whether exercise increases bone size. In our analyses, we therefore included DXA measures of bone mineral content, bone area, and aBMD.

The purpose of this review was to determine whether data from pediatric trials (randomized, prospective, or historically controlled) could answer the following questions: (1) Does exercise in childhood consistently increase bone mineral content, bone area, or aBMD? (2) Do effects of exercise differ depending on pubertal status or sex of the children? (3) Does calcium intake modify the bone response to exercise?

Search Strategy and Criteria

The literature was searched for reports of exercise intervention trials in normal healthy children 3 years old and older. The following search criteria were used in MEDLINE, limited to clinical trials published in English: exercise[All Fields] AND bone[All Fields] AND pediatrics[All Fields] (n = 23); physical activity[All Fields] AND bone[All Fields] AND pediatrics[All Fields]) n = 37); and exercise[All Fields] OR physical activity[All Fields] AND bone[All Fields] AND children[All Fields] (n = 231). From these 291 references, 242 unique references were identified.

Pediatric studies that reported an exercise intervention and had a control group with prospectively collected bone measurements before and after the activity intervention (and control period) were included. Nonrandomized trials or trials that randomized schools rather than individuals were included owing to the small number of trials that randomized individual children.

The 242 references were further screened by one of the authors (BS) based on the article title and the following (n = 190) were deleted (Fig. 1): articles related to interventions in populations with disease or obesity (n = 85), activity interventions conducted on a population younger than 3 years (n = 8), orthopaedic-type interventions or procedures (n = 31), observational studies related to either activity or calcium intake (n = 28), or miscellaneous articles (n = 38; eg, bone studies related to drug treatments, kinetic or metabolic studies, behavioral interventions aimed at increasing activity or calcium intake, vitamin D studies, dietary interventions, studies measuring ground reaction forces, surveys, strength training studies, early life determinants of adolescent bone, anthropometric studies).

Fig. 1
figure 1

The flowchart shows the numbers of articles initially identified and exclusion and inclusion steps.

The remaining 52 studies were retrieved and reviewed: eight were excluded because of lack of appropriate bone measures (bone mineral content, bone area, or aBMD not included) or the method used to obtain bone measures was not dual photon absorptiometry, DXA, or peripheral quantitative CT (pQCT); 14 were excluded as the study did not include an activity or exercise intervention (eg, calcium supplementation trials that also measured activity levels). Total body, femoral neck, and spine sites by DXA most often were measured.

A total of 30 trials were reviewed with sample sizes ranging from 16 to 410 total subjects, ages ranging from 3 to 18 years, and the length of the intervention ranging from 3 to 36 months. On review of these trials an additional seven trials were identified and added to those being reviewed [8, 29, 38, 42, 43, 49, 55] (Table 1). Of these trials, 14 were excluded because of duplication of study populations (Table 1), and one did not present measures of variance for changes in bone outcomes [8]. Two studies had inconsistencies between tables or between tables and graphs resulting in only a fraction of the bone data being used [23, 42]. Authors of two studies provided mean percent changes and SDs of percent change by sex and pubertal status for control and intervention groups because these could not be determined from the articles [41, 54].

Table 1 Summary of pediatric intervention studies

Two authors (BS, NWT) reviewed each paper and assigned a level of evidence based on material published by the Oxford Centre for Evidence-Based Medicine, Oxford, UK [24] (Table 1). When there was disagreement, the study was discussed and a consensus reached. Owing to the scarcity of randomized trials related to the effect of bone-loading activities on bone accretion, nonrandomized trials that included a comparison group with prospective bone measurements were included. In addition, some trials involved clustered randomization by classroom or school (n = 7), rather than randomizing the individual (n = 9), owing to the ease of performing the intervention in classrooms or schools. The majority of the cluster-randomized trials did not take clustering into account during the analyses (six of seven), which influenced scoring of the level of evidence. Trials that had a cluster randomization were assigned a Level III unless the analysis took into account clustering [41]. Trials in which the individuals were randomized received a Level II despite that individuals were not blinded to the intervention. There were four nonrandomized trials with parallel groups (three involved cluster assignment) and two nonrandomized clustered trials that used a previous control group that received a Level III rating (Table 1).

Some trials provided estimates of bone change within different subgroups (eg, prepubertal versus pubertal, males versus females) [25, 28, 31, 36, 39, 48, 49, 54]. Results from each of these subgroups were used in this review. In addition, three studies were designed specifically to investigate the effect of exercise on bone by pubertal status [23, 35, 41]. To compare study findings, mean percent changes in bone outcomes were determined for each subgroup and results were expressed as percent change from baseline.

Because DXA measures of bone area may not be sensitive enough to detect subtle effects of exercise on bone size, results from pQCT tibia measurements were reviewed. However, a meta-analysis of these findings was not performed owing to lack of reporting on variability of change in intervention and control groups.

Nutritional intake was not available for the majority of the studies that focused on the effect of bone-loading activity on bone. Although calcium intake may influence the bone response to exercise or bone-loading activities, it would be expected that mean calcium intakes would not differ between intervention groups in a study. There were three studies designed specifically as two-by-two factorial trials to investigate the effect of calcium intake on the bone response to exercise [6, 25, 48]. Since all three studies found statistically significant interactions between calcium intake and exercise group a meta-analysis was not performed and an overview of the results are presented below.

Analysis

To compare studies on the same scale in the pooled analysis, we calculated the percent change in bone mineral content, bone area, and aBMD for each study population. We used the raw mean difference (intervention mean effect–control mean effect) as our outcome measure. Subgroup analyses were conducted by pubertal status. Subgroup analyses also were performed by sex for bone mineral content outcomes, but the number of studies providing sex-specific data for bone area and aBMD were too small to test for sex-specific effects. All meta-analyses were performed using the metafor Package from R Statistical Computing [46, 53].

Heterogeneity and Publication Bias

Because differences in the measurement methods and population characteristics in published studies may introduce variability among studies, a random effects model was used to account for heterogeneity among studies. Heterogeneity was estimated using the restricted maximum likelihood estimator procedure and tested using Cochran’s Q-test [21]. We performed meta-regression to determine whether study covariates, including study intervention length, mean age, mean calcium intake, pubertal status, or sex could explain heterogeneity among studies. Heterogeneity was not significant for femoral neck aBMD and we were able to reduce spine aBMD heterogeneity by including calcium intake and intervention length as covariates (Table 2). However, significant heterogeneity among studies was observed for all bone mineral content and bone area measurements. Inclusion of covariates did not explain the variation in bone mineral content and bone area among studies. Almost all reported trial results were expressed as marginal means for bone mineral content adjusting for potential covariates, and the covariates differed significantly among the trials (Table 2). It is possible that this may have led to the significant heterogeneity that was observed. To assess potential publication bias, we used asymmetric funnel plots [13]. No publication bias was present based on funnel plot analyses (Table 2).

Table 2 Difference in percent change between intervention and control groups by meta-analysis

Results

Does Exercise in Childhood Consistently Increase Bone Mineral Content, Bone Area, or aBMD?

Children assigned to the exercise interventions had significantly greater increases in bone mineral content and aBMD, but not bone area, than children assigned to the control groups. The overall mean difference between the percent change in bone mineral content in the intervention and control groups was 0.8% (95% CI, 0.3–1.3; p = 0.003) (Fig. 2) for total body; 1.5% (95% CI, 0.5–2.5; p = 0.003) for femoral neck; and 1.7% (95% CI, 0.4–3.1; p = 0.01) for spine.

Fig. 2
figure 2

The forest plot shows the mean difference between the exercise and control groups in total body bone mineral content percent change by pubertal status. The size of the squares is proportional to the inverse of the variance and the error bars represent the 95% CIs. The CIs for the pooled mean difference are shown by the diamond-shaped figure. CIs that include 0 are not statistically significant. Table 2 shows p values for pooled mean differences.

Results for femoral neck and spine bone area were not significant indicating no effect of exercise on bone area (Table 2). This finding is consistent with the majority of pQCT studies that report no differences in the increase in bone cross-sectional area among children assigned to exercise versus those who were not [3, 26, 32], although a study in 3- to 5-year-old children did find an effect [48]. A meta-analysis on the pQCT results could not be performed because of lack of reported measures of variability in the percent change in intervention and control groups in those reports.

Overall, results for aBMD were similar to those observed for bone mineral content at the femoral neck (Fig. 3) and spine (Fig. 4) with children in the exercise groups having a greater change in aBMD than children in the control groups (Table 2).

Fig. 3
figure 3

The forest plot shows the mean difference between the exercise and control groups in femoral neck areal bone mineral density (aBMD) percent change by pubertal status. The size of the squares is proportional to the inverse of the variance and the error bars represent the 95% CIs. The CIs for the pooled mean difference are shown by the diamond-shaped figure. CIs that include 0 are not statistically significant. Table 2 shows p values for pooled mean differences.

Fig. 4
figure 4

The forest plot shows the mean difference between the exercise and control groups in spine areal bone mineral density (aBMD) percent change by pubertal status. The size of the squares is proportional to the inverse of the variance and the error bars represent the 95% CIs. The CIs for the pooled mean difference are shown by the diamond-shaped figure. CIs that include 0 are not statistically significant. Table 2 shows p values for pooled mean differences.

Do Effects of Exercise Differ Depending on Pubertal Status or Sex of the Children?

The beneficial effects of exercise on bone accrual were limited to children who were prepubertal and there were no sex differences in the response to exercise. Children who were prepubertal who were assigned to exercise had larger increases in total body (Fig. 2), femoral neck, and spine bone mineral content than children who were prepubertal who were assigned to the control groups (Table 2). A forest plot by pubertal status indicated that the mean difference in the percent change between exercise and control groups in spine aBMD (p < 0.001) (Fig. 4) was significant and the effect on femoral neck aBMD was marginally significant (p = 0.07). There was no significant effect of exercise on bone mineral content or aBMD in any of the children who were postpubertal (Table 2). There were insufficient numbers of studies (fewer than five) reporting changes in bone area to determine whether the effect of exercise on bone area differed by pubertal status.

For studies specifically designed to test for a pubertal effect of exercise on pediatric bone, MacKelvie et al. [35] found that a school-based intervention led to greater bone accrual at the femoral neck and spine in females in early puberty but not in females who were prepubertal. Heinonen et al. [23] reported greater gains in femoral neck and spine bone mineral content in females in early puberty who participated in the exercise intervention than those who did not, but no difference was observed in females who were postpubertal. A school-based study by Meyer et al. [41] included two different age groups (6–7 and 11–12 year olds) to specifically determine whether the bone response to exercise differed between children who were pre- or early pubertal. They found a significant group-by-puberty interaction with the effect of exercise on the whole body, femoral neck, and spine bone mineral content being greater in children who were prepubertal than children who were early pubertal; males and females responded to exercise in a similar manner.

There were no sex differences between exercise and control groups in total body, femoral neck, or spine bone mineral content (all p > 0.05; data not shown), and there were insufficient numbers of studies to evaluate sex differences in bone area and aBMD.

Does Calcium Intake Modify the Bone Response to Exercise?

A meta-analysis was not necessary to evaluate whether calcium intake modifies the bone response to exercise since all three [6, 25, 48] specifically designed to investigate the effect of calcium intake on the bone response to exercise found that the increase in leg bone mineral content with exercise was greater in children who were randomized to receive supplemental calcium (statistically significant calcium-by-exercise interaction) (Fig. 5). The differences in percent change ranged from 1.5% to 3.7% greater in children assigned to exercise compared with children assigned to the control group, and in all three studies this effect was statistically significant. Another study [26] found a correlation between change in leg bone mineral content and calcium intake among the intervention group but not the control group, supporting the hypothesis that calcium intake modifies the bone response to exercise, at least for leg bone mineral content.

Fig. 5
figure 5

The mean differences in percent change between the exercise and no-exercise groups for leg bone mineral content are shown, with the dashed line connecting the effect of exercise in the calcium and placebo groups in the same study. The differences between the groups were greater at higher calcium intakes in all three studies (all exercise-by-calcium interactions significant at p < 0.05).

Discussion

The purpose of this meta-analysis was to determine whether children randomized to exercise interventions have greater increases in bone mineral content, bone area, or aBMD than children randomized to a control group, and whether bone benefits of exercise varied by pubertal status, sex, or calcium intake. Although it commonly is believed that exercise has significant bone benefit, results from exercise trials do not always show a greater increase in bone mineral content, bone area, or aBMD in children assigned to exercise compared with no exercise. Our current analysis supports a benefit of exercise on bone mineral content accretion and gains in bone mineral content and aBMD, with no effect on bone size. The benefits of exercise appear to be limited primarily to children who are prepubertal, with no sex differences. Calcium intake modifies the bone response to exercise, with a greater exercise effect in children with higher calcium intakes.

One of the limitations of many of the studies is the randomization of schools rather than the individual child to the intervention. A major advantage of randomization is to increase the chance that the groups are not different at baseline in terms of other factors, or potential confounders, that could be associated with bone changes. It is likely that children in a school are more similar in many other factors (eg, ethnic background, dietary intakes, and other related factors) than children between schools. The major advantage of randomizing a group of individuals (classroom or school) rather than the individual is in the feasibility of performing the study and reduced costs to conduct the study. It is much easier and less expensive to incorporate an exercise program in a classroom or school rather than to ask individual children to participate in interventions outside their usual classroom activities.

Significant heterogeneity was observed in the meta-analysis. Heterogeneity among studies could arise for several reasons. For instance, the original studies used various covariates, especially for bone mineral content measurements, and most reported marginal or least-square means (not raw means) that were used in the meta-analysis. The majority of trials involved high-impact activities and few reported actual increases in lean mass or muscle strength. Increased lean mass or muscle strength would apply forces on bone beyond the forces from the impact activity and theoretically should lead to even greater bone response. The variable responses in terms of changes in lean mass or muscle strength may have contributed to the heterogeneity that was observed. Additionally, there were wide ranges in the length of the interventions and the types and intensity of exercise prescribed. However, the inclusion of intervention length in the meta-analysis did not reduce the heterogeneity that we observed in total body, femoral neck, and spine bone mineral content.

Based on the meta-analysis, the overall effect of exercise during growth was to increase bone mineral content and aBMD. Some studies found a positive effect of exercise on bone mineral content or aBMD but only when they limited their analyses to compliant participants [49, 54]. Additionally, children who do not routinely load their skeletons seem to be more responsive to an exercise program as supported by findings from two of the trials [36, 52]. MacKelvie et al. [36] found a positive effect of exercise when they limited their analyses to children who had low or average BMI and not those with high BMI. This could be the result of increased loading of the skeleton that already occurs in children with a higher BMI. This is further supported by the findings of Van Langendonck et al. [52], who showed that the effect of exercise was significant only among females with minimal weightbearing activity during leisure time. Although it has been proposed that bone loading during early adolescence may augment the increase in bone size that occurs during this period of growth [44], few results support this. Because DXA measurements of bone area may be unable to detect subtle changes, we also looked at studies that measured cross-sectional area or periosteal circumference of the tibia [3, 26, 32]. Only one study found that children randomized to exercise had a greater increase in periosteal circumference and this was a study of 3- to 5-year-old children [48]. Thus, if exercise or bone-loading activities do influence periosteal expansion, it would be at very young ages because it has not been reported in older children.

Our meta-analysis results showed a benefit in children who were prepubertal but not children who were early or postpubertal. These results are consistent with those of Meyer et al. [41]. They enrolled two age groups with distinct pubertal stages to an exercise intervention and a control group and were able to formally test for pubertal status-by-intervention (significant) and sex-by-intervention (not significant) interaction. There is large natural variability in skeletal growth and sexual maturation among children, both of which are difficult to control for in any longitudinal study around the age of puberty. To determine whether pubertal status or sex of the child modify the bone response to exercise, it is important to design studies in a manner that will allow for these interactions to be formally tested. The sample sizes required for these types of studies, and therefore the costs associated with conducting them, are large because of the need to have adequate power in each pubertal stage and sex category.

As we showed, the trials designed to test for a calcium-by-exercise effect all found that the gain in leg bone mineral content in response to exercise was greater in the group of children randomized to receive supplemental calcium. Few studies have controlled for calcium intake when investigating the influence of exercise on bone, and this may partially explain the inconsistent findings and heterogeneity among studies. However, we did consider calcium intake in our meta-regression as a possible reason for the high degree of heterogeneity but did not find it to be a significant predictor of total body, femoral neck, or spine bone mineral content. It could be that the modifying effect of calcium intake on the bone response to loading is seen only in the bones that were directly loaded (eg, the legs).

This meta-analysis indicates that bone-loading exercise interventions can lead to a greater increase in bone mineral content and aBMD but may not affect bone area, and that children who are prepubertal appear to be more responsive to bone loading than children who are postpubertal, and other factors such as calcium intake may modify the bone response to loading. Simple exercise interventions during childhood led to 0.6% to 1.7% greater annual increase in bone accrual. If this effect were to persist into adulthood it could have substantial implications for osteoporosis prevention. It is important to identify the sources of heterogeneity in the results of the pooled studies to identify factors that may influence the bone response to increased exercise during growth. Because most studies were completed among girls, the question regarding whether increased bone loading during growth affects bone similarly in boys and girls at various times throughout puberty has not been adequately addressed, especially because few studies have been conducted in boys who are prepubetal.