Introduction

Osteoporosis is either the result of increased age-related bone loss [1] and/or insufficient bone accrual at growth [2]. Fracture prevention in old ages has mainly focused on pharmacological treatment [3] while life style changes have been advocated in young ages [2, 4]. A high peak bone mass is also associated with low fracture risk [2, 5,6,7,8,9]. The most potent and modifiable lifestyle factor may be physical activity (PA) [10,11,12,13,14,15,16], especially if provided in the late pre- and early peri-pubertal period [14,15,16]. But bone strength is not only dependent bone mineral content (BMC) [15, 16]. Bone-size, micro-, and macro-architectural structure and cortical porosity [10, 11, 14, 17] are traits that are independent of BMD associated with fracture risk [10,11,12, 14]. And non-skeletal factors, such as muscle mass, muscle strength, balance, neuromuscular function, and coordination, are also of importance [11, 17,18,19].

The overall positive effect of PA could therefore be underestimated, if only using BMD as a surrogate end point for fracture [20, 21]. A composite score for fracture, including several risk factors that independently predict fracture, would probably better estimate fracture risk. Composite score has been used in cardiovascular research [22,23,24,25,26] and with FRAX when evaluating fracture risk in elderly [27]. No such studies have been done when estimating fracture risk in children. This is however important since early accumulation of risk factors is believed to have greater impact on fracture risk than a single-risk factor. The advantage with cluster analysis is also that it yields a comprehensive perspective on the effect of a studied intervention to reduce the risk of a disease (fracture) [22,23,24,25,26]. We hypothesized that a composite score would better estimate the effect of PA than using BMC (or BMD) alone. We asked the following: (i) What is the effect of PA on single traits and a composite score for fracture? (ii) Could this score be used to identify the level of PA needed to reach beneficial effects?

Material and methods

The pediatric osteoporosis prevention (POP) study is a population-based prospective controlled PA intervention study that has previously been reported in detail [28, 29]. In summary, four neighboring, government-funded and community-based elementary schools were invited to participate. One school was chosen as intervention school and three as control schools. The intervention school PE from 60 to 200 min/week (40 min PE per school day) during the school term. The intervention was composed of ordinary PE lessons of moderate intensity level. The intervention was introduced shortly after the baseline measurement and continued throughout the study period. The control school continued with the Swedish standard curriculum. All activities were led by the same teachers as before study start. No other changes were done in the school curriculum.

We invited at baseline all the children in the first or second grade to participate. One hundred twenty-six out of 218 girls and 152/259 boys accepted to participate. We excluded two boys and one girl with medications that could affect the skeletal development and one girl as being 8 months younger than the others. The children were at study start 7.8 ± 0.6 years (mean ± SD) and 98% were of Caucasian ethnicity. Of the 124 girls and 150 boys included in the study, 119 girls and 150 boys attended the baseline measurement. At baseline, one girl had valid data from less than two variables and the baseline composite score was excluded. One hundred eight girls and 138 boys attended the re-measurement after 2 years. One hundred and thirteen girls and 145 boys answered the questionnaire regarding physical activity at baseline, and the corresponding figure at follow-up was 97 girls and 132 boys, respectively.

We collected at both baseline and follow-up anthropometric measurements of height (cm) and weight (kg) by standardized equipment (Holtain Stadiometer and Avery Berkel HL 120 electric scale). Body mass index (BMI) was calculated as weight/height2. A research nurse assessed Tanner stage [30] at baseline (all in Tanner 1) while self-assessment was used at the follow-up (four girls in Tanner 2). Life style (nutrition, alcohol, smoking, diseases, medications, weekly durations of school PA (physical educational (PE) classes), and durations of organized leisure-time PA in summer and winter) was evaluated by questionnaires [16]. We calculated the annual mean weekly duration of PA. Total PA was estimated as the sum of school PE and leisure-time PA.

We measured by dual-X-ray absorptiometry (DXA, DPX-L® version 1.3z, Lunar Corporation, Madison, WI, USA) the bone mineral content (BMC, g) and bone area (cm2) in the third lumbar vertebra; by quantitative ultrasound (QUS, Lunar Achilles model 1061®, Lunar Corporation, Madison, WI, USA) calcaneal broadband ultrasound attenuation (BUA, dB/MHz), being said to estimate also bone quality [31]; by vertical jump height (VJH) test neuromuscular function; and by a computerized dynamometer (Biodex System III Pro®) muscle strength as concentric isokinetic peak torque (PT) for right knee extension (ext) at a speed of 180°/s. We used the highest PT value of ten repeated movements. All methods are described in detail in the previous publications [11, 16, 29]. Our research technicians performed all measurements and daily calibrated the DXA apparatus with a phantom. There was no long-term drift in the equipment. The coefficient of variation (CV%), evaluated by duplicate measurements in 13 healthy children was 1.4–5.2% for BMC, 1.5 and 6.7% for BUA, and 12.3 and 12.3% for PText.

Drop-out analysis, by using data from the grade-one compulsory school health examinations, found no statistically significant group differences in age, height, weight, or BMI when comparing children who participated in the study with those who did not [28, 29]. Also, there were no statistical baseline differences on the variables included in the study, bone measures, muscle function, and physical activity, between the subjects who dropped out compared to those who completed the 2-year follow-up. For the children who did not attend either the baseline or the follow-up measurement, the composite score yielded from the measurement they did attend was not significantly different compared to the composite score for the children who attended both measurements. As the Z-score for each individual is calculated based on the mean and standard deviation of the cohort, the Z-scores might be affected by changes in the cohort. To make sure, the choice to include individuals attending only one measurement in the cohort did not skew the Z-scores for those individuals attending both measurements, an additional analysis was made. The analysis was made in order to evaluate if the composite score for this group differed when calculated based on all children (with both or only one measurement) compared to when based only on children attending both measurements. Then, the composite scores for the children attending both measurements were compared and no difference was recognized.

We used IBM SPSS Statistics® version 23 for statistical calculations. Skewed variables (Baseline BUA and PT for boys and follow-up BUA in boys and BMC, area, BUA and PT in girls) were normalized by natural logarithm. Data are presented as means with standard deviations (SD) and means with 95% confidence intervals (95% CI). We calculated gender specific Z-scores for each trait according to the formula ((the value of that specific individual−the gender specific mean value for the trait) / the SD for the trait). We used the mean of the Z-scores for BMC (third lumbar vertebrae), area (third lumbar vertebrae), BUA, VHJ, and PT as a composite score for fracture. The rational for including each trait in the composite score was that in publications, it has been shown to influence fracture risk and being influenced by activity [5,6,7,8,9, 18,19,20]. Also, we decided to include one measurement of each trait, since for example, DXA measurements in different anatomical regions within one individual are not independent on each other. A higher Z-score in the combined risk score would then hypothetically have a fracture preventive effect. But since there are no studies that have used this score to predict fractures, we could not transfer what for example, a 0.3 standard deviation (SD) benefit would correspond to reduced fracture incidence or a − 0.3 SD decrease in increased fracture risk. We then used linear regression (Pearson correlation) to assess correlation between PA and each trait (adjusted for age and height (as an estimate of growth)) and composite score (adjusted for age and height). The correlations were analyzed cross-sectionally at baseline and follow-up, respectively. We used a multilevel linear mixed model with unstructured covariance structure and fixed intercept only to examine the effect of PA on the Z-score for each trait and the composite score, adjusted for age and height. An additional analysis was made where an interaction term between PA and PA quartile was added to the model. Subjects were nested within schools. The assumptions of the models were checked with residuals analyses. Statistical significance was set at a level of p < 0.05. This study is approved by the Lund University Ethics Committee (LU 453-98, LU-368-99), registered as a clinical trial with registration identification ClinicalTrials.gov.NCT00633828, and carried out in accordance with the Declaration of Helsinki. Written informed consent was obtained from the parents or guardians of each child before study start.

Results

Gender specific data are presented in Table 1. At baseline, we found no correlation between PA and individual traits or the composite score (Table 2).

Table 1 Descriptive data for age, anthropometrics, bone mineral content third lumbar vertebra (L3), bone size (area) third lumbar vertebra, bone quality (QUS) calcaneus, neuromuscular function, and knee muscle strength at baseline and the 2-year follow-up. Data are provided as means (SD)
Table 2 Correlation between duration of physical activity (PA) and Z-score for bone mineral content third lumbar vertebra (L3), bone size (area) third lumbar vertebra, bone quality (QUS) calcaneus, neuromuscular function, and knee muscle strength and the composite score (mean Z-score for BMC, bone size, bone quality, neuromuscular function, and muscle strength) for fractures. Correlation was done by Pearson correlation analyses after adjustment for age and height. Statistical significant differences are italicized

At follow-up, we found correlation between PA during the study period and the composite score (r = 0.18, p = 0.01), but (except for BUA) no other correlations between PA and individual traits (Table 2).

The linear mixed model analysis revealed no significant effect of which school the children attended on composite score. However, PA had effect on composite score, BMC, and area (Table 3). Also, when adding the quartiles of physical activity to the model, the lower quartile (Q1) had a significantly lower result compared to Q2 (p < 0.001), Q3 (p < 0.001), and Q4 (p < 0.001), respectively, for composite score.

Table 3 Estimates for the Z-scores for bone mineral content third lumbar vertebra (L3), bone size (area) third lumbar vertebra, bone quality (QUS) calcaneus, neuromuscular function, and knee muscle strength and the composite score (mean Z-score for BMC, bone size, bone quality, neuromuscular function, and muscle strength) for fractures. A multilevel linear mixed model with unstructured covariance structure and fixed intercept only was used to examine the effect of PA on the composite score after adjustment for age and height. Statistical significant differences are italicized

Discussion

Our study shows that PA enhances the gain in several traits that independently reduce the fracture risk, as being estimated by a composite score, and that the pre-pubertal years probably is an important period for future fracture risk, as the score could be influenced in this period by also moderate PA.

We have previously shown in the POP cohort that PA confers benefits in BMC [16, 17] and muscle strength [17, 29]. These effects were however only found on group level while no participant specific correlations could be shown between PA and the gain in each trait [16, 17, 29]. That is, if only relating the effect of PA to single traits, we may have disregarded beneficial effects on the individual level. In contrast, the current study infers that with clustering of risk factors as end point, there is actually an individual correlation between PA and the composite score. Composite scores have been used as end point variable for the effect of PA in childhood on cardiovascular disease [22,23,24,25,26], but this is to our knowledge the first time a pediatric composite score has been used to evaluate the effect by PA on fracture.

With the composite score, we may now be able to explain the previously unexplained magnitude of fracture reduction by PA in the POP study [20, 21]. In the previous studies, we found that the school-based PA intervention program was accompanied by an annual reduction in fracture incidence by 50% reduction in fracture incidence ratio during the seventh and eighth year of the intervention [20, 21]. The accompanying BMD benefit could however only explain 25% of the fracture reduction [11, 20, 21, 32]. In the current study, we found a plausible explanation, as PA is associated with measurable fracture risk benefits beyond those rendered by BMD.

Hypothetically, a PA intervention program would provide the greatest benefit in children with the lowest voluntary chosen PA. However, we have not been able to show this in previous reports when using BMD alone as the surrogate end point for fracture risk [16, 17]. This study could not pinpoint the most beneficial level of PA for the composite risk score. But we could show when comparing children within the different quartiles of PA that children in the lowest quartile of PA differed from the rest. This corroborates with studies only evaluating bone mass, finding that the benefits are gained with a relative short duration or repetition of activities but that duration of PA are of less importance [33, 34]. At follow-up, the girls in the lowest quartile had a weekly duration of PA below 3 h and boys below 4.4 h. This can be set in perspective of the 3.3 h/week of PE we exposed the POP intervention group to [16, 17, 29], a duration that would lift most both girls and boys beyond in this study shown, unfavorable level of PA.

Study strengths include the population-based and prospective study design and the inclusion of multiple independent surrogate endpoint variables for fracture risk. A major limitation is the lack of information on non-organized PA during leisure time, on participation rate, and on activity level during the PE classes. It would also have been advantageous with data on incident fractures to address if the composite score can identify fracture-prone individuals better than BMD alone. This is not possible to do in our study as the participants were too few in relation to the low fracture risk in these ages. It is thus not possible to state that the children in the lowest quartile of PA actually had more fractures than children within higher quartiles. It must also be confirmed in future studies that the PA-induced benefits in the composite score remain into older ages. By use of mean Z-scores in the model, each factor is given equal importance for fracture risk reduction, which may be presumptuous. We can further not state if it is the low amount of PA that leads to the clustering effect or if children with low neuromuscular function and muscle strength are less physically active. The well-known difficulty to estimate PA in children [35] is another concern. However, objective measurements of PA, such as by accelerometer, are also criticized as these devices most often measure PA for few days, but represent the level of PA for a long period.

In summary, we conclude that low daily PA induce an unfavorable clustered risk factor score for fracture already at a young age and that high PA in the pre-pubertal period could induce a beneficial composite score as an estimation of fracture risk. Future prospective studies should evaluate if the effect of PA on the composite score remains in adulthood, and most important, if the composite score better predicts fractures than BMD alone.