Introduction

Although effective agents for the treatment of osteoporosis have been available for decades, and new agents with more convenient dosing regimens have been more recently introduced, treatment rates remain low, drug compliance remains poor, and the clinical and economic burden of fracture remains high [1,2,3,4,5]. Assessment of fracture risk, which may inform treatment decisions, has tended to focus on a long-term horizon and populations with a broad risk range, such as the FRAX tool that calculates 10-year fracture risk and other risk assessment instruments [6,7,8,9]. For a high-risk population such as elderly women with osteoporosis, however, it may be more appropriate to develop predictive models for fracture risk over a shorter time horizon. Long-term risk is not evenly distributed over time since annual incidence of fracture increases with age. Moreover, dynamic changes in health status often occur in this population over time, which may impact the risk of near-term fracture. For example, after a recent fracture, the risk of subsequent fracture is highest in the next 12–24 months [10].

Notwithstanding this understanding of the prominent increase in subsequent fracture risk after an initial fracture, there is relatively sparse literature on factors that may contribute to near-term risk of fracture among women aged ≥65 years with established osteoporosis. We thus undertook a study of risk factors and other predictors for hip and non-vertebral fracture occurring within 1 year in this patient population.

Methods

Study design and data source

A repeated-observations design and data from the Study of Osteoporotic Fractures (SOF) were employed. The original cohort, recruited from four US metropolitan areas—Baltimore, Pittsburgh, Minneapolis, and Portland—was enrolled in 1986 and included 9704 (primarily Caucasian) women aged ≥65 years; another 662 African-American elderly women were added in 1997. The SOF has served as the basis of important studies on osteoporosis, aging, and risk factors for fractures [11,12,13,14,15,16,17].

SOF participants underwent exams every 2 years between the first and sixth visit and, thereafter, were examined approximately every 4 years to assess BMD, body weight, cognitive function, lifestyle, medical history, medication use, physical function and performance, quality of life, sleep, vision, vital signs, and/or prior falls (although not all assessments were performed at each exam). Participants were queried tri-annually and via direct contact with participants and providers regarding the occurrence of falls and fractures during the prior 4-month period. Data from nine completed visits—spanning approximately 20 years—are currently available.

The current study employs data from exam no. 4 (1992–1994), exam no. 5 (1995–1996), exam no. 6 (1997–1998), exam no. 8 (2002–2004), and exam no. 9 (2006–2008), as well as data from tri-annual questionnaires administered during the intervals between exams. Observations from exam nos. 1–3 and exam no. 7 were not considered since data on potentially important time-dependent risk factors and other predictors (e.g., medication use) were not collected at these exams. Participant observations were excluded from analyses if data on the candidate risk factors were missing for the corresponding exam, and such information could not be reliably imputed from observed data (i.e., because values for these variables may exhibit non-negligible variability over time). In general, data collection was very complete—≥95% of subjects had data at each exam for the great majority of potential risk factors and other predictors of interest. Details concerning data availability and methods of imputation are available from the authors upon request.

Study population

The study population comprised women in the Caucasian cohort who, at SOF exam no. 4 or a subsequent exam (excluding exam no. 7), had osteoporosis defined as a T-score ≤-2.5 at the total hip. Each qualifying exam constituted a separate baseline at which potential risk factors for fracture were measured, and the incidence of subsequent fracture (i.e., after each qualifying exam) was ascertained. Study subjects could contribute up to five observations in total, one per qualifying SOF exam, and all observations were pooled for analyses.

Study outcomes

Study outcomes included fracture of the hip and fracture of any non-vertebral site (including hip) and were ascertained beginning on the day after the date of each qualifying exam and ending 365 days later, on the date of loss to follow-up, or on the date of death, whichever occurred earliest. Fracture of the hip was defined as an incident (i.e., new), non-traumatic fracture of the femoral neck, intertrochanteric line, or other hip-related site. Fracture of any non-vertebral site was defined—by SOF in a composite measure—as an incident, non-traumatic fracture of the ankle, clavicle, elbow, face, foot, finger, hand, heel, hip, humerus, knee, lower leg, pelvis, rib, toe, upper leg, or wrist. Outcomes (and corresponding dates of occurrence) were identified in the SOF via tri-annual participant questionnaires and—occasionally—via direct contact with participants and providers. Only fractures that were confirmed via a formal adjudication process conducted by SOF investigators were considered in the current study.

Risk factors

Potential risk factors and other potential predictors for near-term fracture included demographics (e.g., age), BMD, anthropometric measures (e.g., current weight), prior fracture (i.e., since age 50 years)/falls (e.g., number of falls in past 12 months), lifestyle variables (e.g., current smoker, walking for exercise), medical history (e.g., arthritis, diabetes, Parkinson’s disease), medication use (e.g., use of bisphosphonates, anticonvulsants, oral steroid, and estrogen in past 30 days), morphometry (e.g., prevalent vertebral fracture (X-ray confirmed)), cognitive function (short mini-mental state examination (MMSE) exam score), physical function (e.g., number of functional status impairments), physical performance (e.g., chair stand, walking speed), quality of life (e.g., self-rated health, Geriatric Depression score), and vision (e.g., average contrast sensitivity score). A complete list of candidate risk factors and other predictors, and corresponding definitions, is available from the authors upon request.

Statistical analyses

Crude (unadjusted) risks of 1-year hip fracture and non-vertebral fracture were estimated for patients stratified by each potential predictor separately, as were corresponding (unadjusted) hazard ratios using a frailty/mixed-effects model (an extension of the Cox proportional hazards model that accounts for intra-cluster (i.e., intra-subject dependencies)). Potential risk factors and predictors continuous in nature were also evaluated in a multi-level context; thresholds separating categories for a given predictor were defined initially based on the quintiles of their distributions; some thresholds were subsequently modified based on distributional properties of the empirical data and thresholds previously employed in published clinical research.

A multivariable frailty/mixed-effects model was employed to identify independent predictors of 1-year hip and non-vertebral fracture. All factors with a p value <0.10 in unadjusted analyses were initially included in the multivariable model; grouped multi-level factors were included if any level had a p value <0.10. Variables that were no longer important predictors in a multivariable context and those not readily measured in primary care practice (i.e., grip strength, contrast sensitivity, and depth perception) were excluded from the model. Also excluded was receipt of bisphosphonate therapy, as receipt was unusual during the SOF (only 8% of women received bisphosphonates). The importance of interactions between predictors was evaluated via the stepAIC method employing backward and forward selection. The presence of multicollinearity, hazards assumptions, and model overfitting were evaluated using published methods [18, 19]. Because it was anticipated that mortality would be high given the age of the study population, death was treated as a competing risk to avoid overestimating the risk of fracture [20].

Model discrimination was quantified based on the c-statistic, which is the probability that among two randomly selected patients the patient with the higher predicted risk of an event will be the first to experience the event. The c-statistic ranges from 0.5 (model discrimination is no better than chance) to 1.0 (model discrimination is perfect). A c-statistic between 0.70 and 0.80 is typically considered “acceptable,” while a value exceeding 0.80 is typically considered “excellent.”

A scoring system for hip fracture was developed using methodology set forth by Wilson et al. and employed in other studies [21,22,23,24]. A corresponding system was not developed for non-vertebral fracture. Although several independent risk factors for near-term non-vertebral fracture were identified in the multivariable analysis—as reported herein—model discrimination was poor, which limits the usefulness of a scoring system. Specifically, beta coefficients from the final multivariable model were converted to scores by multiplying each by 10 and rounding to the nearest integer. A fracture total risk score was calculated for each study subject by summing the individual scores corresponding to her risk factors; the baseline hazard function from the Cox model was then employed to convert the total risk score to a 1-year probability of hip fracture, as follows: p(hip fracture) = 1 – 0.997exp[0.1*total risk score], where 0.997 is the estimated 1-year probability of not experiencing a hip fracture, and thus 1 – 0.997 is the estimated 1-year probability of hip fracture, for persons with the lowest risk (i.e., those with a total risk score equal to 0). To verify that estimates of risk produced by the scoring system were consistent with observed data, subjects were stratified into quintiles based on their risk scores, and average risks calculated from the scoring system were compared with observed risks. Calibration was evaluated using the chi-square statistic.

Results

Risk factors for near-term fracture

The study population included 2499 women with osteoporosis who contributed 6811 observations, approximately 23% of the total population of women aged ≥65 years in the SOF Caucasian cohort. During the 1-year follow-up, 2.2% of study subjects had a hip fracture. In bivariate analyses, a number of risk factors were found to be important in predicting hip fracture (online supplement, Table 1).

In multivariable analyses, independent predictors of hip fracture included total hip T-score, prior non-vertebral fracture, walking speed, MMSE, and use of arms for chair stand or poor/very poor tandem stand; model discrimination based on the c-statistic was 0.71 (0.67–0.76) (Table 1). Any non-vertebral fracture (including hip) during the 1-year follow-up was identified in 6.6% of the study population (online supplement, Table 2). Independent predictors of non-vertebral fracture included age, total hip T-score, number falls in last 12 months, prior fracture, walking speed, Parkinson’s disease or stroke, and smoker pack years; model c-statistic was 0.62 (0.59–0.65) (Table 2). Multicollinearity between independent variables, non-proportional hazards, and overfitting were determined not to be significant in any of the multivariable models; consideration of interaction terms—selected via the stepAIC method—failed to improve model discrimination.

Table 1 Multivariate analyses of risk factors for 1-year hip fracture in osteoporotic women aged ≥65 years
Table 2 Multivariate analyses of risk factors for 1-year non-vertebral fracture in osteoporotic women aged ≥65 years

Fracture risk scoring system

A simplified risk scoring system for hip fracture was developed based on the beta coefficients for each predictor in the multivariable model (Table 3). Overall, the scoring system yielded an estimate of hip fracture risk that was 0.2% higher than the corresponding observed risk (2.4 vs. 2.2%) (Table 4). The ratio of predicted risk from the scoring system to observed risk ranged from 0.7 to 1.3 across quintiles of the population; absolute differences ranged from 0.1 to 1.0%. Calibration (p value = 0.411) and discrimination (c-statistic = 0.70 (0.66–0.74)) of the scoring system were good.

Table 3 Scoring system for 1-year hip fracture risk among osteoporotic women aged ≥65 years
Table 4 Observed 1-year risk of hip fracture among osteoporotic women aged ≥65 years and estimates from risk scoring system

The expected 1-year risk of hip fracture for an individual patient can be ascertained by summing the scores for each of the five risk factors and comparing the total score to the corresponding expected 1-year risk of hip fracture (Table 3). For example, the total score for a woman with MMSE of 24, total hip T-score of −2.7, no prior non-vertebral fracture, walking speed of 1.1 m/s, and no use of arms for chair stands/not poor/very poor tandem stand would be 0 (0 + 0 + 0 + 0 + 0), which corresponds to a < 0.5% risk of hip fracture over a 1-year period. By contrast, the total score for a women with MMSE of 20, total hip T-score of −3.5, history of non-vertebral fracture, walking speed of ≤0.70 m/s, and use of arms for chair stands would be 34 (5 + 8 + 5 + 11 + 5), which corresponds to a ≥ 5.0% risk of hip fracture over a 1-year period.

Discussion

Using data from the SOF, we identified several clinical characteristics predictive of hip and non-vertebral (any) fracture within a 1-year follow-up period among women aged ≥65 years with osteoporosis. For hip fracture, independent predictors included total hip T-score, non-vertebral fracture after age 50 years, walking speed, MMSE, and use of arms for chair stand or poor/very poor tandem stand. For non-vertebral fracture (any), independent risk factors included age, total hip T-score, fracture after age 50 years, prior falls, walking speed, Parkinson’s disease or stroke, and smoking.

In addition, because the discriminative ability of the multivariable model for hip fracture was acceptable, we incorporated the five independent risk factors—all of which are readily ascertainable in clinical practice—into an assessment tool that may be used as an aid in targeting those at elevated near-term risk for appropriate therapy. While our risk assessment tool requires validation using data from another source, we note that the discriminative ability of our hip fracture risk model (c-statistic was 0.71) compares favorably with others, including several cardiovascular disease models based on data from the Framingham Heart Study (range, c-statistic = 0.66 to 0.79) that are widely used in clinical practice and clinical research [25,26,27,28]. We also note that, if validated, we believe our risk assessment tool may be important for decision-making regarding treatment initiation and possibly choice of osteoporosis treatment by helping to identify patients with elevated near-term risk of fracture.

The five risk factors in our risk assessment tool include the following: low BMD, prior fracture (i.e., since age 50 years), slow walking speed, poor cognitive function, and use of arms for chair stand. Low BMD is an important biologic factor reflecting increased fracture risk, and prior occurrence of an event (such as fracture) is often a very common risk factor that identifies patients who are at substantially elevated risk of recurrent events. Slow walking speed, poor cognitive function, and use of arms for chair stand are three factors that increase the risk of falling, a precipitating event in the great majority of osteoporotic hip and non-vertebral fractures [29, 30]. Two of the risk factors were quite common among our study population (walking speed ≤1.0, 71%; history of non-vertebral fracture, 67%), and none of the others was rare (total hip T-score ≤−3.0, 48%; MMSE ≤23, 33%; use of arms for chair stand or poor/very poor tandem stand, 26%). Any woman who had any two of the risk factors or decreased walking speed alone would have 2.7 to 3 times the risk of hip fracture as a woman without any of the risk factors. Much higher risk ratios are assigned to women with >2 risk factors.

While the c-statistic for the non-vertebral fracture model—0.62—was low, and thus model discrimination was poor, we note that model discrimination is not the only indicator of the clinical utility of a model [31,32,33,34,35]. This low c-statistic was not unexpected for two principal reasons. First, the operational definition of non-vertebral fracture in SOF is rather broad and includes several types of relatively minor fractures that are not usually considered to be osteoporotic in nature, including finger, toe, rib, and facial bone fractures, constituting 18% of all non-vertebral fractures. The inclusion of such fractures in the non-vertebral category would tend to decrease the discriminatory power of predictive models, and it was not possible to define a composite measure of non-vertebral fracture that includes only types usually considered to be osteoporotic in nature (i.e., a comprehensive set of variables identifying all fracture types was not available at the exams of interest). Second, the risk factors for various types of osteoporotic fractures may be variable, with non-vertebral fractures being a more heterogeneous outcome [36,37,38,39]. For example, more active women appear to be at higher risk of wrist fracture, as compared with frail, inactive women, who are at higher risk of hip fracture [38]. Thus, combining the various types of non-vertebral fracture—including non-osteoporotic fractures—into a single outcome measure for a predictive model would be expected to lower its discriminatory power.

The risk factors that we identified for hip fracture differed in part from those identified in a prior study of risk factors for hip fracture in elderly women with osteoporosis based on SOF data [17]. In the prior study, baseline risk factor values from exam no. 2 were used to predict 5-year risk of hip fracture, and 16 risk factors were considered for their risk prediction model. Not included among these 16 were MMSE score, walking speed, and tandem stand, all of which were considered in our study. In the 5-year multivariable model, six risk factors were significant: previous hyperthyroidism; distance depth perception (lowest quartile); contrast sensitivity, low frequency (lowest quartile); grip strength; prevalent vertebral fracture; and total hip BMD. Of those that were not included in our model, previous hyperthyroidism was not significant in our bivariate analysis. On the other hand, grip strength, contrast sensitivity, and depth perception were significant, but these three factors were excluded from our model because they are not typically assessed in primary care practice.

Differences between these two studies in the list and strength of risk factors—presumably, at least in part, due to their dynamic nature over time—highlight how risk prediction may vary in important ways when focusing on near-term risk for fractures versus risk of fractures occurring over a longer period of time. It is notable, for example, that age—which is universally acknowledged as an important risk factor for osteoporotic fractures—was found in the above-referenced SOF study to be an independent predictor of 5-year hip fracture risk but was found in the current study to be predictive of 1-year hip fracture risk only in unadjusted analyses (i.e., in analyses not considering dynamic factors that may better “explain” near-term fracture risk). In fact, in a multivariable framework incorporating dynamic predictors, age was not found to be an independent predictor of 1-year hip fracture risk in the current study, suggesting that other (i.e., dynamic) factors may be more important in predicting near-term fracture [17]. Long-term risk is not evenly distributed, since annual incidence of fracture increases with age. Moreover, since the existence, levels, and strength of baseline risk factors may vary over time, the significance and strength of risk factors for near-term fracture may differ from those for long-term fracture. Differences between these two studies also highlight the need for a pragmatic approach to risk classification in clinical practice, one that focuses less on ascertaining values for specific variables (many of which are challenging to measure in clinical practice, such as the MMSE) and more on evaluating risk factors from a broader, more general, and more practical perspective.

Our study has a number of limitations. First, the fact that our study population comprised self-selected, community-dwelling volunteers, lacking in ethnic diversity, limits the generalizability (external validity) of our results in unknown ways. Second, changes in practice patterns, technology, and other unobservable factors since 1986 (the year SOF commenced) may impact the generalizability of study findings to current clinical practice. We note, however, that the importance of risk factors for fracture did not vary over the approximate 20-year follow-up period when evaluated using time-dependent variables. Third, recall bias as to family and personal medical histories may have occurred. Fourth, although the 1-year timeframe gives our models several advantages over those with longer timeframes (as mentioned in the introduction), it does have the disadvantage of permitting a greater role for chance in determining the occurrence of fractures. This factor may have detracted somewhat from the discriminatory power of our models. Finally, we note that because our objective was to use all available data to develop a robust risk scoring system, we chose not to split our sample for purposes of validation, and we were unable to validate the risk scoring system using data from a different source.

In conclusion, prior fracture, prior falls, lower BMD, and/or selected factors associated with falls (physical/cognitive dysfunction) help to identify elderly osteoporotic women who have an elevated near-term risk of fracture. Given the dynamic nature of these risk factors, careful consideration should be given to identifying this population and evaluating their fracture risk frequently (e.g., at least annually) so that those at elevated risk may be targeted for appropriate therapy. Additional research using data from other large populations of women with osteoporosis is needed to validate the applicability and accuracy of our scoring system.