Introduction

Osteoporosis, which is characterized by reduced bone mass and micro-architectural deterioration leading to increased bone fragility [1, 2], affects approximately 200 million people worldwide [3]. In 2020, the National Osteoporosis Foundation reported that approximately 54 million Americans, of all ages, are living with osteoporosis or low bone mass [4].

Bone turnover markers (BTMs) can be measured in serum, plasma, and urine [5], with bone formation and bone resorption marker levels relating to osteoblast and osteoclast activity, respectively. Bone formation markers include proteins such as osteocalcin or procollagen type I N propeptide (PINP), and the bone isoform of alkaline phosphatase (bone ALP). Bone resorption markers include fragments released from the telopeptide end region of type I collagen following its enzymatic degradation, such as the N-telopeptide of type I collagen (NTX), carboxy-terminal crosslinking telopeptide of type I collagen (CTX), deoxypyridinoline (DPD), and the enzyme tartrate-resistant acid phosphatase [6].

The International Osteoporosis Foundation and International Federation of Clinical Chemistry and Laboratory Medicine (IOF-IFCC) Bone Markers Working Group has identified CTX and PINP as promising markers for providing clinically useful information for monitoring osteoporosis treatment [7], and recommends that CTX and serum PINP, measured by standardized assays, be used as reference markers in observational and interventional studies [2]. American Association of Clinical Endocrinologists/American College of Endocrinology guideline recommendations for BTMs also advise use of CTX and PINP as monitoring tests for osteoporosis treatment [8], as do National Osteoporosis Foundation (NOF) guidelines [9]. An IOF and European Calcified Tissue Society taskforce has also suggested that PINP and CTX screening may be used to detect lack of adherence to oral bisphosphonates therapy [10].

In addition to monitoring osteoporosis treatment [11], and patients during treatment holiday [12, 13], a meta-analysis of published studies has shown that low levels of BTMs are modestly associated with reduced fracture risk [5]. A few studies have measured BTMs prior to hip fracture events [5, 14], and found conflicting reports with both positive [15] and negative [16] associations of BTM levels and the risk of osteoporosis-related hip fracture. In clinical practice, the use of BTM levels in predicting fracture outcomes is further complicated by significant within-patient variability of BTM levels due to patient age [17], comorbid conditions such as diabetes and chronic kidney disease [11], or ethnicity [18]. Sources of variability in BTM levels should be considered when interpreting test results. Particular attention should be paid to the appropriate use of reference intervals for determination of abnormal results, specifically related to the age and sex of the patient [19].

The majority of reports on the use of BTMs in clinical practice have tended to be single-site or small number multi-site studies [20] whose results may not be broadly applicable to the medically insured patients with osteoporosis in the USA. To help address this gap, we conducted an investigation using real-world data from a large patient population with osteoporosis in the USA. Our aims were threefold: (1) to assess trends in BTM test utilization; (2) to characterize the patterns of BTM testing and baseline characteristics of a heterogeneous population of patients in clinical practice; and (3) to estimate the potential clinical utility of BTM for treatment decision-making and association with fragility fracture.

Methods

Study design and data source

We undertook a population-based retrospective cohort analysis of patients enrolled in the Truven MarketScan® Commercial Claims and Encounters and Medicare Supplemental and Co-ordination of Benefits databases. These databases consist of the outpatient, inpatient, and pharmaceutical claims of approximately 50 million privately insured individuals and their dependents receiving care annually in the USA. Claims originated from more than 150 large employer-sponsored health insurance plans with patient coverage in all 50 states. The Medicare Supplemental and Co-ordination of Benefits databases represent commercially insured individuals, who have both Medicare coverage and supplemental employer-sponsored coverage, for Medicare-eligible active and retired employees and their Medicare-eligible dependents from employer-sponsored Medicare Supplemental plans. All data were anonymized to comply with the Health Insurance Portability and Accountability Act (HIPAA) and the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Thus, Institutional Review Board approval was not required and formal informed consent was not obtained.

Study design and patient selection

Osteoporosis was defined at baseline in adult patients aged ≥ 50 years who were enrolled in a health plan with pharmaceutical coverage from January 1st 2008 to June 30th 2018. Osteoporosis was defined based on first recorded event according to (1) ≥ 1 inpatient or 2 outpatient claims (≥ 30 to ≤ 360 days apart) for osteoporosis, as defined under the International Classification of Diseases, Ninth (ICD-9-CM) or Tenth Revision (ICD-10-CM); (2) ≥ 1 claim for US Food and Drug Administration (FDA)-approved osteoporosis treatment (National Drug Code [NDC], Healthcare Common Procedure Coding System J- or C-codes); or (3) a fragility fracture considered to be associated with osteoporosis [6, 21, 22] (Online Resource Table 2). For hip fracture claims, ≥ 1 inpatient claim was required, and for other fracture types, ≥ 1 inpatient claim or ≥ 2 outpatient claims, ≥ 30 to ≤ 360 days apart (ICD-9-CM, ICD-10-CM codes, or Common Procedural Terminology [CPT] codes) were required. Patients were excluded if a claim of malignant neoplasm (excluding non-melanoma skin cancers), Paget’s disease of bone, or chronic kidney or end-stage renal disease was made at any time during the study period, to avoid misidentification of patients treated with medications as osteoporotic and inclusion of patients with malignancy-related fractures (Online Resource Table 3).

The index date was defined differently between BTM-tested and untested patients. In both cases, patients were required a minimum continuous enrollment ≥ 360 days prior to defined baseline (or washout period) and ≥ 360 days follow-up (Fig. 1) with allowable gap in coverage equivalent to ≤ 30 days. For those tested, the index date was defined as the date of the first BTM claim following osteoporosis diagnosis based on corresponding CPT codes for BTMs: osteocalcin (83937), bone-specific ALP (84080), and collagen cross-links (any method, 82523). PINP was not included in the present study as no unique CPT code (83519) is available to accurately classify receipt of this test. Untested patients were randomly assigned an index date based on a uniform distribution and the following criteria: to ensure adequate follow-up, the index date was required to fall before the final 360 days of data capture for the patient, and to ensure sufficient baseline washout period, the index date was required to fall after the first 360 days of data capture. Each patient was followed prospectively until an observed outcome, the end of continuous enrollment, reported death, or study end, whichever occurred first.

Fig. 1
figure 1

Study design schema of patients with osteoporosis aged ≥ 50 years enrolled in Truven MarketScan Commercial Claims and Encounters and Medicare Supplemental and Co-ordination of Benefits databases, 2008–2018

Defining study outcomes

Study outcomes were compared between BTM-tested and untested groups in the follow-up period. Further details on how study outcomes were defined are provided in the Online Resource Methods section.

Outcome 1: Treatment decision-making

Osteoporosis treatments approved by the FDA during the study period were explored according to therapeutic mode of action: anti-resorptive (bisphosphonates, estrogen/progesterone, selective estrogen receptor modulators [SERMs], calcitonin, denosumab), or anabolic (teriparatide). Both injectable and oral routes were considered along with respective days of expected coverage or supply (i.e., 90/180/360 day intervals with allowable coverage gap of ≤ 30 days). Patients were classified according to the next observed treatment decision following a BTM test according to the days of coverage or supply: no treatment prescribed, treatment initiated, continue on the same treatment, restart following a treatment gap of > 30 days, between-class switch, or treatment discontinuation.

Outcome 2: Fragility fracture

Occurrence of a fragility fracture following index was assumed to be associated with osteoporosis and was classified according to methods reported by Song et al. [23]. Incident claims were captured by the presence of a qualified diagnosis of closed fractures of sites that may be associated with an increased risk of fracture, predominantly spine, hip, pelvis, and upper leg fractures [24]. Fractures were defined as those with ≥ 1 inpatient claims (primary or secondary discharge diagnosis) for hip fractures, and ≥ 1 inpatient (primary or secondary discharge diagnosis) or ≥ 2 outpatient claims (30–180 days apart) for all other sites. Fractures that were most likely the result of serious trauma were excluded, including compound or open fractures, multiple fractures within 7 days of a single claim, and vertebral fractures with concurrent spinal cord injury. Analyses were based on determination of the first claim for an osteoporotic fracture following index and any subsequent BTM event thereafter.

Statistical analysis

Descriptive statistics were used to characterize baseline socio-demographic and clinical characteristics of the osteoporotic patient cohort, summarizing continuous variables with means (standard deviation [SD]) or medians (interquartile range [IQR]) and categorical variables with counts and proportions (percentages). To explore trends in testing over calendar time [25], the annual period prevalence and associated 95% confidence intervals (CIs) were estimated among tested patients from 2008 forward. To evaluate longitudinal trends, the Cochran-Armitage test for trend and average annual percentage change (AAPC) were reported [26]. To account for variable enrollment in the MarketScan databases over time, the numerator was defined as the number of patients with one or more BTM tests and the denominator as total enrollment in a calendar year.

To examine the association between index testing on treatment decisions and fragility fracture, a multivariate logistic propensity score model conditioned on values at index (age, sex, year, region of care, insurance type, provider type [for treatment decision outcome model only]) and baseline (Charlson Comorbidity Index [CCI], score 1, 2, 3, or 6, where a higher score indicates a greater risk of 1-year mortality associated with more severe and/or greater co-morbidity burden) was fit [27]. The propensity score was used to match tested and untested patients using a fixed 1:1 ratio and nearest neighbor without replacement [25]. Propensity scores represent the conditional probability of assignment to the tested group and may be used to control for multiple observed covariates that are associated with the exposure and outcome [28]. That is, patients are assumed to have or not have been tested by chance and propensity score matching represents a non-parametric way to control for selection bias. Adequacy of matching in terms of patients’ baseline characteristics was evaluated using standardized differences; a value of < 0.1 was assumed to indicate a negligible difference in the characteristics between tested and untested patients [29]. A doubly robust method [30] was used where, in addition to the propensity score matching, generalized estimating equation (GEE) models were fit to estimate comparisons of odds and 95% CIs assessing the association between testing on treatment decision-making and fragility fracture. A binomial distribution and logistic link function were specified for both models fit with unstructured correlation structures, selected based on quasi-likelihood information criteria [31]. Additional covariates were not included in models as groups were well-balanced on baseline characteristics.

All statistical tests were two-sided and significance was determined using ɑ = 0.05. Analyses were conducted in SAS version 9.4 (SAS Institute, Inc., Cary, NC).

Results

Patient cohort

From 2008 to 2018, 457,829 individuals were classified as presumed osteoporotic (Fig. 2). Following application of inclusion criteria, 6075 patients (1.3%) were identified with one or more BTM test claims on or following diagnosis. Among all patients with osteoporosis, cohort entry declined over calendar time (Table 1), reflective of the annual decline in patients enrolled in Truven MarketScan, year-over-year. At the time of diagnosis, median age was 62 years (IQR: 57–74), with the majority of patients classified as female (79.6%), and 19.2% of patients having high CCI scores ≥ 2 (Table 1). Claims were most frequent from the South (33.7%) or North Central (29.9%) USA, while preferred provider organization (PPO) insurance coverage was common (49.7%). Compared with those untested, patients at osteoporosis diagnosis with BTM claims during follow-up were slightly younger; had lower CCI scores; were more likely to have PPO insurance coverage; and had higher proportions of diagnoses made at endocrinologists, rheumatologists, or primary care providers. Similarly, they were more likely to have an explicit osteoporosis diagnosis claim, not be covered via Medicare, have at least one bone mineral density (BMD) test during baseline, and have longer follow-up.

Fig. 2
figure 2

Cohort attrition of patients with osteoporosis enrolled in Truven MarketScan Commercial Claims and Encounters and Medicare Supplemental and Co-ordination of Benefits databases, 2008–2018, and 1:1 propensity score matched between those with tested and untested for bone turnover markers. BTM, bone turnover marker

Table 1 Characteristics at index or baseline of patients with presumed osteoporosis diagnosis and enrolled in Truven MarketScan Commercial Claims and Encounters and Medicare Supplemental and Co-ordination of Benefits databases, 2008–2018 (matched and all patients)

Following application of the propensity score model, 6075 BTM-tested patients were matched to 6075 untested patients (Table 1). Matched tested and untested patients were well-balanced on their baseline characteristics with none exhibiting a standard difference of > 0.1.

Real-world bone turnover marker test patterns

Among the 6075 tested patients, 8828 unique claims were made during the study period, with the majority being markers of resorption (76.6%; Table 2). In total, 14.4% (n = 875) of patients had concurrent claims for both resorption and formation markers. The annualized period prevalence of testing per 100 persons ranged from 0.23 (95% CI: 0.19–0.28) in 2008 to 0.47 (95% CI: 0.45–0.50) in 2018 (Fig. 3). During the study period, patients tested increased year-over-year (Cochran-Armitage test for trend, p = 0.03), with most of the increase occurring in the latter half of the study period (2015 onwards) and with an AAPC of 8.1% (95% CI: 5.6–9.0; p = 0.01). The AAPC prevalence for resorption markers was 4.2% (95% CI: 3.7–3.9; p = 0.04) and for formation markers, it was 6.9% (95% CI: 5.9–7.2; p = 0.02). No substantial difference in annual testing trends was observed when the analysis was repeated by age group deciles and sex (data not shown).

Table 2 Characteristics of osteoporotic patients tested with bone turnover marker and enrolled in Truven MarketScan Commercial Claims and Encounters and Medicare Supplemental and Co-ordination of Benefits databases, 2008–2018
Fig. 3
figure 3

Annual period prevalence (per 100 persons) of bone turnover marker testing and average testing per patient among patients with osteoporosis enrolled in Truven MarketScan Commercial Claims and Encounters and Medicare Supplemental and Co-ordination of Benefits databases, 2008–2018

On average, patients had 2.2 test claims (SD 2.0) during the study period (Table 2), which remained stable irrespective of year of diagnosis (data not shown). Median claims suggest a non-normal distribution (1.0 IQR: 1.0–3.0) with only 593 patients (9.8%) reporting ≥ 3 BTM claims during follow-up. Follow-up BTM and dual-energy X-ray absorptiometry testing after osteoporosis diagnosis are recommended by clinical guidelines [8]; therefore, patterns of repeat testing were examined for those with > 1 test following index. Median time from osteoporosis diagnosis to first BTM claim was 160 days (IQR: 37–471), and for those with two or more claims the median inter-test interval was approximately 220 days between claimed tests. Approximately 30% of all tests were ordered by endocrinologists, rheumatologists, and primary care providers, with the majority of claims in the non-ambulatory or hospital or clinical setting.

Impact of bone turnover markers on treatment decision-making and fragility fracture

In total, 1345 patients (22%) had a unique treatment decision within 30 days of BTM testing. Treatment decisions were most common with anti-resorptives (89.1%) followed by anabolic (5.6%) and combination therapies (6.3%). This included treatment initiated (4.9%), continuation on the same treatment (8.4%), re-starting the same treatment following a gap of > 30 days (0.6%), and treatment discontinuation (8.2%). No observations for treatment switching were observed for tested patients. From the GEE propensity score model predicting treatment decision-making, tested patients were significantly more likely to have a treatment decision within 30 days compared to those untested (OR 1.14; 95% CI: 1.13–1.15). To further understand this observed effect, we conducted a post-hoc analysis of treatment decision-making by category of decision (new treatment, continuation, treatment restart, treatment switch, discontinuation). Assessment of BTMs was significantly associated with the decision to re-start treatment within 30 days of testing (OR 2.67; 95% CI: 2.51–2.93) and continue treatment (OR 1.03; 95% CI: 1.03–1.04), and treatment discontinuation (OR 1.03; 95% CI: 1.02–1.04). While no statistically significant association was observed for decision to initiate treatment (OR 1.01; 95% CI: 1.00–1.01) or switching treatment following testing (OR 1.02; 95% CI: 1.00–1.04), point observations suggest potential weak clinical significance.

The impact of testing on fracture events was also explored. A total of 1409 tested patients (23.2%) had a fragility fracture assumed to be due to osteoporosis following index, and this was linked to 3236 unique fracture events during the study period. The most common fracture type was wrist/forearm (562 events, 17.4%), followed by hip (440, 13.6%), vertebra (429, 13.2%), and femoral (381, 11.8%). In the model predicting fragility fracture following a BTM test, results suggest that testing was associated with lower odds of fracture compared to those patients untested (OR 0.87; 95% CI: 0.85–0.88).

Discussion

To our knowledge, this study represents the first known US nationwide epidemiological study of BTM testing among patients with presumed osteoporosis. We analyzed data from persons with a presumed osteoporosis diagnosis in the USA from 2008 to 2018 and observed that the annual proportion tested using BTMs rose from 0.23 tests per 100 patients in 2008 to 0.47 in 2018, with most of the increase occurring in the latter half of the study period. The observed rise in testing is encouraging, yet tested patients still remain below international guidelines for screening response to therapy. Among various BTMs, serum CTX-I and serum PINP are recently recommended as monitoring tests for osteoporosis treatment by several osteoporosis guidelines, including the NOF, the Japanese Osteoporosis Society, and the IOF [2, 9, 32].

BTMs may be employed as clinical tools for treatment decision-making at several important junctures of osteoporosis treatment. For example, baseline measurements of resorption and formation markers before commencement of anti-resorptive and anti-formation therapies, respectively, are of utility in monitoring treatment response and adherence. BTMs are also of potential clinical value in deciding whether patients should resume therapy following treatment holidays, and for monitoring patients during these periods [33]. Our results suggest that assessing BTM was significantly associated with the decision to re-start treatment for osteoporosis within 30 days of testing, to continue treatment, or to discontinue treatment. Published literature substantiates BTMs as having considerable utility in treatment decision-making in patients with osteoporosis [11]. In particular, measurement of BTMs can reflect response to therapies earlier than that of BMD, and can be used to monitor treatment compliance [6, 34]. PINP or CTX may be used to identify treatment responders and non-responders, and as a marker of poor patient adherence to common osteoporosis treatments [35, 36].

Our analysis showed that BTM testing was associated with lower odds of fracture compared to not testing patients with osteoporosis. This association could potentially be due to turnover data leading to change in pharmacotherapies reducing fracture risk. Supporting this, it has previously been reported that high levels of the BTMs NTX, DPD, and CTX are predictive of subsequent risk of hip fracture in women aged ≥ 75 years, independently of hip BMD [14]. High levels of NTX, DPD, CTX, and serum bone ALP have also been shown to be associated with increased risk of osteoporotic fracture in post-menopausal women, independently of BMD [37, 38]. BTM testing offers potential advantages versus traditional BMD testing, as the latter does not completely capture the risk of osteoporotic fracture, and the use of serial BMD measurements as a tool for treatment response requires an interval of more than a year. Bone turnover, by contrast, changes early and can be assessed within 3 months of starting treatment [34]. BTM measurements are also repeatable, relatively inexpensive, and non-invasive [39], potentially lowering the cost of care [40] and decreasing patient inconvenience as opposed to BMD testing. However, unlike BMD, BTM measurements are subject to a number of pre-analytical variations, including seasonal and diurnal variations [41].

As with all observational studies, and especially with studies using commercial insurance claims databases where changes in enrollment (including left censoring) and loss to follow-up (≥ 20%) [42] reduce the sample size of longitudinal studies, the results of the present study should be interpreted with caution. Firstly, the study provided an overall picture of BTM testing and it was not the intention of the claims data mining to determine which BTMs were being tested. It is, therefore, not possible to specify which BTMs are associated with an impact on treatment or predict fragility fracture risk. As previously mentioned, there is no unique CPT code (83519) available to accurately classify the receipt of PINP. Serum osteocalcin was included in this analysis and has been shown to correspond well with levels of PINP [33]. Secondly, outpatient claims may be recorded by a variety of staff with limited clinical training; therefore, misclassification is possible. In this study, the inclusion criterion for incident presumed osteoporosis diagnosis was based on > 1 claim, which may minimize the risk of misclassification bias. Finally, administrative claims do not provide insight into individual test results, which may be drivers of the observed association, or potential confounders not captured in the present database that may have biased the observations. The strengths of our study include the use of a large, longitudinal claims database from which we were able to analyze a heterogeneous, real-life population of patients in terms of decisions made about their treatment and incident fracture outcomes. MarketScan is a large, nationally representative database of individuals receiving employer-sponsored healthcare insurance, and the coding of inpatient claims in the USA is typically performed reliably by professional coders.

Conclusions

In this large, heterogeneous sample of US-based patients with presumed osteoporosis, we determined that BTM testing was associated with both treatment decision-making and a reduction of fragility fracture following use, conclusions which are consistent with published literature. While further investigation to validate the findings and understand the drivers is warranted, the evidence presented in this work provides further evidence of the value of monitoring osteoporotic patients with in vitro BTM monitoring diagnostic solutions.