Introduction

Osteoporotic fractures (OF) are a major global public health concern, associated with significant clinical, economic, and societal burden worldwide, with various healthcare and treatment approaches among countries [1, 2]. An important component of the direct economic burden of OF is the direct cost of care, which includes inpatient/outpatient care, treatments, use of rehabilitation and equipment, nursing facilities, and other medical and pharmacy services. A retrospective US study of over 300,000 OF patients estimated an average of $34,855 in annual all-cause healthcare costs per patient [1]. The OF-related costs in the Australian healthcare system in 2012 were estimated as $1.6 billion in direct costs [3]. The OF-related direct costs in six European countries (France, Germany, Italy, Spain, Sweden, and the UK) amounted to €37.5 billion in 2017 [4]. In South Korea, the OF-related direct cost to the national health system increased from $549 million in 2008 to $722 million in 2011 [5]. The variation in cost estimates among countries could be attributed to differences in country-specific healthcare and payer systems, socioeconomic determinants of health, geopolitical factors, national wealth, and health status of the population [6,7,8], as well as differences across studies in the fracture sites of interest, data availability, and methodology. These differences lead to challenges in interpreting and comparing economic evaluation involving healthcare resource utilization (HCRU) and costs of care across studies. Hence, to facilitate appropriate interpretations and fair comparisons of OF-related HCRU and cost data across studies, it is critical to account for country-specific differences in study design and participant selection, along with national differences in healthcare systems and costs. Most publications documenting the economic burden of OF are based on data from individual countries; therefore, there is a shortage of directly comparable multi-country real-world evidence. Addressing the inconsistencies due to country-specific and methodological variations would not only highlight the magnitude of economic burden of OF, but also aid appropriate economic evaluation and healthcare resource allocation and prioritization.

Here, to overcome several of these shortcomings, we propose a standardized methodological approach using real-world datasets for economic evaluation of the burden of OF compared with its non-OF pair across multiple countries. We aimed to illustrate the advantages of using a standardized methodology by presenting HCRU and cost data from Australia, Germany, South Korea, Spain, and the USA to provide global insights from patients across North America, Europe, Australia, and Asia. This study may also be used as a reference point for further in-depth country-specific publications.

Materials and methods

Overview of study design

This multinational, retrospective matched cohort study assessed the direct economic burden of OF from 2013 to 2020 in women aged ≥ 50 years from Australia, Germany, South Korea, Spain, and the USA. Within each country, HCRU and cost data derived from patients with newly identified OF (OF cohort) and matched patients without OF (non-OF cohort) were compared. The index date for the OF cohort was defined as the date of the first incident OF within each 6-month calendar period during the index period, and the same index date was then assigned as the index date for the non-OF cohort. The pre-index period for each patient corresponded to at least 18 months preceding the index date, and baseline information, such as comorbidities, medication use, pre-fracture HCRU, and costs of care, was collected. The follow-up period for each patient was defined as the period extending from the index date to the end of follow-up (defined as the earliest of the end of the study period, death (if available), fracture event (applicable to non-OF cohort), or lost to follow-up). Only fractures within each 6-month matching period (called “panel” or 6-month index period) were used as the index fracture for the OF cohort. If a fracture occurred after the 6-month matching period of up to 5 years, it was considered either a recurrent event for the OF cohort or an event to be censored for the non-OF cohort. The follow-up period ranged from a minimum of 1 month to 5 years from the index date (Fig. 1).

Fig. 1
figure 1

Study design. The study periods for the Germany and US studies reflect the latest available data at the time of the study and are different from dates shown above. OF, osteoporotic fracture. *Study period for Germany: July 1, 2013, to December 31, 2018; for the USA: July 1, 2013, to December 31, 2020. **Subject identification period for Germany: July 1, 2014, to November 30, 2020; for the USA: January 1, 2015, to November 30, 2020. There were 12 panels during the subject identification period; each panel had 6-month panel to identify subjects, except the last panel which had a 5-month period to allow a 1-month follow-up period

Study population

The OF cohort was defined as patients meeting all the following criteria: (i) women aged ≥ 50 years who experienced an incident OF (i.e., low or no trauma fracture; Supplementary Table 1) during each 6-month index period/panel, (ii) with continuous enrollment in the database for ≥ 18 months prior to and at ≥ 1 month after index date, and (iii) without an OF in the 18-month pre-index period. Claims with diagnosis codes for fractures after the 18-month fracture-free period were treated as incident fractures. The following hierarchical approach was used to categorize the type of index fracture for a patient with two different OFs on the same day and gave precedence to fractures in the following order: hip, vertebral, pelvis, tibia-fibula, radius-ulna, humerus, ribs, clavicle, and scapula. Diagnosis codes for trauma (Supplementary Table 2) were used to exclude patients with a traumatic fracture during each 6-month index period. To minimize the potential inclusion of fractures related to traumatic accidents/events but not recorded as traumatic, women with three or more types of fractures on the index date were excluded from the selection process. The non-OF cohort was defined as patients meeting the following criteria: (i) women aged ≥ 50 years, (ii) with continuous enrollment in the database for ≥ 18 months prior to and ≥ 1 month after the assigned index date, and (iii) with no record of OF during the pre-index period. For both cohorts, women with a record of participation in a clinical trial pertaining to osteoporotic treatment during the pre-index period, with a diagnosis of cancer (except non-melanoma skin cancer), Paget’s disease of the bone, osteitis deformans, osteopathies, or metabolic bone diseases (e.g., osteomalacia, hyperparathyroidism, and osteogenesis imperfecta) during the study period were excluded to minimize the potential impact of these conditions on OF-related outcomes.

Matching

The subject identification period in each country was divided into 6-month index periods (“panels”), except the last 5-month panel, to identify eligible women with OF and a pool of non-OF women eligible for matching. Supplementary Figure 1 depicts the overall processes. For each 6-month panel, non-OF women were first age-matched to OF women using their birth month and year (± 12 months). The matched non-OF women were assigned a random date during the 6-month panel as their “temporary” index date. After identifying an age-matched group of non-OF women for all OF women in each 6-month panel, propensity scores were generated for OF and non-OF women using multivariate logistic regression with the following baseline characteristics: geographic region, race/ethnicity (if available in the country dataset), pre-index glucocorticoid use, pre-index hormone replacement therapy use, pre-index anti-osteoporosis drug use, residence (i.e., living at home or in an institution), Charlson Comorbidity Index (CCI), selected comorbidities (i.e., osteoporosis, cardiac disease, cerebrovascular disease, chronic obstructive pulmonary disease (COPD), asthma, diabetes, depression, and anxiety), and number of pre-index hospitalizations [9,10,11]. For each OF woman, up to three age-matched non-OF women with the smallest difference in propensity score were identified. After the identification of matches for all OF women, the index date of OF women was assigned to their non-OF matches, and all were added to the analysis set. The same matching process was applied to the second 6-month panel and so forth. In case of a non-OF woman being matched to multiple OF patients, she was only considered matched to the “earliest index date” woman. If a non-OF woman had a fracture during follow-up, she was censored on the date of her fracture, and post-fracture data were not included in any analyses after her fracture date nor assigned to the OF cohort post-fracture (i.e., was not reassigned to the OF cohort).

Data sources

Data were extracted from electronic medical records (EMR) or administrative claims databases or surveys. The countries assessed were selected as examples of four continents, and data sources were selected for diversity in terms of data types and feasibility to generate high-quality real-world evidence. A summary of data source characteristics across the countries is presented in Table 1. These data sources are briefly described in the Supplementary material.

Table 1 Data source characteristics and components across countries

Outcome measures

Direct all-cause HCRU included any resource/service directly available or identifiable from the study data source in each country, including hospitalizations (defined as at least one overnight stay in the hospital), emergency room/accident and emergency (ER/A&E) visits (defined as hospital ER visit without an overnight stay or visit to a stand-alone urgent care facility), physician visits (defined as a visit to primary care or a specialist or an outpatient clinic), nurse visits, home visits (by a nurse or a physician), diagnostic and/or procedures, and prescriptions, use of physical therapy (physiotherapy) and/or occupational therapy, and ambulance use were reported if available in the data sources. Reported costs of each resource/service were used to calculate sums by service type (i.e., medical (sum of medical inpatient, outpatient, and ER), medical inpatient (excluding ER), outpatient (excluding ER), ER, and pharmacy). Total direct all-cause healthcare costs were the sum of medical and pharmacy costs. For South Korea, pharmacy costs were summed with medical costs as pharmacy costs could not be reported separately in the data. All costs were adjusted using a country-specific consumer price index and presented in 2018 USD for all countries except the USA (2021 USD). If there were multiple data on baseline demographic characteristics during the pre-index period, the information reported closest to the index date was chosen.

Statistical analysis

Descriptive statistics were used to describe patient characteristics and outcomes in the OF and non-OF cohorts within each country. The proportion of women with at least one utilization for each resource type (e.g., ≥ 1 hospitalization) during the post-index follow-up period was reported. Post-index all-cause HCRU and associated costs are presented as a rate on a per person-year basis as the frequency of utilizations or total costs divided by the total follow-up time (in years) contributed by each woman. The adjusted rate ratio of OF (numerator) compared with non-OF (denominator) cohorts in all-cause HCRU person-year rates was assessed using negative binomial regression models with log-link function, adjusted for baseline characteristics with a standardized difference ≥ 10% after matching. The adjusted rate ratio of OF (numerator) compared with non-OF (denominator) cohorts in all-cause costs was assessed using generalized linear models with gamma distribution and log-link function, adjusted for baseline characteristics with a standardized difference ≥ 10% after matching and residence at index date (Supplementary Table 3). The gamma distribution with the log-link was chosen considering the skewed distribution of costs. The matching group was included as a random intercept to account for the matching design, and person-years was included as an offset (applying the logarithm function) to account for differences in follow-up time for the HCRU and cost rate ratio models. These regression models were chosen based on the distribution of their dependent variables and have been used in real-world data analyses [12, 13]. Sub-analyses included post-index HCRU and cost outcomes by year since index in 1-year intervals, index fracture type, and imminent risk groups of fracture. Further interpretation and results of the subgroup analysis will follow in future country-specific publications.

Results

The total number of patients meeting the study eligibility criteria by country is shown in Supplementary Table 4. After matching, the following number of OF patients and matched controls, respectively, was identified: Australia, 4809 and 13,921; Germany, 11,452 and 34,090; South Korea, 47,238 and 134,813; Spain, 25,214 and 75,308; and the USA, 193,262 and 570,856 (Supplementary Table 5).

Baseline demographics and clinical characteristics

Most patients’ baseline demographic and clinical characteristics between OF and matched non-OF cohorts within each country were balanced, except for the number of pre-index hospitalizations for Germany and geographic region, cardiac disease, and cerebrovascular disease for Australia (standardized difference of means ≥ 10%; Supplementary Table 3). Key baseline characteristics are presented by country in Table 2. The mean (SD) age of OF patients ranged from 63.3 (9.2) years in the USA to 76.6 (10.1) years in Germany. For controls, the mean age ranged from 63.1 (9.2) years in the USA to 76.7 (10.0) years in Germany. The average follow-up time (months) for the OF cohort ranged from 18.1 in the USA to 34.6 in South Korea. For the controls, the range in follow-up time was from 19.1 in the USA to 35 in South Korea. During the pre-index period, percentages of osteoporosis diagnosis in the OF cohort ranged from 3.9% in Spain to 21.6% in Australia. Percentages of the OF patients with a record of anti-osteoporosis therapy ranged from 5.9% in Germany to 19.8% in South Korea. In the OF cohort, the top two fracture types at index date were radius-ulna (29.2%) and hip (26.2%) for Australia, hip (32.7%) and vertebral (18.7%) for Germany, vertebral (42.4%) and radius-ulna (30.4%) for South Korea, radius-ulna (24.0%) and vertebral (18.1%) for Spain, and radius-ulna (31.4%) and tibia-fibula (17.4%) for USA (Table 3).

Table 2 Baseline demographic and clinical characteristicsa
Table 3 Fracture type on the index date among OF women

Healthcare resource utilization

The adjusted rate ratios of HCRU (OF and non-OF) by country and service type are presented in Table 4. In each country, OF cohorts had significantly higher rates than non-OF cohorts in all types of HCRU, except for outpatient specialist visits in Germany. Specifically, the OF cohort had significantly higher rates of inpatient admissions (2.1 in Australia to 3.9 in the USA), nights stayed at the hospital (1.71 in Germany to 11.52 in the USA), all-type outpatient visits (1.19 in Australia to 2.0 in the USA), ER visits (1.04 in Germany to 3.19 in the USA), home visits (1.27 in Germany to 4.74 in the USA), and prescriptions than the non-OF cohort, where available. The adjusted rate ratios for nights stayed at the hospital, ER visits, and home visits for the OF cohort were higher than those for the non-OF cohort among all countries, most notably in the USA. The adjusted rate ratio for prescriptions ranged from 1.04 in Australia to 1.45 in South Korea. HCRU per 100 person-years during the follow-up period between OF and non-OF cohorts for each country is presented in Supplementary Table 5.

Table 4 Adjusted rate ratios of healthcare resource utilization between OF and non-OF by country and service type

Direct costs of care

The adjusted rate ratios of costs of care (OF and non-OF) by country and service type are presented in Table 5. In each country, OF cohorts had significantly higher rates of cost of care than non-OF cohorts for all cost types, except for ER (1.04; 95% CI 0.99–1.08) in Australia. The adjusted rate ratio of total costs of care was significantly higher in the OF cohort compared with the non-OF cohort, ranging from 1.38 (95% CI 1.35–1.41) in Germany to 3.11 (95% CI 3.09–3.13) in the USA. The highest adjusted rate ratio of total medical cost was observed for the USA (3.69; 95% CI 3.67–3.72) and the lowest for Germany (1.42; 95% CI 1.38–1.46). Mean costs during the follow-up period between OF and non-OF cohorts for each country are presented as per 100 person-years in Supplementary Table 6.

Table 5 Adjusted rate ratio of healthcare costs between OF and non-OF by country and service type

Discussion

The current study explored the economic burden of OF across multiple countries compared to non-OF women that reflected care in the contemporary landscape in different parts of the world. The main study results suggest that OF patients tended to use more healthcare resources and incur higher costs than non-OF patients. These differences illustrate the impact of OF on direct economic burden when a standardized methodology was used, despite variance in available data sources (e.g., some components of HCRU or costs were not captured in data sources) across countries.

The findings of the significant impact of OF on direct economic burden are aligned with most findings reported in the literature [2, 4]. In a US matched case-control study, Tran et al. reported low osteoporosis diagnosis (4–10.6%) and treatment rates (8.2–17.6%) prior to fracture in commercially insured female patients [1, 14]. In 2017, the treatment gap among women in EU6 was 73%, with the highest rate observed for Germany (78%) [4]. The highest radius-ulna fracture rate observed in the current study for the USA (31.4%) may be attributable to a younger study population compared with the other countries (mean age = 63 vs. 71–77 years). In a 2017 US claims–based analysis among commercial and Medicare patients, the radius-ulna (19.5%) was the second most common index fracture type after the spine (21.9%) [1]. The top three fracture types observed for Spain (radius-ulna, vertebral, and humerus) are somewhat consistent with the epidemiologic literature, although hip fractures are known to be the most frequent followed by forearm and vertebral fractures [15]. In the current study, propensity score matching with 6-month periods/panels allowed for the adjustment of several potential confounders, which is advantageous over the direct matching technique used by Tran et al. Nonetheless, the authors were able to demonstrate significantly higher adjusted excess costs for OF vs. non-OF cohorts over each follow-up period [14].

This study used a standardized methodology for all five countries to minimize methodological variance when data were compared across countries. The advantages of the current methodology included (i) use of propensity score matching in each of the 6-month panels to identify non-OF cohorts, with consideration of potential observed variables and to ensure matched patients were from similar time frames, reducing any difference that may have been caused by change in care practice over time, (ii) use of the person-year approach to account for variable follow-up time, (iii) use of multivariate regression to adjust for additional variables including those imbalanced variables after propensity score matching, and (iv) use of the ratio approach to present results, which theoretically cancel out unobserved variables (such as economic ramifications and regional variance mentioned in “Introduction”) because these variables are likely equal in both OF (in denominator) and non-OF (in numerator) cohorts within the same country. All of these were designed to minimize potential biases when differences in all-cause HCRU and costs of care between the OF and non-OF cohorts were assessed. To the best of our knowledge, this adjusted rate ratio approach was the first to be used in published healthcare research in a multi-country analysis in the therapeutic area of bone health.

Overall, the adjusted total cost of care was significantly higher in the OF cohort than in the non-OF cohort in each country, ranging from a ratio of 1.38 in Germany (95% CI 1.35–1.41) to 3.11 in the USA (95% CI 3.09–3.13). The adjusted cost ratios for pharmacy, medical inpatient, ER, and outpatient visits were similarly higher in the OF cohort across the participating countries.

Interpretation of country data should be taken with caution, as the demographic and clinical makeup of country-specific data varied. For example, the US cohorts were about 10 years younger on average than those in other countries, and the average length of follow-up time was shorter. This can be partially explained by the features of PharMetrics® Plus database, where (a) data were primarily from working-age populations based on commercial health plans and self-insured employer groups and (b) plan switches may be common because of the nature of annual subscriptions that led to shorter follow-up time. Changes in insurance coverage (most applicable to the USA) and covered populations in the data sources may also have an impact on HCRU or its subtypes (or components). In Spain, SIDIAP data have high specificity but low sensitivity with respect to reporting of OF in primary care alone compared with cohort studies or hospital discharge data because of the under-reporting of OF in primary care records [16]. However, our linkage to CMBD regional hospital data allowed for an increased sensitivity of OF. The total direct cost of care is underestimated in Spain, Germany, and South Korea due to the unavailability of ER costs for Germany and Spain and the combined medical and pharmacy costs for South Korea. Additionally, all-cause costs of care presented in this study may be influenced by comorbidities that may be related to OF, despite using non-OF cohorts as controls. Future studies may consider an assessment of OF-related costs of care.

Although direct comparisons between countries should be made with caution, a clear trend of increased economic burden in OF compared to non-OF patients, especially presented as rate ratios, was observed across countries. In the current study, the highest relative effect of OF compared with non-OF among HCRU settings was observed for nights stayed at hospital across all countries. Among the cost categories, the highest relative effect of OF compared with non-OF was observed for total medical costs. When HCRU is presented in person-year rates and cost data are presented in absolute numbers, it is difficult to ascertain such relationships across countries with varying healthcare or payer systems, expenditure, and gross domestic product (GDP).

From a clinical perspective, the observations necessitate the need for greater awareness of OF impact among patients and healthcare providers, particularly the need for prevention (which is evidenced by low rates of osteoporosis diagnosis and anti-osteoporosis therapy in the pre-index period) and post-care programs (which showed various levels of HCRU). Tran et al. demonstrated that the cost of fracture in the first year after fracture was higher than that in subsequent years [14].

In addition to generating valuable real-world evidence, the current study is uniquely positioned in terms of methodology, and its uniform application across data sources provides a high level of generalizability of the findings due to the similarity in trends observed, as presented in rate ratios, across countries. The current work presents a more robust approach to assess the direct economic burden of OF utilizing the rate ratio approach to demonstrate study results. Incentives to employ resources for osteoporosis or OF-related care that are country-specific, or that are relevant to all disease states, could have an impact on HCRU (as reflected in HCRU for non-OF cohorts). It is challenging to compare HCRU across countries using presentations that employ various HCRU frequency levels. Additionally, many of these country-specific variations may be eliminated if the differences between OF patients and similar non-OF patients from the same country are considered for economic burden evaluations. A standardized methodology to present the economic burden of OF across countries can address the current gaps in the literature, which can help stakeholders and policymakers tailor preventive fracture and post-fracture treatment programs.

Limitations

This study has inherent limitations with respect to its retrospective cohort design. Eligible female patients were required to be enrolled for a minimum of 18 months for the study as fracture-free during the pre-index period to ensure that the index fracture was an incident fracture, which may exclude some women who changed health insurance (applicable for the USA) or health practice or did not capture previous fractures incurred prior to the pre-index period. Furthermore, the less granular M80 codes (Supplementary Table 1) by themselves do not discriminate between past or new fractures and may be vague as to the specific fracture sites (e.g., M80.9). In our consideration of various clinical practices and adaptations of the ICD coding system across countries (e.g., M80.5 and M80.9 are not valid diagnosis codes in the US ICD-10-CM coding system and indeed were limited to the Korean study population in which these codes identified less than 0.3% of the OF group) and attempt to capture all possible incident fractures, we assumed that claims with ICD codes for prevalent fractures (e.g., M80.5 and M80.9) after the 18-month fracture-free period would be treated as incident fractures for the purpose of this study. The analysis does not include the data of patients who were moved to care homes/assisted living, so related HCRU was not reported, and the true costs of care would be underestimated. South Korean medical costs may need to be interpreted carefully as they could not be separated from pharmacy costs. The claims data are gathered for billing purposes rather than research, and hence, several clinical risk factors for OF were unavailable (e.g., bone density). Specific to the Australian dataset, “use of supportive equipment” was not reported, because this was neither collected nor reported in the administrative government datasets. Specialist visits in a hospital setting could be under-reported. This is not quantifiable as outpatient specialist care within a hospital is not picked up by the Australian MBS and PBS or specifically within NSW databases (though this information is included in cost data). Public hospitals see patients at “outpatient clinics” after discharge. However, this service is not billed through any of the available datasets and will not be selected. Similarly, private hospitals may bill through MBS so some patient visits may be collected. Furthermore, the response rate of the baseline 45 and Up questionnaire was 19% [17]; therefore, findings may not be generalizable to the entire population in NSW or Australia. Due to the characteristics of the SIDIAP database, which is a primary healthcare database and despite the linkage to hospital data, some information was not available. This information included data from patients’ visits to the ER (not leading to hospitalization) and outpatient specialist visits directly referred by the hospital. This lack of information can be relevant for those fractures that are admitted directly to the ER (without being referred by a primary care physician) and that will not undergo surgery or hospitalization or for those fractures that are treated in a more conservative way and are not hospitalized. Moreover, the cost of a specialist visit after the fracture could not be assessed, either by not having the actual costs of the fracture (we could only retrieve referrals to specialists) or by not having any information on the specialist outpatient visits when referred directly by the hospital. Although the SIDIAP database is representative of the Catalonia population [16], study findings may not be generalizable to the entire population of Spain.

Our approach to identify OF by excluding patients with ICD codes for major trauma may have resulted in misclassification for some of the participants. However, we believe this proportion to be rather small, and therefore, the impact of potential misclassification on the study results is negligible. This approach excludes patients who do not seek medical attention for low-trauma fractures. Differences in unobservable characteristics or differences outside of the study period (e.g., medical history prior to the start of the study period) might still exist. The presence of a record (EMR, prescribed, or claim) for a prescribed medication does not indicate that the medication was consumed or taken as prescribed. In the US analysis, patients with commercial insurance primarily comprised the study population, and the findings may not be generalizable to the uninsured or those with other insurance, such as Medicare. In contrast to earlier cost evaluations in the literature, our analysis investigated the adjusted HCRU and cost ratios (OF vs. non-OF). As a result, incompatibility caused by differences between countries (such as GDP/healthcare expenditures) may be lessened, and a standardized method of evaluating and comparing economies across countries should be introduced. Even if they are not ideal, we believe that adjusted ratios are a better approach than direct expenses to compare statistics between nations.

Conclusion

The current study demonstrates the substantial economic burden of OF across the five participating countries when compared to non-OF patients. Women with OF use healthcare services at a significantly higher rate. The current study addresses a critical gap in the assessment of the economic burden of osteoporosis research by allowing researchers to assess the relative economic burden of OF across countries using a standardized methodological approach. Strategies directed toward prevention, accurate diagnosis, and intensive treatment of OF are likely to help reduce the economic burden on healthcare systems.