FormalPara Key Points for Decision Makers

Stated preference research on osteoporosis treatment most often includes attributes regarding process and outcome, and patients are willing to make trade-offs between treatment characteristics in most cases.

Preferences differ significantly between patients; reasons for this heterogeneity are often demographic- or treatment-related factors.

Physicians and policymakers should take into account that preferences vary between patients and tailor treatment approaches to the individual patient.

1 Introduction

Osteoporosis is a pervasive skeletal disorder that has garnered significant attention due to its widespread prevalence and substantial impact on public health [1]. Characterized by weakened bone strength, osteoporosis elevates the risk of fractures, which can lead to a cascade of health issues including impaired mobility, loss of independence, diminished quality of life and excess mortality [2]. The management of osteoporosis is intricate, involving a multifaceted approach that can include pharmacological treatments such as antiresorptive treatment or anabolic treatment, as well as lifestyle modifications like diet and exercise [3, 4].

While the clinical efficacy of these interventions is well-documented, their real-world effectiveness is often compromised by poor medication adherence [5]. Adherence is a complex issue influenced by a myriad of factors, one of which can be a perceived lack of adequate treatment options when existing options do not sufficiently meet patient preferences. Patient preferences encompass individual attitudes towards the perceived benefits, risks, and inconveniences associated with different treatment options [6]. These preferences can vary widely among patients due to differences in personal values, experiences, and expectations, and can thereby affect their willingness to adhere to prescribed treatments that do not align with their preferences.

In this aspect, treatment options that better align with patient preferences are imperative for a patient-centered care approach to osteoporosis management to increase adherence and, by that, effectiveness of treatment. The concept of patient-centered care has gained substantial traction in healthcare policy and practice, emphasizing the need to align medical interventions with patient preferences and values [7]. Eliciting patient preferences, particularly through quantitative research methods is increasingly important, as it can offer valuable insights into the relevance of specific treatment characteristics and the trade-offs that patients are willing to make between them. These insights can inform policy-making and be used to tailor osteoporosis treatments more to the needs of the patients and thereby help improve adherence [6]. To our knowledge no systematic reviews have been conducted to specifically focus on stated patient preferences in regard to treatment options in the management of osteoporosis. This lack of focused systematic reviews is significant and highlights the need for comprehensive research in this area.

By addressing this gap, the proposed systematic review aims to contribute to a more nuanced understanding of patient preferences in the management of osteoporosis, which could lead to more personalized treatment plans or be used in the development of entirely new treatment options, thereby potentially improving medication adherence, treatment outcomes, and overall patient satisfaction. The primary objective of this systematic review is to critically appraise preference research in the field of osteoporosis by summarizing and analyzing the existing literature and to provide a comprehensive understanding of current scientific knowledge in the field as well as areas to be explored.

2 Methods

The reporting in this systematic review was guided by the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [8] as well as PRISMA-Search and Peer Review of Electronic Search Strategies (PRESS) guidelines [9, 10] for literature searches. In this context, a protocol for the systematic review was registered in PROSPERO (International Prospective Register of Systematic Reviews) with the ID CRD42024502379.

2.1 Literature Search

The systematic literature review was conducted across MEDLINE (via Ovid) and Embase and included studies that provide quantitative stated preference data, using the format of conjoint analysis (CA), discrete choice experiments (DCEs), and best-worst scaling (BWS), and were published in peer-reviewed journals. In line with the findings of Morrison et al. (2012) and Dobrescu et al. (2021), the search was limited to English articles published up to February 29th, 2024 [11, 12]. In addition to electronic searches, experts in the field were contacted to provide any missing references. Manual searches of bibliographies of identified studies and forward references as well as of previous systematic literature reviews of DCEs in healthcare [13,14,15] were also conducted.

2.2 Search Strategies

The final search strategies are presented in Electronic Supplementary Material Appendix A and were constructed following the guidance in PRISMA-Search [9] and peer-reviewed in adherence with PRESS guidelines [10]. They were informed by previous research and use a combination of keywords and MeSH terms related to osteoporosis, patient preferences, and specific methodologies of stated preference research [13,14,15,16]. The search strategies were developed in an iterated process supported by experienced researchers (CB, MH, and NKS). Starting with candidate search terms, subsequent draft search strategies were formulated, which were in turn used to expand on and concretize the candidate search terms. Three known relevant studies that met all inclusion parameters were selected a priori to start the search strategy process for validation purposes and were successfully identified by the final search strategies.

2.3 Study Records

Covidence, a web-based collaboration software platform that streamlines the production of systematic and other literature reviews, was used as a systematic review data management tool to manage search results. After the identified literature was uploaded into Covidence, duplicates were eliminated using the automation functions of the software. Two independent reviewers (ELH and LN) individually screened all titles and abstracts of the remaining articles to assess their eligibility for inclusion (see Table 1). Full-text articles were retrieved for those that met the inclusion criteria, and any discrepancies between the reviewers were resolved through discussion or consultation with a third reviewer (CB). In the case of uncertainty regarding the meeting of all inclusion criteria, full-text-records were retrieved and discussed between the reviewers. Reasons for excluding retrieved records were documented (see Fig. 2).

Table 1 Inclusion criteria

2.4 Data Extraction and Analysis

Data extraction and analysis followed a clearly defined process. First, all included papers were summarized using a standardized data extraction form. This extraction form was informed by existing literature and pretested by two reviewers with three studies. Relevant insights of these studies were used to adjust the standard data extraction form. Subsequently, the extraction form featured general study characteristics (title, author, year and journal of publication, country, availability, time and duration of data collection), population characteristics (number of participants, mean age, share of female respondents, response rate), information regarding the methods (study design, data collection method, method of attribute and level elicitation, attributes and levels used in questionnaire, pilot study, number of choice sets/tasks per participant, number of attributes per choice set/task, maximum number of levels in an attribute, number of alternatives per choice set/task, additional opt-out option), and results (information on conditional relative importance [most important attribute, attribute ranking], information on heterogeneity/subgroups).

2.5 Quality Assessment

The quality of the included studies was assessed by two independent reviewers (ELH and LN) using a cumulated checklist integrating the Purpose, Respondents, Explanation, Findings, and Significance (PREFS) checklist [17] as well as the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) checklist [18] as previously established, for example, by Bien et al. (2017) [19], Tünneßen et al. (2020) [20], Lack et al. (2020) [21], and Sain et al. (2020) [22]. Any differences in scoring were addressed and solved through discussion.

The PREFS checklist consists of five items that are to be assigned a digital score per item (0, 1), resulting in a maximum possible score of 5 points. The items include the clear identification of a purpose of the study (Purpose), analysis of similarities between responders and non-responders (Respondents), clarity of method explanation (Explanation), comprehensive reporting of results (Findings), and application of significance testing (Significance).

The ISPOR checklist consists of ten topics (research question, attributes and levels, construction of tasks, experimental design, preference elicitation, instrument design, data-collection plan, statistical analyses, results and conclusions, and study presentation) with three respective sub-questions. Each of these sub-questions was rated as 0 or 1 as suggested, for example, by Scheres et al. (2023) [23] and Al-Aqeel et al. (2023) [24], leading to a maximum possible score of 30 points for the ISPOR checklist.

2.6 Data Analysis

The conditional relative importance of attributes was extracted from the studies where available or, if sufficient analysis was reported, calculated using the range method, as proposed by the ISPOR Conjoint Analysis Good Research Practices Task Force [18]. This method estimates the conditional relative attribute importance by calculating the maximum range between two level coefficients of the respective attribute. Attribute ranking is consequently conducted according to the relative importance, calculated by the level coefficient’s range for that attribute, divided by the sum of all coefficient ranges for all attributes. Figure 1 visualizes the different coefficient estimates for each attribute. The resulting maximum range is the difference between the highest and lowest coefficient estimates for that respective attribute. A high range of a specific attribute in comparison to the other attributes indicates increased sensitivity to changes of levels within this attribute and thus a relatively high impact on respondents’ preferences and the high conditional relative importance of that attribute.

Fig. 1
figure 1

Visualization of coefficient estimates (including confidence intervals) and their range within an attribute (data from Graham-Clarke et al. 2020 [34])

Subsequently, attributes were divided into three categories (outcome, process, and cost), following the proposition of previous research [19,20,21,22, 39,40,41], to facilitate a comprehensive overview of the included studies. The category ‘outcome’ included attributes relating to efficacy as well as side-effects. The ‘process’ category incorporated attributes pertaining to the mode and frequency of administration, duration of treatment as well as convenience of handling or storage. Across these categories, attribute frequency and significance of an attribute were estimated and compared between the included studies.

Finally, a pairwise comparison of attributes was conducted, where the importance of attributes was compared by analyzing specific attribute pairings within the included studies following the proposition of Purnell et al. (2014) [42]. For a more nuanced analysis of the general choice determinants, the attributes within this analysis were further categorized into treatment benefit attributes and treatment burden attributes, and conditional relative importance was evaluated for these categories.

3 Results

Overall, the searches yielded 1026 potential studies, 252 of which were excluded due to being duplicates. The remaining 774 were abstract/title screened; 716 did not meet the inclusion criteria and were excluded; full texts could not be retrieved for an additional two studies; the remaining 56 were included into the full text assessment. Here, 42 studies were excluded; the reason for exclusion is documented in the Electronic Supplementary Material Appendix B. No additional studies were identified during the manual search; thus, overall, we included 14 studies in our analysis. The flow diagram of the study selection process according to PRISMA is presented in Fig. 2.

Fig. 2
figure 2

PRISMA 2020 flowchart for systematic reviews [8]. PRISMA Preferred Reporting Items for Systematic Reviews and Meta-Analyses

3.1 Study Characteristics

Out of the 14 included studies, half were published before 2014 (two CA and one BWS) [25,26,27,28,29,30,31], while all seven studies published in 2014 or after were DCEs [6, 32,33,34,35,36,37]. Thirteen of the studies focused on pharmaceutical treatment [6, 25,26,27,28,29,30,31,32,33,34,35, 37]; only Beaudart et al. (2022) [36] investigated non-pharmaceutical interventions for prevention or treatment of osteoporosis. Overall, five studies were published during the last 5 years [33,34,35,36,37]; however, two of these used data from a study published in 2017 [35, 36], leaving only three studies reporting more recent data.

Six studies were conducted in the USA [25,26,27,28, 31, 37] and Europe [6, 29, 30, 32, 35, 36], respectively; one study was conducted in South-America [34] and Asia [33] each. The studies’ sample sizes ranged from 76 to 1124 participants, with an average of 337 participants. All studies focused on adult populations, with the mean age (where specified) ranging from 52 to 78 years. Eighty-eight percent of the study population was female, with four studies focusing entirely on female populations [25, 28, 29, 31]. An excerpt of study characteristics can be found in Table 2 and a detailed overview in Electronic Supplementary Material, Appendix D.

Table 2 Overview of study characteristics and attributes

3.2 Quality Assessment

All included studies were included in the quality assessment and rated according to the PREFS and ISPOR checklists respectively. Table 3 shows the scores of all studies in each checklist. Overall, the average score was 3.6 for the PREFS checklist, with a maximum score of 4 and a minimum score of 2 points. All studies sufficiently stated the purpose, 12 studies sufficiently explained the methods of assessing preferences and included all respondents in their findings, and 11 studies used significance tests to assess preference results. Only one study reported information comparing the responders to non-responders.

Table 3 Quality assessment according to PREFS and ISPOR

The average score for the ISPOR checklist was 22.8, with a maximum score of 30 and a minimum score of 12 points. The highest scoring items were the definition of a research question and appropriateness of CA (item 1), study presentation (item 10), and results and conclusions (item 9), while instrument design (item 6) and data collection plan (item 7) were least sufficiently addressed by the included studies.

Interestingly, when distinguishing studies by their publication date, study quality increased noticeably after the publication of the respective checklists. Studies published prior to the publication of the ISPOR checklist in 2011 reached an average score of 17.7, whereas studies published later reached an average of 26.6 points. Regarding PREFS, studies published before the publication of the checklist yielded an average of 3.4 points, whereas studies published after 2013 were scored 3.7 on average.

3.3 Study Design and Choice Sets

The process to inform attribute and level selection was specified by 12 of the 14 studies (86%) [6, 26,27,28,29,30,31,32,33,34,35,36]. Three studies used a combination of literature reviews, expert opinion, and patient discussions [29, 32, 35], two studies used a combination of literature review and expert opinion [30, 36], two studies used patient discussions [31, 34], two studies used the formal Nominal Group Technique (NGT) process [6, 33] as defined by Hiligsmann et al. (2013) [38], and three studies used primarily attributes and levels connected with real-life treatment options [26,27,28]. Overall, literature reviews, expert opinion, and patient discussion were equally often used (50%). Nine studies (64%) subsequently pilot-tested their study instrument and adjusted it according to the feedback obtained [6, 29,30,31,32, 34,35,36,37].

Focusing on CA and DCE, most studies used four [26, 28, 33, 35] or five [6, 29, 32, 34] attributes (four studies each), with an average of four attributes per study. Where the number of attributes was not conclusively reported (i.e., attribute description varied from choice set example or analysis), attribute reporting in the choice set example was used. The maximum number of levels was not specified for three studies [25, 28, 31]; the remaining studies used on average a maximum of four levels in the respective attribute with the most levels. Three studies did not specify the number of choice sets per participant [26, 28]; the remaining 11 studies used on average 13 choice sets per participant.

3.4 Main Insights Regarding Preferences

3.4.1 Attribute Classification

Out of the 14 studies, 12 were included in the attribute classification process. Silverman et al. (2013) [31] was excluded as the study used BWS with 39 statements, the classification of which could not be reproduced by the authors. Furthermore, Cornelissen et al. (2020) [35] was excluded as their publication built on data already included in Hiligsmann et al. (2017) [32]. Thus, 12 studies were considered in this part of the analysis, 11 of which focused on medical treatment for osteoporosis, while one study focused on lifestyle behavior and food supplements, which were categorized as process attributes [36]. Figure 3 shows the distribution of all 52 (non-distinct) attributes included in this analysis, 26 of which (50%) were classified as process attributes, followed by 21 outcome attributes (40%) and five cost attributes (10%).

Fig. 3
figure 3

Classification of attributes

All analyzed studies included process-related attributes, nine studies included attributes pertaining to outcomes [6, 25,26,27,28,29, 32, 33, 37], and five studies included cost-associated attributes [6, 29, 30, 32, 33]. All in all, four studies included attributes of all three categories [6, 29, 32, 33], and two studies only included process attributes [34, 36]. Six studies (50%) used attributes of two categories, five of which [25,26,27,28, 37] combined outcome and process attributes, while one study combined cost and process categories [30]. A detailed overview of all attributes used in the included studies can be found in Electronic Supplementary Material Appendix C.

The process category included attributes regarding mode or frequency of administration, convenience (dosage settings, storage), and other characteristics (lifestyle changes, supplements, duration). Within this category, the majority of attributes were associated with either mode of administration (n = 7) [6, 32, 34, 37], frequency of administration (n = 3) [6, 32, 37], or a combination of both (n = 7) [25,26,27,28,29,30, 33]. Beaudart et al. (2022) was the only study to focus on non-pharmaceutical interventions and, as such, included six attributes regarding lifestyle behaviors [36]. Summarized in the process category ‘other,’ de Bekker-Grob et al. (2008) included one attribute pertaining to treatment duration [29], Graham-Clarke et al. (2020) included two attributes regarding convenience (dosage and storage) [34], and Darbà et al. (2011) included one attribute describing the place of administration (self-administered, administered through medical support, or hospitalization) [30].

Attributes were classified as outcome attributes if they pertained to treatment efficacy (reduced risk of fractures) or adverse events (mostly gastrointestinal adverse events, flu-like symptoms, skin reactions, or only specified as serious and non-serious). Within this category, the majority of attributes pertained to efficacy (n = 12), which always included the risk reduction of either hip fractures (n = 5) [26,27,28,29, 37], spine fractures (n = 3) [26, 28, 37], or general, not-specified fractures (n = 4) [6, 25, 32, 33]. The remaining nine attributes in the outcome category all fall into the realm of adverse events, which were mostly specified by their levels and included gastrointestinal symptoms only (n = 3) [26, 28, 29], a combination of gastrointestinal and flu-like symptoms (n = 1) [25], a combination of gastrointestinal and flu-like symptoms as well as skin reactions (n = 3) [6, 32, 33], or a general description of serious or non-serious side-effects (n = 2) [37].

3.4.2 Attribute Significance

In line with previous research [19], attributes were considered significant if at least one of their respective level coefficients was statistically significant at a 5% level (for a detailed list, see Electronic Supplementary Material Appendix C). It should be noted that as Hiligsmann et al. (2017) compared preferences across seven European countries and reported attribute significance on the country level, overall attribute significance could not be derived [32]. Here, attributes were considered to be significant if they were significant at the 5% level in at least one of the included countries. Silverman et al. (2013) was excluded from this analysis as the statements used in their BWS could not be properly assigned to attributes [31]. Furthermore, attribute significance could not be analyzed for five studies, spanning 19 attributes [25,26,27,28, 37], as sufficient information on p values or confidence intervals was not provided, and information on coefficients not reported. Thus, 37 attributes were included in this analysis, the vast majority of which (n = 36) was reported as being significant and thus relevant to the patients, while only one attribute (smoking [36]) was explicitly stated as being not significant.

3.4.3 Conditional Relative Importance

Eight studies (57%) explicitly stated information pertaining to conditional relative attribute importance [25,26,27, 31, 34,35,36,37]; an additional five studies (36%) reported sufficient information for calculation of the conditional relative importance using the range method [6, 29, 30, 32, 33]; Fraenkel et al. (2007) did not provide sufficient information for range calculation [28]. Additionally, Graham-Clarke et al. (2020) [34] and Beaudart et al. (2022) [36] were excluded from this analysis as they exclusively covered attributes that were categorized as process attributes, rendering ranking inconclusive, and Cornelissen et al. (2020) [35] was excluded from this part of the analysis as the attribute importance of their patient sample was already included in Hiligsmann et al. (2017). Thus, ten studies were included in this part of the analysis; a detailed overview of the conditional relative importance of their respective attributes can be found in Electronic Supplementary Material Appendix C.

As Hiligsmann et al. (2017) provided information for seven European countries separately, the range method was used to calculate conditional relative attribute importance for each respective country and the most important attribute classified according to frequency (in this case, efficacy was the most important attribute for respondents from five of the seven countries) [32].

Overall, outcome [29, 31, 32, 37] and process [25,26,27, 30] were rated most important in four studies, respectively (40%) (see Figure 4), covering 1915 (outcome) and 660 (process) respondents, while cost was considered most important in two studies (524 respondents) [6, 33]. Notably, when further reducing the studies included in this part of the analysis to those five studies who incorporate all three categories of attributes, outcome was deemed most important in three studies (60%) [29, 31, 32] and cost in two studies (40%) [6, 33].

Fig. 4
figure 4

Attribute ranking according to conditional relative importance

To further analyze the conditional relative importance of attributes, this study examined the frequency with which each attribute was ranked as the most important in pairwise comparisons across various studies. Here, attributes were categorized into treatment benefits (efficacy) and treatment burden (administration, side-effects, cost). Within the investigated ten studies, the most common attribute comparisons included administration with side-effects (n = 8) [6, 25, 26, 29, 31,32,33, 37], administration with cost (n = 6) [6, 29,30,31,32,33], administration with general efficacy (n = 5) [6, 25, 31,32,33], side-effects with general efficacy (n = 5) [6, 25, 31,32,33], and side-effects with cost (n = 5) [6, 29, 31,32,33]. These pairings resulted in a total of 45 comparisons, the details of which are depicted in Table 4.

Table 4 Comparison of conditional relative importance in attribute pairings with attribute ranked as most important (n ranked most important/total n of comparisons)

Efficacy was featured in 26 comparisons, eight of which were related to the prevention of hip fractures, four to the prevention of spine fractures, and 14 to the efficacy in preventing general fractures. The analysis reveals that efficacy consistently ranks as the most important attribute, being deemed more important than its counterpart in 21 out of 26 pairings. Specifically, efficacy was superior to administration (eight of 11 comparisons), side-effects (ten of ten comparisons), and cost (three of five comparisons).

Administration was included in 25 comparisons and was ranked dominant in 12 of these; it was, overall, more important than a reduction in spine fractures (two of two comparisons), and side-effects (five of eight comparisons). Additionally, although administration was not dominant over efficacy or cost in the comprehensive evaluation, two studies identified it as more critical than efficacy in preventing hip fractures [26, 27] and cost [30, 32] each, with one study ranking it higher than general efficacy [25].

Cost was incorporated in 16 comparisons and was dominant in nine; overall, it was deemed more important than administration (four of six comparisons), side-effects (three of five comparisons), and general efficacy (two of four comparisons).

Side-effects were examined in 23 comparisons but did not demonstrate overall dominance. Nevertheless, three studies ranked side-effects above administration [6, 29, 31] and two above cost [29, 31], respectively.

When summarizing the ranking across all pairings, efficacy was more dominant than any other attribute, followed by cost, which dominated administration and side-effects, and administration, which was superior to side-effects.

In evaluating attributes according to their categories of treatment benefit and treatment burden (26 comparisons), treatment benefits were ranked higher in 19 comparisons, whereas treatment burden was ranked as more important in seven comparisons. Within the treatment burden category (19 comparisons), cost and administration were ranked highest (seven comparisons each), followed by side-effects (five comparisons).

3.4.4 Heterogeneity and Subgroups

Preference heterogeneity was evaluated by most studies to some extent; only Darbà et al. (2011) [30] did not provide an analysis pursuant to preference heterogeneity. Preference heterogeneity was observed in 12 studies [6, 26,27,28,29, 31,32,33,34,35,36,37], one study explicitly stated to have found no significant associations in their subgroup analyses [25]. Beaudart et al. (2022) did not provide information on the statistical analysis method regarding the analysis of preference heterogeneity but provided insight into the results of subgroup analyses, nonetheless [36].

Regarding the qualities of models to assess heterogeneity, five studies used mixed logit (or random parameters logit) models [6, 32,33,34,35], allowing for preference parameters to vary across individuals, indicating preference heterogeneity if standard deviations are significant. Two studies included latent class models, estimating the probability of each respondent belonging to a segment or class with similar preferences and thereby showing heterogeneity [35, 37]. Four studies used hierarchical Bayes models, estimating parameters at the individual level to allow for an analysis of heterogeneity [25, 27, 31, 37]. Six studies investigated the significance and coefficients of interaction terms related to demographic characteristics of respondents and included them in subgroup analyses [6, 29, 32,33,34,35]. Some studies used a combination of models, and while all studies that used mixed logit models also used interaction terms, Cornelissen et al. (2020) used a combination of mixed logit and latent class models as well as an investigation of a joint model using interaction terms [35].

The majority of studies investigated demographic factors, such as age, gender, or education [6, 25, 27, 28, 31,32,33,34,35,36,37], closely followed by factors related to osteoporosis itself (diagnosis of osteoporosis, previous fractures, perceived and actual fracture risk) [6, 26,27,28, 31,32,33,34,35,36]. Six studies included factors related to the respondent’s treatment status and experiences (prior use of osteoporosis medication, experience with injections, attitude towards medication) [6, 27, 28, 34, 36, 37], four studies investigated the impact of factors related to the respondents’ health status (body mass index, perceived health status, comorbidities) [26,27,28, 34], and four studies included other factors [26, 28, 34, 37].

The 11 studies investigating demographics included on average three factors in this analysis, resulting in 36 observations, 15 of which were significant and 21 non-significant. For osteoporosis, 16 observations were included (eight significant) and treatment-related factors were observed nine times and were significant in six of these, while health-related factors were observed six times (three significant). Regarding singular factors, age was most often investigated (n = ten studies) and was found to be a significant contributor to preference heterogeneity in six studies, closely followed by education (n = 8), which was equally often reported as significant and non-significant, and previous fractures (significant in two studies).

4 Discussion

This systematic review represents a comprehensive analysis of stated preference research in the treatment and management of osteoporosis, underscoring the significant role of patient preferences in optimizing treatment outcomes. The inclusion of 14 studies, encompassing a variety of treatment modalities and patient demographics, provides a nuanced understanding of the complex landscape of patient preferences in osteoporosis management.

This review showed that the relative importance of treatment attributes varied across studies. The comparison of most important attributes within each study revealed that outcome and process attributes were equally often rated as most important. This suggests that patients value both the effectiveness of the treatment as improvement of health and risk of side-effects, as well as the manner in which it is administered. However, when further comparing attribute importance by investigating which attribute dominated in relevant pairings, efficacy was deemed more important than any other attribute most often, followed by cost, administration, and side-effects in that order. The predominance of efficacy in these comparisons suggests a robust preference among patients for treatments that promise the highest potential to reduce fracture risk, affirming the critical role of efficacy in patient-centric treatment planning. The results also reveal a complex valuation of cost, which, while often ranked lower than efficacy, still dominates over administration and side-effects in certain scenarios. This reflects the nuanced considerations patients make regarding the affordability and value of treatments relative to their benefits and burdens.

Furthermore, the fact that the vast majority of attributes was significant suggests that patients are willing to make trade-offs between outcome, process, and cost attributes. Only one reported attribute, smoking, which was analyzed as one of the lifestyle factors by Beaudart et al. (2022) [36], was non-significant and among the least important to patients. Beaudart et al. deduce that this may be due to the relatively small percentage of smokers in the study population, rendering a lifestyle intervention in this regard not relevant for the majority of the population. Interestingly, cost was included in a subset of studies, investigating preferences in Spain, the Netherlands, Belgium, Ireland, Switzerland, China, and the USA [6, 29,30,31,32,33]. Here, cost was the most important attribute in two of these studies [6, 33], which might reflect the impact of variations in healthcare systems, insurance coverage, and out-of-pocket expenses across different countries.

The findings of this review have several important implications for clinical practice. First, they highlight the necessity of engaging patients in discussions about their treatment options, ensuring that their preferences and values are taken into account. Such patient-centered approaches could enhance adherence to treatment, as patients are more likely to follow through with a treatment plan that they have had a role in selecting. Moreover, the preference for specific treatment attributes over others suggests that healthcare providers should emphasize these aspects when discussing treatment options with patients. This aspect is further stressed by our findings regarding significant preference heterogeneity across studies, demonstrating the vast array of individual preferences towards treatment attributes. In this aspect, Hiligsmann et al. (2017) provided evidence of considerable cross-country differences in preferences, especially regarding the mode of administration, thus highlighting the complexity of aligning treatment options with patient desires in a global context [32]. Cornelissen et al. (2020) further elaborated on the intricacies of patient preferences by showing that osteoporosis patients exhibit diverse preference patterns that do not align neatly with socio-demographic or clinical characteristics, indicating that the reasons behind their treatment choices might be driven by factors not yet fully understood [35]. Furthermore, Beaudart et al. (2022) explored patient preferences for lifestyle behaviors in osteoporotic fracture prevention, adding another layer to understanding patient-centered care by revealing a spectrum of patient attitudes towards non-pharmacological treatment or prevention strategies, which may often be overshadowed by the focus on medication [36].

These findings collectively underscore the need for a patient-centered approach in osteoporosis management that goes beyond traditional demographic and clinical indicators to include a broad spectrum of individual preferences, highlighting the necessity for healthcare practitioners to engage patients in detailed conversations about their treatment desires and concerns. By doing so, treatment strategies could be more closely aligned with patient values and expectations. Aligning treatment options with patient preferences can enhance their acceptability, thereby reducing perceived inconvenience or burden. Moreover, insights from stated preference research can target specific obstacles to adherence, such as the perceived risk of side-effects, by enabling proactive communication and offering diverse treatment choices. Additionally, certain models can uncover underlying dynamics in the decision-making processes, highlighting key barriers to adherence. Latent class models are particularly beneficial in identifying distinct subgroups within patient populations that share similar preferences. This can facilitate the development of more targeted and personalized interventions, tailored according to individual patient characteristics or previous experiences. Those approaches can acknowledge the diversity within patient groups, allowing healthcare providers to better match treatments to the varied needs of different patient segments, and potentially improving adherence to treatment.

This multifaceted perspective on patient preferences highlights the complexity of managing osteoporosis and calls for a patient-centered care model that is both flexible and responsive to individual needs and preferences.

Another observation from our review is the relatively conservative increase of the use of DCEs in the study of patient preferences for osteoporosis management. Although there is a notable trend in other disease areas to increasingly incorporate DCEs in the body of evidence, this does not seem to be the case for osteoporosis. Despite the growing recognition of DCEs as a robust method for eliciting patient preferences, capable of capturing complex decision-making processes, the trend towards conducting DCEs in osteoporosis lags in this regard. Additionally, our findings suggest a pronounced need for contemporary data specific to osteoporosis, as half of the included studies were published before 2014, and among the five more recent publications of the last 5 years, two utilized data included in a publication from 2017. This gap underscores the urgency for up-to-date research that reflects current treatment options, healthcare policies, and patient expectations, ensuring that the insights derived are relevant and applicable to today's osteoporosis management practices.

Additionally, our review underscores the need for stated preference research to explore the role of lifestyle and behavioral factors in osteoporosis management. Despite the well-documented challenges of treatment adherence in osteoporosis, there is a significant gap in our understanding of how patients value lifestyle modifications and behavioral interventions alongside medical treatments or even as preventive measures. Here, Beaudart et al. (2020) [36] found that most factors investigated in the subgroup analysis did not render the significant influences of the included contextual factors, apart from a secured diagnosis of osteoporosis and age above 64 years. As the reluctance towards preventive measures is a considerable impediment to the prevention of osteoporosis, further contextual factors that evoke this reluctance remain to be investigated.

This gap highlights the necessity for a more holistic approach to osteoporosis care that incorporates patient preferences for non-pharmacological strategies. By broadening the scope of stated preference research to include these aspects, we can gain insights into comprehensive management strategies that not only aim to improve bone health but also enhance overall patient well-being and quality of life. This holistic perspective is essential for developing patient-centered care plans that address the multifaceted nature of osteoporosis management.

Furthermore, the comparative lack of focus on cost attributes in the included studies suggests that further research could be beneficial in understanding the role of financial considerations in patient decision-making. Addressing financial considerations, despite being less frequently identified as a primary concern in the included studies, remains an important aspect of the decision-making process for many patients, particularly in regions with significant out-of-pocket healthcare costs.

The quality assessment of studies included in our review highlights a spectrum of methodological rigor, reflecting the inherent challenges in stated preference research. The application of PREFS and ISPOR checklists allowed for a structured evaluation, revealing a general improvement in study quality over time, particularly after the publication of these guidelines. This trend suggests a growing adherence to established methodological standards in the field, which is relevant for ensuring the reliability and validity of findings. However, our analysis also identified areas for improvement, particularly in the explanation of methods and the comparison of responders to non-responders, as indicated by the variability in PREFS scores across studies. This underscores the necessity of adopting a standardized approach to conducting and reporting preference studies in osteoporosis, facilitating comparability, and synthesis of findings. Having high-quality preference studies is paramount to deriving robust insights that can inform patient-centered care practices and policymaking in osteoporosis management.

This review is subject to specific limitations that should be elaborated. First, the exclusion of non-English-language studies may have omitted relevant research, potentially limiting the comprehensiveness of our findings. Second, limiting the review to two databases could potentially have resulted in the exclusion of relevant studies. To mitigate this risk, the review incorporated manual searches of relevant literature, including forward and backward referencing. Third, the assessment of the quality of the included studies focused largely on the reporting quality and may not account for all potential forms of bias, and thus miss relevant aspects. This review used a combination of two well-established checklists to counteract this limitation to a certain extent. Another limitation lies in the review's scope, as it primarily focuses on quantitative stated preference studies, which may not capture the full complexity of patient preferences that qualitative methods can uncover. However, restricting a systematic review on stated preference research is well in line with recent literature [19,20,21,22] and was conducted in order to assure comparability of included studies, feasibility of analyses, and relevance of findings. Additionally, the heterogenous quality of included studies may impact the quality of this review, although it was conducted in line with best practices in the field of systematic reviews.

5 Conclusion

This systematic review comprehensively examined patient preferences for the treatment of osteoporosis through the lens of stated preference research. Our investigation illuminates the nuanced landscape of patient preferences in the context of osteoporosis management. The review highlights significant preferences, meaning that patients are willing to conduct trade-offs between attributes, as well as significant preference heterogeneity among patients. In conclusion, our systematic review underscores the importance of a patient-centered approach that encompasses a thorough understanding of individual preference treatment attributes. As the field evolves, continued research is essential to deepen our understanding of patient preferences. By doing so, osteoporosis management measures can move towards more personalized, effective, and patient-aligned osteoporosis management strategies.