FormalPara Key Points for Decision Makers

The study employed LCA to uncover preference variations among patients with chronic heart failure.

Market simulation indicated that, even with frequent monitoring, most classes preferred therapy options offering significant improvements in mobility, reduced mortality, and fewer hospitalizations.

The findings enable the adaptation of treatment alternatives to cater to the individual needs of patients and specific patient groups.

Overall, the study's results have the potential to enhance both clinical decision-making and resource allocation, ultimately improving the quality of clinical data interpretation in the context of chronic heart failure management.

1 Introduction

Heart failure (HF) affects approximately 26 million people worldwide [1]. The prevalence is 2–3% of the adult population in the Western world [2]. An estimate of the prevalence in the total European population is between 0.4 and 2% [3]. In 2015, heart failure was the most frequent individual diagnosis treated in hospitals, with around 450,000 patients. Most patients are over 65 years old [4]. The incidence of heart failure increases with age, and men have a higher incidence than women [5].

In chronic heart failure, the heart is no longer capable of providing a sufficient supply of blood. As a result, there is not enough oxygen to maintain metabolism under both stress and rest conditions. This malfunction can be caused by a systolic disorder, in which the pump function is restricted, or by a diastolic disorder, in which the ventricular blood flow is reduced. The symptoms of heart failure vary. They can range from physical weakness to edema or even organ failure [6, 7].

Heart failure is usually diagnosed clinically on the basis of symptoms. If the symptoms indicate a heart failure, a thorough anamnesis and physical examination should follow [8]. The anamnesis can be used to determine risk factors such as smoking, hypertension, diabetes mellitus, or obesity, which can lead to coronary heart disease and, later on, to chronic heart failure [6, 9].

The New York Heart Association (NYHA) classification is used to describe the severity of symptoms and exercise intolerance of heart failure patients [6].

Therapy goals include reducing heart failure symptoms, increasing exercise tolerance, reducing mortality and hospitalization rates, and improving quality of life [10]. Heart failure patients with NYHA stages I–III should only practice moderate physical exercise [11]. If underlying diseases that are the cause of heart failure can be treated, interventional or surgical treatments should be made [12]. There is also the possibility of a drug therapy. Innovative approaches to monitoring heart failure have been developed that allow electronic transfer of physiological data using remote access technology via external, wearable, or implantable electronic devices. Implemented remote approaches allow frequent or continuous assessment of physiological parameters related to HF, and serve as the base for early detection of HF worsening, thus permitting remote disease management [13].

The aim of the study was to analyze the preferences of heart failure patients for the implementation of a monitoring system. Only a few studies have been published that analyze the perspective of patients with heart failure on benefit–risk tradeoffs [14, 15]. A best–worst scaling case 3 was performed to assess the impact of treatment effects on choice decisions for a monitoring system. The study results should contribute to the improvement of treatment decisions and outcomes.

2 Methods

2.1 Best–Worst Scaling Case 3

Study design and analysis are based on guidelines for conducting a discrete choice experiment (DCE) [16,17,18,19]. The DCE methodology is based on the multi-attribute utility theory, which assumes that every product or service can be characterized by several key attributes and levels. It implies that people rationally choose between different choice alternatives, e.g., treatment options, by comparing the attributes and levels while evaluating different aspects of the alternatives. The relative importance of attributes is determined by analyzing tradeoffs between decision-relevant attributes and their levels. DCEs are quantitative methods for measuring stated preferences, which are widely used and accepted in healthcare, and increasingly discussed by regulatory bodies [20,21,22,23].

A best–worst scaling (BWS) represents a specific variant of the traditional DCE. In contrast to the DCE approach, where respondents choose only the best option from a set of options, BWS requires respondents to identify both the best and worst option within a given choice scenario. The BWS method distinguishes three different cases: the objective case (case 1), the profile case (case 2), and the multiprofile case (case 3) [24]. In the object case, different attributes are presented from which individuals choose the best and worst. In the profile case, individuals are presented with attribute levels that represent a choice alternative, such as a medical intervention, and they then choose the best and worst attribute levels instead of the attributes themselves. The multiprofile case used in this study involves three or more alternatives (profiles) from which individuals choose the best and, unlike a traditional DCE, the worst alternative. In a survey with a BWS case 3, participants are presented with choice scenarios featuring a minimum of three alternatives. The fundamental principle underlying this approach is that respondents define the extremes of a latent, subjective spectrum. Essentially, they are asked to choose the pair of options that maximizes the difference in value (on the scale of latent utility) between them [25]. With three alternatives in a choice scenario, a complete ranking of preferred alternatives can be created. In case of more than three alternatives, a follow-up question can be asked regarding the remaining alternatives [19]. In contrast to the traditional DCE, which only uses information on the best choice, the BWS provides a larger amount of information for the analysis. The BWS is recognized as a valuable method for analyzing patients' preferences [26, 27], and it has already been used successfully in the healthcare sector [28,29,30].

2.2 Attributes and Levels

The aim of the study was to analyze the preferences of heart failure patients for the implementation of a monitoring system. Hence, the “treatment” in this case encompasses a monitoring system rather than actual therapy. Consequently, the attributes used refer to place and frequency of monitoring in terms of comparing an implantable (home-based) device to a standard regular check at the physician office. To determine patient-relevant attributes and levels for the BWS survey, a nonsystematic literature search, focusing on current treatment options and potential monitoring alternatives, was conducted. Qualitative pretest interviews were used to identify and outline the benefits and risks of an innovative implantable monitoring system. Development of the questionnaire and identification of attributes and levels for the BWS was based on qualitative pretest interviews (N = 9) conducted with patients in Germany. The purpose of the interviews was to use the information provided by the patients in an open dialogue to identify and evaluate their wishes and needs for heart failure treatment. In the semi-structured interviews, all relevant attributes of a therapy for chronic heart failure were identified. In subsequent interviews (N = 5), the developed questionnaire was tested for comprehensibility and meaningfulness.

Based on the preliminary investigations, five attributes with different levels were determined as critical for heart failure care (see Fig. 1): mobility (500 m, 400 m, 300 m, 200 m, 100 m, 50 m), risk of death (low −3%, medium − 13%, high − 23%), risk of hospitalization (low −10%, medium − 25%, high − 40%), type and frequency of monitoring (9×, 32×, 56× at home or at the doctor), and risk of medical device and system relevant complications (no risk − 0%, mild − 1%, high − 2%).

Fig. 1
figure 1

Example choice task

In the decision model, complications are exclusively associated with the device and system, posing no life-threatening risks. Therefore, even if there is a 2% chance of complications, this would only result in encountering a system malfunction (e.g., false measurement of heart rate) without any impact on the individual health condition. Having system failures (e.g., sensor does not fully deploy during implant, embolism of sensor) simply results in returning to standard routine doctor visits. On the contrary, experiencing a high risk of death could indicate an inadequate monitoring of heart failure and insufficient controls of the device or system- or device-related failures. Prior to starting the choice experiment, all attributes and their respective levels were presented to the respondents in a clear and easily comprehensible manner, using plain language, minimizing the use of technical jargon, and incorporating common everyday terms.

2.3 Recruitment

The study population included German patients with NYHA class 3 heart failure aged 18 years and older, recruited with the help of an external market research company. The referral was made via the treating physicians. Since the recruitment was done by a specialized vendor, no information on the total number of possible participants invited can be given.

The participants had to have good-to-very-good German language skills and give their consent to participate in the study. The survey was conducted between October 2019 and August 2020. No quotas were used except for age (> 18 years) and classification into NYHA class (validated by the treating physician as well as self-reported by respondents). Recruiting such severely ill patients for a study in sufficient numbers is extremely difficult. Therefore, no quotas were used to fulfill representativeness or similar.

2.4 Experimental Design

A d-efficient fractional factorial design was used for the survey instrument. The efficient Bayesian design was created using Ngene software version 1.2.0 [31]. Prior coefficients were based on pretest interviews and objective assumptions. The experimental design consisted of a total of 240 choice tasks, which were divided into 20 blocks. Each respondent was thus assigned 12 choice tasks. The d-error was used as a measure of design efficiency [32]. The given experimental design and the prior assumptions resulted in a d-error of 0.015, which is sufficiently low for the given design dimension. A dominance test was applied to assess the validity (rationality) of the respondents' choice decisions. In addition, a reliability test was performed by means of a duplicated choice task (test–retest). In total, the respondents had to complete 14 choice tasks. The additional choice tasks were excluded from the statistical model analysis. Within a choice task with three treatment alternatives, the respondents were asked to choose the most and least preferred alternative. The attributes were randomized once per respondent to avoid order effects.

2.5 Statistical Analysis

The statistical analysis and interpretation mainly focused on latent class analysis (LCA) to explore preference heterogeneity. Latent class models assume that individual choice behavior depends on observable attributes and latent heterogeneity, which can vary with unobservable factors. Individuals are implicitly assigned to a number of different classes with homogeneous preferences. Heterogeneity in latent class models is analyzed by discrete parameter variation. The model assumes that there are classes of respondents within the sample that have identical preferences within each class but that differ systematically from preferences in other classes [33, 34]. Part–worth utility estimates were calculated for attribute levels; the relative importance of the attributes was also estimated. The statistical analysis was performed with Stata 15 [35] and the lclogit module [34]. Probabilities of class membership were predicted after lclogit estimation. The distribution of sociodemographic characteristics in each class based on this can be seen in the electronic supplementary material.

Several data formats are possible for BWS case 3 experiments, each modeling different possible decision processes [36]. For data analysis the best and worst choice data was combined in a sequential best–worst logistic regression. The dataset was exploded by retaining all alternatives in the subsequently constructed pseudo choice observation. In this process, the coding of the worst choice was inverted, that is the values for the variables were the negatives of the best values. This modified data were then appended to the best data, creating a combined dataset. This approach reflected the pattern of how the choices were made within the choice task.

In total, 278 respondents’ data were used for the final analysis. In addition, model calculations were performed that only included those respondents in the analysis who had passed the dominance (N = 255) and reliability tests (N = 166) (see electronic supplementary material). Incorporating respondents who fail specific tests or struggle to comprehend the choice tasks and context can introduce biases. Yet, apparently illogical choices might genuinely represent preferences, and their exclusion could introduce bias, especially if certain choice behaviors are more common in certain populations like those with lower education levels or older age [37]. Even seemingly irrational choices can be meaningful. Relying solely on repeated choice tasks tests for exclusion is not a reliable criterion [38]. Moreover, as the survey progresses, a learning effect takes place and respondents tend to make more accurate and consistent decisions as they become more familiar with the survey structure and their own preferences. However, the impact of learning effects on parameter estimates is expected to be minor [39]. Comparing outcomes across models with various class solutions (refer to electronic supplementary material) reveals consistent results regardless of sample size. Variations exist only in specifics, with no difference in interpretation. Hence, the decision was made to include the entire sample, encompassing all 278 respondents.

Finally, to analyze the demand of the individual classes for different therapies, a choice simulation using the coefficients of the LCA was conducted. Choice simulations (or what-if simulators) are used to predict how the real market would probably choose among a set of competing alternatives [40]. The simulation offers the possibility to predict the possible or potential market share of new products or therapy alternatives that do not yet exist on the market. However, the preference shares from these predictions do not have to match the actual market shares. These predictions are based on (aggregate) model results, the decision model, availability of an opt-out option in the choice decisions, and the study sample used. In the real world, other factors may also play a role in the decision for or against a therapy alternative.

For the study, we performed a simulation with different treatment alternatives including a baseline therapy that mimics the results of competing monitoring devices for patients with chronic heart failure and predicted the choice probability for each latent class.

3 Results

3.1 Sample Population

Overall, 278 respondents completed the survey (see electronic supplementary material). In total, 156 men and 122 women participated in the study. On average, the respondents were 59.4 years old (min: 18, max: 85 years old). The majority of respondents were married (47%). Forty-two percent of respondents were retired or pensioner, 28% were employed full-time, and 19% part time. The vast majority of the respondents passed the dominance (N = 255) and reliability (N = 166) tests.

3.2 Latent Class Model

The purpose of LCA is to divide the respondents into two or more classes that are significantly heterogeneous and internally homogeneous. In the LCA, the four-class solution produced the best and most interpretable model. The two- to four-class solutions were calculated using the entire sample and the adjusted sample based on the consistency (test-retest stability)/dominance tests (those who failed were excluded). The statistical results and segment sizes of the classes can be found in the electronic supplementary material. Bayesian information criterion (BIC) was used as a measure of goodness of fit. Because of the small sample size, convergence problems with a five-class solution, and ease of interpretation, the four-class model was used for the final statistical analysis. The informativeness of the model with the addition of additional classes with class sizes well below 10% and with expected large standard errors would be low. The assignment of respondents to individual latent classes resulted in the following classification. Two smaller classes and two larger classes were identified. Class 2 contained with 9% (N = 25) the smallest fraction of the sample. Class 1 contained 10.8% (N = 30) of the sample. The second largest class was class 3 with 30.9% (N = 86). Class 4 contained the largest share of the total sample with 49.3% (N = 137).

On statistical analysis, the signs of the coefficients were mostly as expected. Coefficients with positive signs indicate preferred levels and thus benefit the participants, negative signs indicate nonpreferred levels and indicate rejection of these levels in the choice of treatment alternatives. The coefficients are scaled to zero (effects coded) within each attribute and interpreted as the relative utility of each attribute level. The higher a level coefficient, the greater the benefit for the respondents and the greater the influence on the choice decisions. Conversely, the smaller a level coefficient, the smaller the benefit and the smaller the influence. A negative level coefficient reduces the overall benefit for a treatment alternative. Table 1 presents the result of the LCA. For each of the four classes the mean coefficients (Coef.) are shown together with the standard errors (Std. Err.) and significance in the 95% confidence interval (Sign.).

Table 1: Latent class models

Better mobility, i.e., being able to cover a greater distance by walking, is usually preferred. However, there are differences in the individual classes in this respect. Better mobility is clearly preferred by class 1, but not by classes 2 or 3.

All classes are relatively consistent in their preference for the risk of death. A lower risk is preferred over a higher risk. With regard to the degree of preferability, however, there are also differences between the individual classes.

Apart from class 2, the same applies to the risk of hospitalization. A lower risk is clearly preferred over higher risks by participants in three classes. Class 2 seems to be completely indifferent regarding the risk of hospitalization.

A clear difference in all classes is seen in the “type and frequency of monitoring”: more frequent monitoring is not always considered better. There are also differences in the preference for monitoring at home or at the doctor's office. Only class 2 shows a clear preference for monitoring frequency with nine times being the most preferred and frequency of monitoring monotonically decreasing. Class 3 shows monotonically decreasing preferences for higher monitoring frequency, but these preferences are not substantively differentiated.

In the context of the given decision scenario, the significance of medical device and system-relevant complications appears to be relatively low for participants across all classes compared with the other attributes.

The significance analysis sheds light on the primary considerations of each class when deciding on a treatment alternative. None of the classes exhibited statistically significant estimates across all attributes, indicating that individual attributes exerted a relatively minor impact on choice decisions. Substantial differences among the classes emerged, each with distinct focal points in their decision-making processes.

Figure 2 shows the coefficients in a graphical gradient. The graphs are displayed separately for each of the four classes. The vertical bars around the coefficients represent the 95% confidence interval. If the confidence intervals for adjacent levels overlap within an attribute, the coefficients of these levels do not differ statistically. If a level coefficient crosses the zero line, it means that there is no significant difference to the null hypothesis, which states that the level is statistically not significantly different from zero. The figure shows from top to bottom the individual classes and within each class from left to right the utility values for the levels of the attributes mobility, risk of death, risk of hospitalization, type and frequency of monitoring, and risk of medical device and system relevant complications. In the following, the individual classes are described in terms of their preferences.

Fig. 2
figure 2

Part-worth utilities of the latent class analysis

3.2.1 Class 1

Class 1 seems to attach great importance to mobility. Compared with the other classes, all coefficients are significant (different from zero). A low risk of death is important, but of less benefit than the other classes. It is noticeable that class 1 respondents are the only class that prefers to be monitored 32 times by a physician. A nine-times monitoring at home would be equally beneficial. All other levels of the monitoring attribute tend to be rejected. The coefficient for the 56-monitoring at the doctor's office is significantly negative, indicating a strong rejection. Class 1 is something like the "mobile class," which does not hesitate to visit the doctor 32 times a year.

3.2.2 Class 2

For class 2, the type and frequency of monitoring seems to be very important. For this attribute all coefficients are statistically significant from zero (p < 0.001), except the levels 32 times per year at home and 32 times per year at the doctor. The risk of death (level difference 0.85) and the risk of medical device and system relevant complications (level difference 0.51) are both important, with the risk of death slightly ahead. However, the risk of complications seems to have a greater benefit to respondents in this class compared with the other classes. Preferences of class 2 respondents for mobility appear to be undifferentiated. All level coefficients are close to zero and confidence intervals overlap. The attribute risk of hospitalization is of low importance to respondents in class 2. In summary, class 2 can be described as the “immobile and monitored class.” However, it needs to be stated that class 2 is the smallest of the four classes consisting of 9% of respondents (N = 25).

3.2.3 Class 3

Class 3 seems to attach great importance to risk of hospitalization. Compared with the other classes, all coefficients of this attribute are statistically significant different from zero. The risk of death is less important than risk of hospitalization, but more important than for respondents in classes 1 and 2. In comparison with the other risk attributes, the significance of the risk of medical device and system relevant complications is diminished. A high risk of complications is unequivocally dismissed, as indicated by a level that deviates significantly from zero, albeit with a relatively small impact.

Class 3 respondents would prefer to have their heart failure monitored nine times a year, regardless of whether the monitoring is done at home or at the doctor. Regarding mobility, a longer distance is preferred over a shorter distance. The curve of the levels is almost linear, but rather flat with only the first and last level coefficient significant. The class seems to consist of “risk avoiders.”

3.2.4 Class 4

For respondents in class 4, the attribute risk of death seems to be more important in contrast to the remaining classes. Level coefficients for low and high risk are statistically highly significant and have, therefore, a strong influence on the respondents’ choice decisions. This attribute is followed by risk of hospitalizations, which has also a great impact on choices. The remaining attributes seemed to have less influence on choice decisions of class 4 respondents. Due to the main focus on the attribute risk of death, the class can be described as “mortality risk avoider.”

3.3 Attribute Nonattendance

As a starting point for the identification of attribute nonattendance, a latent class model with the maximum number of ANA classes can be estimated. However, the main focus of this study was not to distinguish between attribute nonattendance, i.e., the systematic ignoring of individual attributes, and actual preferences. Moreover, ignoring attributes can also mean that they were only paid less attention to and assigned less importance. The electronic supplementary material presents an additional analysis of the attribute nonattendance using latent class modeling techniques. The analysis shows that the various latent classes that represent different combinations of attendance and nonattendance across attributes have very low probability of class membership, i.e., only a few respondents were assigned to each of the latent classes in the analysis. One exception was a class in which class members appeared to pay attention only to the risk for hospitalization attribute and ignored all other attributes. This indicates that the attributes were not systematically ignored in the choice decisions.

3.4 Relative Attribute Importance

The relative importance of each attribute in this given decision context of the study is derived from the range between the lowest level coefficient and the highest level coefficient. This range is then normalized on a scale of ten, where ten is the highest and thus the best score.

The relative attribute importance, which means the focus of the respondents of the individual classes in the choice decisions, is shown in the figure in comparison (Fig. 3).

Fig. 3
figure 3

Relative attribute importance of the latent classes

Class 1 respondents focused mainly on mobility. All other attributes seem to be subordinate to mobility and far less important. Class 2 respondents attached the greatest importance to monitoring. The risk of hospitalization seems to have had little importance. Class 3 respondents paid more attention to the risk of hospitalization in their choices, and only a little less attention to risk of death. For the respondents grouped in Class 4, the risk of death was by far the most important attribute. Monitoring and risk of medical and system relevant complications seemed to have little importance.

3.5 Segmented Choice Simulation

The LCA revealed that classes 1 (11%) and 2 (9%) each contain only a small proportion of the study population, while classes 3 (31%) and 4 (49%) comprise the larger proportion. The respondents of each class have varying preferences and different rank orders of attributes (see Fig. 3). We designed four scenarios based on the assumptions in the following table to simulate therapy choice. The simulation assumes that all therapies have the same level of complications. However, compared with the base case, treatment alternatives are improved in terms of hospitalization and mobility and mortality. Table 2 also presents the choice probabilities of the individual classes calculated in the simulation.

Table 2 Attributes levels in the simulated choice scenario and the simulation results

Class 1 derives the most benefit mainly from the improvement in mobility. Given the alternatives in the table, the majority of class 1 (78%) would probably choose therapy 3, offering the highest level of mobility. Class 2, which includes the smallest share of the sample, is the only class that places greater importance on a shorter frequency of doctor visits. Accordingly, about half of the respondents in this class would choose the base case (49%). The most important attributes for the larger classes 3 and 4 were mortality and hospitalization risk. The majority of both classes would opt for therapy 3, which is improved over the base case in terms of mortality as well as hospitalization risk. The prediction of the choice probabilities of class 4 indicates that for the respondents of this class the attribute “Mobility” plays no or only a minor role in the choice of a therapy alternative. Therefore, the allocation of respondents to therapy 2 and 3 is about the same in the simulation. However, it is of course true that therapy 2 is objectively dominated by therapy 3 due to the better mobility level. However, this does not play a role for the respondents in their choices.

4 Discussion

Chronic heart failure is a serious disease that requires patients to adhere to the therapy. The provision of therapy and a monitoring system is very complex and requires a balance between potential benefits and risks. It is therefore important to know patients' preferences regarding treatment and monitoring alternatives and their consequences. The practical approach of BWS can help to improve communication between patients and caregivers, support clinical and allocative decision-making, and improve the quality of interpretation of clinical data. Utilizing the acquired knowledge, therapies can be administered in a more patient-oriented approach, enhancing the effectiveness and efficiency of care delivery, and, ultimately, resulting in greater patient benefit.

In the analysis, four classes of respondents with different preferences were identified. Overall, the results showed a very heterogeneous profile of the study population. Except for the attribute risk of system relevant and medical complications, each class had one attribute with at least two very significant attribute levels. In the decision context, the risk of complications had the least influence on patients' choices and thus the least significance or benefit compared with the other attributes. However, the degree of importance of the individual attributes varied from class to class.

The classes could be described in terms of sociodemographic characteristics and patients' experiences with the disease. However, the small sample size of the study population is a limitation of the study. The total sample of 278 for a LCA tends to be at the lower limits. This is also reflected in the partly high standard errors of greater than 0.01. For the LCA, a larger sample size is strongly recommended [40, 41]. This must also be considered and interpreted when comparing the characteristics of the four classes. Especially in the two smaller classes, only tendencies toward certain characteristics can be indicated. With a larger sample, further studies on possible interaction effects and scale heterogeneity could be conducted.

For some classes, it could be assumed that the respondents in the class on average focused on less or even only one attribute in their choice decisions. Attribute nonparticipation models could be applied to consider attribute processing strategies and to analyze the existence of dominant preferences in more depth [42, 43]. Therefore, it has to be considered whether the nonconsideration of attributes is a heuristic or a preference [44].

Limitations A larger sample size could have improved the statistical precision of the results with respect to smaller standard errors. An attribute level can be statistically significant even with a larger standard error, but with less precision. Moreover, it must be stated that this is a rather young, well-educated, well-managed group of patients who are doing well with treatment. They might not be completely representative of the majority of patients with heart failure nor the sicker end of the spectrum that might benefit most from intensive monitoring. The BWS included two additional selection tasks, which were designed as a dominance test to identify irrational decision makers and as a consistency test. The dominance test was not passed if the respondent chose the worst alternative with the highest risks and poorest mobility. In total, 255 respondents successfully passed this test. Additionally, the consistency test, involving the duplication and repetition of a choice task within the best–worst scaling (BWS) framework, was successfully passed by 166 respondents. The combination of both tests was successfully completed by 154 respondents.

A total of 255 respondents passed the test. The consistency test, involving the duplication and repetition of a choice task within the BWS, was successfully passed by 166 respondents, while 154 respondents passed both tests. Various models are shown in the electronic supplementary material with the full and reduced sample. Furthermore, the BWS models derive benefit from utilizing the entirety of the accessible choice data, particularly when dealing with a relatively small sample size. This is evident in the form of reduced standard errors, leading to increased statistical significance in the full data set compared with the reduced sample.

With decreasing sample size, the standard errors of the coefficients increase and thus the precision of the estimation. However, the exclusion of irrational decision-makers can lead to more consistent choices for the analysis, and thus to larger coefficients, which can somewhat compensate for the negative effect of smaller standard errors on significance. Monitoring technologies can be of great benefit to disease management. Patients in the study were asked about monitoring without explicitly providing the value of it. It is possible that patients did not see or understand the benefits of monitoring. It remains uncertain whether respondents view continuous monitoring as burdensome. Individuals with more severe medical conditions may prioritize their wellbeing over prognosis, while those in good health may place a higher value on longevity due to the quality of life. To address this question, a subgroup analysis based on severity, gender and age is conducted.

Additionally, the survey was conducted before the COVID-19 pandemic outbreak. It is also quite possible that the results would have been different had the survey been conducted after the pandemic outbreak and that patient preferences for home monitoring would be significantly greater than monitoring at a doctor’s office. The rise of telemedicine and digital healthcare consultations in the wake of the COVID-19 pandemic has led to the possibility, and in some cases the expectation, of individuals having distinct perspectives and preferences when it comes to their healthcare option.

Avoiding the risk of infection through contact in waiting rooms, doctors’ offices, or public transportation could be critical in this regard. However, healthcare also depends on the personal doctor–patient relationship. Furthermore, the discussion about data protection could also be perceived negatively and reinforce an unpleasant feeling of being constantly monitored. A further study under different circumstances may provide insight here. After all, the coronavirus pandemic could act as a digital health accelerator for a change in perception and uptake of digital health solutions.

A forced-choice task format was used in this study because the goal was to analyze potential participation in treatment. Respondents were forced to choose one of the alternatives in each choice task. There was no option to choose none of the alternatives. The use of this format was intended to reflect the actual choices of the study population. Medical decisions often involve critical choices and, in certain healthcare situations, patients may not have the option to opt out of treatment. The lack of an opt-out option reflects real-world decision-making. Future DCE studies could incorporate a dual response scheme, in which respondents are first asked to choose from amongst all represented alternatives, after which they are asked in a second choice if they would opt out if given the choice [45]. This would reduce the risk that with an integrated no-choice option, a large number of respondents would not choose any of the available alternatives, leaving researchers with insufficient information about preferences to estimate the analytic model [46].

Scale factors were not taken into account in this study. Models that incorporate scale, such as scale-adjusted latent class models allow for exploration of variation in scale and preferences within individual classes. However, consistent with the goals of our study and our nontechnical audience, we chose to use a simple latent class model to explore preference differences. This decision balanced the simplicity of the model with the potential benefits of including the scale factor [47].

5 Conclusions

In this study, a choice simulation crafted a specific market scenario featuring diverse therapy options, allowing for predictions of participant preferences within the latent classes. The results revealed substantial market shares for therapies with low mortality and hospitalization risks in three of four classes, particularly when accompanied by significant mobility improvements.

BWS provides a practical avenue to enhance patient-provider communication, support clinical and allocative decision- making, and improve the interpretation of clinical data. Despite its advantages, the study has limitations, notably the constrained predictive capabilities of market choice simulations, which forecast potential preferences for nonexisting products. These projections may not align with actual preferences.

In essence, these findings have the potential to significantly enhance patient care effectiveness and benefits. Tailoring medical interventions to individual preferences in chronic heart failure treatment can boost patient satisfaction, treatment adherence, and clinical outcomes. Policymakers can incorporate these insights into resource allocation and healthcare optimization, ultimately improving overall patient care.

In summary, our focus on patient preferences offers avenues for direct improvements in treatment decisions at the patient group level.