FormalPara Key Points

The HOSPITAL score and LACE index showed good accuracy to predict mortality in older patients with multiple medications and comorbidities.

The HOSPITAL score and LACE index can help clinicians and patients to make decisions about investigations and treatment after hospitalization.

1 Introduction

Accurately predicting mortality from readily available clinical data is critical to informing future diagnostic and treatment choices. This is especially important in potentially high-risk individuals such as older patients with multimorbidity, which affects 60% of adults aged ≥ 65 years and is associated with polypharmacy, lower quality of life, higher healthcare resource utilization and mortality, and substantial burden for patients, caregivers, and healthcare systems [1,2,3,4,5]. Increased healthcare use and polypharmacy in older multimorbid patients is partly related to screening procedures and preventive medications, which may not be appropriate for patients whose survival time is likely too short to benefit [6,7,8]. In this context, the harm and burden of additional tests and medications may outweigh the benefits. Estimating life expectancy may thus be useful for patients and caregivers to make informed decisions about the need for further screening and preventive care. We can hypothesize that avoiding procedures or medications that are no longer appropriate might in turn improve the quality of life of patients with limited life expectancy as well as reduce overuse of care.

Several models have been developed to predict mortality in recent years [9,10,11,12,13,14]; however, most of them were developed in a single country or a specific population (e.g., palliative care) and have not been externally validated in different older person populations in different countries. Furthermore, none of those models has yet been widely used and implemented in clinical practice, possibly due to their complexity beyond data available during routine clinical care. For example, some models require assessment of functional status [11, 12], which is not routinely collected in electronic medical records. Simpler tools are thus needed.

The HOSPITAL score and the LACE index have been developed and broadly validated in different countries and populations to predict unavoidable hospital readmission (HOSPITAL score) and death or non-elective readmission (LACE index) within 30 days after discharge [15,16,17,18,19,20,21,22,23,24]. These models are less complex than most previous scores and are easily computed with routinely available electronic medical records data. Therefore, these models have the potential for increased large-scale implementation.

Given likely similar risk factors for death and readmission, we hypothesized that the HOSPITAL score and the LACE index could accurately predict mortality. The aim of this study was therefore to assess the performance of the HOSPITAL score and the LACE index to predict 1-year and 30-day mortality following an acute care hospitalization in older multimorbid patients with polypharmacy.

2 Methods

2.1 Study Design and Population

We included all patients with completed follow-up from the OPERAM (OPtimising thERapy to prevent Avoidable hospital admissions in the Multimorbid elderly) trial. The design and results of this multicenter European trial, which aimed to reduce inappropriate prescribing as a means of preventing drug-related admissions (DRAs), have been described previously [25, 26]. Enrolled patients were aged ≥ 70 years, had multimorbidity (three or more chronic conditions), and polypharmacy (five or more chronic medications), and were admitted to a medical or surgical ward for an acute care hospitalization. Patients who died before discharge were not included in the OPERAM trial. Participating countries were from four hospitals in Belgium, Ireland, The Netherlands, and Switzerland (patient randomization occurred from December 2016 to October 2018). The OPERAM trial’s primary outcome was any DRA following index hospitalization, while 1-year post-discharge mortality was a secondary outcome. All variables (including those required to calculate the HOSPITAL score and the LACE index) were collected prospectively and systematically in the OPERAM trial according to a predefined protocol. The baseline visit consisted of a face-to-face interview. Follow-up visits were conducted by phone with the patient and/or their relatives and/or their general practitioner. Diagnoses were extracted from electronic medical records.

2.2 Predictors: HOSPITAL Score and LACE Index

The HOSPITAL score was computed according to its simplified version, in which the variable ‘procedures’ is omitted and ‘oncologic diagnosis’ is used instead of oncology ward (as oncology wards were not part of the OPERAM trial), resulting in a total of six items for a maximum of 12 points (Table 1) [17, 27]. Furthermore, we used the threshold of ≥ 8 days for prolonged length of stay, which has been validated in Europe, instead of the 5-day cut-off employed in the US [15]. The LACE index was used in its original version, consisting of four variables for a maximum of 19 points (Table 1) [19]. To compute the Charlson Comorbidity Index included in the LACE index, we used the method developed by Quan et al., based on International Classification of Diseases (ICD) codes [28]. Data were missing in 6 (0.3%) patients for previous hospitalizations, 8 patients (0.4%) for previous emergency room visits, 12 patients (0.6%) for serum sodium, and 17 patients (0.9%) for hemoglobin. These missing data points were assigned a score of zero (i.e., coded as normal) in the application of both prediction models, as done in the derivation study of the HOSPITAL score [17].

Table 1 HOSPITAL score and LACE index

2.3 Outcomes

The primary outcome of the present analysis was all-cause mortality within 1 year after discharge of the index hospitalization, while all-cause mortality within 30 days after discharge was the secondary outcome. Mortality was determined based on the follow-up phone call. If no-one among the patients, their relatives/contact persons and their general practitioner could be reached by phone, we contacted their residential place.

2.4 Statistical Analyses

We assessed the performance of each model according to its overall accuracy, discrimination and calibration. First, to assess and compare the overall accuracy of the models, we computed the scaled Brier score, as described by Steyerberg et al. (the lower the score, the better) [29, 30]. The overall accuracy refers to the probability that the score correctly classifies the individuals [31]. Second, to assess discrimination (i.e., whether the score separates well lower- vs. higher-risk individuals; 0.5 = no discrimination, 1 = perfect discrimination) [29], we obtained C-statistics for each model and used bootstrapping with 1000 replications to compute 95% confidence intervals (CIs). We further computed the discrimination slope, defined as the absolute difference in mean predicted risk for those with the outcome compared with those without the outcome [29, 32, 33]. To facilitate visualization, we presented boxplots of the predicted risk according to outcome occurrence. We compared discrimination of the HOSPITAL score and the LACE index by assessing the equality of their C-statistics. Finally, to assess calibration (i.e., the agreement between predictions and observed outcomes) [29], we displayed predicted versus observed proportions of death according to (1) lower- and higher-risk categories, and (2) score point categories. We evaluated statistical significance using the Hosmer–Lemeshow goodness-of-fit test (a significant p-value would indicate an overall lack of fit) [29]. We performed all analyses using Stata/MP 16.0 (StataCorp LLC, College Station, TX, USA).

3 Results

3.1 Baseline Characteristics and Mortality Rates

Among the 2008 patients included in the OPERAM trial, 119 withdrew consent and 10 were lost to follow-up, leaving 1879 patients for analysis. Among these, the mean age was 79.4 years (standard deviation [SD] 6.3), with 835 (44.4%) females and a median (interquartile range) number of chronic medications of 9 (7–12). Within 1 year of discharge, 375 (20.0%) patients had died, among whom 94 patients (25.1%; 5.0% overall) within 30 days. Mortality rates during the 12-month follow-up varied by country (Table 2). Baseline characteristics according to death within 1 year of discharge are described in Table 2. Patients who died were older, had higher morbidity (Charlson Comorbidity Index, cancer diagnosis), and received more medications than patients who survived follow-up. They also had a longer length of stay and were less frequently discharged home or to a nursing home. The HOSPITAL score ranged from 0 to 11 points (mean 3.7 [SD 2.0], median 4) and the LACE index ranged from 2 to 19 points (mean 11.2 [SD 2.9], median 11).

Table 2 Baseline characteristics according to death at 1 year

3.2 Primary Outcome: 1-Year Mortality

The overall accuracy for prediction of 1-year mortality was good, with a scaled Brier score of 0.08 for both models. Calibration assessment showed no systematic deviation from the reference line, with well-matching predicted and observed proportions overall (Fig. 1, Appendix Table 3). The Hosmer–Lemeshow C-statistic test indicated no overall lack of fit, with a p-value of 0.37 for the HOSPITAL score and 0.95 for the LACE index. The discrimination of the two models was similar, with a C-statistic (95% CI) of 0.69 (0.66–0.72) for the HOSPITAL score and 0.69 (0.66–0.72) for the LACE index (p-value for comparison of the C-statistics: 0.81) (Appendix Fig. 3). The discrimination slope was 0.08 for both models (Fig. 2).

Fig. 1
figure 1

Calibration of the HOSPITAL score and LACE index to predict 1-year mortality. CIs confidence intervals. Predicted vs. observed mortality at 1 year after discharge score points (left panels) and deciles (right panels)

Fig. 2
figure 2

Boxplots and discrimination slope for predicted mortality according to death status at 1 year. The discrimination slopes were calculated as the mean predicted mortality in patients dead minus the mean predicted mortality in patients alive

3.3 Secondary Outcome: 30-Day Mortality

The overall accuracy for the prediction of 30-day mortality was good, with a scaled Brier score of 0.02 for the HOSPITAL score and 0.01 for the LACE index. Calibration assessment showed no systematic deviation from the reference line, with well-matching predicted and observed proportions overall (Appendix Fig. 4, Appendix Table 3). The Hosmer–Lemeshow goodness-of-fit C-statistic p-value was 0.02 for the HOSPITAL score and 0.41 for the LACE index. The discriminatory power of the two models was similar, with a C-statistic (95% CI) of 0.66 (0.61–0.71) for the HOSPITAL score and 0.66 (0.60–0.71) for the LACE index (p-value for comparison of C-statistics: 0.94) (Appendix Fig. 3). The discrimination slope was 0.02 for the HOSPITAL score and 0.01 for the LACE index (Appendix Fig. 5).

4 Discussion

In this multicenter trial, the HOSPITAL score and the LACE index showed good overall performance for predicting 30-day and 1-year mortality after an acute medical or surgical hospitalization in older multimorbid patients. Discrimination was moderate and similar for both models (0.69; 0.5 = no discrimination, 1 = perfect discrimination). As mortality prediction tools, the HOSPITAL score and the LACE index are easy to use and may thus help to evaluate life expectancy in older multimorbid patients. The predicted risk of death according to the number of score points can provide useful information to both patients and caregivers when deciding upon further need for screening procedures and preventive medications, whose potential adverse effects may outweigh the expected benefit for patients when the time to benefit is too long.

Previous studies have developed various scores to predict mortality in hospital settings [9,10,11,12,13,14]. Some of those scores showed good performance but none of them has yet been broadly implemented to estimate life expectancy. This might be explained by the items included in the models, which may be relatively complex to collect in a standardized way in clinical practice. For example, the Walter Index [12] and the Burden of Illness Score for Elderly Persons (BISEP) [11] include functional status evaluation, which is partly subjective and affected by the method of inquiry about functioning [34]. In addition, this assessment may represent extra workload for healthcare professionals, or additional chart abstraction, which often limits implementation. In contrast, the HOSPITAL score and the LACE index have the advantage of including only items that can be automatically and easily retrieved from electronic medical records, increasing their scalability. The HOSPITAL score has the additional benefit of not requiring assessment of comorbidities, except for active oncological disease, which is unlikely to be frequently omitted in medical records, given the usually highly significant impact of this diagnosis. In contrast, the LACE index requires the calculation of the Charlson Comorbidity Index, whose calculation may be subject to coding quality and underreporting.

In this study, we found lower discrimination for the HOSPITAL score and the LACE index, compared with other survival prognostic tools (C-statistics 0.75–0.90) [10, 12,13,14]. However, studies developing such prognostic tools were conducted retrospectively, in a single country, in medical (i.e. non-surgical) patients only, or in selected populations, limiting their generalizability to other settings [9,10,11,12,13,14]. For example, the CARING criteria were developed by retrospective review of Veterans’ charts, so that generalizability to females is unknown [10]. In contrast, the OPERAM trial was a prospective multinational study in unselected older adults with multimorbidity, which enhances our generalizability. It is also noteworthy that the C-statistic dropped from 0.76 in the development cohort to 0.59 only in the validation cohort for the high-risk category of the BISEP model, suggesting that the performance of a prediction model may not be similar in populations or settings that differ from the development cohort [11]. Further studies should compare the HOSPITAL score and the LACE index with other prediction models in similar populations.

Cut-offs are commonly used to categorize a screening or diagnosis test as normal or pathological, although in reality they most often correspond to a continuum between health and illness. Similarly, categories have frequently been created to classify patients at low, moderate, or high risk of death [9,10,11,12,13]. This approach has the disadvantage of attributing a similar prediction to patients with the lowest or those with the highest number of points merged in the same category, although they may actually have a significantly different predicted mortality risk. Estimating a probability for each possible number of points, as we did in this study, and similar to what was done with other prediction models (e.g., Framingham score for cardiovascular mortality risk) [35], may thus be more informative.

Accurately estimating life expectancy is critical towards understanding whether a patient is expected to live long enough to benefit from screening tests or long-term preventative treatments. For example, while it takes approximately 10 years to prevent one death from colorectal or breast cancer from screening 1000 patients [7], time to benefit for statin treatment to prevent myocardial infarction is approximately 2–5 years [6], and 11 months for alendronate treatment to prevent one fragility fracture [8]. Tools predicting mortality, such as the HOSPITAL score or the LACE index, may enable patients to make healthcare choices to better align with their personalized goals, accounting for overall benefit [36].

Whereas previous models were developed to predict mortality over 1 year or longer, we also studied short-term (30-day) mortality. Overall performance was similar for 1-year and 30-day mortality. These models may thus play a role in predicting short-term mortality also, whose clinical implications may be different than for longer-term mortality. This estimation may indeed help to reduce uncertainty about short-term prognosis and consequently inform end-of-life conversations between caregivers and patients. This helps to focus end-of-life care on the patients where appropriate, avoid treatments or procedures whose benefit may be time-limited and that might not be acceptable to the patients, and thus potentially to improve quality of life during the final weeks and months of life of older people.

4.1 Limitations and Strengths

We must acknowledge some limitations. First, the Charlson Comorbidity Index used in the LACE index is based on ICD codes, which might be subject to underreporting. Second, patients who died during hospitalization were not included, and we included only patients aged ≥ 70 years with multimorbidity and polypharmacy, limiting generalizability to other populations. Third, the number of outcomes at 30 days was limited. Finally, the population was part of a clinical trial that aimed to optimize prescribing and might thus have influenced mortality; however, this risk is rather unlikely given that the trial was negative and that we adjusted for the intervention arm [26].

This study presents some significant strengths. First, we used prospective data systematically collected along the OPERAM trial. Second, we compared two prediction models and assessed both short- and long-term mortality. Third, there were only few missing data for the predictive variables (< 1% for each). Finally, our study was larger than most previous studies [14], was conducted in four different countries, and included both surgical and medical patients with only a few exclusion criteria (i.e. real-world patients, including patients with dementia, were enrolled) [25], increasing the generalizability of our findings.

5 Conclusion

In hospitalized older multimorbid patients, the HOSPITAL score and LACE index showed very good overall accuracy (i.e., very low Brier score), good calibration (i.e., well-matching predicted and observed proportions) and moderate discrimination to predict 1-year mortality. Their performance was slightly lower for 30-day mortality. These simple tools may help predict mortality risk in older multimorbid patients after acute hospitalization, which may inform post-hospitalization intensity of care.