Introduction

Healthcare expenditure in the last year of life—the “cost of dying”—accounts for between 8.5 and 11.2%1 of total medical spending. This is even more pronounced at higher ages since this is where mortality is concentrated. A reduction of this expense is an alluring prospect as it would not only reduce the economic burden of an ageing population but potentially also save people useless treatment2,3,4,5.

However, the assertion that we spend a lot on those who die is misleading: We can only talk of money wasted on those who go on to die if we can know at the time of starting treatment that the patient will die within a short enough time frame to make treatment frivolous. Recent applications of machine learning to improve prediction of mortality have shown impressive results6,7. Nonetheless, mortality seems to remain to a large extent stochastic8,9—even at very high ages, there is still a substantial probability of living to see another birthday. In fact, the discriminative ability of a number of ‘classic’ health indicators seems to decline with age10. In a 2018 paper2, Einav et al. showed that even a well-performing, state-of-the-art prediction model when predicting one-year mortality for a representative sample of American Medicare enrollees finds very few individuals with more than a 50% risk of dying within a year. It is only with a very high mortality prediction in hand that one could consider to refrain from initiating medical treatment but due to their rarity, individuals with very high mortality risk make up a negligible share of population healthcare expenditure, even though their individual healthcare expenses are high2.

In this paper, we attempt to reproduce the analyses of Einav et al.2 and explore the distribution of healthcare expenditure by predicted mortality in the Danish population over age 65, using state-of-the-art machine learning methods and the rich national registry data to obtain the best possible prediction of mortality. With this exercise we will explore whether the patterns of mortality and health care expenditure as described for American Medicare enrollees can be found in a different population (covering an entire country) and healthcare system. Next, we explore whether the inclusion of a wider range of socioeconomic variables that are present in the Danish national registries will improve prediction to any material extent, and how the inclusion of the costs of communal care affects the conclusions. The costs of communal care are often ignored when discussing the cost of dying, but these make up a sizeable proportion of the actual societal costs of dying, and the proportion grows with the age of decedents11.

Results

We identified 1,140,242 Danish residents aged 65 and above on Jan 1st 2016 (Fig. 1). After exclusions of 7210 who had immigrated less than two years before start of follow-up and 8971 who were missing from registries, we were left with a study population of 1,124,061 of whom 43,838 (3.9%) died during the year of follow-up. The characteristics of the test population by one-year survival status is shown in Table 1.

Figure 1
figure 1

Chart describing inclusion in the study population, and data flow into the prediction algorithm. Individuals were included in the national population study if they were alive and aged 65 + on January 1st 2016 and had been living in Denmark for two years prior to this baseline date. A random sample of two thirds was selected, and a prediction model was trained on this sample, predicting individual-level mortality risk in the year after baseline based on characteristics observed during the two years before baseline. The prediction model was then used to predict one-year mortality risk for the remaining test sample, and this sample was followed up for mortality and healthcare expenditure for up to one year after baseline.

Table 1 Characteristics of the test sample used for calculating health care expenditures by predicted mortality.

Using a machine learning ensemble, the distribution of predicted one-year mortality on the test sample of 374,687 is shown in Fig. 2. While the classification was reasonably good (AUC 0.87), and the distributions of predicted mortalities for survivors and decedents were markedly distinct, only a small proportion (0.6%) had a predicted one-year mortality risk of more than 50%.

Figure 2
figure 2

Description of the one-year mortality predictions for the test sample. Violin plot of the distribution of predicted mortalities for survivors and decedents (a), ROC curve (b), calibration plot (c).

Decedents accounted for 13% of 2016 health care expenditure in the test sample (14% of care-related expenditures and 13% of treatment-related expenditures). Figure 3 shows kernel density smoothed mean individual healthcare expenditure per day alive by predicted one-year mortality for decedents and survivors and for total healthcare expenditure, treatment-related expenditure and care-related expenditure. Mean health care expenditure per day alive was higher in decedents than in survivors (1.46 and 0.14 thousand DKK per day respectively). As frail people are more likely to die and also have higher healthcare expenditures, we calculated health care expenditures in a hypothetical population of survivors with the same distribution of predicted mortality as the decedent population had. Mean health care expenditure per day alive in this hypothetical population was 0.57 thousand DKK per day, meaning that decedents outspent survivors by a factor of 10, but outspent equally frail survivors only by a factor of 2.5. Some 39 percent of decedent healthcare expenditure was explained by high spending on those with high ex-ante mortality. The corresponding percentages for treatment- and care-related expenditure, respectively, were 18% and 75%.

Figure 3
figure 3

Kernel smoothed per-person healthcare expenditure per day by predicted one-year mortality and type of expenditure.

Figure 4 shows total healthcare expenditure by predicted mortality. While spending was clearly higher at higher predicted mortalities, the group with a predicted mortality of more than 50% only accounted for 2.8% of total spending. The numbers for treatment and care were 1.8% and 4.4% respectively. While 75% of total treatment-related spending occurred at a predicted mortality of less than eight percent, care-related expenditure was concentrated among those with moderately high predicted mortality, with 75% of expenditure concentrated at predicted mortalities over 11 percent.

Figure 4
figure 4

Healthcare expenditure by predicted one-year mortality expenditure and type of expenditure.

Discussion

Decedents accounted for 13% of yearly healthcare expenditure at age 65 and above, but only 2.8% was spent on those who, according to our machine learning model, had a likelihood of dying of more than 50%. While the mean healthcare expenditure per day alive on a decedent was ten times that of a survivor, when comparing to an equally frail population of survivors, the mean expenditure per day alive on a decedent was only 2.5 times higher. The main strength of the study is the availability of data for an entire population, with rich health care and sociodemographic predictor data and registry coverage of 97% of all healthcare expenditure12, as well as the inclusion of communal care in addition to treatment. As healthcare expenditure in Denmark is tax-funded, differences will not be artefacts of differential insurance coverage and rates. Individual-level expenditure data, however, may be misestimated to some extent: Hospital costs are DRG rates which are averages and may not entirely correspond to the actual cost of treatment, and the computation of individual-level expenditure on nursing home and home care involve some amount of estimation and imputation. The study deals only with expected mortality at baseline, which may arguably be a limited indicator of cost-efficiency of healthcare spending, and other measurements such as quality-adjusted life years could have been taken into account.

The distribution of predicted mortalities resemble that estimated2 for American Medicare enrollees. The inclusion of a wider array of personal characteristics has not improved prediction materially, as our AUC is essentially the same as that of the Medicare study—a result that compares reasonably well to what other studies have achieved6,7,13,14,15,16, particularly considering the relatively wide time horizon of prediction for our study. The very low proportion with high predicted mortality might be due to essential randomness in mortality, the accrual of health-impacting events after start of follow-up, or due to shortcomings of the data available. But while we absolutely might point to health indicators that were not available for the study there are indications10,17,18 that these may not improve mortality prediction that much.

The mass of treatment costs is concentrated at low predicted mortalities in a pattern resembling that of Einav et al.2. Care-related costs, conversely, are concentrated at higher mortalities and increase more markedly with increasing mortality, whereas the costs of treatment among decedents actually decrease up to a predicted mortality of about 30%. This is not surprising—predicted mortality is a proxy for frailty and thus for the need for communal care, and the need for care is likely to change less as the result of health-impacting events over the course of follow-up. It is interesting that we see a decline with predicted mortality in treatment-related expenditure per day alive for decedents. This was not observed for the American population and may reflect different medical culture in Denmark and the US, but the different prediction algorithms might also be part of the explanation—treatment-related expenditure decreases with age in Danish decedents11, and if a high predicted mortality is more reflective of age and frailty in our algorithm than for the American data, that might explain the difference.

At similar predicted mortalities, there is little difference between the care-related expenditure per day alive of decedents and survivors. The treatment-related expenditure of decedents is much higher than that of survivors, although the differences are lower at higher mortalities. This pattern may in part be explained by the passage of time—by the time a person dies, their health has likely deteriorated since their status at entry, and it seems likely that a person who dies at low predicted mortality will have experienced some dramatic health event requiring treatment, while death at higher predicted mortality might be a more direct continuation of patterns already established by the time of entry. Also, a person with low predicted mortality might be a better candidate for treatment, being less frail. But to the extent that the difference between survivors and decedents at the same mortality is not due to curveball events, it might be seen as the “true” cost of dying.

Thus, nearly all healthcare expenditure occurs in situations where there is a reasonable expectation that the patient can survive, and so the concept of “the cost of dying” is confounded by frailty: We spend more on the frail, and the frail are more likely to die—but not certain to do so, at least within a relevant time frame. This underlying frailty, operationalized as high predicted one-year mortality accounted for 39% healthcare expenditure in the last year of life in Denmark, an estimate in line with that in American Medicare enrollees2. The idea of a potential for reductions in health care expenditure at the end of life is enticing, and it seems possible to find groups that could benefit from a switch to a palliative course of treatment. Still, our results, along with those of our model paper, add to a list of arguments for why it might be illusory to reduce healthcare expenditure much by cutting the cost of dying. The proportion of spending occurring at the end of life is lower than has previously been reported1, decedents make up a relatively small share of high-cost individuals3, rising levels of demand drive increasing health care costs in ageing populations at least as much as the cost of dying19, and high end-of-life costs seem driven more by multimorbidity than last-ditch lifesaving efforts1,11,20. Our study design does not touch upon the question of individual treatment effects—whether specific treatments improve survival for specific individuals—and it may be that better methods than ours can detect high-mortality subgroups, but it seems unlikely for such subgroups to be large enough that costs reductions there could matter on the scale of a national budget.

Methods

The use of personal data in this study followed Danish data protection legislation. Treatment of the data at Statistics Denmark is legal according to the Act on Statistics Denmark21 and the General Data Protection Regulation (GDPR) art 6 s 1 ss e22, Statistics Denmark has the right to process individual-level registry data. Per GDPR art 14 s 5 ss b22, obtaining individual consent for the use of registry data in research is not required.

The study design is illustrated in Fig. 1. The study population consisted of all individuals of age 65 and above who had been living in Denmark for two years at the point of entry into the study (January 1st 2016). We collected personal characteristics from individually linkable national registries for the last two years before entry and followed up for death or emigration for one year after entry, as well as for use of healthcare services in the follow-up period (Supplementary information 1).

A machine-learning algorithm to predict one-year survival was trained on a training sample consisting of two thirds of the population. The algorithm was an ensemble of three methods: Boosting, Random Forest and Lasso. Sub-samples of 2.5% and 7.5% of the training sample were set aside for ensemble weight calculation and calibration, the remaining training sample was reduced to a balanced sample with a 50–50 distribution of decedents and survivors by random selection of a subsample of survivors, and the three learners were trained on the balanced training sample, with parameter tuning by five-fold cross-validation. Having trained the individual learners, we created an ensemble by weighting them together, using the ensemble weight calculation sample (after reducing this to a balanced sample) to compute weights by computing predicted values for the three learners and fitting a linear combination of these to the observed values. As the prediction ensemble was trained on balanced data, we re-balanced it, using Bayes’ rule. Following our model paper, we would have used the calibration sample to fit a third-degree polynomial in the predicted values to observed mortality, but as we observed better calibration simply using Bayes’ rule, we abandoned that approach. The prediction model was trained using R software and the tidymodels framework23, and the ranger24, xgboost25,26 and glmnet27 packages.

The personal characteristics were sex, age, country of origin, education, marital status and household size, municipality, variables on income and financial assets, use of primary healthcare grouped by specialization, number of hospital contacts by ICD-10 chapter of the main diagnosis for the contact, number of prescriptions by five-digit ATC code, amount of home care (personal and practical help) being provided by the municipality and an indicator for admission to a nursing home. The financial variables were collected on a yearly basis for the last two years before entry, all other time-varying variables were collected on a quarterly basis for the last two years before entry. In order to reduce the size of the predictor space, we performed principal component analysis on the quarterly-level prescription data and reduced to the first 66 components.

The trained algorithm was used to predict one-year survival for the test sample of one third of the population, and the distribution of healthcare expenditure over the year of follow-up by predicted mortality at one year was examined. Healthcare expenditure was defined either treatment-related: The costs of hospitals, primary care and prescription drugs; or care-related: The costs of home care and residential care. Healthcare and eldercare in Denmark is primarily taxpayer-funded, so the information available in registries accounts for 97% of healthcare expenditure in Denmark12. Individual-level expenditures for the period of follow-up were computed, as well as individual-level expenditure per day of follow-up, for total health expenditure and for treatment- and care-related expenditures. The methods are described in11.

In order to assess the proportion of health care expenditure explained by predicted mortality, we reweighted the survivor population to have a similar distribution of predicted mortality as the decedent population and compared mean spending per day alive in the weighted survivor population to mean spending among decedents. The weights were computed, using kernel density estimates of the densities among decedents and survivors with a survivor with an estimated survival probability of p being given a weight of kd(p)/ks(p) where kd is the density function estimated for decedents and ksis the density function estimated for survivors.

Data availability statement

Due to restrictions in Danish law, the confidential health care data used in this study can only be accessed through Statistics Denmark, the state organization holding the rights to the data. Danish scientific organizations can be authorized to work with data within Statistics Denmark and can provide access to individual scientists inside and outside of Denmark. Data are available via the Research Service Department at Statistics Denmark: (www.dst.dk/da/TilSalg/Forskningsservice) for researchers who meet the criteria for access to confidential data. The authors of this study had no special access privileges others would not have.