Background

End-of-life care is an integral component in the delivery of critical care [1]. Epidemiological data indicate that 15–30% of patients admitted to intensive care units (ICUs) around the world die [2], while 10–12% undergo limitations of life-sustaining treatments [3,4,5]. Such treatments prolong life without reversing the underlying medical condition; examples include cardiopulmonary resuscitation (CPR), mechanical ventilation and renal replacement therapy [6].

In the past three decades, several studies have focused on the investigation of patients’, families’, physicians’ and nurses’ attitudes or practices regarding life support at the end-of-life [7,8,9,10]. The main concerns were symptom control, patient and family satisfaction, adequate communication and management of conflicts between individuals involved in end-of-life decision-making. Few studies have focused on the formal organizational or system-level support and the existing infrastructure of individual ICUs to assist health care staff to perform end-of-life care at a high standard [11].

The novel end-of-life practice score (EPS) is a 12-component score developed through expert consensus and review of existing literature. The EPS was designed to measure the end-of-life care infrastructure and organization of an ICU. It was first developed to interpret the increases in treatment limitations over time in 22 ICUs situated across Northern, Central and Southern Europe [4]. This was the principal finding of a two-part longitudinal study of ICU end-of-life care delivery, termed ″comparison study″ from now on. This comparison study had two data collection periods, 16 years apart, in the context of the Ethicus-1 study (1999–2000) and the Ethicus-2 study (2015–2016) [3,4,5]. Exploratory logistic regression analysis revealed a significant association between the EPS and time-dependent changes in the frequency of treatment limitation decisions [4]. However, the relative contribution of each one of the 12 binary end-of-life practice variables in explaining temporal changes in the frequency of limitation decisions remained unclear.

Using data from the comparison study’s European cohort [4], the current study aimed to first identify specific aspects of end-of-life practice with possibly strong, clinically relevant associations with the comparison study’s time-dependent variations in limitation decisions in European ICUs [4]. The second aim was to develop an EPS with appropriately weighted components. Such weighted EPS might aid in interpreting the recently reported global variation in treatment limitation frequency across 199 ICUs from 8 world regions (worldwide Ethicus-2 study) [5], and potentially have general application in future similar studies. Adequately interpreting this contemporary global variation might help improve end-of-life practice worldwide.

Methods

This study includes data analyses from two previously approved and published studies [4, 5]. Therefore, there was no requirement for ethical approval.

Comparison study summary description

Participating ICUs were in Northern (4 countries), Central (4 countries), and Southern (6 countries) Europe. Center-level and patient-level data were collected prospectively. Data on 4592 patients who died or had a limitation of life-sustaining interventions (2807 and 1785 from the Ethicus-1 and Ethicus-2 studies, respectively) were available [4]. Comparison study and worldwide Ethicus-2 study data forms and collection methodologies were identical [5].

The primary outcome was application of any limitation in life-prolonging therapy (withholding, or withdrawing, or active shortening of the dying process [4]). Patients were categorized into 5 prospectively defined and mutually exclusive end-of-life categories: withholding of life-sustaining therapy, withdrawing of life-sustaining therapy, active shortening of the dying process, failed CPR and brain death [4].

Original (unweighted) EPS development

In the comparison study, 12 end-of-life practice variables (i.e., EPS subcomponents) were collected post hoc from 22 participating ICUs [4]. A simple questionnaire with two possible answers (i.e., no = “absence” or yes = “presence”) for each practice variable was administered electronically. These variables reflect key aspects of ICU end-of-life practice [4, 5] and include (1) routine family meetings [12,13,14], (2) daily deliberation for appropriate level of care [12], (3) end-of-life discussions during family meetings [12], (4) written triggers for treatment limitations [15, 16], (5–6) written end-of-life guidelines [17] and protocols [15], (7) palliative care consultations [14, 18], (8) ethics consultations [12, 14], (9–10) staff taking communication or bioethics courses [12,13,14, 18] and (11–12) country end-of-life guidelines or legislation [12, 17, 18]. Variables were graded by 0 or 1 according to their reported absence or presence, respectively. The sum of these grades was the “original” EPS, which ranged within 0–12. Thus, the higher the EPS the more end-of-life practices were concurrently present. Definitions of practice variables and the EPS are presented in Table 1. The same post hoc collection of binary end-of-life practice data was performed for the Ethicus-2 worldwide study [5].

Table 1 Definitions of subcomponent variables of end-of-life practice score and derivation of its weighted/rescaled form

Derivation of the weighted EPS

The comparison study showed a substantial increase in treatment limitations’ frequency over time and a decrease in the frequency of death without limitation [4]. This was considered as a time-dependent improvement in end-of-life practices [19]. To determine the relative importance of each end-of-life practice variable as explanatory variable, generalized estimating equations (GEE) analysis with robust standard errors and an exchangeable working correlation structure accounting for the factor center [5] was applied to the entire comparison study population [4]. Additional explanatory variables included study period (i.e., 2015–2016 vs. 1999–2000), region (i.e., Northern, Central and Southern Europe), age, gender, acute ICU admission diagnoses, chronic diseases and physician religion. Type of model was set at “binary logistic.” The binary dependent variable was patients with “any treatment limitation or no treatment limitation” (Fig. 1). For these patient-level analyses, it was assumed that a specific, ICU-level grading of an end-of-life practice variable should correspond to all patients originating from that ICU. For example, if an ICU contributed 100 patients and the site principal investigator responded positively to “end-of-life discussions during weekly meetings”-meaning that this was a typical ICU-level characteristic-then “end-of-life discussions” were assumed to have occurred for all the 100 participants of that ICU [4].

Fig. 1
figure 1

Flowchart of the employed analytic methodology. ICU, intensive care unit; GEE, generalized estimating equations; EPV, end-of-life practice variable; ROC receiver operating characteristic, EPS end-of-life practice score, CPR cardiopulmonary resuscitation. *The weighted EPS was determined by first multiplying the comparison study’s [4] GEE-derived EPV coefficients by the 0 or 1 response grades of the 12 EPVs from the worldwide dataset [5], and then by summing up the aforementioned products. The EPS rescaling formula is presented in Table 1. The original, unweighted EPS was calculated as the sum of the 0 or 1 response grades of the 12 EPVs from the worldwide dataset [5]; author consensus definitions of the EPVs are provided in Table 1

Comparison study GEE model was cross-validated using the fivefold validation technique [20]. More specifically, the entire study dataset was randomly split into five, equally-sized groups, i.e., the fivefolds, with one of the folds (20% of the data) serving as the validation group and the remaining four folds (80% of the data) serving as the training group for constructing probabilistic models. The model was fit on the training group, and its coefficient estimates were used to predict treatment limitation probability in the validation group. This process was followed five times in total; each time, a different fold was used as validation group [20, 21].

Agreement (calibration) between predicted and observed treatment limitations in the validation group was assessed by constructing a receiver operating characteristic (ROC) curve based on the entire dataset and calculating the “area under the curve” (AUC).

Weighted EPS rescaling and validation

Patient-level GEE analysis accounting for center on the worldwide Ethicus-2 dataset (n = 11,574) included a weighted EPS and the following explanatory variables: world region (i.e., Africa, Latin America, North America, Asia, Australia/New Zealand and Northern, Central and Southern Europe), age, gender, acute ICU admission diagnoses, chronic diseases, and center-type (i.e., private vs. public) (worldwide model 1; Fig. 1). The worldwide dataset did not include brain-dead patients, and the dependent variable remained ″limitation yes/no″ [5], reflecting treatment limitation vs. failed CPR. Weighted EPS was calculated by multiplying the 0 or 1 end-of-life practice variable response grades by the GEE coefficients determined in the comparison study data analysis and summing up the resulting 12 end-of-life practice variable-specific products (Table 1). Consequently, the weighted EPS was derived according to both the presence and relative importance of its subcomponents. Subsequently, EPS’s values were linearly transformed (i.e., rescaled) to its original 0–12 range [4] (Table 1).

Three additional GEE models were fit on the worldwide study data, namely a recently reported worldwide model 2 [5] and 2 additional models, i.e., worldwide models 3 and 4. Worldwide model 2 differed from worldwide model 1 in including the 12 end-of-life practice variables as separate explanatory variables instead of the EPS [5] (Fig. 1). Worldwide model 3 (reference model) included all the explanatory variables of models 1 and 2, besides the EPS or the end-of-life practice variables (Fig. 1). Worldwide model 4 included the variables of worldwide model 3 plus the original, unweighted EPS version (i.e., the simple sum of the 1/0 grades of the end-of-life practice variables [4, 5]) (Fig. 1). Weighted EPS validation was the primary aim of the fitting of worldwide model 1. The purpose of the additional fitting of worldwide models 2–4 was to comparatively determine any potential weighted EPS-associated improvement in GEE model performance. Analyses were designed by SDM, SC and JN.

All models were subjected to fivefold cross-validation. Furthermore, for all worldwide models, ROC construction and corresponding AUC determinations were used to assess agreement between predicted and observed treatment limitations in the validation groups. Finally, goodness of fit was compared between worldwide models 1, 2, and 4 vs. reference model 3 using the “analysis of variance (ANOVA)” function in R (Fig. 1).

EPS and failed CPR worldwide study data were compared among regions by Kruskal Wallis test and Pearson chi square test, respectively. All analyses were conducted with R (version 4.0.2). GEE analysis was conducted with R package geepack and Figures were produced using R package pROC [22]. Figure 1 is a summary illustration of the above-described analytic methodology. Additional methodological details are presented in Additional file 1.

Results

GEE model for weighted EPS derivation

Data from 4592 patients were included in the comparison study’s GEE model. Patient characteristics have been reported elsewhere [4] and are presented in Additional file 1: Table S1.

Table 2 displays comparison study GEE results (Ethicus-2 vs. Ethicus-1), which reconfirm that the 2015–2016 Ethicus-2 cohort was strongly associated with treatment limitation [odds ratio (OR): 36.3, (95% confidence interval): (9.1–144.5)]; patient age, physician religion, and acute diagnoses/chronic diseases were also associated with limitation decisions. Among end-of-life practice variables, end-of-life discussions during weekly meetings [OR 0.55, (0.30–0.99)], written ICU end-of-life guidelines [OR 0.52, (0.31–0.87)], written ICU end-of-life protocols [OR 15.08, (3.88–58.59)], palliative care consultations [OR 2.63, (1.23–5.60], and national end-of-life legislation [OR 3.24, (1.60–6.55)] were significantly associated with limitation decisions. The AUC of the comparison study model was 0.865 after applying fivefold cross-validation (Fig. 2).

Table 2 Comparison study general estimating equations model for “any treatment limitation or no treatment limitation”
Fig. 2
figure 2

Receiver operating characteristic curve based on the comparison study’s [4] generalized estimating equations model

GEE model for weighted EPS validation

In the worldwide study cohort, EPS/end-of-life practice variable data were available from 186/199 participating ICUs (93.5%), corresponding to 11,574 patients who died or had a treatment limitation [5]. Baseline characteristics of worldwide study participants are presented in Additional file 1: Table S2. Regional and overall original and weighted/rescaled EPS values and frequency distribution of failed CPR are presented in Table 3.

Table 3 Regional end-of-life practice score and frequency of failed cardiopulmonary resuscitation

Worldwide GEE models 1, 2, and 3 and 4 are presented in Table 4, and Additional file 1: Tables S3, S4 and S5, respectively. As also elsewhere reported [5], region, age, acute diagnoses/chronic diseases and center type were associated with limitation decisions in all GEE models. In worldwide model 1, weighted/rescaled EPS was an independent predictor of treatment limitation [OR 1.12, (1.03–1.22)] (Table 4: EPS data highlighted in bold), i.e., for each 1-point increment in weighted/rescaled EPS, treatment limitation probability increased by 12%.

Table 4 Worldwide general estimating equations model 1 for ″treatment limitation vs. failed cardiopulmonary resuscitation.”

The AUCs of worldwide models 1, 2, 3, and 4 were 0.745, 0.752, 0.727, and 0.730, respectively (Fig. 3). Between-model comparisons (by R’s ″ANOVA″) demonstrated that only the worldwide model 1 had significantly better goodness of fit vs. the reference model (P = 0.008). In contrast, the goodness of fit of worldwide models 2 and 4 was not significantly better when compared to the reference model (P = 0.056–0.23) (Fig. 3).

Fig. 3
figure 3

ROC curves of the 4 generalized estimating equations models of the worldwide study [5]. ROC receiver operating characteristic, EPS end-of-life practice score, EPV end-of-life practice variable, AUC area under the curve, CI confidence interval. A: Model with weighted and rescaled EPS (worldwide model 1); B: Model with EPVs (worldwide model 2); C: Reference model without EPVs or EPS (worldwide model 3); D: Model with original, unweighted EPS (worldwide model 4)

Additional exploratory analyses

In the worldwide study population [5], presence of country end-of-life legislation and/or combined presence of end-of-life practice variables with significant ORs (see also above and Table 2) was associated with failed CPR frequencies of 8.4–9.4%, whereas upper-quartile weighted/rescaled EPS of ≥ 8.22 was associated with a failed CPR frequency of < 8% (Table 5). Region-level proportions of ″raw″ positive responses to the 12 end-of-life practice variables for both the comparison and worldwide studies [4, 5] are provided in Additional file 1: Table S6. In the comparison study [4], maximal European regional increases over time in positive responses for end-of-life protocols, palliative care consultations, and end-of-life legislation amounted to 50%; the respective maximal differences in the positive responses of the worldwide study [5] varied within 68–100%.

Table 5 Worldwide study [5] frequency of failed cardiopulmonary resuscitation under specific conditions of end-of-life practice

Worldwide, country-level, weighted/rescaled EPS data are shown in Additional file 1: Figure S1.

Discussion

A novel end-of-life practice score for ICUs, the EPS, weighted according to the strength of the associations of its subcomponents with limitation decisions, was derived from data obtained from a large ethical comparison study [4]. The weighted EPS was rescaled to 0–12, to align with the originally proposed score, and subsequently validated as explanatory variable for treatment limitation decisions using data from a larger worldwide study of end-of-life decision-making [5]. A high EPS was best achieved by ensuring combined presence of the three end-of-life practice variables with the highest coefficient estimates in the comparison study’s GEE analysis, namely the presence of end-of-life ICU protocols, palliative care consultations and national end-of-life legislation. Notably, in a hypothetical case of concurrent positive responses for these variables and negative responses for the remaining 9 variables, the weighted/rescaled EPS-value would amount to 10.76, which is quite close to its maximum value of 12.00. The results from the worldwide study’s data suggest that regions with a high weighted/rescaled EPS demonstrate increased frequency of life-support limitation, and a reduction in failed CPR. Indeed, an upper-quartile EPS was associated with failed CPR rates of < 8%.

In comparison study’s analyses, palliative care consultations had the third highest OR for predicting limitation decisions and the third highest coefficient in EPS weighting. Currently, palliative care is widely recognized as a key component of patient/family centered ICU care [13, 14, 19, 23,24,25,26,27,28,29,30,31,32]. Nevertheless, our data and other findings reveal that the presence of ICU-based palliative care may substantially vary across world regions (current study's range: 21–89%, corresponding to 68% variation), countries and hospitals [4, 5, 19, 33,34,35,36], and even among physicians working in the same ICU [37, 38], or according to daily ICU bed pressure [39].

Four systematic reviews suggested that consultative or integrative palliative care interventions may reduce ICU/hospital length of stay and cost, without increasing mortality [23,24,25,26]. Educational interventions aimed at ICU staff, and interventions comprising screening for palliative care referral, goals-of-care discussions and specialist palliative care involvement were associated with significant increases in limitation of life-sustaining treatments and CPR [26]. However, review findings were limited by study quality, heterogeneity of interventions/outcomes and uncertainty about generalizability [23,24,25,26].

Randomized trials of palliative care-led family meetings or complex, integrative interventions targeted at clinicians reported neutral and/or negative results, including worsening of post-traumatic stress disorder symptoms [26, 40, 41]. Conversely, randomized trials of multi-component interventions delivered by an interprofessional ICU team or of early-triggered palliative care consultations reported mainly positive results, including better clinician-family communication, more limitations in life-sustaining treatments and transitions to hospice care, shorter ICU stay or decreased ICU resource utilization and no significant effect on in-hospital mortality [42, 43]. Published data and current results support the need for further, evidence-based integration of well-designed and multifaceted palliative care interventions in standard ICU care [26]. Such interventions should result in timely provision of effective physical, psychological and spiritual comfort care by specifically trained/skilled ICU clinicians and/or palliative care specialists.

According to our findings, end-of-life protocols in the context of withdrawing or withholding life-sustaining measures should be considered a positive factor during the terminal period of provision of effective palliative care. End-of-life protocol application should be supported by a weighted shared decision that continuation of life-sustaining treatments would confer more harm than benefit to the individual patient [13, 14, 44,45,46]. End-of-life protocols should focus on the prevention/alleviation of any associated distressful patient symptoms (e.g., pain, dyspnea, or delirium) and minimization/prevention of any potential long-term psychological impact to family members (e.g., post-traumatic stress disorder, anxiety, depression, and complicated grief [16, 44, 45]). A preceding roundtable conference concluded that withdrawing of treatments such as mechanical ventilation should be tailored to individual patient needs [47]. A recent systematic review reported a worldwide variation and ambiguity of practices of withdrawal of mechanical ventilation [45]. Nevertheless, in countries from world regions with high (e.g., 100%) positive response rates for end-of-life protocols (e.g., USA), the quality of dying and death has also been rated high by families of decedent ICU patients [48]. Global variation in end-of-life protocol use is consistent with the worldwide study’s data, although the substantial temporal increase observed in the comparison study suggests that implementation remains dynamic and is evolving with time.

In several countries, the potential for exposure to legal risk may prevent ICU physicians from limiting invasive treatments (including CPR) in patients with poor prognosis [49]. End-of-life legislation is a well-established key factor not only for the prevention of disproportionate treatments [44], but also for the development of inter-professional decision-making and consensus building practices that take into account the patient’s values, goals and preferences, and ameliorate the moral distress associated with end-of-life decisions [46, 50]. Notably, legislative processes may be protracted, depending on cultural, religious, social, linguistic and political barriers, and the presence/intensity of lobbying/support by groups of stakeholders and members of regulatory bodies [51,52,53]. Furthermore, the implementation of in-force laws may still be limited due to lack of awareness, perceived ambiguity, or non-compliance by involved parties (e.g., healthcare professionals) [54, 55]. Nevertheless, the enactment of end-of-life legal frameworks, followed by the development of multifaceted end-of-life care programs/initiatives with electronic infrastructure (e.g., the Physician Orders for Life-Sustaining Treatments [56]) has been shown to substantially promote concordance between recorded patient wishes and administered end-of-life treatments and care [18, 57,58,59]. Our results confirm this relationship, as a strong association between end-of-life legislation and treatment limitation [5] was observed in all analyses.

Increased rates of limitation and consequent reduction in failed CPR may imply a higher quality of end-of-life care, strictly in the context of the concurrent presence of the above-discussed end-of-life practices and legal support. Nevertheless, decisions on life-support/CPR should still be individualized, taking into account patient prognosis and preferences. The Ethicus study protocol [3,4,5] did not include any collection of patient-level data on the fulfilment of criteria for withholding or withdrawing CPR [20].

End-of-life practice variable data were not uniformly consistent. Notably, in the comparison study analyses, end-of-life discussions and departmental end-of-life guidelines were negatively associated with treatment limitations. Recently reported limitations of clinician-family end-of-life conferences include insufficient information exchange about the patient’s values and preferences and deficient deliberation; this implies that the communication skills of clinicians need to be improved [18, 59]. Regarding guidelines per se, these may not effectively address problems of reaching consensus decisions, prognostication challenges, barriers in communication, patient palliative care needs, and physician-related variability in end-of-life decision-making [58,59,60,61,62]. Absence of association or negative associations between a number of the end-of-life practice variables and treatment limitation were noted, and appear to be counter-intuitive and difficult to explain. A possible explanation is that in the presence of end-of-life ICU protocols, palliative care consultations, and national end-of-life legislation, other local end-of-life practices may become somewhat redundant. Nevertheless, these local practices may also improve end-of-life care [12,13,14,15,16,17,18, 63] and should therefore be retained in the EPS, for further development in validation studies. Collectively, our results on the relative importance of 12 end-of-life practices highlight the need for further, high-quality research based improvement in interventions related to communication, ethics consultations, education, palliative care, and advance care planning or goals-of-care discussions [26]. The resulting progress in end-of-life practice might then be quantifiable by concurrent changes in the weighting of the corresponding EPS subcomponents.

Current results may also indicate the need for further evaluation and improvement of currently accepted end-of-life practices [5]. The establishment of EPS/end-of-life care Registries might facilitate the periodic (e.g., biyearly) determination of potential, time-dependent changes in the associations between the 12 end-of-life practice variables and treatment limitation. This should enable EPS reweighting (and subsequent prospective validation) according to the evolution of end-of-life care, thereby maintaining and/or enhancing its potential usefulness as a simple tool for continuous assessment and improvement of end-of-life care.

Strengths of the current analyses include using robust analytic methodology and large datasets to derive and validate the weighted/rescaled EPS. Limitations include the post hoc EPS data collection, which may have introduced recall and/or social desirability bias [4, 5]. Also, for the purpose of analyses, ICU-level responses for end-of-life practice variables were assumed to uniformly reflect individual patient-level practice; pertinent consequences could include (1) biased results on end-of-life practices with known, patient-level, qualitative variability (e.g., end-of-life discussions) [59], and (2) additional bias due to potential, physician-related variability in end-of-life practice [62, 64]. Nevertheless, our comparison study analysis was actually adjusted for physician religion, which partly explains end-of-life practice variation [3, 64]. Additional limitations comprise uncertainty about the validity and reliability of end-of-life practices derived by expert consensus, absence of prospective EPS validation and lack of data on patients not admitted to the ICU in the context of treatment limitation decisions in hospital wards [4, 5]. Potential perception and measurement bias cannot be excluded, since the presence of the variables was determined by subjective perception. Lastly, only physicians were asked and not nurses or patients/family members.

Conclusions

A weighted/rescaled EPS developed on the basis of changes in limitation decisions over a 16-year period [4] partly explained the substantial variation in contemporary treatment limitation decisions observed in the worldwide study [5]. The most important weighted/rescaled EPS components were ICU end-of-life protocols, palliative care consultations, and country end-of-life legislation. ICUs wishing to improve quality of end-of-life care may consider introducing the palliative care and end-of-life protocols into their organizational structures. Furthermore, national lawmakers might consider establishing and/or improving country-specific end-of-life legislations and healthcare policies targeted at facilitating their implementation.