Background

Cardiac arrest is a global public health challenge and an important cause of premature death [1, 2]. Mortality is high, and survivors frequently carry a significant disease burden due to unfavourable neurological outcomes [3,4,5]. As a consequence, cardiac arrest leads to significant socio-economic costs [6]. At intensive care unit (ICU) arrival, cardiac arrest patients with return of spontaneous circulation (ROSC) are frequently unconscious and sedated, rendering clinical neurological evaluation difficult [7,8,9]. Therefore, clinicians have to rely on history and ambiguous clinical and diagnostic findings for early prognostication and adequate counselling of relatives [8, 10]. Although substantial progress has been made in the prognostication of short-term outcomes after cardiac arrest [8, 11,12,13], prospective data for early prognostication of long-term outcomes (≥ 2 years) after cardiac arrest are scarce [14,15,16]. Recently, the focus in post-cardiac arrest research has shifted from short-term to long-term outcomes, as highlighted in the 2021 European guidelines for post-resuscitation care and two recently published systematic reviews and a meta-analysis [8, 17, 18]. Reliable prediction of long-term outcomes is of great importance because a change in the level of neurological functioning can still occur months after hospital discharge [8, 13]. In many cases, patients and relatives would agree to limiting therapeutic efforts in the light of poor prognosis, foreseeable poor quality of life or high risk of physical or mental disability [19, 20]. There is a wide consensus among health care professionals that life-sustaining interventions should only be used if they are consistent with the patient’s values and goals [21]. Therefore, prediction of long-term outcomes might provide important additional information to guide early discussions about goals-of-care and the extent of therapeutic effort.

In the past 15 years, several clinical risk scores have been developed and validated specifically for early prognostication after cardiac arrest [22, 23]. Still, only a few have been adequately validated in independent cohorts [22]. In 2006, the Out-of-Hospital Cardiac Arrest (OHCA) score was developed, which relies on five clinical (no-flow and low-flow interval, initial rhythm) and laboratory parameters (creatinine, lactate) available at ICU admission [11]. The Cardiac Arrest Hospital Prognosis (CAHP) score was developed in 2016 and includes additional information regarding resuscitation measures (location of cardiac arrest, adrenaline [epinephrine] dosage) and a different laboratory parameter (pH) on ICU admission [12]. The severity-of-illness scores APACHE II (Acute Physiology and Chronic Health Evaluation II) and SAPS II (Simplified Acute Physiology Score II) have been widely used in critical care research, and the required clinical and laboratory parameters are readily available for all post-cardiac-arrest patients [24, 25]. All four scores were successfully validated for the prediction of short-term neurological outcomes and mortality [26,27,28,29,30,31,32,33,34,35]. However, these scores have not been evaluated regarding long-term outcomes.

This study aims to evaluate the performance of the OHCA, CAHP, APACHE II, and SAPS II scores for early prognostication of long-term mortality and long-term neurological outcome in a large-scale prospective cohort of cardiac arrest patients.

Methods

Study setting

This study was conducted using the prospective COMMUNICATE/PROPHETIC cohort of consecutive cardiac arrest patients admitted to the 42-bed interdisciplinary ICU of the University Hospital Basel, Switzerland (tertiary teaching hospital). The COMMUNICATE/PROPHETIC study investigates the outcomes of cardiac arrest patients and the psychosocial stress of their relatives. The details of the study conductance have been published previously [29, 36,37,38,39,40,41]. Informed consent was either obtained from the patient or the relatives, depending on the decision-making capability of the index patient. In cases of missing relatives, permission was obtained from an independent physician not involved in the study. The study was approved by the Ethics Committee of North-western and Central Switzerland (www.eknz.ch) and followed the principles of the Declaration of Helsinki and its amendments. Analysis and reporting for this study were conducted in accordance with the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement [42].

Participants

Between October 2012 and November 2019, all patients with ROSC admitted to the ICU after OHCA or in-hospital cardiac arrest (IHCA) were prospectively included. Not eligible were patients suffering a cardiac arrest while being monitored (ICU, intermediate care unit, operating theatre, cardiac catheterisation laboratory). Further exclusion criteria were age  < 16 years or denial of informed consent. The patients were treated according to the standardised local treatment protocol, including targeted temperature management, in compliance with the guidelines of the European Resuscitation Council [8, 43, 44].

Outcomes

The primary outcome was long-term mortality at 2 years. Secondary outcomes were long-term neurological outcome at 2 years assessed by the Cerebral Performance Category (CPC), and long-term mortality at 6 years. The CPC system was used by the original development studies [11, 12] and most validation studies in accordance with the Utstein Style of reporting data from OHCA [45]. It divides neurological outcome into five categories: CPC = 1: good cerebral performance; CPC = 2: moderate cerebral disability; CPC = 3: severe cerebral disability; CPC = 4: coma or vegetative state; CPC = 5: death or brain death. Good neurological outcome was defined as a CPC of 1 or 2, and poor neurological outcome as a CPC of 3 to 5 in accordance with the development studies [11, 12].

Data collection

The clinical information was prospectively collected from patient records by the study team. For the calculations of the respective scores, the methodologies of the original publications were strictly applied [11, 12, 24, 25]. Information on resuscitation parameters was collected, including no-flow and low-flow interval, setting of cardiac arrest, initial rhythm, drugs administered, and whether bystander basic life support was performed, as well as clinical data (e. g., heart rate, blood pressure, respiratory rate, urine output, temperature, Glasgow Coma Scale [GCS], intubation status), demographic data (age, sex), pre-existing medical conditions (hypertension, coronary artery disease, congestive heart failure, chronic obstructive pulmonary disease, diabetes, chronic kidney and liver disease, malignant disease) and blood parameters (pH, lactate, base excess, bicarbonate, creatinine, urea, sodium, potassium, bilirubin). Predictor data for calculation of the scores were complete in 79.0% (OHCA), 67.2% (CAHP), 82.9% (APACHE II), and 84.6% (SAPS II), respectively. Missing values were primarily the no-flow time (missing in 12.8%), which is necessary for calculating the OHCA and CAHP scores, and the initial pH (missing in 14.0%), which is required for the CAHP score only. To account for missing data to calculate the four scores, imputed datasets using multiple imputations by chained equations were used for comparisons between scores. Imputations were calculated using multiple covariables (i.e., socio-demographics, comorbidities, resuscitation information, vital signs) also including main outcomes (death, neurological outcome) to reduce bias as previously suggested [46].

Follow-up and survival

In the context of the COMMUNICATE/PROPHETIC study, all patients who had consented to be contacted by the study team were scheduled for a standardised telephone follow-up after 2 years with an assessment of vital status and neurological performance. After the 2-year follow-up period, survival data were obtained by directly contacting either the patient, their relatives, or their general practitioner. If patient contact was not possible, the medical records of the University Hospital Basel (Switzerland) and publicly available death registries were consulted for information concerning vital status. Patients lost to follow-up were censored at the date of the last follow-up. The follow-up time was calculated from the moment of admission to death or censoring date, whichever date came first.

Score risk categories

According to the original publication of the CAHP score [12] and a validation study of the OHCA score [26], the CAHP score results were divided into three categories (< 150; 150–200; > 200 points), and the OHCA score results into four categories (≤ 20; > 20–40; > 40–60; > 60 points). Higher risk score categories are associated with a higher risk of an unfavourable outcome [12, 26]. For the APACHE II and the SAPS II scores, corresponding risk categories do not exist [24, 25].

Statistical analysis

For continuous variables, descriptive statistics such as means, medians, and interquartile ranges were used, and categorical or binary variables were analysed by counts and proportions. Binary and categorical variables between groups were compared using Pearson’s χ2-test. Continuous data were checked graphically for normality of the distribution. Continuous, normally distributed variables were compared using analysis of variance (ANOVA) or t-test, and continuous, skewed variables were compared using the Wilcoxon rank-sum test. The scores were calculated according to the original publications [11, 12, 24, 25]. To assess the prognostic performance of the scores, measures of discrimination and calibration were calculated. Discriminatory performance was summarised by the area under the receiver operating curve (AUROC) for all endpoints. An AUROC of 0.7–0.8 was classified as acceptable, 0.8–0.9 as good, and  > 0.9 as excellent. The approach suggested by DeLong et al. was used to compare ROC curves between groups or between scores [47]. For the OHCA and CAHP score, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive and negative likelihood ratio were calculated for the cut-offs described above. The association of the score value with the outcomes was assessed by conducting regression analyses with calculation of hazard ratios (HR) and their 95% confidence intervals (CI) for mortality using Cox-regression models for time to event data (i.e., mortality) and logistic regression analyses with odds ratios (OR) for poor neurological outcome. Calibration of the OHCA and CAHP score was assessed graphically by depicting observed vs. expected outcome event numbers per decile of predicted risk on a calibration plot. All statistical analyses were conducted using STATA 15 (Stata Corporation, College Station, United States of America).

Results

Baseline characteristics

During the study period, 486 cardiac arrest survivors were admitted to the ICU of the University Hospital Basel, Switzerland. The final analysis included 415 patients, as 38 patients were excluded due to withheld informed consent or screening failure, and 33 patients were lost to follow-up before assessment of vital status at 2 years. Secondary outcome data were available for 80.2% (CPC at 2 years) and 70.4% (6-year mortality) of the included patients. Baseline characteristics stratified by the primary endpoint of 2-year survival are presented in Table 1. Factors significantly associated with higher mortality were female gender, higher age, chronic comorbidities (chronic obstructive pulmonary disease, diabetes, chronic kidney disease, cancer), longer no-flow and low-flow intervals, unwitnessed cardiac arrest, and non-shockable initial rhythm, as well as lower GCS score, higher lactate levels and lower pH on ICU admission.

Table 1 Baseline characteristics

Mortality and neurological outcome

Of 415 patients, 201 (48.4%) died during the initial hospital stay. Withdrawal of life-sustaining therapy was conducted in 179 of these 201 patients (89%). After 2 years, 241 of 415 patients (58.1%) died. Of the 92 survivors with an assessment of neurological outcome after 2 years, 89 (96.7%) had a good neurological outcome (CPC 1–2), and 3 (3.3%) a poor neurological outcome (CPC 3–4). After 6 years, 241 of 292 patients (82.5%) died. A Kaplan–Meier survival estimate of the total cohort is shown in Additional file 1: Figure S1.

Prognostic performance of risk scores

Table 2 summarises the prognostic performance of the OHCA, CAHP, SAPS II, and APACHE II scores for the prognostication of the primary and secondary outcomes. For 2-year mortality, all scores showed good discriminatory performance, with the CAHP yielding an AUROC of 0.87 (95% CI 0.84–0.90), followed by the APACHE II score (0.83 [95% CI 0.79–0.87]), the OHCA score (0.82 [95% CI 0.78–0.86]) and the SAPS II score (0.81 [95% CI 0.76–0.85]). The differences between the AUROC values were statistically significant (χ2 = 19.4, p < 0.001). A graphical comparison of ROC curves for 2-year mortality is shown in Additional file 1: Figure S2. The CAHP showed good discriminatory performance for the secondary endpoints with an AUROC of 0.86 (95% CI 0.82–0.90) for the 2-year neurological outcome and an AUROC of 0.88 (95% CI 0.83–0.93) for 6-year mortality. All other scores showed acceptable to good discriminatory performance for the secondary endpoints (for details see Table 2). For the OHCA and CAHP score, prognostic accuracy at the predefined cut-offs is presented in Tables 3 and 4, respectively. An OHCA score of  > 40 points predicted 2-year mortality with a specificity of 98.9% (95% CI 95.9–99.9), the highest risk category (> 60 points) reached a specificity of 100% (95% CI 63.1–100.0). The CAHP score’s high-risk category (> 200 points) predicted 2-year mortality with a specificity of 97.1% (95% CI 93.4–99.1). Figures 1 and 2 show Kaplan–Meier survival estimate curves with numbers at risk stratified by OHCA and CAHP score categories. For all endpoints, the AUROC was additionally calculated for the subgroups of OHCA and IHCA patients and the results shown in Additional file 1: Table S1.

Table 2 Comparison of long-term prognostic performance between scores
Table 3 Performance of OHCA score at different cut-off points
Table 4 Performance of CAHP-Score at different cut-off points
Fig. 1
figure 1

Kaplan–Meier survival estimates with number at risk for predefined OHCA score categories. Below the x-axis, number at risk for the individual time points are reported

Fig. 2
figure 2

Kaplan–Meier survival estimates with number at risk for predefined CAHP score categories. Below the x-axis, number at risk for the individual time points are reported

The CAHP score showed good calibration for 2-year mortality with a slight overestimation of mortality in the upper two-thirds of the risk spectrum (Additional file 1: Figure S3). Calibration of the OHCA score was poor, with underestimation of 2-year mortality in the low-risk spectrum and overestimation in the high-risk spectrum (Additional file 1: Figure S4). For 2-year neurological outcome, the calibration of the CAHP score was good, with a slight underestimation of poor neurological outcome, especially in the lower risk categories (Additional file 1: Figure S5). The OHCA score showed poor calibration for 2-year neurological outcome with underestimation in the low-risk spectrum and overestimation in the higher risk spectrum (Additional file 1: Figure S6).

Discussion

This study has validated two prognostic cardiac arrest scores (OHCA and CAHP scores), and two ICU severity-of-illness scores (APACHE II and SAPS II scores) for the prediction of long-term mortality and neurological outcome in a prospective cohort of cardiac arrest survivors followed for up to 8 years. The CAHP score showed the best discriminatory performance for the prediction of 2-year mortality, 6-year mortality, and 2-year neurological outcome. Calibration of the CAHP score was good for 2-year mortality and 2-year neurological outcome. As already demonstrated for the prognostication of short-term outcomes [29], two non-specific severity-of-illness scores, the APACHE II and the SAPS II showed promising discriminatory performance for the prognostication of long-term survival as well as long-term neurological outcome. The main drawback of the APACHE II and the SAPS II score is that the worst value of the first 24 h after ICU admission is required for each included parameter. This results in a time delay compared with the OHCA and CAHP scores, which require only parameters readily available on ICU admission.

The findings of this study are generally in line with previous validation studies evaluating outcomes at hospital discharge or 30 days post-event, where the CAHP score showed a slightly better performance than the OHCA score [29, 30, 33, 48]. In two different cohorts of cardiac arrest patients evaluating outcomes at hospital discharge [49] or 90 days [32] the OHCA score performed somewhat better than the CAHP score. The severity-of-illness scores had a slightly inferior performance when compared to the cardiac arrest-specific scores, which was also noted in previous studies looking at short-term outcomes [29, 50, 51]. A British group recently developed and validated a post-cardiac arrest score for OHCA patients to predict neurological outcome 6 months after OHCA and compared it with the OHCA and CAHP scores [52]. Their score only showed marginally better discrimination when compared with the CAHP score in their cohort (AUROC 0.88 vs. 0.87, respectively) [52]. One may argue that the development of new scores for the prognostication of long-term outcomes may not provide additional value, as established scores perform well in predicting long-term outcomes. Efforts to improve established scoring systems by adding known predictors of outcome after cardiac arrest, such as laboratory parameters (e. g., neuron-specific enolase), imaging or electrophysiological examination results, or clinical signs (e. g., GCS motor score) have shown promising results [30, 32, 40]. Such modifications with corresponding validation studies might be helpful to keep established scores up to date and improve their predictive value based on current and evolving science.

A major and overarching concern in research regarding prognostic factors in post-cardiac-arrest patients is the effect of self-fulfilling prophecy, meaning that the documentation of poor prognosis early in the treatment process per se may lead to a change or withdrawal of care, which again leads to a higher occurrence of poor outcome in this patient group [53,54,55]. In our study, score values were calculated by the study team and were not provided to the treating ICU physicians, thus minimising the risk of a low score value influencing the ICU team in their decision-making. However, treating physicians inevitably knew about different clinical factors which also have been used for calculating the score values (e. g., no-flow and low-flow intervals, laboratory values). These factors might have influenced their decision-making. However, blinding involved clinicians with respect to these factors is not possible.

The presented data suggest that the established cardiac arrest-specific scores OHCA and CAHP, which have been thoroughly validated for predicting short-term mortality and neurological outcome in OHCA and IHCA patients, might be suitable for predicting long-term outcomes. Although the OHCA and CAHP scores have originally been intended for the prognostication of outcomes after OHCA, both scores have since been successfully validated in OHCA and IHCA survivors [29, 56, 57], which is confirmed by the results of the present study. However, a subgroup analysis showed significantly lower discriminatory performance of both the OHCA and CAHP scores when used for the prediction of 2-year mortality in IHCA patients only compared to OHCA patients. This finding is in line with previous studies and was expected, as the scores were originally developed for use in OHCA patients only. Before applying the scores to IHCA patients, further validations and, if necessary, adaptions and/or recalibration of the scores are recommended.

The OHCA and CAHP score can easily be calculated using openly accessible online calculators, rendering their use easy and straightforward [58, 59]. Further randomised controlled trials using the scores as decision aids are needed to test the impact of prediction models on decision-making, outcomes, and healthcare costs in clinical practice. In addition, validation studies based on other long-term cohorts of cardiac arrest patients are needed to further validate and, if necessary, recalibrate the scores for the prognostication of long-term outcomes.

Our study has limitations. First, we did not have complete data for all parameters to calculate the scores and thus had to impute the missing data. Second, due to the study setting, there was a relatively large proportion of loss of follow-up patients resulting in a possible selection bias. This is mainly the case for the secondary outcomes. Assessment of neurological outcome at 2 years of follow-up required a telephone interview, which some patients declined while others could not be contacted by the study team. As a 6-year follow-up was not part of the original study design, a substantial proportion of patients were lost to follow-up. Therefore, the results for 2-year neurological outcome and 6-year mortality have to be interpreted with caution due to a possible selection bias. Third, we only assessed all-cause mortality and thus patients may have died from other unrelated causes. Fourth, the single-centre setting of the study limits the generalisability of the results to other centres, regions or countries. External validation studies evaluating the herein validated scores for long-term outcomes in other populations are necessary to address this issue. However, the relatively large cohort size of the study, the fact that treatment modalities were in line with other Swiss and European medical centres, and the inclusion of unselected cardiac arrest patients indicate a high external validity of the results. Furthermore, analysis and reporting were conducted according to current state-of-the-art methodological guidelines, so that its results can be of the greatest possible use for future research in this field.

Conclusion

In our single-centre cohort of cardiac arrest survivors, the OHCA, CAHP, APACHE II, and SAPS II scores showed good performance in early prognostication of long-term mortality at 2 years and acceptable to good performance for the prognostication of 6-year mortality and neurological outcome at 2 years after cardiac arrest. Of the herein validated scoring systems, the CAHP score showed the best discriminatory performance and is a simple-to-use risk-stratification tool available early after cardiac arrest. These scores thus may guide clinicians by stratifying patients according to the risk of poor long-term outcome and may help to support discussions about goals-of-care and the extent of therapeutic effort.