1 Introduction

The implantable cardioverter-defibrillator (ICD) is a highly effective therapy for the prevention of sudden cardiac death (SCD) in high-risk patients [1]. However, many patients whose risk of short-term mortality following device implantation is high may gain no significant benefit from an ICD, irrespective of their SCD risk. Such patients, who have significant non-cardiac comorbidity or advanced heart failure, typically die of a non-cardiac cause or pump failure [2, 3]. These patients are important as they are exposed to all of the risks of ICD therapy, without the opportunity to gain significant mortality benefit.

A number of complex scoring systems have been proposed to identify these high-risk patients [47] (Table 1). However, it is unclear which scoring system is most useful and whether any add incremental value compared to a single risk marker alone (serum urea).

Table 1 Scoring systems to identify patients at high risk of early mortality after ICD implantation

The aims of this study were to use the proposed scoring systems to (1) establish how many current ICD recipients may be too high risk to derive significant benefit from ICD therapy and (2) evaluate how well the proposed scoring systems predict short-term mortality in an unselected cohort of ICD recipients.

2 Methods

2.1 Study design

We conducted a single-centre retrospective analysis of consecutive patients undergoing first time ICD implantation at King’s College Hospital (London, UK) between January 2009 and October 2013.

2.2 Derivation of risk scores

The presence or absence of specific clinical variables such as atrial fibrillation (AF), diabetes, peripheral arterial disease (PAD) and chronic obstructive pulmonary disease at the time of ICD implant were determined by review of the clinical records.

Assigned or measured variables such as age, New York Heart Association (NYHA) heart failure functional class, creatinine, urea, QRS duration and left ventricular ejection fraction (LVEF) were taken at the time of ICD implant or the closest available value.

AF was defined as a history of paroxysmal or permanent AF on the electrocardiogram. PAD was defined as in the Kramer study; a patient had an intervention on the carotid arteries or lower extremities, thoracic or abdominal aorta or had clinical claudication [6]. Chronic kidney disease was defined as an estimated glomerular filtration rate of <60 mL/min/1.73m2 using the modification of diet in renal disease equation.

2.3 Statistical analysis

Continuous variables are presented as median (interquartile range) and categorical variables are expressed as absolute and relative frequency.

The Bilchick [4], Goldenberg [5], Kramer [6] and Parkash [7] risk scores were calculated from patients’ clinical characteristics according to the original publications. On the basis of these risk scores, patients were further classified into risk categories as set out in the papers [47]. The Bilchick, Kramer and Parkash models distinguished two risk categories for mortality (low and high risk). In the Goldenberg model, patients were stratified in three risk categories for mortality (low, intermediate and high risk). For the purposes of our study, we combined the low and intermediate categories into one ‘low-risk’ group. We used serum urea to categorize patients into low and high risk, based on the value derived from the Goldenberg study, an analysis of MADIT-2 (cut-off of >9.28 mM) [5].

Cox proportional hazards regression modelling was used to evaluate the independent contribution of each of the clinical parameters within all scoring systems to the occurrence of mortality during follow-up. Each clinical parameter and risk scoring system was first entered into a univariate model, and those found to be significant at a level of P < 0.02 were then entered into a stepwise forward multivariate model.

Risk model calibration was assessed by the Hosmer-Lemeshow goodness-of-fit test, which determines how close the predicted and observed incidence of events is over a range of scores. In this test, a significant result indicates lack of model adjustment.

We assessed the discriminatory capacity of the risk models, as well as serum urea, for mortality by deriving their C-statistics, using receiver operator characteristic (ROC) curves. In general, a C-statistic value above 0.70 has acceptable discriminatory capacity. The C-statistics were compared to each other using a non-parametric test developed by DeLong et al. [8].

Survival for risk score categories for each scoring system was compared with Kaplan-Meier curves and the log-rank statistic.

To evaluate the ability of each scoring system to identify patients at risk of early (1-year) mortality following ICD implant, the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated for each scoring system and urea.

Cox proportional hazards regression modelling was used to evaluate the relationship between each scoring system and the occurrence of appropriate and inappropriate ICD therapy during follow-up.

SPSS (version 21.0, SPSS Inc., Chicago, Illinois, USA) was used for the statistical analysis. The areas under the ROC curve for clinical event models were compared using MedCalc (version 15.8, MedCalc Software, Mariakerke, Belgium). A bilateral value of P < 0.05 was considered statistically significant.

3 Results

3.1 Baseline characteristics

The study cohort was composed of 406 patients (Table 2). The most common underlying aetiology was coronary artery disease (70.2%, n = 285) and the majority were primary prevention implants (58.4%, n = 237).

Table 2 Baseline characteristics of the study population. Values are median (interquartile range) or n (%)

3.2 Predictors of mortality

During a mean follow-up of 936 ± 560 days, 96 patients died. In univariate Cox regression analyses, the absolute score of each scoring system was significantly associated with survival, with higher scores associated with worse survival (P < 0.0001 for all scoring systems) (Table 3). In addition, apart from AF, all risk factors included in each scoring system were also significantly associated with mortality (Table 3).

Table 3 Univariate and multivariate analyses for mortality by scoring system and constituent clinical parameters

In multivariate analysis, including both the four risk scores as well as their individual components, the only independent predictors of mortality were the Kramer scoring system (P = 0.032), the Bilchick scoring system (P = 0.008), serum urea >9.28 mmol/L (P = 0.006) and peripheral arterial disease (P = 0.024).

3.3 Comparison of risk model discrimination

The calibration of the Goldenberg, Bilchick, Kramer and Parkash scoring systems and urea for prediction of death were excellent, as demonstrated by the results of the Hosmer-Lemeshow test (Table 4).

Table 4 Calibration and discrimination of the Goldenberg, Bilchick, Kramer and Parkash scoring systems and urea for predicting death at 1- and 3-years following ICD implantation

By calculating the area under the ROC curve, we evaluated the accuracy of each scoring system and urea to predict 1- and 3-year mortality (Table 4). The C-statistic values for the Kramer score (AUC 0.76–0.77), the Bilchick score (AUC 0.70–0.76) and urea (AUC 0.71–0.70) were consistently above 0.7, suggesting good discrimination. In contrast, the values for the Parkash and Goldenberg models were consistently below 0.7.

These findings were broadly consistent across the subgroups of primary and secondary prevention devices, as well as single/dual chamber ICDs and CRT-Ds (Table 4). However, the C-statistic values tended to be higher in the CRT-D subgroups compared to the single/dual chamber ICDs. Additionally, while there was a trend towards the C-statistic values being higher in the non-ischaemic cardiomyopathy (NICM) subgroup compared to those with ischaemic cardiomyopathy, the model calibration was generally better in those with ischaemic cardiomyopathy.

We compared the discriminative capacity of the four scoring system and urea to predict 1- and 3-year mortality. The C-statistics for the Kramer and Bilchick scores were significantly higher than either the Parkash or Goldenberg scores at both time points (P < 0.05) (Table 5). However, neither had significantly better discriminative capacity than urea for the prediction of 1- or 3-year mortality (Table 5).

Table 5 Comparison of C-statistics of the Bilchick, Goldenberg, Kramer and Parkash scoring systems and urea for mortality at 1- and 3-years

3.4 Identifying patients at high risk of early mortality

Using the published cut-off values for each scoring system the proportion of ICD recipients in the high-risk groups were 5.9, 34.7, 7.4 and 21.4% for the Bilchick, Goldenberg, Kramer and Parkash scoring systems, respectively (Table 6). For urea (cut-off of >9.28 mM), the proportion was 25.1%. 1-year mortality in these five high-risk groups ranged from 11.7% (Goldenberg) to 40% (Kramer).

Table 6 Predictive accuracy of the Bilchick, Goldenberg, Kramer and Parkash scoring systems and urea for the prediction of 1-year mortality post-ICD implantation in 406 new ICD implants

For each of the four scoring system and urea, Kaplan-Meier survival analyses demonstrated significantly worse survival in high- compared to low-risk patients (P < 0.0001 for each analysis) (Fig. 1).

Fig. 1
figure 1

a–d Kaplan-Meier survival curves for survival following ICD implantation in different prognostic groups according to the a Bilchick, b Goldenberg, c Kramer scoring systems and d urea

Overall, in our cohort, 1-year mortality was 8.1% (n = 33). Of these 33 patients, 8 (24.2%) received documented appropriate ICD therapy (ATP or shocks) and 2 (6.1%) received inappropriate ICD therapy prior to death. The scoring system that identified the largest proportion of these 33 patients was urea (n = 19, sensitivity 57.6%).

The sensitivity, specificity, PPV and NPV for each of the scoring systems and urea to predict 1-year mortality are shown in Table 6.

3.5 Relationship of scoring systems to device therapy

During follow-up, 106 (26.1%) patients experienced appropriate device therapy (36 atp, 30 shock and 40 atp followed by shock) and 38 (9.4%) patients inappropriate therapy (18 atp, 12 shock and 8 atp followed by shock). In univariate cox regression analyses, only the Kramer scoring system was associated with the occurrence of appropriate device therapy (hazard ratio 3.15, P = 0.003) (Table 7). None of the scoring systems were associated with the occurrence of inappropriate therapy.

Table 7 Univariate and multivariate analyses for appropriate and inappropriate ICD therapies by scoring system or urea level

4 Discussion

There are two main findings of this study. First, using published scoring systems, a significant proportion of current ICD recipients—between 6 and 35% in our cohort—are at high risk of early mortality following device implantation. Second, although all of the published scoring systems we evaluated predicted post-implant mortality, none significantly outperformed serum urea in terms of discrimination.

ICD therapy significantly improves survival in the low-LVEF patient population by the successful termination of ventricular arrhythmias that underlie preventable SCD. However, it has no impact on the risk of non-SCD. On the basis of results from multiple large controlled randomised (RCTs) trials, ICD therapy is targeted at patients at highest SCD risk. However, its clinical effectiveness is critically dependent not only on the risk of SCD but also on the risk of non-SCD [5, 9].

Using a simplified version of the Seattle Heart Failure Model, the SCD-HeFT investigators created a risk prediction model to divide the 2487 study patients into quintiles of increasing predicted baseline mortality risk [9]. Although in the overall study cohort ICD therapy improved survival, patients in the highest risk quintile of predicted mortality did not benefit from a device (relative risk for all-cause mortality 0.98, P = 0.89). There were similar findings in an analysis of the 1232 patients enrolled in MADIT-II, where again patients at highest pre-implant mortality risk failed to gain benefit from their ICD despite mortality benefit in the total study population [5].

Furthermore, it is possible that in some cases, ICD therapy may actually increase the risk of non-SCD. The occurrence of ICD shock therapy has been associated with worsening heart failure status and an excess mortality [10]. In addition, unnecessary right ventricular pacing may also worsen LVEF and increase mortality.

Although it is clear that some potential ICD recipients may be too sick to gain meaningful benefit from ICD therapy, it is unclear how best to accurately and reproducibly identify these patients prior to device implantation. The current guidance is limited to suggesting that ICD therapy is not indicated in patients with advanced heart failure, defined as New York Heart Association (NYHA) functional class IV, or in patients who do not have a reasonable expectation of survival with an acceptable functional status for at least a year [11]. However, there is no provision of how best to risk stratify patients in accordance with this guidance, making clinical interpretation difficult. Moreover, NHYA class is a relatively inaccurate prognostic variable, whose classification is often subjective.

A variety of alternative strategies to identify potential ICD recipients with an elevated non-SCD risk have been proposed. While early studies evaluated the use of individual risk markers, such as renal function and age, recent investigators have developed more complex risk scores in an attempt to improve prediction [47]. These different approaches reflect the observation that in the low-LVEF population, the main contributor to non-SCD is pump failure, though non-cardiac mortality may also play an important role in patients with significant comorbidity.

Our finding that urea, a measure of renal function, is a powerful marker of increased mortality following ICD implantation is consistent with published data. In a meta-analysis of patient-level data from 2867 patients enrolled in three RCTs of prophylactic ICD therapy, Pun et al. found that benefit from ICD therapy was strongly related to renal function, with impaired renal function at implant associated with a decrease in survival benefit from a device [12]. These findings have been reproduced by other investigators. These results emphasize the uncertain benefit of ICD therapy in patients with renal dysfunction and question the use of ICDs in this patient population.

Our data suggest that despite consensus guidelines stating that patients at increased short-term mortality risk should not receive ICD therapy, many such patients are still implanted. Using the four published scoring system in our cohort, 6 to 35% of implanted patients were identified to be at high risk of short-term mortality.

In our cohort, while all four scoring systems predicted survival, the Kramer and Bilchick scores had the best discriminative capacity, with C-statistics for both models consistently 0.7 or above for each of the two measured time points. However, despite this, none of the scoring systems outperformed serum urea, when evaluated using the area under the ROC curves. Furthermore, pre-implant serum urea, using the published cut-off of 9.28 mmol/L, identified the largest proportion (58%) of patients who died within 1 year of ICD implant.

When comparing predictive models, it is important to find a balance between mathematical accuracy and clinical applicability [13]. All of the proposed scoring systems used a minimum of four variables, with the Bilchick scoring system including seven variables and a nomogram to calculate the overall risk. In contrast, serum urea is universally available and simple to use.

Post hoc analysis of SCD-HeFT suggested a threshold of benefit may be present based on an annual mortality risk of 20–25%, with patients at greater annualised risk than this unlikely to benefit from an ICD [9]. Interestingly, 1-year mortality in the Kramer model (40%), Bilchick model (29%) and urea (19%) were all around this level. This supports the possibility that these models not only identify patients at high risk of mortality but also a group that may not benefit from ICD therapy.

The issue of identifying patients who fulfill current international guidelines but are unlikely to gain significant survival benefit from ICD therapy due to their high non-SCD risk is an important one. ICDs continue to be an expensive technology, and avoiding implanting patients who are unlikely to gain survival benefit is likely to improve clinical and cost-effectiveness. Furthermore, avoiding implanting unnecessary ICDs would prevent exposing patients with advanced cardiac and non-cardiac disease to the risks and potential complications of a high-energy device.

5 Limitations

Our study has several potential limitations. First, it is a single-centre retrospective analysis and at risk of the inherent bias of this type of study. Although we analysed patients’ electronic records in detail, it is possible that important clinical variables used in the scoring systems, such as the presence or absence of PAD, were not recorded adequately. This may have resulted in a miscalculation of patients’ individual risk scores and impacted on our results.

Second, although we included data on 406 patients with 96 deaths, our analysis is relatively small by the standards of previous studies in this area.

Third, in our analysis, we included an unselected population of primary and secondary prevention patients, patients with CRT and non-CRT ICDs and patients with both ischaemic cardiomyopathy and NICM. The rationale for this was that the issue of identifying patients too sick to benefit from ICD therapy is important in all potential ICD recipients, irrespective of their ICD indication or the aetiology of their cardiac disease. However, some of the prediction models we evaluated were developed in purely primary prevention populations, or patients with only single/dual chamber ICDs, which may reduce their accuracy when evaluated in a mixed population. In addition, the recent publication of the DANISH study may impact on the guidelines for prophylactic ICD implantation in the NICM population [14]. For this reason, we have performed subgroup analyses based on indication, type of device and aetiology of cardiac disease.

Furthermore, when making decisions regarding complex device therapy for individual patients, it is important to balance the potential benefits against the risks. The benefit from ICD therapy is likely to be influenced by the indication (primary vs. secondary prevention), as well as the concomitant use of CRT. Although the association between the risk factors/models and mortality was relatively consistent across the different patient groups (primary/secondary prevention and ICD/CRT-D recipients), the numbers in each group are relatively small, and these models should be used with caution in patients with secondary prevention indications or CRT devices.

Fourth, given the observational design of our study, it is not possible to establish cause of death, which may have influenced interpretation of our results. However, all of the scoring systems we evaluated were designed to predict all-cause mortality and in none of the studies was cause or mode of death given.

Lastly, it is an observational study, and therefore, it is not possible to say that patients who died did not have their life meaningfully prolonged by ICD therapy.

6 Conclusion

Using published scoring systems, a significant proportion of current ICD recipients are at high risk of short-term mortality following device implantation. Although all of the four published scoring systems we evaluated predicted early mortality following ICD implantation, none outperformed serum urea. We advocate the use of urea as a simple, clinically applicable, risk marker to better identify patients at high risk of early mortality post-ICD implantation.