Introduction

Despite diagnostic and therapeutic improvements, postoperative mortality for infective endocarditis (IE) remains high [1,2,3]. Cardiovascular surgery (CS) will be indicated in nearly half of IE patients [4, 5]. A timely operation to restore hemodynamic function, to eliminate embolic risk, and/or to control infection contributes to survival [6, 7]. Although indications for surgery might be clear from a structural or infective standpoint, in daily clinical practice the decision to offer CS is often challenging. Nearly a quarter of patients with surgical indication do not receive CS [1, 8]. This reality may be a consequence of perceived futility or denial of operative management on the basis of a high-risk score estimate.

The use of risk scores for endocarditis aims at an objective measurement of mortality risk inherent to the disease and may assist into benchmarking healthcare systems. Specific recommendations for clinical practice and individual decision-making based on available risk scores remain an aspect to be further developed in current ESC guidelines for IE [7]. Intending to predict mortality, multiple specific IE risk scores have been developed and adopted [5, 9,10,11,12,13,14,15]. Some have also been shown to predict in-hospital mortality even in patients treated without CS [16]. One significant drawback of these scores is their performance in external cohorts [2, 17,18,19,20]. At an individual patient level, the indication for CS in the face of extreme operative risk and marginal chances of meaningful survival is controversial.

We aim to explore the performance of multiple operative mortality risk scores in our most contemporary surgical experience of active left-sided IE. In addition, we examine the outcomes at high-risk thresholds to explore the hypothetical loss of life if arbitrary cut-offs for operability had been used.

Materials and methods

Patients

All consecutive patients admitted to our institution from 1 May 2014 to 31 August 2019 with diagnosis of definite acute left-sided IE who underwent CS were reviewed. All patients with IE were discussed prospectively and the indications to pursue CS were agreed within the Hospital Clínic Endocarditis Team as stated in other works [21]. Patients’ baseline characteristics and intraoperative, postoperative, and follow-up data were collected from the departmental database, outpatient’s clinic visits, telephone interviews, and referring physicians’ notes.

Definitions

The diagnosis of definite IE was made according to the modified Duke criteria in all cases [22]. Results were reported following the Guidelines for Reporting Mortality and Morbidity After Cardiac Valve Interventions by Akins [23]. CS was considered urgent when it was performed within the first 7 days following hospital admission and emergent when performed in the first 24 h [7]. Definitions related to the IE pathology process and outcomes follow those already published by the International Collaboration on Endocarditis [14].

IE-CS mortality risk scores’ performance and QoL assessment

The estimated mortality risk was calculated at the time of surgery for each patient using the most employed scores in CS to predict mortality (STS-risk score and EuroScore I and II) [24,25,26,27]. Moreover, we calculated the predicted mortality risk using specific IE-CS risk scores, which include several items known to affect the patients’ outcome as prosthetic valve IE or the existence of a paravalvular complication. These risk scores included PALSUSE [5], Risk-E [11], Costa [9], De Feo-Cotrufo [10], AEPEI [13], the modified STS-IE [28], APORTEI [12], and ICE-PCS scores [14]. Composite morbimortality risk was also assessed by the STS-IE score [28]. Scores were calculated using the same definitions of variables as stated in their original article [5, 9,10,11,12,13,14, 24,25,26,27,28].

We performed a cross-validation study on the basis of different preoperative mortality risk thresholds. We used arbitrary cut-off points for mortality risk of 45%, 60%, and 70% for each score to assess the impact of operability at these incremental risks. We excluded those scores for which the number of patients with risk over 45% was less than 5 (STS, STS-IE). To better understand the risk prediction, we calculated the predictive positive value (PPV) with its 95% confidence interval (95% CI) for EuroScore II, STS-risk, Risk-E, and ICE-PCS scores. We selected these scores due to their performance in our cohort and their general applicability in CS. PPVs were calculated at cut-off values indicating either the maximal sensitivity or specificity.

In addition to the perioperative outcomes, we also analyzed the characteristics of incremental subgroups of high-risk patients defined by the Risk-E score. With the use of the 36-Item Short Form Survey (SF-36), we evaluated different quality of life (QoL) parameters at latest follow-up for patients with higher estimates of mortality by the Risk-E score (> 45%). This particular score was selected as it was the one that performed better in our cohort.

Statistical analysis

Variables are expressed as the median and interquartile range [IQR], or as proportions, as appropriate. In-hospital mortality and survival rates were assessed at 30 days, 6 months, and 1 year. Discrimination of the risk scores was studied by performing the receiver operating characteristics (ROC) curve for each risk score and its corresponding area under the curve (AUC). The selected thresholds for each score assessment using ROC curves were obtained maximizing the sum of sensibility and specificity. Calibration for each score was assessed using the Hosmer–Lemeshow goodness-of-fit test. The statistical analysis was performed using STATA statistical software version 14.1 (Stata Corp., College Station, TX).

Results

A total of 235 patients were evaluated at our institution with acute left-sided infective endocarditis. Data related to our patients who did not undergo CS has been published elsewhere [8]. A total of 142 patients (60%) underwent surgery and 2 patients (1.4%) were lost to follow-up.

Characteristics of the operated IE cases in the cohort (Table 1)

Table 1 Baseline characteristics, microbiology, and perioperative and postoperative variables of 142 patients undergoing cardiovascular surgery for infective endocarditis from 2014 to 2019

Male gender predominated (76.8%) and median age was 64 years (IQR 30–82). There were 95 cases of native valve IE (66.9%) and 47 cases of prosthetic IE (33.1%). Perivalvular abscess, fistulas, or perforation occurred in 100 cases (70.4%). Twenty cases (14.1%) suffered documented preoperative emboli in the central nervous system and 33 cases (26.2%) presented with cardiogenic shock.

The causative microorganism was identified in 97.9% of patients. The most prevalent causative agent were viridans group streptococci (VGS), followed by coagulase-negative staphylococci (CoNS) and Staphylococcus aureus. These results show a tendency in higher number of CoNS and a lower frequency of S. aureus as causative microorganisms, which are in line with the most recent literature [29, 30].

Postoperative outcomes

In-hospital and 30-day mortality was 5.6% (eight patients). The mortality reasons and deceased patients’ profile are reported in Supplementary Table 1. Six-months and 1-year survivals were 93.7% and 90.8%, respectively. No relapses of the index infection were observed at follow-up. Median follow-up was 33.1 months (IQR 18.6–49.9).

Risk score data and performance in our cohort of patients

The mean, range, standard deviation, and CI for each risk score model are summarized in Table 2. The ROC curve and its AUC for the risk scores are shown in Fig. 1. The majority of IE scores showed a high estimated risk of mortality in our cohort, between 15 and 30%. The risk scores with lower estimated mortality were STS (5.74–10.18%) and De Feo-Cotrufo score (9.2–27.3%). Meanwhile, the risk scores with higher predicted mortality were EuroScore I, Costa, and APORTEI. The risk score with the best performance was Risk-E (AUC = 0.89, 95% CI 0.823–0.970) and the one with worst performance was the STS-IE risk score (AUC = 0.61, 95% CI 0.370–0.846). When tested in our cohort, all scores appeared to show adequate calibration (Supplementary Table 2). The number of operated patients whose risk exceeded arbitrary cut-off points for every score is summarized in Table 3. Survival for the highest cut-off values remains ≥ 79%. Similarly, the PPV and the 95% CI of the scores studied are shown in Table 4. As the risk score increased, the PPV also increased, indicating better discrimination. However, the PPVs calculated were all consistently low between scores.

Table 2 Performance of risk scores for in-hospital mortality and composite outcomes after surgery for infective endocarditis
Fig. 1
figure 1

ROC curves and AUC for the scores analyzed. The risk score with the best performance was Risk-E score (AUC = 0.89), ICE-PCS score (AUC = 0.87), APORTEI score (AUC = 0.87), and PALSUSE score (AUC = 0.87). The worst performance was found on the STS-IE risk score (AUC = 0.61)

Table 3 Cross-validation of risk scores with hypothetical cut-off values and outcomes of operative management
Table 4 Positive predictive values and 95% CI of EuroScore II, STS-risk score, Risk-E score, and ICE-PCS score. Score’s cut-off values were defined according to the maximum possible specificity or sensitivity

Profile, 1-year survival, and QoL of high-risk patients according to the Risk-E score

We found increasing proportions of S. aureus etiology, cardiogenic shock, thrombocytopenia, and acute renal failure as the predicted risk raised. Patient’s characteristics for each group at incremental risks are detailed in Supplementary Table 3. In-hospital mortality slightly increased as the risk rose but remained low compared to estimated mortality. In the highest risk group (> 70%), only one patient died in the first year after surgery. Further analysis of QoL subcomponents for patients with Risk-E > 45 is shown in Supplementary Table 4. Follow-up data was obtained at a median of 37.0 months after surgery [IQR = 19.0–52.7]). On a global evaluation, this subgroup of patients performed slightly lower than the general population in the physical subcomponent evaluation and similarly in the mental subcomponent [31].

Discussion

We externally assessed the discriminative power of the currently available IE risk scores in a center with an Endocarditis Team active for more than 30 years. From a statistical point of view, the general discrimination of most risk scores in our cohort appears appropriate. However, the difference between predicted and observed mortalities struck from a clinical and statistical standpoint. The Endocarditis Team through an experienced, collegiate, and multidisciplinary care may account for such outcomes. Prior data from the group of Botelho-Nevers et al. showed a significative reduction in 1-year mortality (18.5 to 8.2%, p = 0.008) after implementing a management-based approach through standardized diagnostic and therapeutic protocol on patients with diagnosis of IE [32]. Therefore, we hypothesize that the presence of an IE Team as well as undergoing timely cardiac surgery might be used as a correction factor for predicting mortality in further scoring systems.

We acknowledge that none of the available scores has influenced our IE practice. Conceptually, the ability to count on a perfect score when complex IE decision-making is required would prevent denial of surgery to a viable patient and limit unnecessary escalation. In our experience, mortality after surgery for endocarditis arises from brain bleeding from preoperative embolism, severely depressed systolic function not amenable to transplantation/mechanical circulatory support, poor tissue-suture anchorage, and advanced patient directives. Important features impacting survival after cardiac surgery (e.g., non-reversible pulmonary hypertension, overestimated ventricular function with multiple valve regurgitation, calcified aorta, need for extended cardiac ischemic times, and coronary disease not amenable to revascularization) have been ignored from scoring systems [33,34,35].

Understanding how risk scores have been created provides clues on the current study findings. Most of the scores include patients with right- and left-sided IE (De Feo score, PALSUSE score) and different stages of the disease and merge patients with healed IE and patients undergoing medical treatment alone (De Feo score, PALSUSE score, AEPEI score, Costa score, ICE-PCS), or general cardiac patients (STS score, EuroScore I and II). Our cohort’s risk profile is higher than those that served to formulate the majority of scores and most of the surgically reported experiences [7, 36]. The reported mortality rates vary from 8.2% (STS-IE) [13] to 28.3% (RISK-E) [9], being De Feo-Cotrufo and STS-IE the only ones with mortality below 10% [10, 28]. Wang and colleagues [(18)] concluded that EuroScore I overestimated the risk in IE and failed to discriminate operative mortality. Varela et al. added that EuroScore II underestimated mortality in patients with low risk most likely as a result of not capturing important inherent features of IE patients [2]. Other studies showed similar results in which EuroScore II tended to underestimate mortality by 5–10% when predicted mortality was greater than 10% for IE patients [37]. A recent meta-model with weighted IE-specific variables from individual scores from Fernández-Félix et al. showed an increased discriminatory power compared to their previous existent scores [38].

The score proposed by De Feo-Cotrufo [10] is the only score included in current clinical guidelines [7]. It arises from patients operated over a 30-year period and is limited to native valve IE. Wang also demonstrated its applicability could be extended to prosthetic valve IE patients [18]. The Costa score [9] has been found to have poor performance in our cohort. We believe that the different patient’s characteristics of both cohorts (Brazilian population, mean age of 33.9 years; and patients in whom surgery was not performed) may be reasons for the observed results. The ICE-PCS score is the only one that predicts mortality at 6 months [14]. Similarly, to other scores, it merges patients undergoing surgery (48.1%) with patients treated only medically. For patients not undergoing surgery, this score has demonstrated reasonable prediction of mortality [16]. The PALSUSE [5], RISK-E [11], and APORTEI [12] scores have the best performance in our cohort. Similar preoperative characteristics and inclusion of IE-specific and critical preoperative variables may explain our findings.

Contemporary IE clinical guidelines state clearly the theoretical indications to pursue surgery [7]. However, it is acknowledged that in practice the final decision to offer surgery usually relies on the patients’ condition and risk profile. We previously reported our non-operated cohort and compared it with operated patients [8]. In our experience, the estimated risk profile of patients not undergoing surgery (despite contemporary guidelines indication) was significantly lower than of those operated (EuroScore II 9.4% vs 23.3%, p < 0.007). Over a similar time period, a total of 46 patients with acute left-sided IE with indications of surgery did not undergo surgery, which accounted for the 27% of patients with formal indication of surgery dictated by guidelines. We also reviewed the reasons for not pursuing surgery and those usually overlapped, but a high score was never a motivation itself. Thirty-day, 1-year, and 2-year mortality were 63%, 85%, and 90%, respectively. Recent data suggest that the implementation of several risk scores, including EuroScore II, might be useful in predicting mortality even in not operated patients [39]. In our experience, EuroScore II seems to underestimate mortality in non-operated patients while overestimates morality in surgical patients. Importantly, the reasons for not undergoing surgery in this cohort of non-operated patients usually include cerebral brain bleeding, end-stage cancer, unwillingness to undergo surgery, and end-stage liver disease, among other reasons that are not usually captured by prediction scores.

If a high numeric risk score was to be used as a sole tool to deny surgery, we attested that preventable deaths might have occurred. The perfect score would be a tool with 100% specificity in death prediction so that surgery is not even undertaken. Anything below this capability will lead to questioning how much risk is reasonable to accept. Undeniably, at times, the borders of operability and futility remain arbitrary. As for any urgent cardiac intervention, the likelihood of survival and return to previous status relies on the extent of biological reserve and organ dysfunction acuity. We evaluated not only mortality but also QoL in this very high-risk subgroup to understand whether futile interventions were pursued. In those high-risk patients, data on survival beyond the acute phase along with follow-up QoL points at a beneficial and appropriate use of surgery.

Procedural reporting of mortality has been associated with surgical risk-averse behavior as it poses a challenge to the surgeon at an individual level. At a departmental and institutional level, public reporting of outcomes and benchmarking through inaccurate risk scores threaten the best interest of patients. A charitable decision erring on the side of the patient is the logical course of action regardless of high estimates of death, if eloquent recovery with surgery is possible. The need for improved scores remains to guide resource allocation and the referral of surgical candidates to the best-performing IE teams.

Limitations

The first is the single-center observational retrospective nature of the study, although the local Endocarditis Team prospectively evaluated all patients. Second, given that our institution acts as a referral center for endocarditis, the pattern of surgical candidates may be biased towards patients with different stages of the disease and more complex interventions that were at times denied elsewhere. However, it is possible that this occurred also to centers participating in the production of risk scores. The low number of adverse outcomes challenges the analysis of the different risk scores. Finally, the QoL assessment has occurred at variable time frames since CS was pursued. Thus, if QoL assessment is obtained years after the operation, unmodifiable natural events—such as aging itself or progressive organ dysfunction—not related to the CS-IE may have led to worse punctuation.

Conclusion

The observed mortality in our cohort is significantly lower than predicted by contemporary risk scores. Despite the reasonable numeric performance of the analyzed scores, their utility in judging the operability of a given patient remains questionable, as is demonstrated in the cross-validation analysis. On the sole basis of a high-risk value, many patients would be denied a lifesaving operation upholding the potential to restore QoL. Individual assessment of risk from a specialized team might improve outcomes in this complex subset of patients. Future IE guidelines may recommend that denial of surgery should only follow a highly experienced Endocarditis Team evaluation.