Preoperative renal dysfunction is a well-established risk factor for postoperative complications.1-4 Quantifying renal dysfunction is thus clinically important; consequently, preoperative serum creatinine is included in various risk stratification models, including the Revised Cardiac Risk Index and the Cleveland Clinic Risk Score.5-7 A limitation is that serum creatinine concentration is an imprecise measure of renal function. It is an especially poor estimate of mild renal dysfunction; furthermore, comparable creatinine concentrations may represent different degrees of renal impairment in patients of various age, sex, and ethnicity. And finally, the relationship between serum creatinine and glomerular filtration rate (GRF) – a clinically more important measure – is complex and depends on various factors, including sex, ethnicity, recent dietary intake, drug use, muscle mass, and extra-renal creatinine clearance.8

Direct measurement of GFR using exogenous markers remains the gold standard but remains impractical in most clinical situations. In contrast, estimated GFR (eGFR) provides a reasonable compromise between accuracy and practicality.9,10 The two most popular methods of calculating eGFR are the Modification of Diet in Renal Disease (MDRD) Study equation and the Chronic Kidney Disease-Epidemiology (CKD-EPI) Collaboration equation. Both have been validated against measured GFR and offer acceptably precise assessment of renal function.9-11

In both cardiac3,12-14 and non-cardiac2,4 surgery settings, preoperative renal dysfunction is strongly associated with adverse outcomes. Most studies were conducted on the basis of serum creatinine,5,6,15,16 although some used eGFR.2-4,12-14 For example, Shavit et al. showed that even relatively small preoperative increases in serum creatinine or decreases in eGFR are associated with significant risk of mortality and morbidity.3 Specifically, mortality increased significantly with every 15 mL·min−1·1.73 m−2 decrease in preoperative eGFR.

A meta-analysis of studies assessing the impact of preoperative renal dysfunction on postoperative outcomes confirms that both an elevation in serum creatinine and/or a decrease in eGFR are associated with worse outcomes.1 Nevertheless, cardiac and vascular surgery patients were significantly overrepresented among the included studies. Cardiac and vascular surgery represents a high-risk intervention, with the majority of patients exposed to cardiopulmonary bypass, large fluid shifts, and a high risk for hemodynamic perturbations. On the other hand, the general surgical population is exposed to a wide spectrum of procedures with invasiveness ranging from low to high. Accordingly, the association of preoperative renal dysfunction with postoperative outcomes may differ between cardiac and non-cardiac surgery patients. Hence, it remains unknown whether use of eGFR enhances clinicians’ ability to predict adverse perioperative outcomes compared with serum creatinine alone in patients having non-cardiac surgery.

We therefore tested the hypothesis that preoperative eGFR, which was estimated using the CKD-EPI formula, predicts 30-day postoperative mortality better than preoperative creatinine in a large cohort of non-cardiac surgery patients.

Methods

Data for this retrospective investigation were obtained from the Perioperative Health Documentation System (IRB #8167), and the need for consent was waived by the Cleveland Clinic Institutional Review Board (April 2013). We considered adults who had inpatient non-cardiac surgery lasting at least one hour during January 1, 2006 to December 31, 2011 at the Cleveland Clinic Main Campus. For patients who had multiple operations, only the first surgery was included in our investigation. Inclusion criteria comprised patients whose age, sex, race, and preoperative serum creatinine level within 30 days of the surgical procedure were available. Exclusion criteria were patients on dialysis preoperatively and patients with a creatinine level > 10 mg·dL−1 or an eGFR > 180 or < 10 mL·min−1.

Basic demographic data were retrieved, including: age, sex, race, height, weight, type of surgical procedure, medical history (coronary artery disease, hypertension, diabetes mellitus, cerebrovascular disease, peripheral vascular disease, chronic obstructive pulmonary disease), and surgical class of the procedure. We also retrieved preoperative hemoglobin concentrations along with the Charlson Comorbidity Index.17

The intraoperative data included type of anesthesia (general vs regional), intraoperative fluid administration (crystalloids, colloids, blood products), intraoperative estimated blood loss (EBL), and the duration of the surgery. Preoperative eGFR was calculated using the CKD-EPI Collaboration equation10:

\({\text{eGFR }} = { 141 } \times { \hbox{min} }\left( {{\text{Scr}}{/}\upkappa,{ 1}} \right)^{\alpha } \times { \hbox{max} }\left( {{\text{Scr}}{/}\upkappa,{ 1}} \right)^{ - 1. 20 9} \times \, 0. 9 9 3^{\text{Age}} \times { 1}.0 1 8 { }\,\left[ {\text{ if female}} \right] \, \times { 1}. 1 5 9 { }\,\left[ {\text{ if black}} \right],\) where Scr is serum creatinine in mg·dL−1, κ is 0.7 for females and 0.9 for males, α is −0.329 for females and −0.411 for males, min indicates the minimum of (Scr/κ) or 1, and max indicates the maximum of (Scr/κ) or 1.

Patients were divided into four blood loss groups based on the intraoperative EBL: blood loss minimal (0-50 mL), blood loss 50-500 mL, blood loss 500-1,500 mL, and blood loss > 1,500 mL. Separately, patients were dichotomized by need for intraoperative transfusion of red blood cells.

Preoperative eGFR was compared with preoperative serum creatinine (both assessed as continuous variables) on the ability to predict 30-day mortality, primarily in all patients and secondarily in each blood loss group and in males and females. The blood transfusion groups were used as a surrogate for invasiveness of the surgical procedure.

Baseline and intraoperative characteristics of the study population were described using standard summary statistics. For the primary outcome, 30-day mortality, we assessed the ability of preoperative serum creatinine vs preoperative eGFR to discriminate 30-day mortality using area under the receiver operating characteristic curve (i.e., c-statistic). The c-statistics for eGFR and creatinine were compared using the Delong method of comparing dependent areas under the curve (AUCs).18 We assessed the calibration of all models, i.e., how well the model fits the data, using the Hosmer-Lemeshow goodness of fit test.

For each eGFR and serum creatinine, we searched for a best cut-point that might be used in clinical decision-making, and we decided a priori that a clinically useful cut-point would have sensitivity and specificity > 0.75 (corresponding to a positive likelihood ratio of ≥ 3 and a negative likelihood ratio of ≤ 0.33). With such a small incidence of the outcome (1.4% mortality), we did not expect the positive predictive value to be high – even with a sensitivity and specificity of 0.90, it would be only about 0.11.

We further assessed the discriminative ability of preoperative serum creatinine and preoperative eGFR within levels of blood loss for patients who received and did not receive transfusions, and for males and females. We again assessed calibration with Hosmer-Lemeshow goodness of fit tests.

In addition to the primary univariable analysis, we used multivariable logistic regression to assess the ability of each of eGFR and serum creatinine to improve discrimination of 30-day mortality (i.e., the c-statistic) beyond what is achieved using the baseline variables, i.e., sex, age, body mass index (BMI), and American Society of Anesthesiologists (ASA) physical status.

Finally, we calculated two complementary reclassification indices to assess the practical utility of using eGFR vs serum creatinine in predicting 30-day mortality – the net reclassification improvement (NRI) and the integrated discrimination improvement (IDI).19,20 The NRI index compares upward and downward reclassification of risk for a new marker (or prediction model) vs a standard model between patients having the event (here, 30-day mortality) and controls (here, alive at 30 days). A better predictor of the event would have more upward reclassification for patients with events than for those without events (controls). Specifically, the NRI index was calculated as twice the difference between cases and controls on the proportion of patients for whom the predicted risk was higher using eGFR than using serum creatinine minus the proportion for whom the predicted risk was lower using eGFR than using serum creatinine. A large NRI would indicate that eGFR improves risk categorization compared with serum creatinine. The corresponding formula is as follows:

$${\text{NRI}} = 2\left\{ {\left[ {\left. {{\text{P}}({\text{up}}) - {\text{P}}({\text{down}})} \right|{\text{event}}} \right] - \left[ {\left. {{\text{P}}({\text{up}}) - {\text{P}}({\text{down}})} \right|{\text{non - event}}} \right]} \right\}$$

where, for example, “P(up) | event” is the proportion of patients having the event who reclassified with higher (i.e., upward) risk with eGFR than with serum creatinine, and “P(down) | event” is the proportion of patients having the event who reclassified with lower (i.e., downward) risk with eGFR than with serum creatinine. Since the results are dependent on the size of the categories, we calculated the NRI with several methods of risk categorization: using the raw predictions for each patient and using categories of each 1%, 2%, and 4% of the predictions.

The IDI assesses the difference (typically, an increase) in the mean probability of an event using the new marker vs the old marker for cases (i.e., true events) minus the difference (typically, a decrease) in non-cases. Equivalently, it is the difference in discrimination slopes between the new and the old marker, where the discrimination slope is the absolute difference in the mean predicted probability of outcome between cases and controls for a marker. We also calculated the relative IDI, which is the ratio of discrimination slopes minus 1. The IDI can be considered a continuous version of the NRI.

The significance level was 0.05 for each hypothesis. Bonferroni correction was used to adjust for comparisons within subgroups. SAS® 9.3 statistical software (Carey, NC, USA) was used for all analyses.

Results

We considered 93,012 patients for our investigation; however, 30-day mortality could not be determined in 124 cases, leaving 92,888 patients for analysis. Table 1 describes patient demographics on baseline, intraoperative factors, and in-hospital (1%) and 30-day (1.4%) mortality.

Table 1 Descriptive summary of patient population

Estimated GFR (using the CKD-EPI formula) was a better discriminator of in-hospital mortality than serum creatinine, with an estimated c-statistic (95% confidence interval [CI]) of 0.67 (0.65 to 0.68) for eGFR vs 0.61 (0.59 to 0.63) for serum creatinine (P < 0.001) (Table 2 and Figure). For comparison, the c-statistic (95% CI) using the MDRD formula for eGFR was only 0.62 (0.60 to 0.64). Neither single-predictor model had adequate calibration (Hosmer-Lemeshow goodness of fit, P < 0.001 for each variable).

Figure
figure 1

Comparative receiver operating characteristic curve plots of eGFR and creatinine as predictors of 30-day mortality. Area under the curve (AUC), or c-statistic, was significantly higher for eGFR, indicating better discrimination of 30-day vital status compared with serum creatinine alone (P < 0.001, Delong method). Diagonal line indicates an AUC of 0.5 corresponding to chance discrimination, to which the eGFR and creatinine AUCs are compared. eGFR = estimated glomerular filtration rate

Table 2 Comparing eGFR and creatinine on discrimination of 30-day vital status

Furthermore, eGFR was consistently a better discriminator of 30-day mortality across blood loss, transfusion, and sex groups (Table 2). Univariably, the estimated odds ratio (95% CI) for 30-day mortality was 1.30 (1.27 to 1.32) for a ten-unit decrease in eGFR and 2.3 (2.1 to 2.5) for a doubling of creatinine (i.e., for an increase of 1 on the log-transformed base 2 scale).

No particular cut-off values of either eGFR or serum creatinine predicting 30-day mortality had sufficient diagnostic accuracy to be recommended for clinical decision-making. For example, a value of 76 for eGFR achieved estimated sensitivity and specificity of 0.66, negative predictive value (NPV) of 0.66, and positive predictive value (PPV) of 0.03. The positive likelihood ratio (probability of a positive test for diseased vs non-diseased) was 1.9, while the negative likelihood ratio (probability of a negative test for diseased vs non-diseased) was 0.52. A value of 0.94 mg·dL−1 for serum creatinine achieved an estimated sensitivity and specificity of 0.62, NPV of 0.62, and PPV of 0.02. The positive likelihood ratio was 1.6, while the negative likelihood ratio was 0.61.

Reclassification analyses also suggested that eGFR is a better predictor than serum creatinine for individual patients. The net reclassification index was significantly greater than zero for all categorizations attempted, indicating improved prediction using eGFR compared with serum creatinine. Table 3 outlines the net reclassification index results, including the proportion of patients who were reclassified upwards and downwards for both cases and controls when using eGFR vs serum creatinine, as well as a sample calculation of the NRI. As expected, the NRI was highest when no categorizations were used (0.53; 95% CI 0.44 to 0.58). In that analysis, any increase (and separately, decrease) in the predicted probability of 30-day mortality, when using eGFR compared with serum creatinine, was considered reclassification. Using categories as large as each 4% for the predicted probabilities still showed a modest improvement using eGFR vs serum creatinine.

Table 3 Net reclassification improvement

The estimated IDI index was also significantly greater than zero, indicating slightly higher on-average predictions for true events using eGFR vs using serum creatinine (Table 4). Although the estimated IDI is quite small (0.0037), the relative IDI of 0.567 indicates somewhat better prediction using eGFR vs serum creatinine. Due to the low incidence of mortality in the study, the mean probability of events was very small for both deaths (eGFR 0.024 vs serum creatinine 0.021) and non-deaths (eGFR 0.014 vs serum creatinine 0.014).

Table 4 Integrated discrimination improvement

A multivariable combination of baseline characteristics ASA physical status, age, and BMI (all P < 0.001) discriminated 30-day mortality with a c-statistic of 0.850. Adding eGFR to the model improved the c-statistic to only 0.851 (multivariable model, Table 5), while separately adding serum creatinine did not change the c-statistic. Nevertheless, in their separate multivariable models, both predictors were still significantly associated with 30-day mortality, with estimated odds ratio (95% CI) of 1.06 (1.04 to 1.09) for a ten-unit decrease in eGFR and 1.22 (1.11 to 1.34) for a doubling of creatinine.

Table 5 Multivariable model* for association with 30-day mortality

In exploratory analyses, we found that the relationship between both log-transformed serum creatinine and eGFR with mortality was somewhat nonlinear and that calibration could be improved by adding quadratic and cubic forms of the predictor to each model. For eGFR, this did not affect the c-statistic, but for serum creatinine, the c-statistic improved to 0.67 (the same as eGFR). Reclassification was still marginally better with eGFR compared with serum creatinine in the more complex model. Nevertheless, since, in practice, clinicians would be more likely to use the raw serum creatinine or eGFR values and not a prediction with linear, quadratic, and cubed forms of the variables, we opted to report on the more simple prediction model for our main results.

Discussion

Creatinine-based eGFR equations, which aim to circumvent some of the limitations of serum creatinine, are well validated.10,11 Various equations have been proposed, but we chose the CKD-EPI Collaboration equation because it offers high precision, especially at higher GFR values.10,11 Precision at a higher GFR is important when the equation is applied to overall surgical populations in which renal function is generally good. Stevens et al.11 compared precision of MDRD and CKD-EPI formulas across multiple ranges of GFR and found that the CKD-EPI equation can be used for an accurate estimate of GFR in patients with measured GFR > 60 mL·min−1·1.73 m−2. On the other hand, the MDRD equation loses its accuracy if GFR is > 60 mL·min−1·1.73 m−2. For that very reason, when eGFR exceeds > 60 mL·min−1·1.73 m−2, MDRD is reported as “eGFR > 60 mL·min−1·1.73 m−2” rather than as a discrete value. Therefore, mild renal dysfunction (in the range of 60 mL·min−1·1.73 m−2 to normal value) cannot be quantified, and practically, based on eGFR calculated with the MDRD equation, it makes a patient with normal GFR indistinguishable from a patient with GFR just above 60 mL·min−1·1.73 m−2 (clearly with some renal dysfunction), which would be a gross oversimplification. Thus replacing serum creatinine with the CKD-EPI (not MDRD) estimate of eGFR improves detection of mild preoperative renal dysfunction which otherwise often goes unnoticed but has been associated with worse outcomes.1,3,4

Our primary result is that eGFR is indeed a better discriminator of 30-day mortality than serum creatinine in a large cohort of patients having a wide variety of non-cardiac operations. The magnitude of the improvement, while highly statistically significant, was relatively small (c-statistic of 0.67 vs 0.61, respectively). Nonetheless, this represents an important improvement in discrimination, as eGFR is often reported by clinical laboratories and is otherwise easy to calculate. Although we did not find specific cut-points in eGFR or serum creatinine with adequate sensitivity and specificity to be reliable predictors of 30-day mortality, eGFR was clearly a better discriminator of mortality when used as a continuous variable.

Besides addressing discrimination and calibration, which are more global measures of prediction, we reported two reclassification indices to assess how well eGFR performs compared with serum creatinine in predicting 30-day mortality for individual patients. Our results indicate that using eGFR would more appropriately predict 30-day mortality compared with serum creatinine, i.e., patients who would eventually die by 30 days would be more likely to have a higher prediction of mortality using eGFR, adjusting for those who were wrongly reclassified. The NRI has possible limitations: it is sensitive to the calibration of the model; it may be significantly different from zero for very subtle improvements in prediction, and it is dependent on the number of reclassification categories used.21,22 Nevertheless, in our study, the NRI showed a clear benefit of eGFR over serum creatinine – eGFR, on average, placed patients who would eventually die by 30 days into a higher risk category than would serum creatinine, independent of the number of risk categories used (i.e., up to 34% of deaths correctly reclassified).

The benefit of using eGFR rather than creatinine alone extended across blood loss and transfusion – both of which can be considered surrogates for the invasiveness of the surgical procedure – as well as across sex groups. We did separate analyses for males and females because, as opposed to the CKD-EPI formula, serum creatinine does not distinguish difference between sexes. Estimated GFR was thus a modestly superior predictor of mortality in each circumstance we considered. Our results strongly suggest that eGFR, as calculated using the CKD-EPI formula, should be used in preference to serum creatinine. For the sake of interest, we also calculated GFR using the MDRD formula and observed a c-statistic (standard error) of only 0.62 (0.01), indicating similar discrimination of 30-day mortality compared with creatinine alone.

As shown in many previous studies, even slight impairments in renal function were associated with substantial increases in 30-day mortality and/or cardiac events – whether renal function was evaluated on the basis of serum creatinine or eGFR.1,23 Nevertheless, postoperative mortality is affected by multiple perioperative factors, with renal function being just one of them. It was therefore unlikely that preoperative eGFR alone would be a strong predictor of 30-day mortality in surgical patients. As one might thus expect, a multivariable combination of ASA physical status, age, and BMI predicted 30-day mortality far better than just eGFR (c-statistic of 0.85 compared with 0.67, respectively).

In summary, eGFR using the CKD-EPI formula is a modestly better predictor of 30-day mortality than serum creatinine in patients having non-cardiac inpatient surgery. Our study was based on a large and diverse non-cardiac surgical population. Furthermore, eGFR proved modestly superior to serum creatinine alone in various subgroups; our conclusions about the relative merits of eGFR and creatinine thus appear to be broadly applicable. Given that eGFR is often reported by clinical laboratories and is otherwise easy to calculate, it should generally be used in preference to serum creatinine alone.