Introduction

In recent years, with advances in the fields associated with anesthesia, surgery has become increasingly applicable to a wider range of diseases and patients, and the annual number of operations performed is also increasing globally [1]. In terms of patient safety and medical economics, an important issue is how to reduce the incidence of perioperative complications and mortality. At least half of postoperative complications can be prevented, while improvements in anesthesia-associated factors contribute greatly to the prevention of complications [24]. Thus, many assessment methods to estimate the incidence of postoperative complications and postoperative mortality have been proposed. Among them, the acute physiology and chronic health evaluation (APACHE), the physiological and operative severity score for the enumeration of mortality and morbidity (POSSUM), and others have been reported to be highly useful, and many revised versions with improved accuracy have been reported [57]. However, while these methods have been designed for presumptive use in the field of intensive care, the large number of essential test items and complex calculation procedures have been problematic. Thus, these methods are unsuitable for immediate calculation of scores after surgery, identification of patients at high risk, and determination of intensive care unit (ICU) admissions. These methods have not been widely adopted for predicting postoperative outcomes. Conventionally, the American Society of Anesthesiologists physical status classification (ASA-PS) is well known for its simplicity. However, it is problematic that this classification system depends largely on the subjective judgment of evaluators and is also broadly divided into categories [8]. In addition, ASA-PS scores are determined without consideration of surgical invasiveness and other intraoperative factors, but only based on preoperative patient status. For these reasons, although the ASA-PS is simple and useful for assessing preoperative physical status, this scoring system has been regarded as insufficient for predicting postoperative outcomes. Against this background, Gawande et al. proposed the surgical Apgar score (sAs) (Table 1), which was named after the obstetric Apgar score, in 2007 [9]. This new scoring system, in which scores are calculated from only 3 intraoperative factors (lowest intraoperative heart rate, lowest mean intraoperative blood pressure, and volume of intraoperative blood loss), attracted attention for its simplicity. Subsequently, this scoring system has been shown to be highly useful for predicting the incidence of postoperative complications and postoperative mortality in many surgical specialties beyond general and vascular surgery, for which the system was originally developed [10]. However, in contrast to the ASA-PS, the sAs is calculated as a score mainly based on intraoperative patient status, and does not directly incorporate an assessment of preoperative patient status. There is still no easy and highly useful method for comprehensively assessing both preoperative and intraoperative patient statuses to predict postoperative outcomes.

Table 1 Surgical Apgar score used in this study

In this study, we attempted to develop a new scoring system that would enable a comprehensive assessment of preoperative and intraoperative patient statuses instantly following entry of data into an electronic anesthesia chart, accurately and automatically predicting postoperative mortality. The usefulness of our new scoring system was compared and analyzed with that of the sAs and ASA-PS, which are the components of our new system.

Materials and methods

Subjects

Ethical approval for this study (Approval number 2521) was provided by the Ethical Committee of Tokyo Women’s Medical University, Tokyo, JAPAN (Chairman Prof S. Miyazaki) on 25 June 2012. In addition, this study was registered under the University Hospital Medical Information Network- Clinical Trial Registry (UMIN-CTR) (unique trial number: UMIN000016990). The study included 32,555 patients who underwent surgery under general or regional anesthesia at Tokyo Women’s Medical University Hospital between February 1, 2008, and February 29, 2012.

Exclusion criteria

The following patients were excluded: those aged 16 years or younger, those undergoing cardiovascular surgery, those receiving electroconvulsive therapy, those undergoing magnetic resonance imaging-guided brain surgery, those receiving anesthesia management outside of an operating room, and those in whom no anesthesiologist was involved in anesthesia management.

Protocol

In all patients, factors presumably associated with surgical outcomes, including patient characteristics and ASA-PS scores, were extracted from the Anesthesia Information Management Systems (AIMS) (MetaVision: FUKUDA DENSHI, Tokyo, Japan). In addition, the lowest heart rate, lowest mean arterial pressure, and estimated volume of blood loss were extracted to calculate the sAs (Table 1). Among the extracted intraoperative biological data, all data showing a lowest heart rate of 40 bpm or lower and a lowest mean arterial pressure of 40 mmHg or lower were individually confirmed with each anesthesia chart to determine whether they were outliers due to artifacts or not. These data were manually corrected and entered. Whether or not each patient had died within 30 days of surgery was determined from the patients’ medical records. The sAs and ASA-PS scores were calculated from the data extracted from the AIMS. We develped a new scoring system (the SASA) by combining the surgical Apgar score (sAs) with American Society of Anesthesiologists physical status classification (ASA-PS), using the follow equation.

$${\text{SASA}} = {\text{sAs}} + \left( {6 - {\text{ASA-}} {\text{PS}}} \right){ \times }2.$$

As the ASA-PS score increases, severity increases. Conversely, as the sAs score decreases, severity increases. While the ASA-PS is rated on a 5-point scale except for a patient declared brain dead considered to be ASA-PS VI, the sAs is on a 10-point scale. In the equation above, the ASA-PS score is subtracted from 6 to make its score mean the same tendency of severity as sAs and multiplied by 2 to equalize the scale of points between the sAs and ASA-PS scores, which are then combined. We thought that ASA-PS VI should be excluded from this equation because their mortality is 100%. As a primary endpoint of this study the sAs, ASA-PS, and SASA were compared and analyzed to determine whether the scores were associated with postoperative 30-day mortality. The association of other factors, including patient characteristics, with postoperative 30-day mortality was analyzed as a secondary endpoint of the study.

The original identification (ID) numbers assigned to the AIMS data and patient medical records used for each analysis were converted according to certain rules so that the modified ID numbers could not lead to identification of individual patients or provide access to the original data.

Statistical analysis

Data are presented as means and standard deviations, medians with interquartile range, or frequencies. One-way analysis of variance (ANOVA) was used to compare normally distributed continuous variables among groups and the Kruskal–Wallis H test was used for skewed continuous or ordinal discrete variables. The chi-square test was used to compare nominal variables. To evaluate the impact of sAs and ASA-PS on 30-day mortality, univariate and multivariate logistic regression models were used. The interaction and multicollinearity in the model were assessed using regression diagnostic analysis. To compare the diagnostic performance of the three scores, SASA, sAs, and ASA-PS, receiver operating characteristic (ROC) curves were used. Two-tailed P values of less than 0.05 were considered statistically significant. Analyses were performed with the SAS system ver. 9.3 (SAS Institute, Cary, NC) at an independent biostatistics and data center (STATZ Institute, Inc., Tokyo, Japan).

Results

Of the total 32,555 patients, 2808 with incomplete data and 5429 meeting the exclusion criteria were excluded. The remaining 24,318 patients were analyzed. Patients underwent surgery in the following specialities: gastroenterological surgery, urology, obstetrics and gynecology, neurosurgery, general surgery, plastic reconstructive surgery, orthopedic surgery, endocrine surgery, thoracic surgery, otorhinolaryngology, oral surgery, emergency and critical care center, and other.

Characteristics of the sAs, ASA-PS, and SASA

None of the 3 scoring systems were associated with age, sex, body mass index (BMI), surgical specialties, operative duration, or anesthesia duration. The rate of emergency surgery increased as severity increased, with lower sAs, with higher ASA-PS and with lower SASA: however, it was not significant (Tables 2, 3, 4).

Table 2 The Surgical Apgar score (sAs) according to perioperative factors and 30-day mortality
Table 3 The American Society of Anesthesiologists physical status (ASA-PS) according to perioperative factors and 30-day mortality
Table 4 The sAs combined with ASA-PS (SASA) according to perioperative factors and 30-day mortality

30-Day mortality of the sAs, ASA-PS, and SASA

As the sAs, ASA-PS, and SASA indicated more severe conditions, mortality tended to be significantly higher (P < 0.001) (Tables 2, 3, 4). In addition, the risk of death was elevated by 3.65 for every 2-point decrease in the sAs, by 6.4 for every 1-point increase in the ASA-PS score, and by 9.56 for every 4-point decrease in the SASA (P < 0.001) (Table 5).

Table 5 Multivariate logistic regression model for 30-day mortality

ROC curve of the SASA

The ROC curves of the sAs and ASA-PS alone also individually demonstrated that the sAs and ASA-PS were highly valid [area under the curve (AUC) = 0.81 for sAs and 0.79 for ASA-PS, P < 0.001]. There was no difference between these 2 scoring systems (P = 0.451). However, the SASA, which combined them, was even more valid (AUC = 0.87, P < 0.001) (Fig. 1).

Fig. 1
figure 1

ROC curves of three scores of 30-day mortality

Discussion

Both sAs and ASA-PS were found to be very useful scoring systems for predicting postoperative 30-day mortality, but the SASA demonstrated a predictive ability that was superior to those scoring systems. Our study differs from previous reports on the sAs in several aspects. First, the proportion of patients with an ASA-PS score of 3 or higher is small [9, 10]. Second, patients undergoing cardiovascular surgery were excluded. Finally, the National Surgical Quality Improvement Program (NSQIP) [11], on which the sAs calculation is based, excludes patients undergoing endoscopic surgery, which was included in our study. These aspects might have contributed to the postoperative 30-day mortality being lower in our study than in previous reports [9, 10]. However, the sAs still demonstrated a high predictive ability in our study, and its wide versatility that is not affected by differences in patient characteristics and target facilities was consistent with previous reports [12]. Meanwhile, the ASA-PS, which was not originally developed as a risk indicator, has been reported to be useful for predicting outcomes [1316]. In our study, the ASA-PS also demonstrated a high predictive ability that was comparable with that described in previous reports. Although this scoring system has the limitation of reflecting only preoperative patient status [17], our results can help anesthesiologists to realize that the ASA-PS, which we use usually, is highly useful for predicting postoperative patient status.

In this study, we developed a new scoring system to evaluate both preoperative and intraoperative patient statuses. The SASA, which combines the sAs and ASA-PS, demonstrated a much higher predictive ability, compared with either the sAs or ASA-PS. The accuracy of the sAs and ASA-PS is reportedly improved by addition of factors such as age, surgical invasiveness, and respiratory complications [1822]. Although such modifications do not make the calculation of the scores as complex as that of the APACHE and POSSUM, the calculation of those still remains complex. The modified scoring systems have not been widely adopted. The SASA is a scoring system that is easy to calculate and combines the sAs and ASA-PS, each of which is highly useful. While the calculation of the SASA is simple, its predictive ability appears to be comparable with the previously reported predictive ability of the POSSUM and APACHE [23]. In the SASA, the ASA-PS reflects preoperative patient status, and the sAs reflects intraoperative patient status. Thus, the SASA is expected to clearly and comprehensively indicate perioperative risk in patients. Moreover, because the ASA-PS, which is criticized for its predominant influence of subjective elements, is complemented by the addition of the sAs, which is calculated only with objective elements, the SASA scoring system is extremely practical. For example, in cases of cesarean delivery, in which hypotension is prevalent, if the volume of blood loss includes amniotic fluid, the severity based on the sAs alone will be extremely high. However, addition of the ASA-PS reduces such false-positive results.

This study has some limitations. First, it is based on data collected at a single large academic center. Because data compiled by multicenter registries, such as NSQIP and the National Anesthesia Clinical Outcomes Registry (NACOR) [24], are not used, further studies are needed to determine whether our proposed SASA will also be as useful at other facilities, as shown by the results of this study. Although the SASA originates from Japan, we hope that wide international validation will follow; the sAs originally came from a single academic center, and its versatility has been widely reported since then. Moreover, further studies may also be needed on the usefulness of the SASA in patients undergoing cardiovascular surgery and those aged 16 years or younger, who were excluded from this study. In our study the number of patients classified as ASA-PS IV and V was so small that further studies may be needed in a patient population which has an equal distribution of ASA-PS, to show the versatility of SASA.

In summary, the sAs and ASA-PS were shown to be extremely useful for predicting mortality within 30 days of surgery. An even higher predictive ability was demonstrated by the SASA, which combines these simple and effective scoring systems. We expect that the SASA will be widely used as a new easy scoring system for predicting prognosis, allowing a comprehensive assessment of perioperative patient status and automatic calculation of scores at the end of entering data into electronic anesthesia charts.