Perioperative risk prediction in the era of enhanced recovery: a comparison of POSSUM, ACPGBI, and E-PASS scoring systems in major surgical procedures of the colorectal surgeon

Purpose This study aims to determine whether traditional risk models can accurately predict morbidity and mortality in patients undergoing major surgery by colorectal surgeons within an enhanced recovery program. Methods One thousand three hundred eighty patients undergoing surgery performed by colorectal surgeons in a single UK hospital (2008–2013) were included. Six risk models were evaluated: (1) Physiology and Operative Severity Score for the enumeration of Mortality and Morbidity (POSSUM), (2) Portsmouth POSSUM (P-POSSUM), (3) ColoRectal (CR-POSSUM), (4) Elderly POSSUM (E-POSSUM), (5) the Association of Great Britain and Ireland (ACPGBI) score, and (6) modified Estimation of Physiologic Ability and Surgical Stress Score (E-PASS). Model accuracy was assessed by observed to expected (O:E) ratios and area under Receiver Operating Characteristic curve (AUC). Results Eleven patients (0.8%) died and 143 patients (10.4%) had a major complication within 30 days of surgery. All models overpredicted mortality and had poor discrimination: POSSUM 8.5% (O:E 0.09, AUC 0.56), P-POSSUM 2.2% (O:E 0.37, AUC 0.56), CR-POSSUM 7.1% (O:E 0.11, AUC 0.61), and E-PASS 3.0% (O:E 0.27, AUC 0.46). ACPGBI overestimated mortality in patients undergoing surgery for cancer 4.4% (O:E = 0.28, AUC = 0.41). Predicted morbidity was also overestimated by POSSUM 32.7% (O:E = 0.32, AUC = 0.51). E-POSSUM overestimated mortality (3.25%, O:E 0.57 AUC = 0.54) and morbidity (37.4%, O:E 0.30 AUC = 0.53) in patients aged ≥ 70 years and over. Conclusion All models overestimated mortality and morbidity. New models are required to accurately predict the risk of adverse outcome in patients undergoing major abdominal surgery taking into account the reduced physiological and operative insult of laparoscopic surgery and enhanced recovery care. Electronic supplementary material The online version of this article (10.1007/s00384-018-3141-4) contains supplementary material, which is available to authorized users.


Introduction
Colorectal surgery carries inherent perioperative risks, particularly for elderly patients with multi-morbidity [1]. Accurate risk stratification is essential to inform discussions with patients in order to facilitate informed consent, to enable surgical planning, and to anticipate the need for high dependency and intensive care support. Several models have been devised to predict postoperative mortality following general surgical operations.
In 1991, Copeland and colleagues developed the Physiology and Operative Severity Score for the enumeration of Mortality and morbidity (POSSUM) to allow risk adjustment of operations to enable comparative audit of different centers [2]. However, this model was shown to overestimate mortality in patients undergoing low-risk procedures and led the development of a recalibrated version termed Portsmouth-POSSUM (P-POSSUM) using a dataset of 10,000 general surgical operations [3]. Both POSSUM and P-POSSUM incorporate data from the same 12 physiological and 6 operative parameters.
Specific models to determine risk in colorectal surgery have also been created. In 2004, a risk prediction model termed ColoRectal POSSUM (CR-POSSUM) was developed using prospective data collected from 6883 colorectal operations from 15 UK hospitals between 1993 and 2001 [4]. This model had the advantage of only requiring six physiological variables and four operative parameters to predict 30-day mortality. However, once again this model has been shown to have limitations when predicting outcome in older patients. More recently, a French research group recalibrated the original POSSUM algorithm using data from 1186 patients aged 65 or over undergoing colorectal surgery across 41 hospitals, to generate an elderly-specific risk score (E-POSSUM) [5]. To date, POSSUM and Elderly POSSUM are the only scores validated to predict perioperative mortality and morbidity.
The Association of Coloproctology of Great Britain and Ireland (ACPGBI) developed a risk prediction model for patients undergoing colorectal cancer surgery requiring data on five variables: age, cancer resection, ASA grade, Dukes' stage, and operative urgency [6,7]. Although developed from a UK-based population, POSSUM and ACPGBI risk scores have been externally validated and shown to have good predictive power in studies of patients undergoing colorectal surgery in the Netherlands [8] and China [9].
In Japan, a mortality prediction risk score was developed termed Estimation of Physiological Ability and Surgical Stress (E-PASS) [10] using parameters to assess the health status of the patient (ASA score, co-morbidities, performance status) and the stress of surgery (blood loss, operation time, open or laparoscopic surgery). E-PASS has been validated for patients undergoing elective general surgery [11,12] and colorectal surgery [13]. E-PASS has not been validated in a UK-based population.
All these risk models were derived from analyzing large datasets of patients undergoing open surgery with standard perioperative care. Advances in surgical practice with the increased use of a laparoscopic approach and enhanced recovery pathways may impact on the relevance and value of this data set. Senagore and colleagues demonstrated that laparoscopic surgery reduces true morbidity and mortality compared to the POSSUM and P-POSSUM values. Furthermore, a reduction of the operative severity from major to minor with the resultant reduction of severity score by three corrected the overprediction of morbidity by POSSUM and mortality by P-POSSUM in laparoscopic colectomies [14]. Enhanced recovery in turn has been demonstrated to minimize the stress response to surgery thus reducing complications and expediting the recovery process as a result [15,16]. We hypothesize that existing risk prediction models may overestimate the incidence of mortality and morbidity following elective colorectal surgery with enhanced recovery care. The aim of this study was to compare the predicted mortality and morbidity in each risk prediction model with the true mortality and morbidity of a large prospective cohort of patients who underwent colorectal surgery with enhanced recovery care.

Methods
Data was obtained from a prospectively maintained database of consecutive patients undergoing colorectal surgery with enhanced recovery care between 2008 and 2014. Ethical approval and individual written patient consent was not required because data were anonymized to the researchers which conforms to NHS research guidelines. Data on age, gender, surgical approach (laparoscopic, open, or converted), American Society of Anesthesiology (ASA) grade, and procedure type were prospectively collected together with data necessary to calculate the risk prediction scores. Complications were entered prospectively and missing data added from case notes by either the operating surgeon or a trained specialist nurse. These were defined according to POSSUM [17] and by the Clavien-Dindo classification [18]. Mortality was defined as any death that occurred during the first 30 days and within the hospital admission if longer than 30 days. The predicted risk of mortality and morbidity was determined using POSSUM model. In addition, we calculated predicted mortality using P-POSSUM, CR-POSSUM, and E-PASS for all patients. Normal values were substituted for missing data as per the method adopted by Senagore et al. in 2004 [19]. A separate analysis was performed to determine the accuracy of each model to predict outcome in patients undergoing surgery for colorectal cancer, inflammatory bowel disease, and other benign colorectal conditions. We also analyzed the accuracy of the ACPGBI score to predict mortality for all patients undergoing surgery for colorectal cancer. Finally, we determined the risk of mortality and morbidity in a subgroup of patients aged 70 years or more using POSSUM and E-POSSUM models.  [2][3][4]7]. Elderly POSSUM scores were derived using the algorithm by Tran Ba Loc et al. 2009 [5], and E-PASS scores by the method described by Haga et al. [13]. The algorithms are provided in appendices 1 and 2.

Statistical analysis
To determine the validity of the risk prediction models, discrimination and calibration of each model were calculated. Discrimination is the ability of the model to assign a higher probability of death to the patients who actually died than those who were alive 30 days after surgery. This is determined by generating receiver-operating characteristic curves (ROC), with sensitivity (y-axis) plotted against specificity (x-axis). The area under the curve (AUC) of < 0.7 indicates poor discrimination, 0.7-0.8 indicates fair discrimination, and 0.8-1 good to excellent discrimination. AUC was calculated with 95% confidence intervals and compared using non-parametric paired tests in the method described by DeLong et al. (1988) [20]. The Bonferroni correction was used to adjust for pairwise comparisons.
Calibration is the accuracy of the model to predict the risk of death or complication for an individual patient. The estimated probability of death was calculated for each model and ranked into five equal groups of increasing operative mortality. The true mortality (observed) is then compared with the predicted mortality (expected) in each group. The observed/ expected ratio of 1 is perfect accuracy, a ratio < 1 indicates overprediction of mortality rate, and a ratio of > 1 indicates underestimation. A Hosmer-Lemeshow C goodness of fit test is then used to generate a Chi squared test comparing the

Statistical analysis
All statistical analyses were performed using SPSS version 18.

Results
We analyzed data from 1380 consecutive patients who underwent elective colorectal surgery performed by three surgeons within an enhanced recovery program. 0.03% of overall data was missing from the database, and normal values were substituted. Seven hundred sixty-four patients were male (55.4%) and 616 patients were female (44.6% Performance of mortality prediction models in the entire dataset Table 1 and Fig. 1 show that all mortality risk prediction models demonstrated poor discrimination ability when applied to our dataset of all patients undergoing colorectal surgery. CR-POSSUM had the highest AUC value of 0.607 (95% CI 0.476-0.738) and best performing calibration of all models (Hosmer-Lemeshow 3.601, P = 0.891). All models significantly overpredicted perioperative mortality determined by observed by expected ratios of < 1. Performance of mortality predictive models in subgroups All models demonstrated poor discrimination to predict mortality in patients undergoing colorectal surgery for CRC (Table 2), benign colorectal disease (Table 3), or in patients aged ≥ 70 years or more (

Morbidity prediction
In enhanced recovery patients, both POSSUM and E-POSSUM models demonstrated poor discrimination to predict major complications in the first 30 days following surgery (Table 5, Fig. 2). This was also seen when analyzing subgroups by type of surgery and age. E-POSSUM had the most reliable discrimination (AUC 0.572, P = 0.005) and calibration (H-L = 7.962, P = 0.437) of all models to predict complications in individual patients aged 70 years or more ( Table 6).

Discussion
This study is the first to evaluate the validity of perioperative risk prediction models in patients who have undergone surgery within a rigorously followed enhanced recovery care program. All the models tested in our study overpredicted mortality and morbidity. Previous studies of patients undergoing colorectal surgery within an enhanced recovery program have been shown to have fewer complications and faster recovery than those managed with conventional care [21]. These risk stratification models are now used routinely in clinical care, especially P-POSSUM and CR-POSSUM. Both patients and clinicians are making clinical decisions guided by the results extrapolated from online calculators, which use these tools. This study's findings therefore have significant implications when counseling patients regarding perioperative risk in order to guide treatment decisions and help to determine the level of postoperative care required. Surgeons who have adopted laparoscopic techniques and enhanced recovery principles should be aware of the limitations of the current risk models available. Importantly, emphasis should be made on  Table 1 cautious interpretation of the results of these calculations and the dangers associated with using them for risk prediction of mortality in individual patients. Additionally, there has been a recent focus on risk stratification in order to evaluate hospital and surgeon specific complications rates as part of quality improvement and safety initiatives. We have demonstrated the relatively poor reliability of current risk prediction models in patients undergoing colorectal surgery with enhanced recovery care. Indeed one can argue that both POSSUM and P-POSSUM, which were initially designed for comparative audit [2,3], are being used inappropriately in clinical practice and that the historical data from which they are derived is outmoded and even redundant in relation to the recent advances in surgical technique and perioperative care. The traditional risk models evaluated in this study are based on the multivariate regression analysis of perioperative variables recorded in large numbers of patients who have undergone open surgical procedures and not managed with enhanced recovery care protocols. Due to the low mortality associated with ERAS, we estimate a minimum dataset of approximately 10,000 patients with 100 deaths (10 deaths per model independent variables) would be required in order to maintain model integrity [22,23]. This would then require further validation with data from other international institutions practicing ERAS.
The models were also poor at predicting mortality when analyzing patients stratified by age or disease pathology. Therefore new models need to include patients of all ages and encompassing a wide breadth of indications for surgery, if they can be applicable to these subgroups. Furthermore, new outcome predictive variables are becoming readily available, for example, sarcopenia, which can be demonstrated on preoperative computer tomography, has been shown to have a negative effect on postoperative outcomes [24][25][26][27].
There are several limitations in our study. Firstly, this is a single institution study based in a tertiary colorectal specialist center, and the results may not be generalizable to other hospitals who have adopted ERAS care protocols. Second, the  Same legend as Table 1 performance of any mortality predictive model is dependent on the number of events, and in particular, deaths. The mortality and morbidity rates of patients undergoing surgery with ERAS were relatively small and therefore validation through a much larger dataset is required as described above. Third, we substituted normal values for missing data, although this represented less than 0.03% of the overall data. Finally, in order to confirm our hypothesis that traditional risk models to do not accurately predict short-term perioperative outcomes, other institutions who practice ERAS, have a large proportion of laparoscopic surgery and maintain prospective databases need to analyze the validity of the models tested in this study using their dataset.

Conclusions
As time passes and surgical technique and postoperative recovery pathways reduce the physiological insult of surgery, we will find that the models based on these early data will loose validity. The existing perioperative risk prediction scores overestimate morbidity and mortality in patients undergoing colorectal surgery within an enhanced recovery program. New models are required based on prospectively collected data across multiple centers performing cases both laparoscopically and with enhanced recovery-based care this would fall in line more with the ACS-NSQIP database producing a more reliable and accurate tool for risk prediction in the UK setting.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Open Access This article is distributed under the terms of the Creative Comm ons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.