Background

Type A acute aortic dissection (AAD) requires emergency replacement of the ascending aorta and/or the aortic arch with or without aortic valve replacement and in-hospital mortality ranges from 7 to 30% in recent series [1, 2]. Among 526 patients enrolled from 1996 to 2001 by the International Registry of AAD investigators, 30-day mortality was 25.1% on average [1]. A large list of pre-, intra- and immediate post-operative factors may independently contribute to increase the mortality risk at short-term (see [2] for extensive review). These include: history of aortic valve replacement, migrating chest pain, hypotension and/or shock, cardiac tamponade, limb ischemia, the length of extracorporeal circulation and chronic renal failure. There has also been an effort to investigate whether surgical techniques may contribute to modify the risk; however inconsistent results were obtained as to the role of retrograde, anterograde or selective cerebral perfusion after circulatory arrest [1, 2]. More recently, anatomo-surgical parameters [3] and biological indexes, such as D-dimer values above a given threshold [4], were assessed as diagnostic tools, but no study was performed to clarify their potential predictive role. On the other hand, it is largely unknown whether the assessed short-term risk factors may also predict long-term (say 1- to 2-year) mortality in Type A AAD patients.

Aim of the study was therefore to see whether selected risk factors assessed previously for prediction of 30-day mortality risk among Type A AAD patients [1, 2], may also contribute to index long-term prediction using neural networks known to have a larger global accuracy as compared to standard models such as logistic regression [2, 5]. In addition, to improve discrimination between cases and non cases [6], which is essential once new risk equations are tested in general and in cardiac surgical outcome studies [710] in particular, support vector machines (SVM) were also used [11, 12] for the first time on this material.

Methods

Cohort and Risk Factors

There were 235 consecutive patients undergoing surgical repair of AAD Type A between January 2002 and late 2008 at the University of Rome "La Sapienza"(n = 143, 61%) and Catanzaro Sant'Anna Hospital (n = 92, 39%), Cardiac Surgical Departments. Diagnosis was made in emergency with computer tomographic (CT) scan and/or trans-esophageal echocardiography. Anesthesia was induced by propofol (1-1.8 γml) and sufentanil (0.35-1 γkg) and maintained by propofol 1-1.8 γml/hr and sufentanil 0.35-0.51 γkg/hr.

For each patient there were 32 potential predictors including demographic characteristics and pre-, operative and immediate post-operative variables including dummies (see Additional File 1 for the definition of mathematical, computational or statistical technicalities) constructed in order to index operative techniques and related complications. These were selected based on a Literature review of studies performed to assess the role of relatively short-term potential predictors [2]. Thus, year of surgery, hospital localization, age, sex and presence of clinically diagnosed high blood pressure and Marfan habitus were considered. Among AAD onset symptoms we coded shock and whether intubation was present at arrival or neurological deficits were present. Previous cardiac surgery was also coded. Among intra-operative variables there were: cross-clamping and total circulatory arrest times in min after extracorporeal perfusion started along with operative techniques (whether ascending aorta plus arch or hemi-arch or plus aortic valve and whether by Bentall or Cabrol, all as dummies versus ascending aorta alone). We also coded whether cerebral perfusion was anterograde, retrograde or both. Immediate post-operative complications were noted for each patient and included: total bleeding in ml, limb ischemia, by clinical and CT documentation, renal complications, including oligo-anuria and continuous hemodialysis, gastrointestinal complications such as bleeding and ischemia, and other complications requiring medical or surgical treatment and cerebral accidents, neurological deficits and coma, by clinical and CT documentation. For the definition of the analysed variables we followed those reported in previous studies [1, 2].

Follow-up was performed by periodic visits and/or telephone contacts. Death certificates and all pertinent records were reviewed: time and causes of death were considered and patients alive were censored. For the purpose of the study we concentrate here on all-cause mortality.

Statistical Analysis

Data are expressed as means ± SD or SE (when appropriate). The selection of potential predictors was done a priori based on previous knowledge [2, 5, 13]. Linear correlation with the outcome variable and information value (that is the relative importance of each covariate) were considered. Follow-up data were investigated by modelling the presence (coded 1) or absence (coded 0) of post-operative mortality using Tiberius Data Mining © software (version 6.1.5; see http://www.tiberius.biz) to obtain multilayer perceptron (MLP) neural network solutions. These were from a 3-layer network, including the hidden unit containing 2 neurons (one linear and the second non-linear), with 32 input nodes (corresponding to the 32 potential risk factors selected) and one output unit, modelling the dichotomous risk outcome [2, 5]. MLP were trained on a randomly selected sub sample (50% of all patients included), preventing over-fitting [14, 15]. Validation was performed on the remaining 50%. Gini's coefficient and graph [16] were produced. Receiver operating characteristic (ROC) areas under the curve (AUC) were compared [17, 18] between solutions using MedCalc software (version 9.6.3.0; see http://www.medcalcsoftware.com). To run SVM [11] cSVM (version 3.1.0; see http://www.smartlab.dibe.unige.it) was used with optimal C search on 50% of the overall sample. There are similarities between neural networks and SVM since an SVM with a sigmoid kernel is equivalent to a neural network with a sigmoid activation function and one hidden unit, the difference being only the number of neurons, automatically selected by a SVM [12]. A value of p < 0.05 was considered statistically significant in all cases.

Results

Univariate contributors

The univariate contribution of the 32 potential risk factors for AAD Type A is shown in Additional File 2, Table S1 among the 235 patients studied (see Additional File 2). These patients were from 2 Cardiac Surgical Centres, one in central and the other in southern Italy, and were followed-up from 8 months to 7 years post operation. There were 84 deaths (36%): 81 (95%) of these were of cardiac origin, whereas the remaining 4 (5%) presented mixed causes, from accidents to cancer and suicide. Deaths occurred at 564 ± 48 (mean ± SE) days (95%CI from 470 to 658 days). To index the relative discrimination between cases and non cases (variable = Status) provided individually by these factors, the table shows the information value, Gini's coefficient and linear correlation. A good information value (> 0.5) is provided by chronic renal failure, bleeding in the first post-operative 24 hours, extracorporeal circulation and circulatory arrest times, age, and dummies for post-operative neurological coma and immediate post-operative dialysis in continuous. Apart bleeding in the first post-operative 24 hours, the other variables present a high linear correlation and a large Gini's coefficient.

Multivariable contribution by NN

There were 211 (90%) patients who had all variables for analysis whereas missing data were seen from 0.4 to 9.4%, depending on the examined variable (Additional File 2, Table S1). There was a slightly lower, not significantly different death rate among patients with complete information for all variables (60 of 211, 28%), than among the overall studied patients. Neural network model classified 44 of 60 (73%) dead patients and 147 of 151 (97%) long-term survivors using 5 covariates: immediate post-operative chronic renal failure, circulatory arrest time, the type of surgery on ascending aorta plus hemi-arch, extracorporeal circulation time and the presence of Marfan habitus. Figure 1 shows a semi-quantitative graphic presentation of these risk factors for training and validation models. The proportions of dead patients identified by neural network were slightly lower in training and validation runs (respectively 69 and 64%) than in the overall study. However, much similar proportions were correctly identified among long-term survivors (respectively 97 and 100%). Of note that global accuracies (as detected by ROC AUC) were extremely high (respectively 0.871 and 0.870).

Figure 1
figure 1

Receiver operating characteristic plots by randomly selected training (50%) and validation (50%) neural network models on patients with complete variables (N = 211). A semi-quantitative graphic presentation of the covariates relevance is presented for training and validation models. Full names of coded variables are reported in Additional File 2, Table S1. Keep = 1 means that covariate may stay in the model. Note that Gini's coefficients are practically identical for training and validation neural network models, respectively 0.742 and 0.741 (ROC AUC: 0.871 and 0.870, respectively). Therefore, training and validation neural network models have a very high, yet similar, accuracy and define a set of 5 predictive covariates useful to index long-term mortality in patients operated for Type A ascending aorta dissection.

Multivariable contribution by SVM

Figure 2 shows the results of the SVM run on the overall study group (N = 235), since by this method there is no limitation to confine the analysis to patients with complete data for all variables, as with neural networks. A somewhat different picture is provided by SVM as compared to neural network. First, SVM make use of a larger number of covariates, some of which provided little if any information, yet globally enabled to obtain a Gini's coefficient of 1.00 (using 27 of 32 covariates) with no false negative or false positive cases identified among 118 randomly selected AAD Type A patients (error = 0%). Second, when validation SVM were run on the remaining 117 patients, the Gini's coefficient was 0.642 (with an ROC AUC = 0.821), a statistically lower (p < 0.01) result as compared to those obtained by neural network model. There were 15 false negative and 11 false positive cases (error = 22%) identified. Third, validation and training SVM used different covariates to predict outcome and there was a relatively different ranked importance.

Figure 2
figure 2

Receiver operating characteristic plots by randomly selected training (50%) and validation (50%) support vector machine (SVM) with optimal C search on overall study patients (N = 235). A semi-quantitative graphic presentation of the covariates relevance is presented for training and validation models. Full names of coded variables are reported in Additional File 2, Table S1. Keep = 1 means that covariate may stay in the model. Using 27 of 32 covariates, Gini's coefficient by training SVM was 1.00 and no false negative or false positive cases were identified among 118 randomly selected AAD Type A patients (error = 0%). However, validation SVM on the remaining 117 patients provided 15 false negative and 11 false positive cases (error = 22%) and the Gini's coefficient was 0.642 (ROC AUC 0.821), which is statistically lower (p < 0.01) than the results obtained by neural network model, shown in Figure 1. Of note that validation and training SVM use different covariates to predict outcome and a relatively different ranked importance. Nevertheless, with both training and validation SVM, apart from extracorporeal circulation time, the other 4 covariates were also selected by neural network models.

Variables selected in common by NN and SVM

There were 4 covariates (circulatory arrest time, immediate post-operative chronic renal failure, the type of surgery on ascending aorta plus hemi-arch, and the presence of Marfan habitus) selected in common by neural network models and both training and validation SVM. It is important to consider that a high correlation (r = 0.31) exists between circulatory arrest and extracorporeal circulation times (results not shown).

Discussion

This is the first investigation to adopt neural networks and support vector machines to assess the relatively long-term predictive role of a quite large series of potential risk factors including pre-operative, operative and immediately post-operative variables in AAD Type A patients. The presence of Marfan habitus, the length of circulatory arrest, an intervention on the ascending aorta plus hemi-arch and immediate post-operative chronic renal failure were the risk factors selected in common by these methods with a very high global accuracy (ROC AUC > 0.82). Although the factors selected were not new, their combination might be used in practice to enable the construction of risk charts whereby levels of risk might be defined. However, it is clear that the corresponding cells of these charts need to contain a sufficient number of cases and non cases, which is presumably possible only after large multi-centre and/or multinational cooperative efforts will be undertaken. The evidence presented here might contribute to stimulate cooperation to reach this aim.

The presented rules provided very good predictive and discrimination properties, however only Marfan habitus was a parameter that could be used pre-operatively. Determination a priori about which patients are not candidates for surgery is therefore not possible using the evidence of this investigation. Nevertheless, as there were 2 operative parameters contributing to increase long-term mortality risk, it is important that attention is paid to keep the length of circulatory arrest at the minimal level and to consider that an intervention on the aorta plus hemi-arch conveys an independent risk of lower survival. On the other hand, all efforts should be done to reduce the incidence of post-operative chronic renal failure.

The incidence of AAD Type A has been estimated at from 5 to 30 per million people per year in the United States, which is 880 to 147 times less than the incidence of acute myocardial infarction, but still provides an important clinical problem and sometimes a dilemma for the differentiating difficulties between these presentations [13]. Although biological thresholds of plasma molecules such as D-dimer are actively looked for in order to improve diagnosis [4], this may not have an impact on prediction before the results of larger studies are obtained. Therefore, risk profiling remains crucial. Based on results obtained by the IRAD investigators, short-term mortality could be reduced from as high as 58% in medically treated patients to the current average figure of 25.1% (and sometimes less) when surgery is performed [1]. Risk factors may contribute to better management and a more defined risk assessment [1, 2]: in-hospital mortality was as high as 31.4% in unstable patients presenting with cardiac tamponade, shock, congestive heart failure, cerebro-vascular accident, stroke, coma, acute myocardial and/or mesenteric ischemia and acute renal failure at the time of operation, whereas stable patients may present with a mortality as low as 16.7%.

In a previous report we investigated 30-day mortality among 208 patients from 2 Italian Centres [2] using a series of demographic, pre-operative, operative and post-operative characteristics, selected from 37 such variables considered in the Literature as potential predictors of short-term mortality after AAD Type A. When logistic or neural network models were produced in one Centre and applied to the data from the second Centre, for external validation [1315], there were predictors which were selected in common: the presence of pre-operative shock, intubation and neurological symptoms, immediate post-operative presence of dialysis in continuous and the quantity of bleeding in the first 24 hours post-operation. By neural network model only, the length of extracorporeal circulation and post-operative chronic renal failure were detected as independent predictors of 30-day mortality. Different from the IRAD Registry investigators [1] we showed [2] that operative and immediate post-operative factors should be considered to predict short-term mortality. They contributed significantly to obtain a large overall accuracy, which might be explained in part by these factors being continuous [19]. On the other hand, similar to studies investigating predictive performance of short-term mortality after coronary artery bypass surgery [9, 10], neural networks had a better performance when compared to standard methods such as logistic regression [2, 5].

When the performance and/or reliability of predictive models is limited, or of low sensitivity and specificity, their capability may be hampered to identify high risk subjects who deserve individualized treatment [13]. The neural network method stems [14, 15] from its potential for improved predictive performance by exploring, hidden layers to find nonlinearities, interactions and nonlinear interactions among predictors. The attraction of neural networks is quite evident from the impressive growth of results published [15]. However, there are relatively few comparative reports on the performance and accuracy of neural networks, which was assessed only versus multiple logistic function, to predict events in clinical [9] or epidemiological [5, 18] cardiovascular studies.

There has been some controversy as to whether new risk predictors, or series of old and newer ones, can add to the prediction of events, including mortality, in terms of clinical utility, impact or discrimination [6]. Although in clinical and epidemiological experiences discrimination metrics (such as ROC AUC) are quite well established methods [2, 5, 18, 20, 21], it has been pointed out that ROC AUC are insensitive in comparing models [6], which may be circumvented however by making comparisons with fixed number of covariates [5]. To evaluate and compare predictive risk models there have been therefore new methods to be proposed, based primarily on stratification into clinical categories on the basis of risk and attempts to assess the ability of new models to more accurately reclassify individuals into higher or lower risk strata [22, 23]. Risk reclassification for single factors can be then examined by using models with and without each risk factor in turn or measuring the net reclassification improvement, that is the difference in proportions moving up and down risk strata among case patients versus control participants [6, 23]. Whatever reclassification method is selected it is important to understand that when length of follow-up differs (as in the present series) among individuals and/or the cohort is relatively small it may be impossible to apply them [6]. Moreover and more importantly, reclassification methods depend on the particular categories used [6]: in our case it is far from established if a 5%, 10%, 20%, 30% or more are adequate categories of long-term risk of AAD Type A. To compare with established experiences in preventive cardiology [20, 24] or coronary by-pass surgery [25], the sensitivity and specificity of the abovementioned thresholds should be accurately assessed, which again calls for large amount of data being collected and therefore improved multi-centre collaboration.

Conclusions

The classification provided by neural network models and related SVM may represent a compromise to cope with the necessity to assess the clinical relevance of variables used for predictive purposes in AAD Type A patients, but also in different areas of research. These methods may also go beyond the classical contention of standard predictive models, namely that only predictors that are statistically significant are typically used [6]. Indeed, with SVM a high discrimination is obtained by using a large number of variables, most with little informative content if used alone. As we have shown, however, it is extremely important not only to train but to validate these methods, which demands further study and the accumulation of very large data sets. Our results may well stimulate these efforts.

An important take-home message for clinicians should be that with neural networks and SVM, by concentrating on a few risk factors such as those described here, it is possible to predict long-term mortality in AAD Type A patients with a global good accuracy. We produced an html tool (see Additional File 3) based on the neural network solution reported here, whereby it is easy to appreciate that increasing from 60 to 80 min the circulatory arrest time, the patient long-term risk category evolves from false (survival) to true (dead) at an assessment strength (roughly the degree of certitude) of 1/3. By further increasing circulatory arrest times to 120 and 180 min, the assessment strengths become 2/3 and almost 1, respectively. Although Surgeons know well and from decades that this is a hardly steerable variable in the clinical practice, a dimensional outcome predictive assessment might be obtained using our tool immediately after the operation is finished, which may have an impact for further clinical decision making. The other variables described in the present study might also be used for predictive assessments so that a very large combination of clinical presentations could be easily modeled.