Background

Traumatic Brain Injury (TBI) is defined as the brain injury that is caused by external trauma [1]. TBI causes death and disabilities more than any other trauma [2]. The life expectancy reduces significantly with TBI as the mortality rate increases between 30 and 70% [1, 3] compared to other types of injuries. TBI affects millions of people around the world yearly causing a major global burden [3]. Globally, 64–74 million individuals around the world are estimated to sustain TBI every year with the greatest burden of the disease is in Southeast Asian and Western Pacific regions [2]. Mortality is associated with advanced patient age and the severity of TBI. It was found that the 14-day in-hospital mortality post severe TBI reaches up to 24.5% in adults between 16 and 65 years and greater than 40% in patients over 65 years old [4]. There are several published and widely used prognostic/outcome predictive models that demonstrated good predictive and discrimination power. Table 1 shows some of the widely used prognostic models and their performance as measured by the Area Under the Curve (AUC) [5,6,7,8,9,10,11,12,13,14,15].

Table 1 examples of popular TBI prognostic models

In addition, scholars designed several predictive models that aim to help the clinicians and the researchers to predict the TBI prognosis and outcomes. Jacobs et al. [16] designed a predictive model that aims to predict the outcomes of moderate to severe TBI using demographics, clinical data (e.g. vital signs, pupils reaction, and Glasgow coma scale (GCS)) and radiological parameters (Brain computed tomography scan (CT) findings). The study found that age, pupil responses, GCS score and the occurrence of a hypotensive episode post-injury and several CT scan findings are good predictors for the TBI outcomes.

The use of machine learning techniques to predict diseases outcomes has grown significantly in the last decade. Several studies proved that the machine learning predictive techniques outperformed the classical multivariate techniques [17, 18]. In a systematic review of 30 studies that used machine learning techniques to predict several neurosurgical outcomes including mortality following TBI, machine learning techniques outperformed several well-known classical predictive tools and performed similar or better than field experts in some instances [19]. Rau et al. [6] used machine learning techniques to predict the moderate to severe TBI mortality. The authors used age, sex, use of helmet, co-morbidities, GCS, and vital signs as predictors. They used Logistic Regression (LR), Artificial Neural Network (ANN), Decision Tree (DT), Support Vector Machine (SVM), and Naïve Bayes Classifier (NB) to classify the patients based on the survival outcomes. They compared the performance of the models in terms of accuracy, sensitivity, specificity and the area under the curve (AUC). ANN yielded the best performance amongst all with 96.8% AUC, 92% accuracy, 84.4% sensitivity and 92.8% specificity. Hale et al. [12] used machine learning technique (ANN) to predict 6- month favorable/unfavorable outcomes including mortality in 565 pediatric patients who sustained TBI. They used GCS, pupils reactivity to light, blood glucose level, blood hemoglobin concentration, mass lesion, traumatic sub-arachnoid hemorrhage (tSAH), cistern status, and midline shift to build the predictive model. Further, they compared the performance of the ANN based predictive models with three of the known classical predictive models, namely; Helsinki, Rotterdam, and Marshall. The machine learning model not only achieved profound accuracy (> 94%), but also outperformed the three classical predictive tools. This finding supports Eftekhar et al. [17] that found ANN to significantly outperform the logistic regression based predictive models in predicting diseases outcomes with AUC of 96.5% vs. 95.4%.

This study aims to design supervised machine learning predictive model to early predict in-hospital mortality in adult patients who sustained TBI and admitted to the level 1 trauma center of Hamad General Hospital (HGH)- Hamad Medical corporation (HMC); a governmental non-for-profit healthcare organization.

Methodology

The study was conducted in accordance with the Cross-Industry Standard Process for Data Mining (CRISP-DM) that provides definition of typical phases of the data mining projects. CRISP-DM breaks data mining process into six phases: business and data understanding, data preparation, modeling, evaluation and deployment [20]. Figure 1 provides a summary of the methodology.

Fig. 1
figure 1

Research methodology

Business and data understanding

Not all the registry data were usable in this study. Therefore, to better understand and choose meaningful variables, the authors explored the definition of each variable in the trauma registry data dictionary. In addition, authors reviewed the literature in order to determine which among the enormous number of variables need to be considered a predictor and which among them to be imputed if in case they have missing values [21]. Pediatric patients (< 14 years old) were excluded. This was important for understanding and interpreting the results as some of the important parameters (i.e. vital signs) are different between the pediatrics and adults groups.

Data preparation

The study was approved by the Institutional Review Board (IRB MRC-01-19-106) of HMC. This retrospective study targeted all adult patients who were admitted to the trauma center at HGH in the period from January 2014 to February 2019 and registered in the trauma registry. A total of 2318 patients with TBI were registered in the trauma registry for the given period.

Only Adult patients (≥ 14-year-old) who sustained TBI were included in the study. All variables that have no predictive power (e.g. health record number, date of admission and date of disposition) or those that were severely imbalanced (e.g. gender: where female patients were less than 6%) were excluded. Missing data may seriously impact the predictive models performance [22]. Several approaches to handle data missingness were used in the literature such as elimination of the incomplete records [6] or imputing the missing values which is a widely used approach [22]. Due to the criticality of the subject, records with missing data were eliminated. Subsequently, 1620 eligible patients were included in the study.

The retrieved data included the following variables: Age, gender, mechanism of injury, mode of arrival, alcohol blood level, blood pressure, heart rate, Glasgow Coma Score (GCS), CT findings, intubation status and location, date/time of injury, time of admission to the Emergency Department (ED), patients known comorbidities, performed procedures, blood transfusion, administration of the Venous Thromboembolism (VTE) prophylaxis, blood transfusion, in-hospital complications, outcome and date of disposition.

Additional variables were secondarily generated from the retrieved variables: Shift of admission (D: 7 am to 6:59 pm and N: 7 pm to 6:59 am) [23, 24].

Outcome measure

The outcome measure is the in-hospital mortality during the initial hospitalization post moderate to severe TBI. It is a dichotomous variable (0 = alive and 1 = dead). Patients who were discharged from the trauma surgery section or transferred to another hospital were considered alive.

Prediction models

Two of the powerful supervised machine learning techniques were utilized to allow us to compare their performance with each other and with previous studies in order to recommend the model that achieves the optimal performance and highest practicality in supporting clinical decision. Artificial neural networks (ANN) and Support vector machines (SVM) are widely used in predicting in-hospital mortality. Therefore, they were selected to provide base line comparative performance. SPSS modeler 18.1 was used to conduct the analysis.

To prevent overfitting and to validate the models’ performance, we partitioned the data into training and testing sets and the overfit prevention was set at 30%. Table 2 explains the data partitions.

Table 2 data partitions

Artificial neural networks (ANN)

ANN are widely used machine learning technique that performs powerfully in classification and pattern identification [25]. When used for classification, ANN is seen as a set of connected input/output units in which each connection has a weight associated with it. This weight represents the strength of the connection between the units [26]. Scholars consider ANN as a black-box analytical model. Nonetheless, their great potentials in supporting clinical practice through the engagement with the evidence-based medicine are undeniable [12]. Usually, the performance of the neural network is optimized through partitioning the data into training and test data sets which helps preventing overfitting. The training continues until the error is no further reducible [27]. Once trained, the ANN can be used for future cases where the outcome is unknown [28].

Support vector machines (SVM)

SVM is a powerful classification machine learning algorithm that can be used for linear and non-linear data sets [25]. When using SVM for classification purpose, it is very important to decide which kernel function better achieves the optimal hyperplane that separates the classes [29]. Linear kernel was used in this study as it optimized the predictive performance in the preliminary assessment compared to other functions (i.e. polynomial, sigmoid or Radial Basis functions).

Results

Among the 1620 patients who were included in this study, 203 (12.5%) died in the hospital during their initial hospitalization. Mean age was 34.4 years and mean age at death was 37.2 years. The most common mechanism of injury was fall from height (34%) followed by Motor vehicle crash (30%). The most common CT finding/mass lesion was subdural hemorrhage (28.1%) followed by extradural hemorrhage (22.9%) with 22% of the patients’ sample sustaining midline shift. Tables 3 and 4 show the sample characteristics and the descriptive statistics for the study sample.

Table 3 Sample characteristics- continuous variables
Table 4 Sample characteristics - Nominal and ordinal variables

Performance of the data mining techniques

To calculate the models’ performance metrics, we first constructed the confusion matrix that displays the relationship between the actual observations and the predicted conditions.

Table 5 shows the performance evaluation metrics for the two machine learning techniques in the test data partition. Both models achieved accuracy greater than 91%. Nevertheless, since accuracy alone is insufficient measure to evaluate model performance, AUC, precision, NPV, sensitivity, specificity and F-score measures were taken into consideration. SVM achieved the best performance (Table 5).

Table 5 Performance of the classification models

In-hospital mortality risk factors

SVM utilized all the 21 variables in predicting the in-hospital mortality. In machine learning, the contribution of every predictor to the overall model’s capacity to produce accurate predictions is usually presented in the form of predictor’s importance (Fig. 2) [30]. The first predictor is usually the most important predictor to the model’s capacity. Then the other predictors importance values are ranked in relation to the first ranked predictor. SVM revealed that receiving endotracheal intubation during resuscitation plays the most important role in predicting the in-hospital mortality.

Fig. 2
figure 2

Importance of predictors in Support Vector Machines

Discussion

The early prediction of in-hospital mortality in patients with traumatic brain injury is of utmost importance. Early and powerful prediction of mortality helps clinicians and healthcare managers to optimize the management of medical resources, initiate appropriate diagnostics and interventions in a timely manner, conduct comparative audits and ensure that the patients’ families and others receive appropriate guidance [3, 6]. However, prediction of disease prognosis and outcomes requires developing good prognostic models that include large samples and enjoy high external and internal validity in order to be generalizable beyond a specific research setting [31]. Many prognostic models were published over the years but only few of them achieved sample validity requirements [32]. Usually, clinicians use certain prognostic factors such as GCS to guide their therapeutic decisions and to estimate prognostic outcomes [3, 32]. Nonetheless, such predictors may be affected by several factors such as alcohol use which negatively affects the prediction success and the discrimination power of the model [11, 12]. Thus, for accurate outcome prediction, multiple risk factors (e.g. age, GCS and others) need to be considered jointly in developing prognostic model [32, 33].

In terms of models’ performance, SVM outperformed the ANN in all the performance evaluation metrics (Table 5). Therefore, SVM is the chosen model for deployment.

At a wider scale, in this study, the SVM outperformed the conventional multivariate LR based models that utilize the conventional TBI prognostic models as reported in Table 1. The highest reported AUC when using the conventional prognostic models was 92% [10, 12, 16]. Furthermore, when comparing this study’s machine learning models’ performance with the published literature on TBI, we found that the performance of the SVM model was higher or similar to the performance of the machine learning models in similar studies [6, 19]. This comparison is crucial when the external validity of this study was considered.

This study ranked the intubation to be the most important predictor for post TBI in-hospital mortality. Almost 26% of patients who were intubated in the first 24 h post injury died during their initial hospitalization compared to 0.1% of those who were not intubated. This could be attributed to the severity of TBI as the severer the injury the higher likelihood to get intubated. Moreover, intubation increases the length of stay in the hospital and increases the risks of in-hospital complications e.g. ventilator associated pneumonia that contributes significantly to increasing the mortality [34]. The need for blood transfusion during resuscitation has a significant relationship with the in-hospital mortality. 29% of patients who received blood transfusion during resuscitation died compared to 2.4% mortality among those who didn’t received blood. The need and the consequences of blood transfusion in TBI are still debatable. Several studies reported that blood transfusion in TBI is associated with unfavorable outcomes [35, 36]. This could also be explained by the fact that patients who needed blood transfusion were those who had severer injuries and had lost significant amounts of blood. Therefore, these patients are prone to poor TBI outcomes already.

Consistent with the previous literature, this study found that patients who received venous thromboembolism (VTE) prophylaxis had better survival rate compared to those who didn’t [37]. Almost 18% of those who didn’t receive the VTE prophylaxis deceased during their initial hospitalization compared to 8.7% of those who received VTE prophylaxis. Also, this study found that 54.4%of patients who developed cerebral edema following the primary TBI have died in-hospital compared 9.7% of those who didn’t develop cerebral edema. This finding is consistent with Jha et al. who reported that cerebral edema is a leading cause of in-hospital mortality as it occurs in more than 60% of patients with mass lesions including post TBI hemorrhage [38]. Cerebral edema is a secondary complication to the TBI when the brain tissue water increases following the injury. Hence, significant efforts in TBI management go to the prevention of the secondary brain injury and to maintain adequate cerebral perfusion pressure (CPP) [39, 40]. Midline shift is a major post traumatic complication that leads to serious unfavorable effects including mortality [8, 12, 41]. Around 27% of patients who had midline shift deceased compared to 8.3% of those who had no midline shift reported in their CT scan. TBI diagnosis as per brain CT scan result plays an integral role in predicting in-hospital mortality post TBI. Interestingly, 25% of those who had subarachnoid hemorrhage (SAH) following the TBI died compared to 17.5% and 16.5% for those with DAI and SDH respectively. It is documented in the literature that traumatic SAH has a significant effect on the in-hospital mortality [8, 12, 19]. Further, presenting heart rate (HR) is an indicator of the patient’s hemodynamic stability following any type of trauma particularly TBI. High HR (> 100 bpm) [42] especially when associated with Low SBP (< 90 mmHg) [40] may indicate hypovolemic shock state which leads to poor CPP. This study found a positive relationship between the HR and in-hospital mortality. The HR in this study was collected upon arrival to the ED following trauma. The mean HR upon admission was 93 bpm. The mean HR for those who survived was 90.8 while it was 108.5 bpm for those who later died in the hospital. Interestingly, the mortality rate increases significantly when patients with TBI have associated abdominal injuries [43]. Mortality among those with associated abdominal injury is 31.5% compared to 9.8% among those with no associated abdominal injury.

Finally, the 10th ranked important variable was the arrival mode. Patients that arrived to the trauma center via ambulance had higher mortality compared to those who arrived to the trauma center via another mode (13.5% vs. 7.4%). This is consistent with previous literature which found that the mortality patterns are affected by the mode and the time of arrival to the emergency room following TBI or polytrauma [44, 45]. This could be due to the assumption that the time between injury and arrival of the ambulance then the arrival to the hospital is relatively longer than arrival with private vehicle [45] or simply by the assumption that the severer the injury the higher the likelihood that a patient gets transported to the hospital via ambulance.

Limitations

One of the most important limitations in this study was faced during data processing and preparation. Several variables in the registry are recorded as text-free which complicates data preparation process. Data were abstracted from Qatar national trauma registry; which is contributing data to the National Trauma Data Bank (NTDB) and the Trauma Quality Improvement Program (TQIP) of the American College of Surgeons-Committee on Trauma (ACS-COT). Therefore, there are several potentially useful predictors that were unobtainable such as laboratory results and received medication. The deployment of the model to support clinical decision making is another significant challenge. This is due to several reasons such as the questionable reliability of the non-traditional predictive techniques that stems to a certain extent from the lack of awareness among the clinicians about the artificial intelligence potentials in supporting clinical decision-making process. Very importantly, unlike the logistic regression for instance, the standardized coefficients and the odds ratios pertaining to each predictor in the SVM are not obtainable. This makes the results interpretation more complex than the traditional computational techniques.

Conclusions

This study demonstrated that the performance of the machine learning techniques is superior to the conventional multivariate models. Furthermore, the results were consistent with the known body of knowledge. Thus, with the availability of massive data sets in the electronic medical records and other structured registries, clinical evidence could be made available quickly and with less effort.

From another perspective, the results of this study may encourage the decision makers in the trauma surgery to integrate the machine learning techniques with the trauma registry and the electronic medical records. This may help clinicians plan their preventive efforts and mobilize the necessary resources in an early stage of patient treatment which could improve the care outcomes.