FormalPara Key Summary Points

We present a novel deep neural network-based artificial intelligence prediction model to help identify a subgroup of patients undergoing carotid artery stenting who are at risk for short-term unplanned readmissions.

Prior studies have attempted to develop prediction models but have used mainly logistic regression models and have low prediction ability.

The novel model presented in this study boasts 79% capability to accurately predict individuals for unplanned readmissions post carotid artery stenting within 30 days of discharge.

Digital Features

This article is published with digital features, including a summary slide and plain language summary, to facilitate understanding of the article. To view digital features for this article go to https://doi.org/10.6084/m9.figshare.14198939.

Introduction

Strokes secondary to thromboembolism from an atherosclerotic plaque at the carotid bifurcation or the internal carotid artery (ICA) account for 10–15% of all strokes worldwide, and carotid artery stenting (CAS) in a timely fashion may help prevent such strokes [1, 2]. Unplanned readmissions due to all causes are an important aspect from a patient care perspective and quality metric under the Patient and Protection and Affordable Care Act. Prediction of early readmission is crucial in helping plan the delivery of healthcare services and identifying high-risk patients for intervention strategies to reduce readmissions and provide cost conscientious care to the community [3, 4].

Most of the prior studies evaluating carotid revascularization have focused on comparing the CAS with carotid end-arterectomy (CEA) in terms of outcomes, including stroke, major adverse cardiovascular and cerebrovascular events (MACCE), death, or readmission. Furthermore, these studies and others evaluating readmissions used data before 2014 when the International Classification of Diseases, 9th Revision (ICD-9) was in practice. To the best of our knowledge, there is no study that has specifically evaluated CAS for early readmissions using the strength of the International Classification of Diseases, 10th Revision (ICD-10) diagnosis codes and for more recent patient-level data [5, 6].

The specificity of diagnostic and procedure codes has immensely improved with the advent of the ICD-10 codes, which can give us better insight into the causes and associations of unplanned readmissions [7]. With the advent of machine learning methods, there has been a gradual adaptation of those in healthcare, and they are also found to be superior to standard prediction rules for hospital readmissions [8]. We present a nationwide evaluation of early unplanned readmission and an artificial intelligence-assisted readmission prediction model for patients undergoing CAS.

The purpose of this study was to describe the rates and causes of unplanned readmissions within 30 days following CAS, and we aimed at using artificial intelligence machine learning analysis to develop a robust prediction model for short-term readmissions.

Methods

The US Nationwide Readmission Database (NRD) is a nationally representative sample of all-age, all-payer discharges from US nonfederal hospitals produced by the Healthcare Cost and Utilization Project (HCUP) of the Agency for Healthcare Research and Quality (AHRQ) [9]. This database is composed of discharge-level hospitalization data from 28 geographically dispersed states across the USA. It has approximately 18 million discharges for the year 2017 (weighted estimated to roughly 36 million discharges). The dataset used in the present study represents 60% of the US population and 58.2% of all US hospitalizations.

Individual patients in the NRD are assigned up to 40 diagnosis codes and 25 procedure codes for each hospitalization. We defined CAS with the procedure codes from ICD-10 listed in Table 1 in the supplementary material. The primary outcome was first unplanned readmission within 30 days of the index discharge after the first CAS. If a patient had multiple CAS in a year, only the first intervention was used for the analysis. The cohort patients admitted in the month of December for the index admission were also excluded, as they may not have 30 days of follow-up, leading to immortal time bias. All ICD-10 and Clinical Classification Software Refined (CCSR) codes used in this study are presented in Table 2 in the supplementary material. Statistical analysis was performed using IBM SPSS Statistics V26 using two-sided tests and a significance of less than 0.05. We used the Pearson–chi-square test for categorical variables, Mann–Whitney U test for continuous variables with no readmission as the reference group, and logistic regression for predictions.

Compliance with Ethics Guidelines

This study was performed utilizing publicly available datasets and hence does not require IRB review under 45 CFR 46. In addition, the Institutional Review Board (IRB) at the University of South Alabama approved the study for exempt status.

Model Development

A prediction model was developed in accordance with the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) initiative guidelines [10].

Data Preprocessing

The preprocessing of CAS readmission records was conducted to make the data at hand represent nationwide readmission rates. The data at hand involved 9019 individual patients (weighted analysis to 16,827 patients for national representation).

Description of Outcome Variable

The outcome of interest was readmission (yes/no) within 30 days from discharge after CAS. Among all subjects undergoing CAS, 7.4% of subjects were readmitted, while 93.6% were not readmitted.

Description of Predictor Variables

A total of 42 clinically pertinent variables were included for model development. These included patient demographics, insurance status, hospital bed size, teaching hospital teaching status, length of stay, and relevant comorbidities. We performed exploratory analysis (Table 1) by evaluating all categorical and continuous variables. Additionally, we evaluated for missing data and zero variances and found no zero variance categorical predictors along with any missing values among all the predictors. The continuous variables were standardized to zero mean and unit variance. The categorical variables were dummy coded using one-hot encoding, which were then used as inputs for the prediction model.

Table 1 Baseline characteristics and procedure-related factors during index admission for CAS

Model Specification

Deep neural network (DNN) is a machine learning algorithm that processes data with complex architectures to perform classification and build prediction models. Briefly, DNN is an artificial neural network with one input layer, at least four hidden layers, and one output layer [11]. Fully connected DNN conventionally uses a feed forward learning mechanism where the information flows from the input layer, through the hidden layers, towards the output layer to learn complex data patterns, extract important features, and make predictions. DNN uses nonlinear transformations to construct such prediction models [12, 13]. To reveal patterns from large datasets and investigate those patterns with biological questions of interest, a key factor is to set up a DNN model properly. For this end, we choose the proper choice of hyper-parameters, such as the number of hidden layers and the type of loss function using a grid search using Grid_SearchCV [14] function on a range of values of the hyper-parameters and then choose the combination which results in the best accuracy. In our model setup, we used (1) hidden layers ranging between 1 and 100 with a step size of 1 and between two loss functions “adam” and “rmsprop”, where “adam” function performed the best and with four as the number of hidden layers. (2) To choose the number of nodes on each hidden layer since we had 91 input variables after dummy coding, we chose the number of nodes to be 128 per hidden layer, as the rule of thumb that recommends the number of chosen nodes to be less than twice the size of the input. (3) For nonlinear activation function to the hidden layers rectified linear unit (relu) was used and sigmoid function as the output layer’s activation function.

Evaluation Strategy

Addressing Class Imbalance

Only 7.4% of the CAS patients were readmitted within 30 days in our study, highlighting a very low readmission rate as often seen with medical data. Particularly, in our dataset, we observed that the proportion of non-readmitted patients (93.6%) was 13 times more than that of readmitted patients (7.4%), clearly making our data highly class imbalanced. To address this class imbalance problem, we adjusted the minority class (patients who were readmitted) weight to 13 times the majority class’s weight (patients who were not readmitted). This adjustment inherently constructs a model with better generalization. These class weights are used during the DNN model’s training to increase the misclassification cost of minority class samples. This makes the training process pay more attention towards the minority class samples, thus increasing the sensitivity of the prediction model.

Incorporating AUPRC as a Metric of Performance Evaluation

Inferring model performance from area under receiver operating characteristic curve (AUROC) can be deceptive in imbalanced datasets like ours where only 7.4% of patients were readmitted while 93.6% were not readmitted. Specifically, in an imbalanced setup, since the number of negative samples (non-readmitted patients) is very large, the false positive rate increases more slowly because the true negatives would probably be very high and make this metric smaller. Therefore, a receiver operating characteristic (ROC) curve can have a better result while misclassifying most or all the minority class. However, precision is not affected by a large number of negative samples, which is because it measures the number of true positives out of the samples predicted as positives. Precision is more focused on the positive class than in the negative class; it actually measures the probability of correct detection of positive values, while false positive rate (FPR) and true positive rate (TPR) (ROC metrics) measure the ability to distinguish between the classes. In contrast, area under the precision recall curve (AUPRC) scores are specially designed to detect rare events and are appropriate in these scenarios as they particularly show a classifier having a low performance if it is misclassifying most or all the minority class [15, 16].

Splitting Data for Training Models

We divided the dataset into three parts, 70% for training, 10% for validation, and 20% for testing. Firstly, we build the model using 70% of the data by obtaining the optimized weight of each node. Secondly, to evaluate our built model’s fit, we apply the model on a validation set to check the prediction accuracy, to which we found it to be at 90.86%. Finally, the remaining 20% of data was used to test the performance of the model. The performance of the model was evaluated using accuracy, precision, sensitivity, specificity, and AUROC. Additionally, we used ELI5 to predict the importance weight of each feature. To determine variable importance, ELI5 uses the idea of permutation importance, which randomly shuffles the values of a single feature and makes a prediction using the shuffled dataset. The prediction accuracy using the shuffled dataset and original dataset are then compared to enumerate the performance degradation. This process is conducted till the importance of all the features is computed.

Results

Population Characteristics and Descriptive Results

The NRD included 18,882 individual patients undergoing CAS from January through December 2017 among hospitalized patients 18 years of age and older. We excluded those who died during the index hospitalization (n = 251) and were discharged after November 30, or had missing data on length of stay. The final study sample included 16,745 patients who were discharged alive after index CAS from January through November 2017.

After index CAS, 7.4% patients (n = 1233) had 30-day readmission. Notably, 8.6% of patients returned with acute cerebral infarction, 4.2% acute hemorrhagic cerebrovascular disease, and 4.7% of patients died during the unplanned 30-day readmission encounter. Table 1 gives a detailed synopsis of the baseline and procedure-related factors associated with early readmission. Forest plot analysis of comorbidities and procedure-related factors affecting 30-day readmission is presented (Fig. 1).

Fig. 1
figure 1

Forest plot analysis of comorbidities and procedure-related factors affecting 30-day readmission after carotid artery stenting

Cardiac causes made up 39.8% of all readmissions. Ten leading causes and frequencies of primary diagnosis category for readmissions encounters based on Clinical Classification Software Refined (CCSR) categories are presented in Table 2. Total charges of care for the index and 30-day unplanned admissions amount to over $1.9 billion and $66 million, respectively.

Table 2 Causes and frequencies of primary diagnosis category for readmissions encounters [based on the primary Clinical Classification Software Refined (CCSR)]

Performance Comparison of Deep Neural Network with Other Machine Learning Models

We evaluated the proposed DNN model’s performance with four other frequently used machine learning algorithms in medical data, such as logistic regression, random forest, decision tree, and support vector machine. Firstly, we found that its accuracy was 92.57% for logistic regression, with AUROC of 0.68 and AUPRC of 0.14. Secondly, the accuracy was 55.26% for random forest, with AUROC and AUPRC of 0.611 and 0.367, respectively. Thirdly, for the decision tree, the accuracy was 78.19%, with AUROC of 0.588 and AUPRC of 0.269. Lastly, for the support vector machine, the accuracy was 70.35% with AUROC of 0.67 and AUPRC of 0.14. DNN, on the other hand, produced an accuracy of 87.43% but notably had a higher AUROC of 0.79 (validation 0.73) and AUPRC of 0.383 compared to all other models (Table 3). A graphical representation of the performance (AUROC and AUPRC) of the proposed DNN prediction model with other classification models on 30-day readmission data for CAS subjects is shown in Fig. 2.

Table 3 Machine learning algorithms and accuracy in predicting early readmission post CAS
Fig. 2
figure 2

ROC and AUPRC analysis of DNN prediction model with other classification models on 30-day readmission data for CAS subjects. Plot of prediction capability of machine learning models

Predictors of 30-Day Readmission and Costs

To identify the variables that had a higher contribution to our DNN models predictive power, we applied ELI5 to identify such variables. Specifically, using Eli5, we obtained importance weights for each of the 42 model variables used in this analysis and then ranked them using their respective percentile score. Variables with their importance weight ranked in the top 20 percentiles (ranging from 0.0116 to 0.0067) were then selected as the most influential/important variables in our model (Table 4). Following this procedure, we obtained ten most important variables to predict 30-day readmission (Fig. 3).

Table 4 Important score of each variable in early readmission post CAS dataset
Fig. 3
figure 3

Bar graph diagram showing relative importance of predictors for unplanned readmission

Discussion

This study, which examines the rate and costs of 30-day readmissions after index CAS, finds that 7.4% of patients get readmitted within 30 days of discharge after undergoing CAS. The major causes for 30-day unplanned readmission were septicemia or cerebral infarction/hemorrhagic cerebrovascular bleed. Using machine learning approaches, we can develop a risk prediction model that can identify patients at high risk of unplanned readmissions with a C-statistic of 0.802 using DNN, to the best of our knowledge, the first of its kind in many ways. This is the most contemporary analysis looking at 30-day unplanned readmissions. The first study uses nationally representative data to develop risk prediction models using advanced machine learning like DNNs for CAS. On the basis of AUROC and AUPRC metrics, DNN shows superior performance to commonly used statistical or machine learning methods in modeling CAS readmission rates.

Prior studies looking at readmissions have shown a variety of readmission rates, including vascular interventions in Medicare patients (24%) [17], endovascular aortic aneurysm repair (10.2%) [18], lower extremity bypass (14.8%) [19], endovascular or surgical revascularization for chronic mesenteric ischemia (19.5%) [20], and revascularization for critical limb ischemia (20.4%) [21]. Looking specifically at the readmission rates in the CAS population, most prior studies have compared CEA readmission rates versus CAS. These studies have demonstrated rates of 12.0% and 8.3% for Medicare-only and nationally representative data, respectively, for the CAS cohorts [5, 22]. Other studies have also shown similar readmission rates for CAS patients in the range of 9.6% in the Pennsylvania Health Care Cost Containment Council study by Hintze et al. [23], 10.75% by Greenleaf et al. [24], and 11.11% by Galinanes et al. [25]. All these prior studies have used patient-level data from before 2015 when ICD-9 was in use. Our study gives a glimpse into the most contemporary nationally representative data using ICD-10 codes. In addition, our national evaluation of unplanned 30-day readmissions after CAS has several key findings. Our observation showed that 7.4% of patients undergoing CAS had unplanned readmissions within 30 days of hospital discharge. The decline in readmission rates observed in our study, as compared to aforementioned prior studies, may be related to increased operator and/or hospital experience or may be due to strict inclusion/exclusion criteria employed in our study [26, 27].

In our study, sepsis was found to be one of the leading causes for readmission post CAS. Interestingly literature review showed that postoperative surgical-site infection, sepsis/septic shock, pneumonia, and urinary tract infection are known associations with readmissions after CEA [28]. Quiroz et al. looked into hidden readmissions after CEA and CAS, and found infectious etiologies amounting to 9.9% as a cause for readmission (wound complication 3.7%, sepsis 3.1%, urinary tract infection/pyelonephritis 0.5%, and other infections 2.6%). This proves that the infection/sepsis rates found in our study were not in excess of those in the existing literature [29].

There are over 40 models for predicting short- and long-term outcomes after carotid revascularization [30]. However, the prediction models available have the following potential limitations. Firstly, most of the models have used patient databases that are not representative of the national population. The models have used logistic regression, and none of them have used artificial intelligence to improve the quality of predictions. The existing models have used data from the ICD-9 era wherein the specificity of diagnostic codes was significantly inferior to ICD-10 codes. None of the short-term models have looked at all-cause readmissions and can only predict stroke or death. C-statistic (or area under the curve) is considered an important discriminating factor for the accuracy of a prediction model with scores of < 0.50, > 0.50–0.70, > 0.70–0.80, < 0.80–0.90, and > 0.90 representing no, poor, low, excellent, and outstanding discrimination, respectively [31, 32]. Volkers et al. presented an excellent external validation study evaluating 30 prediction models for CEA, CAS, or both and found that not a single model had C-statistic over 0.67 (poor prediction capability) during external validation proving that although there are many models to choose from none of them truly qualify for being substantially useful by clinicians [30].

Despite having a plethora of existing state-of-the-art statistical and machine learning methods in modeling readmission rates, it is noteworthy that our novel implementation of DNN helped us build a model with increased predictive power and, at the same time, facilitated the identification of features that are clinically both relevant and important to their association with the event of readmission post CAS. It was also interesting to see that although DNN had a lower or similar level of accuracy compared to traditional methods, the score of our model with respect to performance metrics such as AUROC and AUPRC was significantly higher. One plausible reasoning for such an observation stems from the fact that the accuracy of existing models such as logistic regression is heavily biased towards the majority class’s proportion as opposed to the minority class samples. Although it enriches the accuracy score, this inherent bias has downstream consequences in high misclassification rates, subsequently resulting in low predictive power [33]. In contrast, DNN is agnostic to such biases. It uses the entire dataset to find out the complex patterns between the variables and then further utilizes this pattern to classify the outcome labels even if the data is highly imbalanced. Therefore, on the basis of our strong evidence, DNN should serve as a premium choice in building more robust and adaptive predictive models for accurate predictions in complex data architectures such as 30-day readmission post CAS.

Patients with comorbidities, including depression, heart failure, cancer, in-hospital bleeding, and coagulation disorders, were the strongest predictors of readmission based on logistic regression, as shown in Fig. 1. The logistic regression analysis did not include hospital-level factors like teaching status, control ownership, or hospital location. Also evident was that most comorbidities had overlapping power, which would make it challenging to develop a robust prediction model using this analysis alone. This meant that logistic regression-based prediction models would not perform well in a clinical setting. Our study further improved upon this and used DNN and hospital-level data to identify novel predictors for early readmission, as shown in Fig. 3. The DNN model provided a zoomed-in view with refined results and showed that factors like hospital rural–urban designation, control/ownership, and teaching status form one of the strongest predictors in addition to newly identified comorbidities to identify patients at risk of early readmissions post CAS. Patients with these comorbidities were more likely to get readmitted, which may have been due to disease progression. Further prospective research would be needed to determine real impact and causal associations.

Reasons for readmission based on the primary diagnosis code were septicemia (8.6%), cerebral infarction (8.6%), heart failure (5.9%), acute hemorrhagic cerebrovascular disease (4.2%), acute renal failure (4.2%), and gastrointestinal hemorrhage (3.6%). These further affirm the need to develop robust prediction models to help decrease unplanned readmissions and comorbidities. Interestingly, in our study, patients with different insurance status (Medicare, Medicaid, or private) and hospital bed size had no significant impact on unplanned readmissions. In contrast, patients with higher scores on mortality/severity of illness subclass of APRDRG scores had worse outcomes in terms of all-cause readmissions. Patients treated at private hospitals compared to government, nonfederal hospitals, and those at metropolitan hospitals were at increased risk for unplanned readmissions. This may be attributed to the difference in the practice patterns at different hospital locations or subtypes or secondary to the number of cases being done at that location and physician experience [26, 27].

Multiple prediction models have been developed in the past, mostly looking at outcomes like recurrent stroke, myocardial infarction, or death. Unfortunately, none of them have a prediction tool to help with short-term readmission risk due to all-cause readmissions. The studies also lack prediction power, especially when evaluated with external validation. A study by Volkers et al. presented an excellent external validation study evaluating 30 prediction models for CEA, CAS, or both and found that not a single model had AUC over 0.67 during external validation [30]. This further proves the point that although there are many models to choose from, none of them truly qualify for being substantially useful in current day practice. Our prediction model is novel in many ways. First, it is the first model to use nationally representative data in the contemporary ICD-10 era and uses machine learning models to predict all-cause unplanned short-term readmissions. The AUC score of 0.79 for DNN is very robust in predicting our primary outcome.

Study Limitations

NRD is in a format of annualized data with a maximum follow-up of 1 year. As with any observational data, the results do not suggest a causal relationship as there can be other unmeasured confounders. The NRD database does not provide pharmacological data/lesion-level data that may impact readmissions. Lastly, the presented risk scores have not been externally validated and currently stand applicable only to the US population.

Conclusion

Our analysis suggests that 7.4% of patients get readmitted within 30 days of discharge after undergoing CAS with septicemia or cerebral infarction/hemorrhagic cerebrovascular bleed as the major causes of all unplanned readmissions. We demonstrate that using machine learning approaches, and we are able to develop a risk prediction model that is able to identify patients at high risk of unplanned readmissions with a C-statistic of 0.802 using DNN. Our work is an exemplar of how machine learning techniques can be used to identify patients at high risk of unplanned readmission for targeted interventions, which, if efficacious, may represent significant healthcare savings to the wider healthcare economy. We plan to acquire funding to develop an easy-to-use online tool, and a software plug-in for existing electronic medical record software to allow for quick assessment of readmission risk in patients undergoing CAS.