Introduction

Multidrug-resistant organism (MDRO) are bacteria simultaneously resistant to three or more different antibiotics. The infection caused by such bacteria is called MDRO infection [1]. Intensive care unit (ICU) patients are in critical condition and require various invasive procedures [2]. Thus, ICU is deemed the hardest-hit area for MDRO infection in hospitals [3]. Approximately 50% of ICU patients in developing countries suffer from at least one hospital-acquired infection; the corresponding rate for developed countries is 25% [4]. Antibiotic abuse and bacterial mutation have increased the number of MDRO and drug resistance [5]. The World Health Organization stated that bacterial resistance can cause a massive burden of disease, including social expenditure and medical expenditure, and will lead to a decline in global gross domestic product of 1.40–1.60% [6]. The result of a multicenter prospective cohort study showed that the 30-day mortality of patients infected with Carbapenem-resistant Klebsiella pneumoniae in China, the United States, and South America were 12.00%, 23.00%, and 28.00%, respectively [7]. However, there is no specific therapy for MDRO infection.

The risk of MDRO infection is early predicted for patients, and appropriate interventions are taken in time, the MDRO colonization rate of ICU patients can be effectively reduced, and reduce the chance of self-infection and cross-infection between patients and healthcare workers [8]. However, the drug sensitivity test and microbial culture results need 24 to 72 h to obtain, resulting in a "lag" in determining the infection status of patients. Given the potential benefits of predictive models in MDRO, many researchers have developed various models based on logistic regression (LR) to predict the risk of MDRO infection [9,10,11]. Fortunately, LR has the following shortcomings: LR requires a specific linear relationship between the independent and transformed dependent variables. Moreover, the LR model lacked the ability for self-learning and iteration. Once the time and population characteristics changed, the model tended to underperform [12].

The backpropagation neural network (BPNN), one of the most widely used deep learning methods, is a multilayer forward neural network trained according to the error backpropagation algorithm [13]. Compared with traditional LR, the advantage of BPNN is no need for prior knowledge of the mapping relationship between independent and dependent variables. As long as sufficient samples are provided for training, it can complete the nonlinear mapping from input to output variables. BPNN can accept all kinds of independent variables simultaneously without any form of variable transformation, which preserves data information to the greatest extent [14]. In addition, BPNN has strong self-learning and adaptive ability and constantly updates and improves its performance in the use process [12]. The BPNN model has been used to construct disease diagnosis and prognosis prediction models and achieved sound prediction effects [15, 16]. Nevertheless, as far as we know, no study has used it to predict the risk of MDRO infection in ICU patients. Therefore, this study aims to establish the MDRO infection model through BPNN, which identifies high-risk factors and high-risk groups of MDRO infection early and guides the implementation of interventions to reduce the incidence of MDRO infection in ICU patients.

Methods

Study population

We retrospectively collected data from patients who received treatment in the ICU of the Affiliated Hospital of Qingdao University from July 2021 to January 2022. The primary cohort enrolled 688 critically ill patients. For external validation, patients in the same study center from May 2022 to July 2022 were selected in the validation set.

All adults (aged ≥ 18 years and ≥ one-time microbial culture performed during ICU hospitalization) in ICU were enrolled in this study. Patients who died or left the ICU within 48 h, had incomplete case data or were diagnosed with MDRO infection prior to ICU admission were excluded. Only the first admission was included for analysis for patients with multiple ICU admissions during hospitalization.

This study has obtained the approval of the Ethics Committee of Qingdao University Medicine (QDU-HEC-2021173). As this study was retrospective and data were anonymized, informed consent was waived.

Data collection

We obtained patient information through hospital infection surveillance and electronic medical records systems. Initial candidate factors may be associated with MDRO infections, including general data, invasive procedures, medication, laboratory indicators, and the scores. General data included gender, age, body mass index, length of hospitalization, length of ICU stay, and comorbid diseases (including diabetes, hypertension, chronic lung disease, liver disease, chronic renal disease, congestive heart failure, and cerebrovascular disease). Invasive procedures included surgical situations, mechanical ventilation, central venous catheters, gastrointestinal decompression, peripherally inserted central venous catheters, extracorporeal membrane oxygenation, urinary tube, and other drainage tubes in ICU. Medication included antibiotic use, hormone, and nutritional support therapy during ICU. Laboratory indicators included albumin, prealbumin, C-reactive protein, procalcitonin, white blood cells, blood–urea–nitrogen, and creatinine within the first 24 h of their ICU stay. The scores included the APACHE II score, Glasgow coma scale, and nutrition risk screening (NRS)-2002 score within 24 h of admission in the ICU. The diagnosis of the combined disease was as per the International Classification of disease-10 code [17].

This study obtained specimens for microbiologic cultures from blood, urine, sputum, pus, drainage fluid, and secretions. VITEK2 Compact System automatic microbial identification and drug sensitivity analysis system were used for culture identification of strains, and the Kirby Bauer paper diffusion method was applied to the drug sensitivity test of strains. The definition of MDRO was based on the provisional standard definition of MDRO published by Magiorakos and other experts [18]. Long-term bed rest refers to being bedridden for 15 days at least, and more than 90% of the time in bed within 1 day. The surgical situation included the grading of the operation, the classification of incision, and the healing of the incision.

Screening for risk factors

Patients were categorized into MDRO-infected and non-MDRO-infected groups in accordance with the presence or absence of MDRO infection during the ICU. We combined Lasso and stepwise regression to screen risk factors. Lasso regression used tenfold cross validation to select the optimal penalty coefficient (lambda). The variables whose coefficients were not zero had a significant relationship with the dependent variable and were preserved [19]. Lasso can avoid adding too many independent variables into the BPNN model, thereby reducing the network's complexity and computation and improving the model's prediction accuracy. Then, stepwise regression was applied to further select the optimal combination of independent variables. This method was the introduction of variables one after the other. After introducing a new variable, the old variables that had been selected in the regression model were tested one by one, and the variables that were not meaningful were deleted [20]. This process continued until no new variables were introduced and no old variables were deleted. Variables with bilateral P < 0.05 were identified as independent risk factors for MDRO infection.

Development and validation of the BPNN model

These confirmed independent risk factors for MDRO infection were used as input variables to construct a BPNN model. The BPNN algorithm employed gradient descent to continuously adjust the weights and thresholds among layers through backpropagation to minimize the sum of error squares of the network [21].

These data of the primary cohort were randomly divided into a training set and a test set in an 8:2 ratio, where the training set was utilized to construct the model, and the test set was utilized to evaluate the model's ability to discriminate new samples. To further evaluate the generalization ability and universality of the model, external validation was performed by period validation, that is, patients from the same study center at different times. At this stage, patient data were mainly collected based on independent risk factors confirmed during model construction.

Statistical analysis

All variables in this study had less than 5% missing values, and mean interpolation was accomplished. Outliers were values that were less than the difference between the first quartile and 1.5 quartile spacing or more than the sum of the third quartile and 1.5 quartile spacing. Outliers in the data were replaced using mean values [22].

Continuous data were described as means ± standard deviation or median and interquartile range (IQR), and group comparisons were performed using the Students' t test or Mann–Whitney U test. Categorical data were expressed as frequency and percentage, and comparisons were made using the Chi-square or Fisher's exact test between groups.

In this study, Lasso and stepwise regression were performed using "glmnet" and "MASS" packages of R 4.2.3. The BPNN model was constructed with the "nnet" package of R 4.2.3. The model's predictive performance was evaluated in terms of calibration and discrimination. The discrimination was assessed by accuracy, sensitivity, specificity, and area under the curve (AUC). Calibration curves investigated the calibration of the model.

Results

Baseline characteristics

Figure 1 shows the flow chart of patient screening. There were 3673 patients enrolled, including 2764 and 909 patients in the primary cohort and validation set. In the primary cohort, 2031 patients were eliminated due to not meeting inclusion and exclusion criteria. Ruling out 46 patients with community-acquired MDRO infection, a total of 688 patients were identified, including 550 patients in the training set and 138 in the test set. The incidence of MDRO infection in the ICU was 15.84% (109/688), with the highest detection rate of Carbapenem-resistant Acinetobacter baumannii (CR-AB). The detection rate of other types of drug-resistant bacteria was depicted in Additional file 1: Table S1. There were 259 (37.65%) females and 429 (62.35%) males. The median age was 65.50 (IQR, 53.00–74.00) years. The body mass index was 23.88 (IQR, 21.48–26.44) kg/m2. The length of hospitalization and ICU stay were 19.00 (IQR, 11.00–29.00) days and 9.00 (IQR, 5.00–16.00) days, respectively (Table 1).

Fig. 1
figure 1

Flowchart for patients selection

Table 1 Demographic and clinical characteristics at baseline in the primary cohort

Excluding 658 patients without meeting inclusion and exclusion criteria and 13 patients with non-first ICU admission, 238 patients were enrolled to validate externally. The prevalence of nosocomial infection of MDRO was 13.00% (31/288). The specific characteristics of the patients are summarized in Table 2. The comparisons of parameters between the primary cohort and validation set were presented in Additional file 1: Table S2.

Table 2 Demographic and clinical characteristics of patients in the validation set

Independent risk factors for MDRO infection

In our study, lasso adopted nested tenfold cross verification to select the largest lambda with mean error within one standard deviation (lambda.1se) as the optima lambda. As shown in Fig. 2, the optimal lambda was 0.033, corresponding to 11 variables with non-zero coefficients: NRS-2002 score, APACHE II, number of antibiotics and duration of combination, chronic lung disease, hypoproteinemia, invasive operation before ICU, antibiotic use before ICU, length of ICU stay, long-term bed rest.

Fig. 2
figure 2

Features selection by Lasso. A Tenfold cross validation for the optimal lambda (λ) parameter selection in the LASSO model. There are two dashed lines in the cross-validation diagram, one is the input value with the minimum Mean Square deviation and the other is the input value of the minimum Mean Squared Error(MSE). We take the value of λ with the minimum MSE as the optimal λ. B Binomial deviance curve was plotted versus log (λ), where λ is the tuning parameter Lasso regression cross-validation results. LASSO: least absolute shrinkage and selection operator

On this basis, variables were further analyzed using backward stepwise regression. APACHE II (OR 1.06, CI 1.02–1.10; P = 0.002), quantity of antibiotics (O R 1.81, CI 1.18–2.78; P = 0.002), chronic lung disease (OR 2.02, CI 1.02–3.97; P = 0.04), hypoproteinemia (OR 3.59, CI 1.21–10.35; P = 0.01), invasive operation before ICU (OR 2.20, CI 1.17–4.11; P = 0.01), antibiotics use before ICU (OR 2.95, CI 1.58–5.53; P < 0.001), length of hospitalization (OR 1.04, CI 1.02–1.10; P < 0.001), length of ICU stay (OR 1.02, CI 1.00–1.05; P = 0.04), and long-term bed rest (OR 3.69, CI 1.80–8.12; P < 0.001) were risk factors for MDRO infections (Table 3).

Table 3 Multivariable logistic analysis for MDRO infection

Construction and evaluation of the BPNN model

Nine independent risk factors screened above were employed as input variables to develop the BPNN model (Fig. 3). The parameters of the model were Activation (nonlinear function): logistic, hidden_layer (number of hidden layers): 1, sizes (Number of hidden layer nodes): 3, max_iter (number of iterations): 10, and linout (output function): logistic.

Fig. 3
figure 3

BPNN model for predicting MDRO infection. BPNN: backpropagation neural network; MDRO: multidrug-resistant organism

The model's prediction performance was assessed by AUC, accuracy, sensitivity, and specificity, as shown in Table 4. The AUC of the training set and test set were 0.889 and 0.919, respectively. The validation set revealed the same result (AUC = 0.811). Comparisons of the AUC for the model training set, test set, and validation set are depicted in Fig. 4. Calibration curves of the test and validation set showed that the model had good calibration ability (Fig. 5).

Table 4 Performance of the BPNN model in the training, test and validation set
Fig. 4
figure 4

AUCs of the BPNN model of MDRO infection. The x-axis represents 1-specifcity, and the y-axis represents sensitivity. The part below the red, green and blue lines are the AUCs of the train, testing and validation set. AUC: area under the curve; BPNN: backpropagation neural network; MDRO: multidrug-resistant organism

Fig. 5
figure 5

Calibration curves of the BPNN model. The x-axis represents the predicted probability of MDRO infection. The y-axis represents the actual diagnosed of MDRO infection. The blue solid line represents the perfect prediction with the same predicted probability as the actual probability. The black line represents the performance of the nomogram. The closer the calibration curve of the model is to the black line, the better the model prediction is represented. A Calibration curve of the test set. B Calibration curve of the validation set. BPNN: backpropagation neural network; MDRO: multidrug-resistant organism

Features importance ranking in the BPNN model

In the BPNN model, the top 5 risk factors affecting MDRO infection were the length of hospitalization, the length of ICU stay, long-term bed rest, antibiotics use before ICU, and APACHE II (Fig. 6).

Fig. 6
figure 6

Ranking of features importance in the BPNN model. BPNN: backpropagation neural network

Discussion

In this study, we developed and validated an MDRO infection prediction model for ICU patients based on the BPNN algorithm. The model included nine scientifically and clinically accessible independent risk factors: length of hospitalization, length of ICU stay, long-term bed rest, antibiotics use before ICU, APACHE II, invasive operation before ICU, quantity of antibiotics, chronic lung disease, and hypoproteinemia. Utilizing a handful of variables, the BPNN model achieved good performance with high accuracy and sensitivity for predicting the incidence of MDRO infection in ICU patients. Furthermore, we found that the drug-resistant bacteria causing infection in ICU patients were mainly Gram-negative bacteria, especially CR-AB. CR-AB can survive for several days on dry surfaces, and can also be asymptomatic to colonize the skin, respiratory tract, and intestines. Therefore, active monitoring of CR-AB should be strengthened for the high-risk population of MDRO infection.

The prediction model can forecast the risk of individual MDRO infection based on predictors, providing theoretical support for the early identification of high-risk groups and better guidance for formulating MDRO infection management strategies [23, 24]. More and more scholars have begun to explore the construction of the MDRO infection prediction model. Wang et al. collected the data from 331 patients, adopting the method of univariate analysis followed by multivariate analysis. Finally, three risk factors were integrated to build an MDRO infection prediction model with an AUC of 0.77 (95% CI 0.70–0.84) [11]. However, the model's poor performance in predicting MDRO infection risk may be due to other valuable independent variables ignored during the data analysis. The relationships between variables in the ICU are complex, including linear or nonlinear relationships. Nevertheless, LR was used by default to deal with linear relationships between independent and dependent variables and may oversimplify complex nonlinear relationships. BPNN was widely applied in the medical field with its unique advantages, including disease diagnosis, disease classification, prognosis prediction, etc. In this study, the MDRO infection prediction model was constructed using BPNN. The AUC of the training and test sets were 0.889 and 0.918, respectively. Compared with the previous MDRO infection models [25,26,27], the prediction performance of our BPNN model constructed was improved. In addition, we collected 238 ICU patients' data for external verification. The AUC, accuracy, sensitivity, and specificity were 0.811, 0.852, 0.806, and 0.715, respectively. These results demonstrated that the BPNN model had good discrimination. That suggested that our model had good external applicability and could be used clinically for early prediction of MDRO infection in ICU patients.

In the BPNN model, length of hospitalization, length of ICU stay, long-term bed rest, antibiotics use before ICU, and APACHE II score were the top 5 predictors of MDRO infection. Length of hospitalization and ICU stay were correlated with MDRO infection in ICU patients, which agreed with the conclusions of previous studies [28]. Compared with the non-ICU environment, there are more bacterial isolates in the ICU environment, and the susceptibility is generally lower. ICU patients are more likely to be directly or indirectly exposed to MDRO [29]. As previous evidence indicated, this study also found that long-term bed rest was an independent risk factor for MDRO infection [30]. A meta-analysis showed that prior use of antibiotics, especially third-generation cephalosporin antibiotics, was higher in the multidrug-resistant Gram-negative infection group than in the non-infected group, significantly increasing Gram-negative resistance [31]. This study showed the same results: antibiotics use before ICU was an independent risk factor for MDRO infection. APACHE II score is a tool for evaluating the severity of patients' disease and predicting prognosis. The previous study found that the higher the APACHE II score, the greater the likelihood of MDRO infection and mortality [32]. This study similarly found that the APACHE II score was positively associated with MDRO infection. The previous study showed the association between MDRO and major surgery operation before admission to ICU [33], quantity of antibiotics [34], chronic lung disease [35] and hypoproteinemia [36].

This study has the following advantages. We combined Lasso and stepwise regression to screen for risk factors to avoid multiple collinearity and overfitting of variables. In addition, compared with LR, the BPNN algorithm had strong fault tolerance, nonlinear mapping ability, self-learning and adaptive ability, and generalization ability. Thus, BPNN was employed to mine data characteristics and develop our study's MDRO infection model for ICU patients.

However, it was undeniable that our study had some drawbacks. First, the current study was a single-center retrospective modeling study, which restricted us from determining causal relationships between predictors and outcomes. Therefore, further prospective clinical trials are needed to verify the validity of our model. Second, the retrospective and observational data may result in selection bias. Finally, although external validation was performed in this study, it was limited to data from the same center. In subsequent studies, the sample size can be further expanded, and multicenter studies can be added to optimize the structure of the BPNN model.

Conclusion

We combined Lasso and backward stepwise regression to screen out nine predictors and built the BPNN model for MDRO infection in ICU patients based on them. The model has proven good prediction performance, which may be an effective instrument for identifying high-risk groups of MDRO infection in the early stage and helping medical personnel intervene early to reduce the rate of MDRO infection in critically ill patients.