Introduction

Urinary sepsis due to upper urinary tract obstruction is most commonly caused by ureteral stones [1] and has a high risk of potentially serious complications such as septic shock and/or disseminated intravascular coagulopathy [2,3,4]. Thus, the high mortality rate of up to 26% emphasizes the importance for an immediate treatment of concomitant urinary tract infection (UTI) in patients presenting with symptomatic ureterolithiasis. Therefore, accurate identification of these patients could help to guide clinical decision-making and reduce the rate of sepsis development. Moreover, particularly in an era of increasing antibiotic resistances, the number of unnecessary antibiotic treatments could be reduced [5].

However, early diagnosis and treatment of concomitant UTI is challenging, as the incubation time of urine cultures, the gold standard for diagnosis of UTI, usually requires at least 24 h. Therefore, for diagnosis of a concomitant UTI in patients with symptomatic ureterolithiasis, clinicians frequently base their diagnostic and clinical assessment on the subjective interpretation of symptoms, and physical and laboratory findings. However, this approach bears the risk of over- or undertreatment with antibiotics and/or emergent surgical interventions [6]. While previous studies identified several blood and urine laboratory biomarkers for the prediction of UTI, a combined analysis incorporating multiple biomarkers into a multivariable predictive model has not yet been performed [7,8,9,10,11,12,13,14,15,16,17,18].

We therefore analyzed a consecutive cohort of patients and used a machine-learning-based approach to identify the variables that offer the highest discriminatory power. We hypothesized that this approach will enable the development of a logistic regression model that can accurately identify patients at risk of concomitant UTI by predicting a positive midstream urine culture at time of admission and guide clinical decision-making.

Patients and methods

Patient population

We retrospectively reviewed data from a consecutive cohort of patients who presented at our tertiary care emergency department due to a symptomatic ureterolithiasis between 2011 and 2017. Exclusion criteria were missing follow-up, age under 18 years, nephrolithiasis only, reported ongoing antibiotic therapy, anatomical aberrations, solitary kidney, missing laboratory values, chronic bacteriuria due to an indwelling transurethral catheter, ureteric stent, and/or incontinence. The review of the patient cohort included the following variables: patient’s age, gender, symptoms on admission (e.g.; nausea, chill, abdominal pain, flank pain, dysuria, pollakiuria, and gross hematuria), medical conditions (e.g., immunosuppression including organ transplant recipient, diabetes mellitus, human immunodeficiency virus (HIV) infection, and autoimmune/rheumatoid disease), vital signs on admission (e.g., body temperature, blood pressure, and heart rate), laboratory work-up on admission, including extended blood sample analysis [including serum levels of creatinine, C-reactive protein (CRP), hemoglobin, platelets, neutrophils, lymphocytes, and leukocytes], dipstick urine analysis on admission (including urine erythrocytes, leukocyte esterase, pH, and nitrite), as well as radiological findings on admission (e.g., renal pelvis ectasia, perirenal stranding, and fornix rupture). Continuous blood laboratory values were converted to categorical variables (decreased, normal, or increased) based on local laboratory reference ranges. Urine dipstick values were classified as negative/normal or elevated based on the local laboratory report. Urinary tract infection was defined as a positive midstream urine culture with > 104 colony forming units per milliliter (CFUs/mL) excluding the bacteria which are clinically non-relevant or indicating contamination (e.g., Lactobacillus spp., Gardnerella vaginalis, and/or Streptococcus spp.). The study was approved by the local ethics committee (BASEC-Nr. 2017-02036).

Statistical analysis

Continuous normally distributed variables are expressed as mean ± standard deviation (SD), continuous non-normally distributed variables are presented as median with interquartile range (IQR), and categorical variables are presented as percentage. To simulate external validation and to perform a true model performance evaluation, we randomly divided patients into a training cohort (80%) and a testing cohort (20%). Patients’ characteristics in the training set and testing data set were compared using Wilcoxon rank-sum test, Chi-square test of independence, Kruskal–Wallis test, or Fisher's exact test, as appropriate.

For fitting of the prognostic model, tenfold cross-validation and the least absolute shrinkage and selection operator (LASSO) approach was used to select the most relevant predictors from all available variables. Predictive mean matching was used to impute missing values in the training data set. During the LASSO procedure, a continuously reduced penalty (the sum of the absolute size of the regression coefficients multiplied a tuning parameter (lambda, λ) is used to shrink the absolute value of the respective regression coefficients of the assessed variables. Following this approach, some regression coefficients are shrunk to zero. The corresponding variables hold little-to-no discriminatory power and were not used during the fitting of the final model. The optimal value of λ was determined by a tenfold cross-validation in the training set. To do so, the area under the curve (AUC) across the cross-validation folds was calculated for different values of λ1.se. The weight of λ that minimizes deviation in the cross-validation is usually determined by λmin. However, the weight of λ that empirically has been shown to create the most parsimonious, but yet informative model, is λ1.se, which was also used during the fitting of the final model. λ1.se is defined as the value of λ within one standard deviation of the minimum mean cross-validated error [19]. Variables whose LASSO coefficient were not equal to zero at λ1.se were subsequently extracted and used during the fitting the final model. This cross-validation process reduces the risk of overfitting and it is a way of assessing how a model will perform in an independent dataset. In summary, the LASSO procedure allows a machine-learning based variable selection for the fitting of predictive or prognostic models. It has been suggested to be particularly well suited for variables that show high levels of multicollinearity [20, 21].

The selected variables were then used to fit a logistic regression model for prediction of UTI. To evaluate the discrimination ability of this model, the AUC of receiver-operating characteristics (ROC) curves was calculated for both the training and the testing cohort. AUCs were statistically compared using DeLong’s test. The differences between predicted probabilities and the observed proportions were assessed using calibration plots. The Hosmer–Lemeshow test was used to check the goodness-of-fit of the final logistic regression model. Internal validation was performed using 200 bootstrap re-samples as a means of calculating the most unbiased predictive accuracy. Based on the logistic regression models, a nomogram was developed to guide clinical decision-making. Finally, the decision curve analysis (DCA) was used to evaluate the clinical net-benefit of the model. All reported p values were two-sided, and statistical significance was set at 0.05. All statistical analyses were performed using R (Version 4.0.3, Vienna, Austria, 2020).

Results

After applying the exclusion criteria, a cohort of 705 patients was available for analysis. Patient characteristics, clinical and radiological findings, treatment and outcomes of all patients stratified by occurrence of UTI, and training/testing cohort are summarized in Table 1. The laboratory findings and their corresponding reference ranges or cut-off values are displayed in Table 2. In the total cohort, UTI was observed in 40 patients (5.7%). These patients had a significant higher rate of dysuria (28 vs. 12%, p = 0.008), higher pulse rate (86 bpm (IQR 64–96) vs. 72 bpm (IQR 64–83), p = 0.009), lower diastolic blood pressure [median 82 mmHg (IQR 71–90) vs. 88 mmHg (IQR 78–96), p = 0.006], and higher body temperature [36.9 °C (IQR 36.4–37.6 °C) vs. 36.7 °C (IQR 36.4–37.0 °C), p = 0.046]. Furthermore, patients with UTI had elevated serum levels of CRP (57 vs 23%, p < 0.001), leukocytes (55 vs. 42%, p = 0.021), neutrophil granulocytes (63 vs. 39%, p = 0.04), and creatinine (40 vs. 22%, p = 0.015). On urinary dipstick analysis, UTI patients had significant higher rates of positive nitrite (30 vs. 0.5%, p < 0.001) and positive leukocyte esterase (70 vs. 13%, p < 0.001). Patients with UTI also required longer inpatient stays compared to patients without UTI (median 4.5 vs 0 days, p < 0.001). While the rate of development of sepsis was higher in UTI patients, this did not reach statistical significance (5 vs. 0.8%, p = 0.055). Patients with UTI underwent significantly more subsequent surgical interventions (75 vs. 27%, p < 0.001) and received more often empirical antibiotic treatment (72 vs. 12%, p < 0.001). With the exception of the position of the biggest stone on CT scan as well as hemoglobin and thrombocytes levels, all baseline characteristics and the rate of UTI were equally distributed between the training and the testing cohort.

Table 1 Association of urinary tract infection with patient characteristics, clinical/radiological findings, treatment and outcome in 705 patients and stratification by training/testing cohort
Table 2 Association of urinary tract infection with laboratory findings in 705 patients and stratification by training/testing cohort

Model development, nomogram assessment, and performance evaluation

From all included variables, LASSO regression selected the variables elevated serum CRP level as well as positive nitrite and positive leukocyte esterase on urinary dipstick analysis in the training cohort for fitting of the model with the highest discriminatory ability. Positive nitrite on urinary dipstick was found to offer the highest discriminatory power for prediction of UTI. The final logistic regression model showed that all three variables remained significantly associated with risk of UTI on multivariable analysis (Fig. 1). Assessment of the nomogram axes indicated that the model demonstrates a wide range of predicted probabilities (5–90%) with positive nitrite contributing the highest number of points. In the training, testing and entire cohort, model performance evaluation showed a 200-fold bootstrap corrected AUC of 85.3% (95% CI 75.7–93.5%), 81.6% (95% CI 71.5–95.7%), and 85.8% (95% CI 78.7–92.2%), respectively (Fig. 2). In the testing cohort, the model demonstrated a negative predictive value of 98.1% and a positive predictive value of 27.6%.

Fig. 1
figure 1

Uni- and multivariable logistic regression model for prediction of positive midstream urine culture (left). The model was fitted using LASSO regression with tenfold cross-validation. Nomogram predicting risk of positive midstream urine culture based on the logistic regression model (n = 705, right). CRP C-reactive protein, OR Odds ratio, 95%CI 95% confidence interval

Fig. 2
figure 2

Receiver-operating characteristic curves and model performance evaluation for the prediction of positive midstream urine culture based on the logistic regression model (left: training cohort n = 564; middle: testing cohort n = 141, right: full cohort n = 705). AUC area under the curve, 95%CI 95% confidence interval

Model calibration and decision curve analysis

The calibration plot showed that the model showed that the model’s calibration curve ran very close to the diagonal reference line. This suggests near optimal agreement between predicted and observed outcome (Fig. 3A). Correspondingly, the Hosmer–Lemeshow test was insignificant for all cohorts. DCA showed that the model offers a clinical net-benefit relative to the treat-all approach between a threshold of 0–80%. Furthermore, the net-benefit provided by the novel logistic regression model was higher than the net-benefit provided by either one of its singular components (Fig. 3B).

Fig. 3
figure 3

A Calibration plots of the logistic regression model predicting of positive midstream urine culture, 200-fold bootstrap corrected (left: training cohort n = 564; middle testing cohort, n = 141; right entire cohort, n = 705). B Decision curve analyses for the evaluation of the clinical net-benefit using the novel logistic regression model for prediction of positive midstream urine culture (n = 705). CRP C-reactive protein

Discussion

In the current study, we developed and internally validated an accurate and easy-to-use nomogram for prediction of concomitant UTI in patients presenting with symptomatic ureterolithiasis. Using a machine-learning approach, we found that elevated levels of serum CRP as well as positive nitrite and leukocytes esterase on urinary dipstick analysis can accurately identify patients at risk of developing UTI. A 200-fold bootstrap corrected AUC of 81.6% (95% CI 71.5–95.7%) was demonstrated during our internal validation in a cohort that was not used during the development of our model.

As all predictive parameters identified in our model are part of the routine examination in patients presenting with symptomatic ureterolithiasis, we feel that our model is not only easy-to-use but will also be accessible to a broad community of physicians. This is of clinical importance, as there are no existing recommendations on predictive parameters and their thresholds for immediate diagnosis of concomitant UTI. However, individual and subjective interpretation of laboratory markers by clinicians may expose patients to unnecessary empirical antibiotic therapy and/or emergent surgical intervention, while increasing sepsis-related morbidity in case of missed diagnosis.

Results from our multivariable logistic regression analysis confirm the previous univariable findings for early detection of UTI [8, 12, 13, 15]. Nitrite has shown to deliver a low sensitivity of 16.7% but high specificity of 99.5% to diagnose UTI [8, 15]. The low sensitivity of nitrite can be explained by the exclusive detection of Gram-negative rod bacteria [22], which is also reflected in the relatively low probability of approximately 0.35 in our nomogram for an existing UTI when nitrite is found positive. This highlights the need for additional predictors to detect non-nitrite producing bacteria. Indeed, the combination of nitrite and urinary leukocytes esterase as the optimal combination has previously been postulated to rule out UTI with a high reliability [8, 12]. Similarly to urine parameters, elevated CRP levels have been reported to be with an 18-fold increase of UTI in case of obstructive pyelonephritis compared to patients without UTI and dilated renal pelvis [7, 17]. To the best of our knowledge, our model and the corresponding nomogram are the first to incorporate all three parameters to allow accurate prediction of UTI and guide clinical decision-making.

While an experienced urologist will not always require a nomogram to identify patients with concomitant UTI, our findings are still clinically important, as we were able to demonstrate that patients who do not exhibit the findings shown in our nomogram are indeed very unlikely to develop UTI. Hence, with an NPV of 98.1%, our model offers a very reliable method for physicians unfamiliar with obstructive urolithiasis to rule out urinary tract infection. Even though the threshold for early renal decompression should remain low, safe exclusion of UTI could help to reduce the rate of empirical antibiotic therapy, especially in the era of increasing antibiotic resistances. Validated decision-making tools are necessary, as we have found that even in our specialized urological department, 12% of all patients received an antibiotic therapy, which in fact was not necessarily due to negative urine culture. Considering the overall high incidence of symptomatic urolithiasis [23,24,25], this amounts to a significant amount of unnecessary antibiotic therapy that could be omitted. As genuine clinical applicability of a nomogram has previously been proposed for validated models/nomograms who exhibit AUC/C-indices > 0.75, we feel that our nomogram offers the potential to guide clinical decision-making and could be used by non-urologists for early decision-making and triage [26].

As calibration and validation of nomograms are paramount before the implementation in clinical practice, we performed a statistically rigorous evaluation of the proposed model [27]. Indeed, our model showed nearly perfect calibration properties. Furthermore, the nomogram demonstrates a wide range of predicted probabilities. Finally, the inclusion of only three readily available variables offers a very low level of complexity for our nomogram, suggesting that it is easily reproducible. To allow a realistic model performance evaluation, we aimed to imitate external validation by splitting our cohort into two different cohorts of patients. While true external validation with separate cohorts remains the best assessment of a models accuracy and a crucial step before transferring the models into clinical practice [27], we found that, encouragingly, all results from the training cohort could be reproduced in the testing cohort.

Although our current study uses a statistically rigorous validation and calibration process, several limitations exist. First are the limitations inherent to the retrospective study design. Thus, it is impossible to determine whether laboratory parameters appear to be affected by existing comorbidities and whether a patient has already received an unreported antibiotic treatment prior to evaluation. Second is the single center approach and the limited sample size, as reflected by high odds ratios and wide confidence interval in our multivariable logistic regression model. Third, our endpoint was positive urine culture, which, however, consisted of a single urine collection at the patient admission, giving the potential for missing a positive urine culture during the further clinical course. Additionally, patients with atypical infections or organisms that are hard to culture might have been falsely excluded from our analysis. It should also be considered that a midstream urine culture could be negative, while the urine proximal to an obstructing ureteral stone may be infected. This has been shown in previous studies where urine cultures taken from the renal pelvis were significantly more often positive compared to midstream urine cultures [28]. Furthermore, patients with an infected stone could also have a negative urine culture and might initiate a urinary tract infection during the further clinical course (e.g., by manipulation intraoperatively) [29]. It is therefore important to note that our nomogram predicts a positive midstream urine culture but not a urinary tract infection. Fourth, our results are limited by the failure to control for additional potential predictive parameters such as levels of cytokines or procalcitonin. Finally, radiological findings concerning pyelonephritis or fornix rupture in mostly unenhanced computed tomography are of limited utility. External validation in a larger patient population is needed to verify our findings and help identify patients who require early renal decompression and antibiotic treatment.

Conclusions

We developed and internally validated a highly accurate, easy-to-use nomogram for prediction of concomitant positive midstream urine culture in patients presenting with symptomatic ureterolithiasis. External validation in a larger patient population is needed to verify our findings and help identify patients who require antibiotic treatment and immediate renal decompression.