Abstract
Prediction and avoidance of intraoperative hypotension (IOH) can lead to less postoperative morbidity. Machine learning (ML) is increasingly being applied to predict IOH. We hypothesize that incorporating demographic and physiological features in an ML model will improve the performance of IOH prediction. In addition, we added a “dial” feature to alter prediction performance. An ML prediction model was built based on a multivariate random forest (RF) trained algorithm using 13 physiologic time series and patient demographic data (age, sex, and BMI) for adult patients undergoing hepatobiliary surgery. A novel implementation was developed with an adjustable, multi-model voting (MMV) approach to improve performance in the challenging context of a dynamic, sliding window for which the propensity of data is normal (negative for IOH). The study cohort included 85% of subjects exhibiting at least one IOH event. Males constituted 70% of the cohort, median age was 55.8 years, and median BMI was 27.7. The multivariate model yielded average AUC = 0.97 in the static context of a single prediction made up to 8 min before a possible IOH event, and it outperformed a univariate model based on MAP-only (average AUC = 0.83). The MMV model demonstrated AUC = 0.96, PPV = 0.89, and NPV = 0.98 within the challenging context of a dynamic sliding window across 40 min prior to a possible IOH event. We present a novel ML model to predict IOH with a distinctive “dial” on sensitivity and specificity to predict first IOH episode during liver resection surgeries.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Worldwide, more than 300 million surgeries are performed annually, a number expected to increase with the aging population and the consequent rise in comorbid conditions [1, 2]. In patients having non-cardiac surgery, intraoperative hypotension (IOH) under general anesthesia is a frequent physiological derangement [3]. In those undergoing liver resections, IOH has been reported in nearly half of the cases because of their complexity, positioning, substantial blood loss, and fluid shifts [4,5,6]. Brief episodes of IOH, ranging from moderate to profound, have been shown to carry clinically significant consequences [7]. Studies have shown that IOH is associated with a higher risk of postoperative morbidity due to acute renal failure, delirium, myocardial injury after non-cardiac surgery [3, 4, 8,9,10,11,12,13,14]. Therefore, avoiding IOH is important in procedures such as liver resections.
Machine learning (ML) in healthcare has emerged in recent decades as a tool for prognostication, clinical decision support, and predicting complications. Machine learning strategies provide a framework from which to estimate intraoperative events of interest from observed time-series data (i.e., blood pressure or oxygen saturation) [15,16,17,18]. In that regard, significant research efforts are underway to apply ML in the complex interplay between surgery, patient demographics, anaesthetic factors, and their relationship with perioperative outcomes [15,16,17].
For example, several recent ML approaches have been applied to predict IOH events [15,16,17,18,19,20,21,22]. In a randomized clinical trial, an ML-derived early-warning system showed a decrease in the incidence and duration of IOH, suggesting that prediction of IOH is superior to prompt management of the event [20]. This study used a commercially available ML-derived tool called the Hypotension Prediction Index (HPI, Edwards Lifesciences, Irvine, CA) [20]. The index estimates the occurrence of IOH using an algorithm derived from the dynamic variations of an arterial line waveform. Potential limitations in previous work with HPI include the need for high-frequency arterial waveform data (which adds cost to patient care and most anaesthetics delivered do not use), selection bias, selection of prediction-outcome pairs, reduced performance between backwards and forward case analysis as recently indicated by Davies et al., and complicated or inaccurate treatment recommendations [20,21,22,23,24,25]. It has also been suggested that the HPI overestimates the prediction of IOH [26]. Another notable gap in previous work using HPI is that liver resections were excluded in some HPI analyses due to possibly lower IOH prediction performance in that context, providing a strong point of motivation for the work presented below, which focuses on liver surgery.
Furthermore, in the clinical setting, anaesthesiologists routinely rely on additional factors to predict the individual thresholds of hypotension in the operating room. For example, the amount of anaesthetic being administered could affect the patient’s blood pressure [27]. Several factors are often considered to determine the optimal blood pressure range, such as age and amount of anaesthetic being delivered.
To more closely mirror the clinician’s complex decision-making process at the bedside, we investigated the performance of a novel prediction algorithm that incorporates multiple dimensions of physiological monitoring data (cardiac, respiratory, and neuro monitoring) in combination with patient-specific factors (age, sex, and BMI) to forecast the occurrence of IOH. Initial work specifically tackled the challenging scenario of major liver surgeries, which are hemodynamically complex and carry strong potential for substantial blood loss and fluid shifts that expose patients to the risk of IOH [5, 6]. We hypothesized that our multivariate ML algorithm would yield high performance for IOH prediction in liver surgery. Features of this work distinct from previous reports in the literature include: (i) the incorporation of multivariate physiological time series data and patient-specific demographic information; (ii) a multi-model approach comprising an ensemble of ensembles hypothesized to improve performance and practicality; and (iii) the ability to control the sensitivity and specificity of the prediction in real-time.
2 Methods
2.1 Data curation, pre-processing, and truth definition
The study drew from retrospective data under Institutional Review Board approval (IRB# 2023-0656) for patients undergoing hepatobiliary surgery at our institution. Data were extracted from a database of physiological monitoring variables gathered from the electronic medical record for surgeries between 3/22/2016 and 2/18/2023. With respect to guidelines reported in the multidisciplinary review of ML-based predictive models, the work was retrospective in nature with internal validation [28]. Inclusion criteria were open partial or total liver lobectomy, age > 18 years at the time of surgery, availability of demographic data, and intraoperative physiological signals, as noted below. Patients undergoing emergency or minimally invasive surgeries and those having multiple surgical procedures utilizing other surgical specialties were excluded.
A key aspect of the model described below is the combination of demographic and multivariate physiological time-series data incorporated into the prediction model. Demographic data included the subject age at the time of surgery, body mass index (BMI), and sex. Physiological data drawn from anaesthesia monitoring included 13-time series signals recorded at 1-min intervals: diastolic arterial line (Art Line D), systolic arterial line (Art Line S), mean arterial pressure (MAP), heart rate as measured by pulse oximeter (HR, Oximeter), inspired oxygen fraction (FiO2), delivered oxygen flow (O2 flow), bispectral index (BIS Monitor), inspired desflurane, expired desflurane, minute volume ventilation (Minute Volume), respiratory rate (Resp), peak inspiratory pressure (PIP), and tidal volume (Vt). The saturated haemoglobin concentration from the pulse oximeter reading (SpO2) was originally included, but analysis of feature importance (Sect. 2.2.2) showed little influence on IOH prediction, and it was subsequently excluded.
Time series data were considered 30 min after arterial line placement, when all physiological signals mentioned above were present, and were low-pass filtered with a 3-min averaging window width to reduce noise. Signals exhibiting missing data of 1–2 min were median filtered to impute minor gaps in the data. Training cases were classified as positive for IOH if they exhibited a drop in MAP below 65 mm Hg sustained for at least 2 min. Only the first IOH event was considered for each case, beyond which the subject was assumed to be under medical intervention and was not further considered in training or testing. Negative cases were those for which MAP was above 65 mm Hg throughout the observation period. Selection bias was minimized by ensuring forward analysis of physiological variables and future IOH (up to 8 min in advance, as detailed below) and not by the way the data was assembled [29].
Figure 1 illustrates the segmentation of time series data into “data intervals” (i.e., the input data from which a prediction of IOH is made) and “forecast intervals” (i.e., the time period between the data interval and a possible IOH event). For illustration purposes, the time stamp in Fig. 1 denotes time-shifted series relative to the start of the data interval and does not describe the actual time during surgery. Throughout this work, the data interval was taken to be a 10-min interval prior to a possible IOH event. As a starting point, the current work focused on a relatively short forecast interval up to 8 min in duration, selected in part due to the relatively fast hemodynamic swings that have been described in liver surgery. Thus, the proposed interval along with the sliding window approach described below is fairly realistic with respect to clinical decision making in the operating room and could in turn provide flexibility to monitor, react, and promptly intervene to prevent IOH. Shorter (~ 5 min) and longer (~ 15 min) intervals are possible subjects of future work. The forecast interval duration was randomly varied from 3 to 8 min according to a uniform distribution, and as shown in Fig. 1, for a data interval spanning t = 1–10 min, the IOH event could therefore occur anywhere from t = 13 min to t = 18 min.
2.2 Predictive models
A number of model architectures were considered for initial investigation of IOH prediction, including artificial neural networks (ANN), gradient boosted trees (GBT), and random forest (RF) classifiers. While ANNs are a potentially powerful approach (as in, for example [16]), they typically require large training sets and can challenge interpretability. Simpler ML approaches such as GBTs and RFs can yield a reasonable degree of predictive performance, a high degree of interpretability and explainability, and have widely available shared software libraries that facilitate implementation and reproducibility (e.g., the sktime ML Python library for time series data). For these reasons, the initial studies reported below were based on RF supervised classifiers with 100 decision trees combined using the ColumnEnsembleClassifier() function, and future work will certainly consider alternative architectures.
Several variations in model training and testing were investigated to rigorously assess the potential for bias and to evaluate performance under a broad variety of conditions. The differences in accuracy were analyzed using a non-parametric Mann–Whitney U test, with p-value < 0.05 interpreted as evidence of statistical significance. Models were trained with and without Z-score signal normalization. Two variations in train:test proportion were considered (90:10 and 80:20) to investigate the tradeoffs between a larger training set and fewer test samples, hypothesizing slight improvement for the former due to a larger volume of training data. As an alternative to the 8-min forecast interval (randomly varied from 3–8 min), models with a fixed 5-min interval were investigated. Finally, three variations in class balance were trained and tested: (i) the natural imbalance of the data (as shown below ~ 6× in favor of positive cases); (ii) balancing negative and positive datasets by sampling three negative datasets (non-overlapping, separated by at least 20 min) from each negative case plus three negative datasets (similarly non-overlapping) from positive cases sampled at least 30 min prior to the IOH event, referred to as “3× sampling”); and (iii) balancing achieved by sampling six non-overlapping datasets from negative cases (referred to as “6× sampling”). Models (ii) and (iii) therefore involved patient-level combined with segment-level splitting to achieve class balance and is recognized to carry potential bias. The bias is partly mitigated by sampling non-overlapping data intervals separated by at least 20 min and assumed to be independent, recognizing that age, sex, and BMI are common at the level of segment and therefore not strictly partitioned.
2.2.1 Application scenarios: static and dynamic
Two application scenarios were considered for analysis of model predictions. The first was a relatively simple “static” scenario in which a single 10-min data interval was taken as input for each case, occurring up to 8 min prior to a possible IOH event, as illustrated in Fig. 1. The static scenario was hypothesized to represent an optimistic upper bound on model performance under the idealized condition that half of the validation data were positive or negative for IOH.
A second, more realistic, and challenging scenario was considered in which the 10-min data interval advanced in 1-min steps through an extended period of truly negative data prior to a possible IOH event. With the forecast interval up to 8 min and a possible IOH event time-shifted to t = 40 min, the duration of the truly negative period ranged from 32 min (for an 8-min forecast interval) to 37 min (for a 3-min forecast interval). Sliding the 10-min data window in 1-min steps through a preponderance of truly negative instances and forming a prediction at each step is referred to below as the “dynamic” sliding window scenario, illustrated in Fig. 2. We hypothesized that model performance would be challenged (specifically, a reduction in PPV) in the dynamic scenario due to the high prevalence of negative instances—analogous to the more realistic clinical scenario of an early warning system for which most of the samples are truly negative, and a high level of PPV with few false alarms is required for the system to be useful. Testing in the “sliding window” scenario involved 20 cases (10 positive + 10 negative) that were held out and previously unseen in training at either the patient-level or segment-level.
2.2.2 Model variations: MAP-only, single-model, and multi-model
Three main variations in the predictive model were developed and tested. First was a RF model trained as detailed above but with only MAP as an input feature—referred to as the univariate “MAP-only” model, analogous to previously reported models [15].
To evaluate the importance of multi-dimensional inputs that more closely reflect the actual clinical considerations of an anaesthesiologist, a multivariate RF model was developed that takes the 3 demographic variables and 13 physiologic time series signals described above as input—referred to as the “single” multivariate model (in contrast to “multi-model,” described below). The importance of individual features contributing to multivariate model predictions was evaluated via ablation—i.e., removing a given feature, retraining, and retesting in the absence of the ablated feature. The reduction in model performance yields a surrogate for feature importance, and the features were rank-ordered accordingly.
In light of the challenge presented by a large prevalence of truly negative instances in the “dynamic” scenario, a “multi-model voting” (MMV) approach was developed that runs multiple, separately trained RFs in parallel, each contributing a vote. Whereas the conventional “single” multivariate model represents a single RF resulting from an ensemble of decision trees, the MMV approach comprises multiple RFs. Given that an RF is itself an ensemble approach, the MMV approach can be considered an “ensemble of ensembles.” With MMV, each of the (11) RF models from 11-fold cross validation described above contributes a prediction that is taken as a “vote” in forming a final prediction—e.g., by simple majority (Nvotes ≥ 6). Figure 2 illustrates the MMV approach for a single case in the dynamic scenario, with the number of votes (Nvotes) shown on the right axis and True-Negative (TN), True-Positive (TP), False-Negative (FN), and False-Positive (FP) predictions marked. The MMV approach was further investigated with Nvotes taken as an adjustable parameter, allowing the anaesthesiologist to control the sensitivity and specificity of the model by dialling the Nvotes threshold lower (for greater sensitivity) or higher (for greater specificity).
2.2.2.1 Performance evaluation
Predictive performance was evaluated in terms of standard binary hypothesis-testing metrics of TP predictions (correctly predicting an IOH event within the forecast interval), TN predictions (correctly predicting the absence of an IOH event), FP predictions (incorrectly predicting an IOH event), and FN predictions (incorrectly predicting that an IOH event will not occur). Corresponding metrics of Accuracy, Sensitivity, Specificity, positive predictive value (PPV), and negative predictive value (NPV) were computed. Receiver operating characteristic (ROC) curves were evaluated by varying the probability threshold (from 0 to 1) in model predictions to be considered positive, and the area under the ROC curve (AUC) was evaluated by numerical integration. The resulting distributions were analyzed using a non-parametric Mann–Whitney U test computed in Python to compare the observed difference in mean value between distributions, with p-value < 0.05 taken as statistically significant.
3 Results
3.1 Study cohort
The study cohort is summarized in Fig. 3. Inclusion criteria yielded a total of 918 cases undergoing open liver lobectomy. Of these, 723 were partial lobectomy, 98 were left lobectomy, and 97 were right lobectomy. Median age at the time of surgery was 55.8 years (range 20–90, Fig. 3a), Median BMI was 27.7 (range 20.0–35.2, Fig. 3b), and males constituted 70% of cases (642/918, Fig. 3c). At least one IOH event was observed (at least 30 min after arterial line placement and sustained for at least 2 min) in 85% of cases (783/918, Fig. 3d), and the remaining 15% (125/918) exhibited MAP > 65 mm Hg for the duration of their case. Among the 783 positive cases, 73% (570/783) were male, approximately consistent with the male:female proportion in cases overall.
3.2 Parameter selection and model variants
Among the basic model variations investigated, there was no statistically significant difference in performance with and without signal normalization (p = 0.132), consistent with scale invariance of the underlying RF approach; therefore, models were trained without signal normalization. As anticipated, a statistically significant improvement was observed in performance between 90:10 and 80:20 split in train:test datasets (p = 0.002) due to a somewhat larger training set, and the former was used throughout. There was no evidence of a statistically significant difference in performance between the 8-min forecast interval (i.e., variable 3–8 min) and the (fixed) 5-min interval (p = 0.242), and the former was used throughout in the interest of increased variety in the training set. The three variations in class balance resulted in: (i) 6× imbalance in favor of positive cases (783) vs negative cases (135); (ii) balanced datasets via “3× sampling,” of 135 negative cases (giving 270 negative datasets) plus 513 negative datasets drawn from positive cases sampled at least 30 min prior to the IOH event; and (iii) balanced datasets via “6× sampling” of negative cases to yield 783 negative and positive datasets. As expected, the imbalanced dataset exhibited a statistically significant reduction in performance compared to the 3× and 6× sampling (average AUC = 0.84 compared to 0.91 (p = 0.002) and 0.97 (p = 0.003), respectively); the 6× sampling dataset was used throughout to achieve class balance while mitigating bias in multiple sampling of negative datasets assumed to be independent.
3.3 Single-model performance
As a starting point, the performance of the MAP-only univariate model (i.e., a single RF model with MAP as the sole input feature in the relatively simple static scenario), is shown in Fig. 4. Performance overall is modest, as previously reported, exhibiting average AUC = 0.83 (over 11-fold) and median Accuracy = 0.73, Sensitivity = 0.69, Specificity = 0.79, PPV = 0.77, and NPV = 0.71 [30].
By way of comparison, the performance of the multivariate RF model is shown in Fig. 5, also in the static scenario. Performance overall is markedly improved, demonstrating AUC = 0.97 (average over 11-fold) and median Accuracy = 0.95, Sensitivity = 0.86, Specificity = 0.93, PPV = 0.94, and NPV = 0.88. The importance of individual features in the model is shown in Fig. 5c, where the horizontal axis denotes the ablated feature. Age and BMI showed the greatest importance in the prediction, followed by a combination of hemodynamic features (Art Line S, Art Line D, and MAP), respiratory features (Tidal Volume and Resp), and anaesthetic delivery (Inspired Desflurane). Other features individually exhibited less influence on the model overall, but were maintained in the model, since they may contribute to the aggregate.
While the accuracy exhibited in Fig. 5 is promising, deployment of an IOH prediction model as a real-time early-warning system must operate with a high degree of PPV to avoid false alarms. Performance of the multivariate RF model in the more challenging, dynamic sliding window scenario is shown in Fig. 6, where testing spanned a prolonged period, t = 1–40 min, within which the first 30 min was truly negative, prior to a forecast interval up to 8 min and possible instance of IOH at t = 40 min. Analysis was performed on ten positive test cases, amounting to 11 × 30 = 330 negative samples and 11 × 10 = 110 positive samples). A substantial drop in overall performance is evident: average AUC dropped from 0.97 to 0.84, median accuracy from 0.95 to 0.86, specificity from 0.93 to 0.88, and PPV from 0.94 to 0.63. The large drop in PPV in the dynamic scenario particularly motivated development of the MMV approach, below.
3.4 Multi-model (MMV) performance
The MMV approach runs 11 separately trained RF models in parallel and evaluates the resulting 11 votes to yield a prediction. Moreover, the sensitivity and specificity of the MMV approach can be controlled by adjusting (“dialing”) the Nvotes threshold for positive prediction. Figure 7 summarizes the MMV performance and influence of the adjustable threshold in the dynamic sliding window scenario. The 11 ROC curves in Fig. 7a show the dependence on Nvotes—largely unaffected over the range Nvotes = 1–5 and decreasing for Nvotes > 6, illustrated further in terms of AUC in Fig. 7b. Figure 7c and d show the anticipated trade-offs in sensitivity and specificity with adjustment of Nvotes. In the dynamic scenario for which negative samples outnumber positives by at least a factor of 3, Figs. 3f–g show PPV and NPV to be optimal in the range Nvotes = 6–8.
Finally, the performance of the MMV approach was evaluated in the dynamic scenario with a nominal value of Nvotes = 6 (out of 11—i.e., a simple majority), as summarized in Fig. 8. Compared to Fig. 5, this scenario presents a more challenging, realistic context wherein the propensity of instances is truly negative and compared to the single-model approach in Fig. 6, MMV demonstrated improved AUC (0.96) and median Accuracy (0.98), Sensitivity (1.0), Specificity (0.96), PPV (0.89), and NPV (0.98). The improvements are statistically significant (p < 0.05) compared to the single-model predictions (Fig. 6).
4 Discussion
Liver resections have a high incidence of hemodynamic disturbances, including IOH, as indicated by our work and others, likely related to large fluid shifts, extreme positioning, and high use of vasopressors [4, 31,32,33]. Machine learning methods have been successfully employed to predict IOH in various clinical settings outside of liver surgery. Although a recent study demonstrated that the HPI system offers potential benefits to reduce IOH compared to goal-directed hemodynamic therapy, liver surgeries were excluded from the analysis [34]. The research reported above represents a novel use of multivariate ML models using forward analysis of variables (as recommended) to predict, with the aid of invasive blood pressure monitoring, the first episode of IOH during open liver resections by integrating demographic data (age, BMI, and sex) with 13 physiological time series signals recorded at 1-min intervals [22].
In this research, the multivariate model outperformed a univariate (MAP-only) model, and a fairly wide variety of hemodynamic and respiratory time-series signals were important to reliable prediction. This finding contrasts with a recent study indicating that MAP may perform as good as the HPI at set thresholds of 72 or 73 mmHg [30]. The dynamic sliding window scenario presented the real-world prevalence challenge of sequential data for which a preponderance of the data is truly negative for IOH, and a novel MMV technique was developed to maintain PPV under such conditions. The MMV approach also presents the intriguing capability for the anaesthesiologist to dial the sensitivity and specificity of the model according to their judgment with respect to factors related to the patient or phase of the procedure—a subject of future work.
Based on AUC, the HPI in the internal validation cohort has shown the highest performance at 5 min, with the lowest being in the Jacquet-Lagre`ze’s model [16, 17, 20, 35]. Compared to those models, our MMV strategy demonstrated a relatively high performance (AUC = 0.96). However, a fair comparison between available models and our strategy is difficult because of differences in the clinical data used as predictors, the type of surgical patient population, and ML strategies [19, 36].
Davies et al. used a cohort of patients from nine previous studies and compared data using either a backward approach with a gray zone or a forward approach without a gray zone [22]. The latter strategy (forward without a gray zone) is similar to the approach taken in our work; however, there are important distinctions. First, we used only data relating to an intraoperative context, whereas Davies et al. used a mixed population of operative and ICU patients. Second, Davies et al. included data from invasive and non-invasive arterial waveform analysis using HPI, whereas our work only used recording from invasive arterial blood pressure without waveform patterns. Lastly, Davies et al. observed in their forward approach without a gray zone a low PPV (0.52), whereas the MMV approach reported above maintains high PPV and suggests that our results could be more clinically actionable in predicting whether a particular patient will truly develop IOH.
Additionally, previous work used intraoperative arterial waveform analysis, such as the HPI model, to predict IOH [17, 20]. More recently, Hwang et al. used local trends of arterial blood pressures to how certain waveform shapes were associated with IOH. The study showed good predictive performance based on the reported AUC (> 0.9) [37]. The rationale for using arterial waveform contour lies in identifying abnormal compensatory mechanisms before IOH develops via changes in the waveform. However, pulse contour analysis technology needs frequent calibration in patients with low systemic vascular resistance or after changes in vasopressor dosage [38]. We have not used waveform contour analysis in the current work. Rather, the multivariate ML model described above incorporated demographic data in combination with arterial blood pressure values, respiratory parameters, and anaesthetic time series data for prediction instead of a single physiological parameter [17, 20].
Using a deep learning convolutional neural network, Lee et al. constructed several prediction models of IOH in a conglomerate of non-cardiac surgical patients employing (a) an arterial-pressure-only model, (b) an invasive multichannel model (arterial blood pressure, electrocardiograph, photoplethysmography, and capnography), (c) a photoplethysmography-only-model, (d) a non-invasive-multichannel-model and (e) a non-invasive hybrid model [16]. Lee’s invasive multichannel model shares some of the physiological parameters used in the prediction model reported above, demonstrating AUC = 0.91, sensitivity = 0.86, and specificity = 0.86 [16]. The MMV model reported here yielded a somewhat higher AUC (0.96), recognizing the need for external validation. Hwang et al. [37] constructed a model with an interpretable predictor and interpretable methods. The model showed high performance in the internal validation phase, which was somewhat reduced in external validation. Significant differences in the work reported by Hwang et al. from that reported above include use of a mixed population of patients, waveform shape analysis and Fourier transform for each blood pressure cycle, and use of a “gray zone” between MAP = 65 mmHg and 75 mmHg [37].
Feature ablation analysis shed light on the importance of the multiple intraoperative variables contributing to the model developed in this work. The three highest-ranking parameters in the prediction model were age, BMI, and diastolic pressure. Age and BMI variables have been previously shown to provide value in predicting postinduction hypotension; of course, they do not lend themselves to real-time monitoring/early warning systems in the operating room [39].
Accurate alerts on potentially imminent IOH should allow anaesthesiologists to intervene and improve postsurgical clinical outcomes. It could also be of value to report the relative uncertainty associated with the prediction—e.g., a score from 1 to 11 corresponding to Nvotes in the MMV approach. The ability to control the sensitivity and specificity of the model by adjusting the Nvotes threshold opens an interesting possibility that warrants further investigation. A clinical application of such an approach could allow the anaesthesiologist to “dial up” the number of votes in a relatively healthy patient for surgery, indicating that some level of tolerability of hypotension while “dialing down” for a patient with multiple co-morbidities and a lower threshold to intervene early.
Limitations of the work reported above include (a) retrospective study design, which may lead to predictive bias, (b) the arbitrary but well-accepted definition of IOH (< 65 mm Hg sustained for at least 2 min), (c) ML methods that can be subject to bias introduced in training data, (d) the prediction of the first IOH event (only), and (e) limitation to internal validation, with testing on external datasets recognized as an essential area of future work. Class imbalance intrinsic to the data (6× in favor of positive cases, consistent with prevalence of IOH in these surgeries) necessitated a combination of patient-level and segment-level splitting of the data, with 6× sampling of negative data intervals (non-overlapping, separated by at least 20 min) necessary to achieve class balance. Segment-level splitting in training for the “Static” scenario is acknowledged to carry potential bias, and future work warrants larger datasets to enable a larger volume of training data balanced at the level of the patient. However, the “sliding window” scenario involved testing with strict patient-level splitting in 20 unseen cases, confirming performance from the “static” scenario and suggesting that possible leakage effects were minimal.
Previous work on perioperative outcomes suggests an interaction between IOH, deep levels of hypnosis, and low minimum alveolar concentration (MAC) [27]. Our work included “anaesthetic dose” and depth of hypnosis as variables; however, we did not have MAC data available in our data registry. Furthermore, we only included patients receiving desflurane as the main volatile anaesthetic for the maintenance of hypnosis. In addition, despite including heart rate as an input variable in the model, our algorithm does not differentiate between endotypes of IOH [40].
Our model has shown that it is capable of predicting an IOH event up to 8 min before the event, which we acknowledge to be a somewhat shorter window overall than that used by the HPI (5, 10, and 15 min) [15]. Recognizing the flexibility of the MMV approach described above, future work could consider development short, medium, and long range predictors—e.g., with short-range models optimized for sensitivity (reduced Nvotes threshold) and long-range models optimized for specificity (higher Nvotes threshold). Future work will also address how the strategy can be adapted to multiple instances of IOH and in response to treatment and specific patient populations.
In conclusion, the rate of IOH is high in patients undergoing liver resection surgery—85% in this cohort. The overall performance of the MMV predictive model was high, achieving AUC = 0.96, median PPV = 0.89, and median NPV = 0.98 within the challenging context of a dynamic sliding window for which the propensity of data is normal (truly negative for IOH). Further research is needed to compare with other proprietary algorithms, such as that used for the HPI, and apply external validation in a larger, more complex population of patients. Furthermore, research is needed to determine whether our IOH prediction model would be superior to prompt management of IOH.
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Weiser TG, Haynes AB, Molina G, et al. Estimate of the global volume of surgery in 2012: an assessment supporting improved health outcomes. Lancet. 2015;385:S11.
Moore BJ, White S, Washington R, Coenen N, Elixhauser A. Identifying increased risk of readmission and in-hospital mortality using hospital administrative data. Med Care. 2017;55:698–705.
Salmasi V, Maheshwari K, Yang D, et al. Relationship between intraoperative hypotension, defined by either reduction from baseline or absolute thresholds, and acute kidney and myocardial injury after noncardiac surgery: a retrospective cohort analysis. Anesthesiology. 2017;126:47–65.
Liao P, Zhao S, Lyu L, et al. Association of intraoperative hypotension with acute kidney injury after liver resection surgery: an observational cohort study. BMC Nephrol. 2020;21:456.
Alkozai EM, Lisman T, Porte RJ. Bleeding in liver surgery: prevention and treatment. Clin Liver Dis. 2009;13:145–54.
Tranchart H, O’Rourke N, Van Dam R, et al. Bleeding control during laparoscopic liver resection: a review of literature. J Hepatobiliary Pancreat Sci. 2015;22:371–8.
Gregory A, Stapelfeldt WH, Khanna AK, et al. Intraoperative hypotension is associated with adverse clinical outcomes after noncardiac surgery. Anesth Analg. 2020;132:1654–65.
Mascha EJ, Yang D, Weiss S, Sessler DI. Intraoperative mean arterial pressure variability and 30-day mortality in patients having noncardiac surgery. Anesthesiology. 2015;123:79–91.
Roshanov PS, Sessler DI, Chow CK, et al. Predicting myocardial injury and other cardiac complications after elective noncardiac surgery with the revised cardiac risk index: the VISION study. Can J Cardiol. 2021;37:1215–24.
Mathis MR, Naik BI, Freundlich RE, et al. Preoperative risk and the association between hypotension and postoperative acute kidney injury. Anesthesiology. 2020;132:461–75.
Bijker Jilles B, Persoon S, Peelen Linda M, et al. Intraoperative hypotension and perioperative ischemic stroke after general surgery. Anesthesiology. 2012;116:658–64.
Maheshwari K, Ahuja S, Khanna AK, et al. Association between perioperative hypotension and delirium in postoperative critically Ill patients. Anesth Analg. 2020;130:636–43.
Walsh M, Devereaux PJ, Garg AX, et al. Relationship between intraoperative mean arterial pressure and clinical outcomes after noncardiac surgery. Anesthesiology. 2013;119:507–15.
Roshanov PS, Sheth T, Duceppe E, et al. Relationship between perioperative hypotension and perioperative cardiovascular events in patients with coronary artery disease undergoing major noncardiac surgery. Anesthesiology. 2019;130:756–66.
Davies SJ, Vistisen ST, Jian Z, Hatib F, Scheeren TWL. Ability of an arterial waveform analysis-derived hypotension prediction index to predict future hypotensive events in surgical patients. Anesth Analg. 2020;130:352–9.
Lee S, Lee H-C, Chu YS, et al. Deep learning models for the prediction of intraoperative hypotension. Br J Anaesth. 2021;126:808–17.
Lee S, Lee M, Kim S-H, Woo J. Intraoperative hypotension prediction model based on systematic feature engineering and machine learning. Sensors. 2022;22:3108.
Connor CW. Artificial intelligence and machine learning in anesthesiology. Anesthesiology. 2019;131:1346–59.
Wijnberge M, Geerts BF, Hol L, et al. Effect of a machine learning-derived early warning system for intraoperative hypotension vs standard care on depth and duration of intraoperative hypotension during elective noncardiac surgery: the HYPE randomized clinical trial. JAMA. 2020;323:1052–60.
Hatib F, Jian Z, Buddi S, et al. Machine-learning algorithm to predict hypotension based on high-fidelity arterial pressure waveform analysis. Anesthesiology. 2018;129:663–74.
Simjanoska M, Gjoreski M, Gams M, Madevska Bogdanova A. Non-invasive blood pressure estimation from ECG using machine learning techniques. Sensors. 2018;18:1160.
Davies SJ, Sessler DI, Jian Z, et al. Comparison of differences in cohort (forwards) and case control (backwards) methodological approaches for validation of the Hypotension Prediction Index. Anesthesiology. 2024;141:443–52.
Maheshwari K, Shimada T, Yang D, et al. Hypotension prediction index for prevention of hypotension during moderate- to high-risk noncardiac surgery: a pilot randomized trial. Anesthesiology. 2020;133:1214–22.
Enevoldsen J, Vistisen ST. Performance of the hypotension prediction index may be overestimated due to selection bias. Anesthesiology. 2022;137:283–9.
Yang SM, Cho HY, Lee HC, Kim HS. Performance of the Hypotension Prediction Index in living donor liver transplant recipients. Minerva Anestesiol. 2023;89:387–95.
Mulder MP, Harmannij-Markusse M, Donker DW, Fresiello L, Potters JW. Is continuous intraoperative monitoring of mean arterial pressure as good as the hypotension prediction index algorithm?: Research letter. Anesthesiology. 2023;138:657–8.
Sessler DI, Sigl JC, Kelley SD, et al. Hospital stay and mortality are increased in patients having a “triple low” of low blood pressure, low bispectral index, and low minimum alveolar concentration of volatile anesthesia. Anesthesiology. 2012;116:1195–203.
Luo W, Phung D, Tran T, et al. Guidelines for developing and reporting machine learning predictive models in biomedical research: a multidisciplinary view. J Med Internet Res. 2016;18: e323.
Wijeysundera DN, McIsaac DI, London MJ. The promise and challenges of predictive analytics in perioperative care. Anesthesiology. 2022;137:275–9.
Mulder MP, Harmannij-Markusse M, Fresiello L, Donker DW, Potters JW. Hypotension Prediction Index is equally effective in predicting intraoperative hypotension during non-cardiac surgery compared to a mean arterial pressure threshold: a prospective observational study. Anesthesiology. 2024;141:453–62.
Jongerius IM, Mungroop TH, Uz Z, et al. Goal-directed fluid therapy vs. low central venous pressure during major open liver resections (GALILEO): a surgeon- and patient-blinded randomized controlled trial. HPB (Oxford). 2021;23:1578–85.
Dunki-Jacobs EM, Philips P, Scoggins CR, McMasters KM, Martin RCG. Stroke volume variation in hepatic resection: a replacement for standard central venous pressure monitoring. Ann Surg Oncol. 2014;21:473–8.
Macacari RL, Coelho FF, Bernardo WM, et al. Laparoscopic vs. open left lateral sectionectomy: an update meta-analysis of randomized and non-randomized controlled trials. Int J Surg. 2019;61:1–10.
Lorente JV, Ripollés-Melchor J, Jiménez I, et al. Intraoperative hemodynamic optimization using the hypotension prediction index vs. goal-directed hemodynamic therapy during elective major abdominal surgery: the Predict-H multicenter randomized controlled trial. Front Anesthesiol. 2023;2:1193886.
Jacquet-Lagrèze M, Larue A, Guilherme E, et al. Prediction of intraoperative hypotension from the linear extrapolation of mean arterial pressure. Eur J Anaesthesiol. 2022;39:574–81.
Arina P, Kaczorek MR, Hofmaenner DA, et al. Prediction of complications and prognostication in perioperative medicine: a systematic review and PROBAST assessment of machine learning tools. Anesthesiology. 2024;140:85–101.
Hwang E, Park YS, Kim JY, Park SH, Kim J, Kim SH. Intraoperative hypotension prediction based on features automatically generated within an interpretable deep learning model. IEEE Trans Neural Netw Learn Syst. 2023. https://doi.org/10.1109/TNNLS.2023.3273187.
Biais M, Mazocky E, Stecken L, et al. Impact of systemic vascular resistance on the accuracy of the pulsioflex device. Anesth Analg. 2017;124:487–93.
Kendale S, Kulkarni P, Rosenberg AD, Wang J. Supervised machine-learning predictive analytics for prediction of postinduction hypotension. Anesthesiology. 2018;129:675–88.
Kouz K, Brockmann L, Timmermann LM, et al. Endotypes of intraoperative hypotension during major abdominal surgery: a retrospective machine learning analysis of an observational cohort study. Br J Anaesth. 2023;130:253–61.
Acknowledgements
None
Funding
The authors have not disclosed any funding.
Author information
Authors and Affiliations
Contributions
JPC, BS, SB, JHS, JMS wrote the main manuscript text, created figures, and revised it critically for important intellectual content. All authors made substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data, and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. All authors reviewed the manuscript and approved the version to be published.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Informed Consent
The authors received a waiver of consent through the Institutional Review Board (IRB protocol #2023-0656) at the University of Texas MD Anderson Cancer Center, Houston, TX, USA, due to the retrospective nature of this study.
Ethical approval
The research was conducted according to ethical standards in compliance with institutional and international guidelines.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Cata, J.P., Soni, B., Bhavsar, S. et al. Forecasting intraoperative hypotension during hepatobiliary surgery. J Clin Monit Comput (2024). https://doi.org/10.1007/s10877-024-01223-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10877-024-01223-5