1 Introduction

Late recognition of deteriorating patients is a major problem in hospital wards, contributing to in-hospital mortality and intensive care unit admissions [1, 2]. Several countries have implemented routine manual monitoring of vital signs at fixed intervals such as the National Early Warning Score (NEWS) [3, 4]. However, clinical deterioration also occurs between observations, which might explain the lack of documented effect of NEWS on morbidity and mortality [5,6,7]. Thus, early recognition is crucial to allow for timely treatment and potentially prevent further deterioration. Recent technological advances allow continuous vital sign monitoring at the general ward that may detect early clinical deterioration and thereby decrease the risk of serious adverse events (SAE) [8], but traditional vital signs may not consistently predict imminent SAEs [9, 10]. This technology also enables continuous evaluation of heart rate variability (HRV), which is determined from variations in electrocardiographic (ECG) R-R intervals and considered an important indicator of autonomic nervous system (ANS) activity [11]. HRV can be affected by physical, mental, and surgical stress in a clinical setting [12, 13]. HRV is often altered in elderly patients and patients with chronic or acute diseases [14]. A reduction in autonomic regulation of heart rate, resulting in lower 24-h measurements of the standard deviation of normal-to-normal R-R intervals (SDNN) and root mean square differences of successive R-R intervals (RMSSD), is associated with adverse cardiac events and mortality [15,16,17,18]. Consequently, the utilization of HRV alerts, which offer a more refined assessment of autonomic functions, potentially holds substantial predictive value [14]. As an example, HRV depression, defined as a decrease in R-R interval variation and thus lower RMSSD, is a marker of increased sympathetic tone and stress, associated with poor outcomes in patients with sepsis [19,20,21,22]. As complications can arise without clinical recognition in the general ward [23, 24], deviations in HRV monitored with continuous wireless monitoring systems may serve as a prognostic marker of upcoming critical complications.

Previous studies have generally investigated long term outcomes in specific medical populations and focused on predefined thresholds for a limited number of HRV parameters, without systematically evaluating multiple thresholds across various parameters [14, 18, 25,26,27]. This study aimed at exploring the prognostic ability of HRV-derived measures for subsequent SAEs when measured continuously with a wearable single-lead ECG device in patients hospitalized for acute medical disease or after major surgery. We investigated the predictive ability of several HRV parameters for SAEs by calculating the area under the curve (AUC) based on the last 24 h of continuous monitoring before the occurrence of SAEs.

2 Methods

2.1 Included studies

Data were collected from four prospective observational studies (NCT03660501, NCT03491137, NCT04473001, and NCT04628858) and two randomized controlled trials (NCT04661748 and NCT04640415). All studies were approved by the Danish Data Protection Agency (P-2019-690) and regional ethics committee (H-20033246, H-20034555, H-18026653, H-17033535, H-20002220, and H-19086583). One trial was ongoing at the time of data extraction (NCT04661748), and only patients with registered outcomes were included. All studies’ inclusion and exclusion criteria are listed in Supplemental Table 1.

2.2 Data collection

Data were collected using the clinically validated CE- and FDA-approved Isansys Lifetouch (Isansys Lifecare, Oxfordshire, United Kingdom) [28, 29]. Isansys Lifetouch is an ECG patch attached to the patient’s chests with two electrodes placed 15 cm apart over the anterior left aspect of the patient’s thorax at an angle of 45° with its base in the fourth intercostal space approximately 2 cm lateral to the sternum. The device recorded a single lead ECG and registered heart and respiratory rates. ECG segments of approximately 10 s each minute were transferred and saved directly via Bluetooth to the Isansys gateway. When patients were out of range for Bluetooth connection, data were stored automatically on the device and transferred when the connection was re-established. Patient demographic data were collected including height, weight, age, smoking status, and pre-existing medical conditions through the Charlson Comorbidity Index [30]. Informed consent was obtained from all patients included in the study.

2.3 Exposures

Exposure variables were accumulated time in minutes below specific thresholds for each of the different HRV-derived measures (see HRV-derived measures) in the last 24 h of measuring before the first SAE. If no SAE occurred, the last 24 h of HRV measurements in each patient were used for the primary analysis. Up to 1000 thresholds for each measure were investigated for the prognostic ability (see Statistical analysis). Only patients with at least 18 h of ECG monitoring in the 24 h of interest were included in the analysis. Patients were excluded from the primary analysis if an SAE occurred within 18 h after initiating monitoring to give patients with and without SAE similar exposure time.

The secondary analysis included only patients with an SAE after at least 48 h of measurements. We aimed to compare the last 24 h of measurements before an SAE to the period measured 24 to 48 h prior to the SAE. We conducted this analysis to determine if HRV changes during monitoring could predict development of SAEs. Thus, the primary analysis assessed whether HRV differed from a control group without SAEs, while the secondary analysis investigated if HRV changed in the period leading up to an SAE compared to earlier monitoring.

A tertiary analysis compared the first 24 h to the last 24 h of HRV measurements in patients without an SAE, to investigate differences between the start and end of the monitoring period. This analysis is relevant as a wide range of factors can alter HRV [14]. Given that all study participants were admitted for major surgery or acute medical conditions, a degree of HRV alterations are expected [14]. This analysis aimed to evaluate changes in HRV during the monitoring period in patients who did not develop an SAE.

The same requirements for missing data were applied to all the analyses and only HRV measurements preceding the first SAE were included to avoid bias of vital signs and ECG deviations resulting from the first SAE and medical interventions to correct it.

2.3.1 HRV-derived measures

HRV was evaluated using time-domain and frequency-domain measures [14]. Time-domain measures were used to quantify variability in the time between successive heartbeats, and included SDNN, RMSSD, the standard deviation of successive R-R interval differences (SDSD), the percentage of adjacent R-R intervals that differ from each other by more than 50 ms (pNN50), and the mean of R-R intervals (RRMean), with higher values reflecting greater variation in R-R intervals [14].

Frequency-domain measures can be calculated as absolute power or normalized power for different frequency bands including very-low-frequency (vLF, ≤ 0.04 Hz); low-frequency (LF, 0.04–0.15 Hz); and high-frequency (HF, 0.15–0.4 Hz) [14]. Power is defined as signal energy within a frequency band with the absolute power of a frequency band calculated as milliseconds squared divided by cycles per second [14]. Normalized power was reported in normalized units (n.u.), calculated as the absolute power of the specific frequency band divided by the summed absolute power of the vLF, LF, and HF bands. The normalized R-R interval power of the vLF, LF, and HF bands was included for statistical analysis [31].

24-h measurements of SDNN are considered the gold standard for medical stratification of cardiac risk and prediction of both morbidity and mortality, hence SDNN was the primary exposure variable [11, 14]. Time-domain and frequency-domain measures were calculated for one-minute intervals and are defined in Supplemental Table 2 [14].

2.4 Outcomes

The primary outcome was any SAE within 30 days. SAEs were defined as any medical life-threatening complication, resulting in death, hospitalization, prolongation of existing hospitalization, or significant disability according to the International Conference on Harmonisation—Good Clinical Practice guideline [32]. The secondary outcomes were all-cause mortality and non-fatal SAEs within the following classifications: cardiovascular, respiratory, infectious, neurologic, and other SAEs. All outcomes were based upon a standardized outcome manual including international defined criteria, such as acute renal failure, myocardial infarction, and sepsis [33,34,35].

2.5 Statistical analysis

Descriptive statistics were used to analyse baseline characteristics and frequency of patients with and without SAEs. Categorical variables are presented as numbers with percentages and continuous variables as medians with interquartile ranges.

For each individual HRV parameter, we calculated the accumulated time below the threshold, for up to 1000 different thresholds. As all patients had an accumulated time below the various thresholds this was used to evaluate the prognostic ability of each individual HRV-parameter. We calculated the area under a receiver operating characteristics curve (AUROC) and the corresponding 95% confidence interval (CI), using stratified bootstrapping, with the pROC-package [36]. The binary response variable was the occurrence of SAEs, and the continuous predictor variable was the accumulated time below each specific threshold. These calculations were repeated across all thresholds for each individual HRV parameter. AUROC quantified the prognostic ability, and was interpreted as representing ‘no better than chance’ (~ 0.5), low prognostic ability (0.5–0.7), moderate prognostic ability (0.7–0.9), and high prognostic ability (> 0.9) [37].

Additional calculations of the optimal cut-off and corresponding sensitivity and specificity were performed for thresholds with a lower 95% CI of moderate prognostic ability or above (AUROC > 0.7); and if none had a lower 95% CI above moderate prognostic ability, the calculations were performed for the threshold of each HRV parameter that achieved the highest AUROC. This was performed with both the primary outcome and each specific group of secondary outcomes as the binary response variable. The optimal cut-offs were defined as the values of the predictor variable, that maximize Youden’s Index, i.e. the maximum sum of specificity and sensitivity [38].

For the primary outcome of any SAE, the threshold with the highest AUROC for each HRV parameter, along with the specific HRV parameter that had the highest AUROC for each specific secondary outcome are presented in tables with the AUC, the corresponding 95% CI, the threshold, the optimal cut-off, and the corresponding sensitivity and specificity.

Subgroup analyses were performed for medical patients and surgical patients. All statistical analyses were performed using the statistical software R version 4.2.1 (R Core Team, Vienna, Austria).

3 Results

A total of 1402 patients from six studies and trials were assessed for eligibility. A total of 923 patients were included for analysis, while 479 were excluded as the requirement for duration of recording was not met (Fig. 1), and 297 included patients (32%) had one or more SAEs. There were 27 instances of all-cause mortality, 30 cardiovascular SAEs, 45 respiratory SAEs, 72 infectious SAEs, 9 neurologic SAEs, and 114 other SAEs. The baseline characteristics of the patients are presented in Table 1.

Fig. 1
figure 1

CONSORT diagram of patient inclusion and data analysis. RCT randomized controlled trial, WARD Wireless Assessment of Respiratory and Circulatory Distress, CGM continuous glucose monitoring, VASC vascular surgery

Table 1 Baseline characteristics

3.1 Primary analysis

When comparing the last 24 h before any SAE with the last 24 h of monitoring in those without SAEs, the optimal threshold for the primary exposure variable SDNN demonstrated an AUC of 0.57 (95% CI 0.53–0.61), sensitivity of 0.47 (95% CI 0.41–0.53), and specificity of 0.65 (95% CI 0.62–0.69). Similarly, the other HRV parameters presented with a low prognostic ability (Fig. 2). For the specific outcomes, RMSSD had the largest point estimates of AUC 0.67 (95% CI 0.63–0.71) for predicting cardiovascular SAEs, with a sensitivity and specificity of 0.03 (95% CI 0–0.17) and 0.63 (95% CI 0.59–0.67), respectively (Table 2).

Fig. 2
figure 2

Best performing thresholds in the primary analysis. AUC area under the curve, 95% CI 95% confidence interval, SDNN standard deviation of R-R intervals, RMSSD root mean square differences of successive R-R intervals, RRMean mean of R-R intervals, SDSD standard deviation of successive R-R interval differences, pNN50 percentage of adjacent R-R intervals that differ from each other by more than 50 ms, HF high-frequency; 0.15–0.4 Hz, LF low-frequency; 0.04–0.15 Hz, vLF very-low-frequency; ≤ 0.04 Hz

Table 2 Best performing thresholds for the primary analysis and secondary analysis

In the surgical subgroup, the best performing thresholds for any SAEs and the specific SAEs had a low prognostic ability. In the medical subgroup, the optimal thresholds for predicting any SAE demonstrated a low prognostic ability. When investigating specific SAEs, multiple thresholds presented with moderate prognostic ability for all-cause mortality, cardiovascular, infectious, and neurologic SAEs. HF demonstrated the largest point estimate of AUC with statistical significance for predicting neurologic SAEs (AUC: 0.85; 95%CI (0.76–0.95). RMSSD and SDSD had the largest point estimates of AUC with statistical significance for predicting cardiovascular SAE (AUC: 0.84; 95%CI 0.73–0.95) (Table 3).

Table 3 Best performing thresholds for the primary analysis in the medical and surgical subgroups

3.2 Secondary analysis

When comparing the last 24 h with the 24 to 48 h of measurements before an SAE, the best performing threshold for RRMean, demonstrated the largest point estimate of AUC 0.70 (95% CI 0.64–0.76), with sensitivity and specificity of 0.70 (95% CI 0.63–0.76) and 0.88 (95% CI 0.82–0.92), respectively (Fig. 3). RMSSD and SDSD also had moderate prognostic ability with AUC 0.70 (95% CI 0.64–0.75). The remaining HRV parameters for any SAEs presented with a low prognostic ability. The best performing thresholds for specific SAEs all presented with point estimates of moderate prognostic ability, except for other SAEs. LF, RMSSD, SDSD, and VLF presented with the largest point estimates of AUC for predicting all-cause mortality (AUC:0.8; 95%CI 0.60–0.99) (Table 2).

Fig. 3
figure 3

Best performing thresholds in the secondary analysis. AUC area under the curve, 95% CI 95% confidence interval, SDNN standard deviation of R-R intervals, RMSSD root mean square differences of successive R-R intervals, RRMean mean of R-R intervals, SDSD standard deviation of successive R-R interval differences, pNN50 percentage of adjacent R-R intervals that differ from each other by more than 50 ms, HF high-frequency; 0.15–0.4 Hz, LF low-frequency; 0.04–0.15 Hz, vLF very-low-frequency; ≤ 0.04 Hz

As most patients included for the secondary analysis were surgical patients, the results in this subgroup were nearly identical with similar HRV parameters demonstrating moderate prognostic ability for any and specific SAEs. For any SAEs in the medical subgroup pNN50 demonstrated the largest point estimate of AUC 0.70 (95% CI 0.50–0.90) with sensitivity and specificity of 0.61 (95% CI 0.36–0.83) and 1.0 (95% CI 0.81–1.00), respectively. The remaining HRV parameters demonstrated a low prognostic ability for any SAE prediction. For the specific SAEs, the best performing thresholds demonstrated moderate or high prognostic ability, but none had a lower confidence limit > 0.7. Detailed subgroup results for the secondary analysis are presented in Supplemental Table 3.

3.3 Tertiary analysis

When comparing the first 24 h with the last 24 h of monitoring in patients without SAEs, the optimal thresholds demonstrated low prognostic ability including analysis of the subgroups. HF demonstrated the largest point estimate of AUC 0.61 (0.57–0.65) (Fig. 4). Detailed results from the tertiary analysis are presented in Supplemental Table 4.

Fig. 4
figure 4

Best performing thresholds in the tertiary analysis. AUC area under the curve, 95% CI 95% confidence interval, SDNN standard deviation of R-R intervals, RMSSD root mean square differences of successive R-R intervals, RRMean mean of R-R intervals, SDSD standard deviation of successive R-R interval differences, pNN50 percentage of adjacent R-R intervals that differ from each other by more than 50 ms, HF high-frequency; 0.15–0.4 Hz, LF low-frequency; 0.04–0.15 Hz, vLF very-low-frequency; ≤ 0.04 Hz

4 Discussion

4.1 Summary of findings

In this study, predicting SAEs based on the accumulated time below threshold values for individual HRV parameters demonstrated overall low prognostic ability. RRMean was the overall best performing parameter, having the highest AUC point-estimate across the most thresholds for both any and specific SAEs. Certain HRV measures had moderate prognostic ability for specific SAEs. In the medical subgroup, thresholds for all-cause mortality, cardiovascular, infectious, and neurologic SAEs had moderate prognostic ability when comparing the last 24 h before an SAE with the last 24 h of measurements in those without SAE. When comparing the last 24 h with the 24 to 48 h of measurements before an SAE, RMSSD, RRMean, and SDSD had moderate prognostic ability for predicting any SAEs in all patients and the surgical subgroup. The best performing thresholds in both subgroups had moderate or high prognostic ability for all specific SAEs except other and infectious SAEs in the surgical subgroup, but the limited number of SAEs challenged the statistical power. When comparing HRV measurements during the first 24 h to the last 24 h of monitoring in patients without development of SAEs, all thresholds and parameters demonstrated low prognostic ability or limited discriminative ability. This indicated minimal changes in HRV during the monitoring period for patients without development of an SAE.

4.2 Comparisons with previous studies

In the medical subgroup, parameters for all-cause mortality (RMSSD, SDSD, pNN50, HF, SDNN, RRMean, LF) and cardiovascular SAEs (RMSSD, SDSD, pNN50, LF, SDNN) had moderate prognostic ability when comparing the last 24 h before an SAE with the last 24 h of measurements in those without SAE. Specifically, RMSSD, SDSD, pNN50, and LF had a lower confidence limit > 0.7 for prediction of cardiovascular SAEs. Other studies have associated reduced HRV with long-term mortality and cardiovascular SAEs in post myocardial infarction patients [14, 18, 25,26,27]. Specifically, in a meta-analysis by Fang et al. lower HRV was associated with a pooled hazard ratio of 2.27 (95% CI 1.72, 3.00) for all-cause mortality and 1.41 (95% CI 1.16, 1.72) for cardiovascular events [18]. As multiple parameters, reflecting all different parts of the ANS, had moderate prognostic ability [14], our study suggests that autonomic dysfunction might have predictive value for mortality and cardiovascular SAEs beyond post-myocardial infarction populations, even with our study having less statistical power due to shorter follow-up time and less included patients with these outcomes.

For infectious SAEs most HRV parameters, reflecting all different parts of the ANS [14], had moderate prognostic ability. Garrad et al. found significantly reduced LF and sympathetically mediated HRV during sepsis [19]. Likewise, Korach et al. reported a likelihood ratio of 6.47 for an LF/HF ratio < 1, indicating sympathetic failure, in patients with sepsis compared to those without sepsis [20]. The varied results may arise from Garrad et al. and Korach et al. assessing sepsis in a small number of ICU patients. In our study, sepsis and urinary tract infections were categorized as infectious SAEs, the latter being less severe and more common, could predominantly impact parasympathetic nervous system (PNS) or overall ANS activity.

HF, VLF, RMSSD, SDSD, and RRMean had moderate prognostic ability for neurologic SAEs in the medical subgroup including patients with stroke and syncope. Specifically, HF, RMSSD, and SDSD had a lower confidence limit > 0.7. Dütsch et al. found parasympathetic deficit in post-ischemic stroke patients, specifically with reduced HF [39]. Holmegard et al. found significantly lower overall HRV in patients with cardioinhibitory type of syncope compared to patients with vasopressor syncope during head-up tilt test and active standing, and cardioinhibitory patients showed dominance of sympathetic modulation [40]. This aligns with our results as HF, RMSSD, and SDSD reflecting short term suppression of PNS and overall ANS activity, had significant predictive ability [14]. Directly comparing physiological responses before and after a complication is likely unreasonable due to anticipated variations in HRV measurements.

When comparing the last 24 h with the 24 to 48 h of measurements before an SAE, RMSSD, RRMean, and SDSD had moderate prognostic ability for predicting any SAE in all patients and the surgical subgroup. Previous research proposed HRV as an indicator of surgical stress [13, 41,42,43]. Frandsen et al. continuously monitored HRV in total hip arthroplasty patients, revealing decreased HRV for at least the subsequent nine days [41], but comparability is limited due to the different study design and population. As multiple thresholds had moderate or high prognostic ability for nearly all specific SAE groups in both the surgical and medical subgroup, our study may indicate that HRV changes in patients experiencing an SAE could be a sensitive prediction method. The limited statistical power must be acknowledged as no thresholds had a lower confidence limit > 0.7 and the best performing thresholds had low prognostic ability for infectious and other SAEs in the surgical subgroup, which specifically had the highest number of SAEs.

The primary exposure variable of SDNN did not demonstrate superiority as the best HRV parameter for predicting SAEs. Furthermore, in contrast to previous studies, not a single threshold and cut-off for a HRV parameter stood out as the best predictor of SAE [11, 14, 25]. RRMean demonstrated the largest point estimates of AUC for multiple specific SAEs in both subgroup analyses. This HRV parameter is less investigated compared to RMSSD and SDNN [14], but the inverse relationship with heart rate allow comparison of our findings to other studies that have likewise described association between heart rate and complications [44,45,46], although generally in small or different populations from ours. Finally, when comparing the last 24 h with the 24 to 48 h of measurements before an SAE, the best performing thresholds for all patients and the surgical subgroup were consistently the largest thresholds investigated. This may suggest that the true optimal thresholds for SAE prediction were outside the range investigated. Similarly, previous studies have described that longer total or average exposure time with lower HRV in 24-h measurements indicate higher risk of mortality or complications [14, 26, 47]. The clinical relevance of the largest thresholds and beyond should be considered, if only a limited number of patients were exposed to them.

4.3 Strengths and limitations

This study included a large sample size with a substantial number of both medical and surgical patients, even after exclusions due to SAEs occurring within the first 18 h of monitoring or patients not meeting requirements of ECG monitoring time. We were unable to find other studies that systematically explored the optimal thresholds for multiple time and frequency domain measures of HRV variables, and comprehensively evaluated HRV’s predictive performance for SAEs in hospitalised patients.

It is important to acknowledge the risk of type 1 errors, considering the explorative nature of this study that involves conducting multiple analyses. The population was included in prospective studies with specific inclusion and exclusion criteria, which limits the generalizability. The rather high exclusion of patients, especially within the medical population, emphasize that we cannot exclude stronger associations of HRV variables to SAE. The prognostic value of duration below thresholds for individual HRV parameters was assessed, but combining these parameters in a machine learning model could potentially enhance predictive performance.

4.4 Perspectives

While it is recognized that traditional vital signs may not consistently demonstrate accurate predictive capability for upcoming SAEs [9, 10], the utilization of accumulated time below thresholds for individual HRV parameters, did not appear to have significant relevance for SAE prediction within our study population either. The moderate predictive capabilities for neurologic SAEs and cardiovascular SAEs suggest that future studies may describe this association more comprehensively, as have previously been done with long-term cardiovascular SAEs and mortality [18, 25]. Additional studies are required to validate the predictive capability of HRV for cardiovascular and neurologic SAEs in populations without pre-existing cardiovascular conditions or post-stroke [18, 25, 39]. The consistent predictive performance observed for multiple thresholds and HRV parameters in the surgical subgroup requires validation in larger studies with a more homogenous structure [48].

Larger studies including more patients with an SAE could demonstrate enhanced predictive performance by evaluating HRV changes preceding SAEs, as indicated by our secondary analysis comparing the last 24 h with the 24 to 48 h of measurements before an SAE. Our study indicated a potential association, but the existing uncertainty prevented us from reaching a definitive conclusion.

The traditional 24-h HRV measurement, while well-validated [14], have challenges in real-time SAE prediction, as the relevant 24-h window prior to an SAE is unknown. This method may help clinicians stratify patient risk but is less effective for immediate interventions. Integration with real-time HRV monitoring might be feasible. This approach would adaptively set thresholds based on analysis of continuous measurements, offering a more dynamic tool that could potentially allow for interventions before development of certain SAEs. This method requires much larger databases with more patients who develop SAEs during the monitoring period, due to the broad variation in pathophysiology associated with SAEs, complicating the development of a universal model capable of accurately predicting all SAEs or even SAEs within specific groups [49].

Future studies utilizing machine learning models to analyse multiple HRV parameters simultaneously or combine HRV data with other demographic variables may significantly enhance the predictive performance [49,50,51].

5 Conclusion

Predicting SAEs based on the accumulated time below thresholds for individual continuously measured HRV parameters demonstrated overall low prognostic ability in high-risk hospitalized patients and no HRV parameter consistently demonstrated superiority. Multiple thresholds in the medical subgroup moderately predicted all-cause mortality, cardiovascular, infectious, and neurologic SAEs when comparing the last 24 h before an SAE with the last 24 h of measurements in those without SAE. Various thresholds had moderate prognostic ability, when comparing the last 24 h with the 24 to 48 h of measurements before an SAE, suggesting HRV changes over time could be a potentially sensitive method for predicting SAEs.