Introduction

Heart rate variability (HRV) is a physiological phenomenon characterising variation in the time interval between consecutive R waves (RR-intervals). It is considered to be intricately modulated by several mechanisms including respiration, thermoregulation, hormonal activity, and the interaction of the sympathetic and parasympathetic divisions of the autonomic nervous system [1]. HRV may be considered a useful predictive marker for diverse adverse clinical outcomes. Low HRV is associated with increased mortality after myocardial infarction [2, 3], increased ICU mortality [4], poor prognosis after traumatic injury [5], or in multiple organ dysfunction in patients with sepsis [6]. It is even considered to be one of the vital signs [5].

HRV can be analysed in the time domain, the frequency domain, and using non-linear methods [1, 7], whereby time and frequency domain analyses are most common in the literature. Time domain analysis is normally reported as standard deviation of all normal-to-normal R-R intervals (SDNN) and root mean square of successive differences between normal heartbeats (RMSSD). Frequency-domain power spectral density analysis can be used to study cardiac autonomic balance [8]. The power spectrum of HRV consists of four components i.e., high frequency (HF) (0.15–0.4 Hz), low frequency (LF) (0.04–0.15 Hz), very low frequency (VLF) (0.0033–0.04 Hz) and ultra-low frequency (ULF) (<0.0033 Hz). HF power has been proposed to be a marker of parasympathetic activity but there is disagreement in respect of the LF component–some studies suggest that LF power reflects sympathetic activity [8, 9], but others propose that LF power reflects both sympathetic and parasympathetic activity as well as baroreflex activity [1, 7]. Nevertheless, the interaction between the sympathetic and parasympathetic divisions is complex and can be modified by multiple stimuli [1]. VLF HRV is thought to be generated intrinsically from the heart and afferent activity of the sympathetic nervous system which is more highly activated with physical activity, while stress may modulate its amplitude and frequency [1, 10]. ULF HRV is thought to be due to very slow-acting biological processes such as circadian rhythms [11].

Spinal cord injury (SCI) leads to an imbalance in cardiogenic autonomic control. This change leads to various complications such as autonomic dysreflexia, arrhythmia and orthostatic hypotension [12, 13]. Moreover, cardiovascular disease is a major problem that leads to morbidity and mortality in individuals with long term SCI [14]. Previous studies have shown that HRV is altered following SCI; for example, LF power was lower in individuals after SCI compared to abled-bodied persons [15,16,17]. Additionally, persons with paraplegia with a sedentary lifestyle had lower HRV than those with active lifestyles [15]. These findings support the concept that HRV may provide additional objective information about cardiovascular risk and may help to raise awareness of the importance of a healthy lifestyle.

Previous HRV studies showed promising results with good to excellent reliability in able-bodied subjects or in patients after myocardial infarction [18,19,20]. Considering that reliability is not a fixed characteristic of the variable being measured, but depends on the characteristics of the individuals under investigation [21], it is necessary to determine the reliability of HRV in patients with SCI before HRV can be considered as a practical outcome measure in SCI. Although studies have previously been done to investigate reproducibility of HRV in individuals with SCI, the measurement time was limited to 5–10 min [22, 23]; these durations are simply too short to allow estimation of ULF power. Additionally, taking into consideration that wearable HR monitoring technology is now within reach by everyone, data collected in normal daily conditions without any restrictions on activity may be useful for future analysis.

The aim of this study was to investigate test-retest reliability of HRV metrics in individuals with SCI with no restrictions on activity over a long duration (24 h) and with sub-analysis of shorter durations of measurement (5-min, 10-min, 1-h, 3-h and 6-h).

Methods

Subjects

We studied individual with SCI who were admitted at Srinagarind Hospital, which is the largest public hospital in the Northeast region of Thailand, from October 2019 to August 2020. Inclusion criteria were SCI more than 3 months and age ≥18 years. Exclusion criteria were abnormal breathing pattern (respiratory rate >20 breaths/min or <10 breaths/min), fever (body temperature ≥37.8 °C), concomitant cardiac disease as well as endocrine disorders including diabetes mellitus and thyroid disease. Ethical approval for this study was obtained from the Khon Kaen University Committee for Ethics in Human Research (ref. HE621279). Written informed consent was obtained from all participants before the study. The study participants were admitted for annual urological surveillance which is generally composed of urodynamics study or bedside cystometry, ultrasound of the urological system, and voiding cystourethrography. 24-h HRV recordings were done following admission on the day prior to urological check-up. Because every participant was in the same inpatient setting, all participants had a similar daily routine: get up at 5.30–6.00 am, meals served at 8.00 am, 12.00 am and 4.00 pm. The light was turned off for bedtime at 9.00 pm. During the hospitalisation, the patients would have both physical therapy and occupational therapy sessions but during their free time they could do what they want such as going around in the hospital area or staying in bed.

Study protocol

HRV was measured over a period of 24 h starting at approximately 8 am. Individuals were required to refrain from smoking, and from drinking caffeine or alcohol for 24 h before the study. They were instructed to perform their normal daily, physical activities as usual. Each measurement session was separated by at least 24 h.

HRV measurements

Raw RR intervals were obtained using a wearable heart rate monitor comprising a wrist watch receiver (Polar V800; Polar Electro Oy, Kempele, Finland) and chest belt sensor (Polar H10). Data recorded from the 24-h measurements were used for a 24-h test-retest reliability analysis. Additionally, HRV outcomes were analysed from recording durations on sub-intervals of 1, 3 and 6 h from specified periods each day (9 am–10 am, 9 am–12 noon and 9 am–3 pm) to determine the shorter duration inter-day test-retest reliability. The 5-min and 10-min segments to be used for short-term analysis were obtained from 9 am–9.05 am and 9 am–9.10 am.

Outcomes and data processing

Following each measurement, the raw RR intervals stored in the V800 receiver were uploaded to the Polar Flow application, and then exported as a text file to custom-written HRV analysis software implemented in Matlab (The Mathworks, Inc., USA). Some recordings were deemed invalid because of poor signal quality. The remaining data sets were preprocessed for artefact detection and removal. Artefact detection was performed using two methods: (i) maximal and minimal values for plausible RR values were defined (min = 400 ms; max = 1650 ms), (ii) the difference between two successive RR intervals was set to be at a maximum of ±20% of the previous value. For the removal of the detected artefacts, special care was taken not to add new information to the original data sets by removing any artificially introduced combinations of two successive RR intervals from the analysis.

The outcome parameters consisted of both time domain and frequency domain parameters. In the time domain, the HRV metrics SDNN and RMSSD were computed. For the frequency-domain analysis, power in the ULF, VLF, LF, and HF frequency bands was calculated, together with total power (TP). The Lomb-Scargle least squares spectral analysis method for spectral density estimation was used, as it is specifically designed and optimised for non-uniformly spaced data sets such as RR time series. A recent review provides a systematic analysis of the applicability of Lomb-Scargle to in the clinical HRV analysis setting [24].

Statistical analysis

Continuous parameters are presented as medians (with 25th and 75th percentiles) because the data are not normally distributed. Wilcoxon signed-rank tests were used to test paired differences of the repeated measurements among each participant with significance level set to α = 0.05. Relative test-retest reliability was analysed using the intraclass correlation coefficient ICC3,1 and is presented as ICC and 95% confidence interval (CI). ICC ≥ 0.75 represents excellent reliability, ICC < 0.4 is poor reliability and ICC between these ranges is regarded as moderate to good reliability [25]. Absolute reliability was evaluated using the coefficient of variation (CV) [26] and Bland–Altman limits of agreement (LoA) [27]. When the data were heteroscedastic, the data were analysed using log-transformation. The LoA were then back transformed and are presented as ±bx̄, where x̄ is the mean and b is the slope of the LoA [28]. The statistical analyses were perform using SPSS (IBM SPSS Statistics for Windows, Version 28.0. Armonk, NY: IBM Corp).

Results

Seventy-two individuals participated in this study. During HRV data processing, some data sets were deemed invalid, leading to data from 45 individuals for further analysis. In 17 participants with tetraplegia (34 HRV recordings, 17 data pairs), 6 data pairs were invalid (some single data were valid but because the other was invalid the pair had to be excluded). These were due to 7 inadequate signal durations, 6 noisy signals and 1 signal gap (some data had more than one problem). In 55 participants with paraplegia (110 HRV recordings, 55 data pairs), 21 data pairs were excluded due to 12 noisy signals, 11 inadequate signal durations, 7 multiple skipped heart rate measurements and 3 signal gaps (Supplementary Material 1). Our study had 11 participants with tetraplegia and 34 participants with paraplegia; 71% were male. The mean age was 48.6 years and the median duration after SCI was 5 years (Table 1).

Table 1 Demographic data of participants with SCI (n = 45).

There were no statistically significant differences (p > 0.05) in any pairs of HRV values for any recording duration (5-min, 10-min, 1-h, 3-h, 6-h and 24-h) except for LF for the 10-min duration (Table 2).

Table 2 Test-retest reliability of HRV for each duration (5-min, 10-min, 1-h, 3-h, 6-h and 24-h) in all participants (n = 45).

Relative reliability

HRV values for the 5-min duration showed poor reliability of SDNN (ICC of 0.34) and moderate to good reliability of RMSSD, HF, LF and TP (ICC of 0.40–0.72). The 10-min duration showed poor reliability in VLF and TP (ICC of 0.16 and 0.22), moderate to good reliability in SDNN, HF and LF (ICC of 0.43–0.65), and excellent reliability in RMSSD (0.76). HRV outcomes from the 1-h duration showed excellent reliability for LF (ICC = 0.76), moderate to good reliability for SDNN, RMSSD, HF and VLF (ICC of 0.46–0.74), but ULF and TP showed poor reliability (ICC of 0.06 and 0.30, respectively). ULF and TP did however demonstrate markedly increased reliability for 3-h duration (ICC of 0.70 and 0.74, respectively). Relative reliability was excellent (ICC of 0.77–0.92) in all HRV parameters for the 6-h and 24-h durations (Table 3 and Fig. 1).

Table 3 Summary of test-retest reliability of heart rate variability for each duration (5-min, 10-min, 1-h, 3-h, 6-h and 24-h) in all participants (n = 45).
Fig. 1: ICCs of each HRV parameter.
figure 1

SDNN, RMSSD, HF, LF, VLF, ULF and TP are shown for each time interval (5-min, 10-min, 1-h, 3-h, 6-h and 24-h) in participants with SCI (n = 45).

Absolute reliability

Better absolute reliability was found for longer durations. CVs were in the range of 40.6–144.1% for HRV values of 5-min and 10-min duration and decreased to 14.9–42.5% for the 24-h duration. Generally, CV decreased by more than half in all recorded HRV parameters towards the 24-h duration (Fig. 2). There was better CV in the time domain outcomes compared to frequency domain outcomes (Table 3). Overall, Bland–Altman plots showed narrower limits of agreement in all HRV parameters as the observation period increased (Figs. 3 and 4).

Fig. 2: Coefficient of variation of each HRV parameter.
figure 2

SDNN, RMSSD, HF, LF, VLF, ULF and TP are shown for each time interval (5-min, 10-min, 1-h, 3-h, 6-h and 24-h) in participants with SCI (n = 45).

Fig. 3: Bland–Altman plot.
figure 3

Mean differences and 95% limits of agreement (LoA) among time domain HRV measures (SDNN and RMSSD) in 5-min, 10-min, 1-h, 3-h, 6-h and 24-h in participants with SCI (n = 45). The diagonal lines represent the 95% LoA.

Fig. 4: Bland–Altman plot.
figure 4

Mean differences and 95% LoA among frequency domain HRV measures (HF, LF, VLF, ULF and TP) in 5-min, 10-min, 1-h, 3-h, 6-h and 24-h in participants with SCI (n = 45). The diagonal lines represent the 95% LoA.

Test-retest reliability across groups based on risk of autonomic dysreflexia

Eighteen participants were in the high AD risk group (SCI level at or above T6). There were no significant differences between any pairs of HRV values for any duration in either group. Participants with high AD risk showed lower test-retest reliability of all HRV metrics compared to participants with low AD risk for the 5-min and 10-min durations. Additionally, they had lower test-retest reliability of HF and LF values compared to participants with low AD risk for all durations. The ICCs of HF were 0.31, 0.43, 0.43, 0.26, 0.59 and 0.66 for participants with SCI level at or above T6, while the ICCs were 0.84, 0.84, 0.76, 0.82, 0.87 and 0.89 for participants with lesion below T6 (for the 5-min, 10-min, 1-h, 3-h, 6-h and 24-h durations, respectively). ICCs of LF were 0.23, 0.32, 0.36, 0.58, 0.75 and 0.82 for participants with SCI level at or above T6 while ICCs were 0.64, 0.81, 0.90, 0.92, 0.94 and 0.95 for participants with lesion below T6. For these outcomes, it is clear that participants with lesion level at or above T6 had lower reliability for all durations. However, participants with a lesion level at or above T6 showed better relative reliability in SDNN, ULF and TP for the 3-h, 6-h and 24-h durations. HF and ULF were the least reliable HRV outcomes based on the lowest absolute reliability in both groups. The highest absolute reliability, as classified by smallest CV and narrowest limits of agreement were in the time domain metrics and for the 24-h duration in both groups. The CVs were 12.9% and 13.6% for SDNN and RMSSD in participants with SCI level at or above T6, and 17.3% and 15.7% in participants with SCI level below T6 (Supplementary Material 2, Tables 1 and 2).

Test-retest reliability across groups based on tetraplegia or paraplegia

The group of persons with paraplegia (n = 34) showed a significant difference in LF in the 10-min pair (p = 0.022). Persons with tetraplegia (n = 11) showed a significant difference between ULF pairs for the 1-h duration (p = 0.007). Participants with tetraplegia demonstrated lower relative test-retest reliability than those with paraplegia in most HRV metrics for every duration. For example, participants with tetraplegia exhibited lower test-retest reliability of HF and LF than participants with paraplegia: the ICCs of HF were 0.34, 0.50, 0.53, 0.20, 0.57 and 0.51 for participants with tetraplegia, while the ICCs were 0.67, 0.72, 0.66, 0.81, 0.86 and 0.89 for participants with paraplegia (for the 5-min, 10-min, 1-h, 3-h. 6-h and 24-h durations, respectively). The ICCs of LF were 0.17, 0.41, 0.73, 0.37, 0.64 and 0.66 for participants with tetraplegia, while the ICCs were 0.45, 0.60, 0.77, 0.92, 0.93 and 0.93 for participants with paraplegia. Thus, the group with tetraplegia had lower reliability of these outcomes for all durations. The highest absolute reliability as classified by lowest CV and narrowest limits of agreement were in the time domain metrics and for the 24-h duration in both groups (CVs of 13.3 % and 11.6 % for SDNN and RMSSD in participants with tetraplegia, and CVs of 16.4 % and 15.9 % in participants with paraplegia (Supplementary Material 2, Tables 3 and 4).

Discussion

This study aimed to investigate test-retest reliability of HRV metrics in individuals with SCI with no restrictions on activity over a long duration (24-h) and with sub-analysis of shorter durations of measurement (5-min, 10-min, 1-h, 3-h and 6-h). Based on ICC value ranging from 0.77 to 0.92, excellent relative reliability was found in all HRV parameters derived from 6-h and 24-h periods. Overall, the time-domain parameters were more reliable than the frequency domain parameters.

Regarding 5-min HRV metrics, La Fountaine et al., conducted a test-retest reliability study of HRV in seven participants with tetraplegia in the supine position and found moderate to good relative reliability of HF and LF (ICC of 0.66 and 0.44) [23]. Our study also showed that HF exhibited better relative test-retest reliability compared to LF for the 5-min duration, but our ICC values (ICC of 0.34 and 0.17 for HF and LF) were lower than those reported in the previous study. The difference in position and activity of participants may have played a role in this discrepancy. Ditor et al. [22] examined the test-retest reliability of 10-min HRV and found that the reliability of HF was poorer than LF, the ICCs for HF and LF were 0.53 and 0.84 in ten participants with SCI (all levels), and the ICC of HF and LF were 0.66 and 0.82 in six participants with tetraplegia. In contrast to the results of Ditor et al., we found better relative reliability of HF compared to LF for the 10-min duration. The ICCs of HF and LF in the overall SCI population were 0.65 and 0.60, and the ICCs of HF and LF were 0.50 and 0.41 in participants with tetraplegia. There seems to be conflicting results among studies regarding whether LF or HF showed better relative reliability, so this may need to be interpreted with caution [22, 23]. In our study, however, it should be noted that LF showed better absolute reliability, as evidenced by a smaller CV and narrower LoA compared to HF.

The reliability of long-term duration was comparable to previous studies in healthy individuals, patients with coronary disease and patients with hypertensive disease, which showed moderate to very good correlations of 0.60 and 0.98 [29, 30]. Those authors gave a cautionary note that some individuals without heart disease had considerably higher day-to-day variation in heart rate variability, so that care should be exercised when interpreting HRV outcomes from healthy individuals [29, 30]. These previous studies explored long term recording of HRV, but they used only correlation analysis and other reliability outcomes such as ICC or CV were not reported.

Regarding the relative reliability, different ranges of ICC are well defined and recognised for interpretation as poor, moderate to good, or excellent. However, CV and LoA were interpreted differently among previous studies. For example, the CV regarded as good reliability varied among studies in the range 10–30% [31, 32]. Based on the ICC, together with the CV and LoA, we found that the reliability for long term measurement was excellent especially for time domain parameters compared to frequency domain parameters. This finding was consistent with previous studies in patients with cardiac disease, patients with hypertension as well as in able-bodied subjects [18, 29, 33, 34].

The CV for 24-h recording of HRV observed in individuals with SCI in this study (14.9–42.5%) was comparable to those of healthy subjects (6–88%), but the CV for short-term HRV values in our study of 32–123% was much higher than elsewhere [34, 35]. The finding of lower reliability may have several causes. Firstly, it has been found that in clinical populations HRV is less reliable compared to able-bodied subjects. For example, Lord et al. found CV of LF power of 45% in controls and 76% in heart transplant patients [36]. Secondly, as our subjects have unique cardiogenic autonomic balance, especially those with injury level at or above T6 [37], higher day to day variation can occur. This is demonstrated by the lower test-retest reliability we found in the group with high AD risk and especially in participants with tetraplegia. Additionally, HRV is known to be affected by factors including physical activity level, rate and depth of respiration, postural change and acute psychological factors, as described in previous studies [18, 38,39,40]; since our study did not limit activity, it is possible that HRV varied more than in controlled conditions.

In general, a data recording period of 24 h is recommended when ULF power is to be analysed [1]. However, in our study, the shortest period that we examine all HRV parameters including ULF power was 1-h. According to the Nyquist-Shannon theorem, in order to gain an adequate waveform to analyse the data, the sampling frequency has to be at least twice the frequency of interest [41]. In practice, however, it is necessary due to measurement noise to increase the sampling rate to at least 10 times the theoretical lower bound. Since the upper frequency bound of ULF is 0.0033 Hz, corresponding to a period of 5 min, a tenfold recording interval of at least approximately one hour is required.

A limitation of our study was the reduced sample size caused by rejection of multiple data sets, caused principally by shifting of the HR chest belt sensor during normal daily activities. The high percentage of invalid recordings implies that 24-h recordings with wearable devices, specifically for the purpose of HRV analysis within the SCI population, is challenging. Care should therefore be exercised by patients and their carers to ensure, as far as possible, that the chest belt remains in position. The data in participants with tetraplegia were mainly excluded due to inadequate signal duration. This may be due to the removal of the HR belt before the proposed time by the patients or their relatives. There were no HRV data from participants with complete tetraplegia. Therefore, the generalisation of the data may be limited in this regard. It should be noted that although medications affect HRV, it should not affect the repeatability of HRV as the patients were taking the same medications every day [42,43,44,45].

Future work should focus on improving methods for HR measurement to achieve acceptable reliability in all HRV parameters in a shorter period, e.g. during measurement at rest, with controlled breathing or with limited postural change. The test-retest reliability data were mainly focused on no restriction of activity, thus improvement in reliability in under more-controlled conditions can be expected.

Conclusions

Relative reliability of HRV was excellent for 6 and 24-h recording durations and the best absolute reliability values were for 24-h recordings. Taking into consideration both relative and absolute reliability, longer-duration recording led to progressively better reliability. Time-domain HRV outcomes were more reliable than frequency domain outcomes. Participants with high risk of AD, particularly those with tetraplegia, showed lower reliability, especially for HF and LF. Additionally, there were challenges in acquiring long-duration recordings using the wearable devices without any restriction in activity in participants with SCI. Care should be taken to ensure that the chest belt remains in position.