Introduction

Epileptic seizures are potentially dangerous as they can lead to complications, including injury, status epilepticus, and sudden unexpected death in epilepsy (SUDEP) [1]. Adequate seizure detection may have the potential to minimize these complications and to ameliorate treatment evaluation, as seizures—particularly those at night—are often underreported [2,3,4,5]. Detection devices may also help to improve the independence and quality of life of people with epilepsy and their caregivers [3, 6].

Several parameters, including movement, sound, and autonomic nervous system changes, can be used to detect seizures. This review focuses on changes in autonomic function, including cardiovascular, respiratory, and transpiration changes [7]. Seizures can alter autonomic function, particularly if the central autonomic network is involved. The most common expression is a sudden increase in sympathetic tone [7, 8]. Ictal tachycardia (IT) is a very frequent sign, with prevalence rates ranging from 80 to 100% [9, 10]. IT is a hallmark of convulsive seizures (i.e., focal to bilateral tonic–clonic as well as generalized tonic–clonic seizures), and more common in temporal lobe vs. extratemporal lobe seizures [9]. Changes in autonomic function can precede ictal electroencephalographic (EEG) discharges by several seconds [10,11,12]. Preictal tachycardia has an incidence rate of approximately one-third of seizures [13]. Autonomic alterations may therefore provide an adequate tool for early seizure detection and facilitate timely interventions. Ictal arrhythmias and desaturations are more common but are thought to be self-limiting, while postictal arrhythmias and apneas may lead to SUDEP [14,15,16,17]. SUDEP usually occurs several minutes after a convulsive seizure (mean 10 min, range 2–17 min) [18]. Raising an alarm at seizure onset may be sufficient to allow timely intervention.

We aimed to systematically review different seizure detection algorithms based on autonomic function changes.

Methods

This systematic review was conducted in accordance with the preferred reporting items for systematic reviews and meta-analyses (PRISMA) guideline [19].

The PubMed and Embase databases were systematically searched through May 2018 for original studies validating an algorithm for automatic seizure detection based on heart rate (HR), heart rate variability (HRV), oxygen saturation (SpO2), electrodermal activity (EDA, reflecting changes in transpiration), or a combination of the aforementioned. A sequence of synonyms for ‘autonomic variables,’ ‘seizures,’ and ‘detection’ were used as search terms (see Table S1 in the Electronic supplementary material, ESM). Studies were included if they met the following criteria: (1) human studies; (2) written in English; (3) reporting on children or adults with any type of epilepsy; (4) validating an algorithm for automatic seizure detection using autonomic parameters; (5) reporting at least one performance measure [sensitivity, positive predictive value (PPV), false alarm rate (FAR), or detection latency (DL)]. Studies on neonates only were excluded, because both seizure and autonomic function characteristics differ greatly at this age compared to older age. Pilot studies lacking performance data, as well as conference abstracts and reviews were also excluded (Fig. 1).

Fig. 1
figure 1

Flowchart of the search for applicable studies

One author (AvW) screened all titles and abstracts, as well as the full texts of the remaining studies. For each article included, the following parameters were recorded: method of automatic seizure detection, type of autonomic variable, individual characteristics, number and types of seizures analyzed, prospective or retrospective validation, total recording time and performance of the algorithm (including sensitivity, PPV, FAR, and DL). We compared algorithm performance using multimodal autonomic parameters versus those using single modalities, provided that the studies (1) had a similar design (prospective vs. retrospective) and (2) reported both sensitivity and FAR.

The quality of the included studies was evaluated using the QUADAS-2 [20]. This tool consists of four domains (patient selection, index test, reference standard, and flow and timing) and different signaling questions to assist in judgments of the risk of bias and applicability. Additionally, we assessed all included studies according to the recently proposed standards for clinical validation of seizure detection devices (SDDs) [21].

Results

Out of the 638 articles identified, 86 studies were selected on the basis of title and abstract. After full-text screening, 21 studies were included for further analysis. Most of the excluded articles lacked the validation of a seizure detection algorithm (see Fig. 1). The characteristics of the included studies are summarized in Table 1. Most of the studies (n = 15) focused on ictal cardiac changes as a tool for seizure detection algorithms, including HRV (n = 10) [8, 22,23,24,25,26,27,28,29,30], HR (n = 4) [31,32,33,34], and changes in QRS morphology (n = 1) [35]. Six studies used multimodal algorithms, including combinations of HR, corrected QT interval (QTc), SpO2, EDA, and accelerometry (ACC) [2, 36,37,38,39,40]. None of the included studies validated an algorithm based on oxygen saturation or EDA alone. Most studies were conducted in adults, but two studies included a pediatric population [23, 40], and six studies included both children and adults [22, 25, 35,36,37, 39]. Fourteen studies prospectively enrolled their participants [8, 22, 23, 26, 28, 30,31,32,33, 36,37,38,39,40], but only two studies prospectively validated their algorithm [31, 33].

Table 1 Characteristics of included studies

Most studies had small sample sizes (median population size 14, IQR 7–26). The number of seizures analyzed per patient tended to be low (median number of seizures per participant 3, IQR 2–7). The total recording time used to validate the algorithm varied from 7 min to 158 h per person (median recording time per participant 34 h, IQR 3–86 h), but was not specified in two studies. Seizure onset was mostly focal (n = 14) [8, 22, 24,25,26, 28, 30, 31, 33, 34, 37, 39, 40, 42], but was focal and generalized in some (n = 4) [2, 23, 35, 42] or not specified in others (n = 3) [32, 36, 38].

All four performance measures (sensitivity, PPV, FAR, and DL) were only reported in three out of 21 studies [22, 33, 39]; eight studies reported three [2, 23,24,25, 28, 30, 31, 42], eight studies reported two [8, 26, 34, 36,37,38, 40, 43], one study reported one [41], and one study only reported sensitivity and PPV data for some of the subjects [32].

Heart rate analysis

Heart rate was monitored using single or multiple lead electrocardiography (ECG) in 14 of 18 studies [8, 22,23,24,25,26, 28, 32, 34,35,36,37, 42, 43]. Alternative methods included photoplethysmography (PPG) in a wearable sensor (n = 2) [2, 30] and an implanted heart rate sensor (AspireSR) (n = 2) [31, 33].

Heart rate measurement was done using various methods of R-peak detection, including those proposed by Pan and Tompkins [30, 41], Kohler [28], Yeh and Wang [22,23,24], or unspecified methods [8, 25, 26, 31,32,33,34, 42]. Some studies applied noise filtering techniques to diminish false R-peak detection, including high- and low-pass noise filters [8, 22,23,24, 26, 30] or a specific algorithm (baseline estimation and denoising with sparsity) [42].

One case study prospectively assessed a HR algorithm using a vagal nerve stimulation (VNS) device with a fixed HR sensitivity threshold [33]. Alarms were generated when the HR augmentation exceeded 50% of the baseline HR. Eleven out of 12 seizures were detected (sensitivity 92%), together with 128 false alarms (FAR 1.88/h; 68 h recordings). A second prospective validation study of the same VNS device compared different HR thresholds (≥ 20%, ≥ 40%, and ≥ 60% increases from baseline) in 16 adults with refractory epilepsy [31]. Lower thresholds resulted in higher sensitivity and higher FAR than higher thresholds (e.g., sensitivity 59.3% and FAR 7.2/h for threshold ≥ 20% vs. sensitivity 18.8% and FAR 0.5/h for thresholds ≥  60%).

Similar effects of varying the thresholds (for both the relative HR increase and the duration of HR increase) were reported in two studies on retrospectively validated HR algorithms [32, 34]. A follow-up using the same dataset examined different factors that may influence the probability of seizure detection [44]. The best regression model was created with variables including age, gender, etiology, seizure class, and years with epilepsy.

Heart rate variability (HRV)

All of the HRV-focused studies performed retrospective validations [8, 22,23,24,25,26, 28, 30, 41, 42]. Different HRV features were selected and specific feature thresholds were classified as ‘ictal’ or ‘interictal.’ Nine out of ten HRV studies applied linear analysis [8, 22,23,24,25, 28, 30, 41, 42] using time domain [22,23,24,25, 28, 30, 41, 42] and frequency domain [8, 25, 28, 41, 42] features. Time domain analysis focuses on the instantaneous HR; the interval between two normal QRS complexes, abbreviated to ‘NN.’ Different time domain features, such as the mean NN interval or the distribution of NN have been used for seizure detection. Four studies extracted and classified these time domain features using a support vector machine (SVM) classifier and validated the same HRV algorithm in different populations [22,23,24, 30]. The first retrospective study of 17 people with temporal lobe epilepsy found a mean sensitivity of 83.2% with a FAR of 2.01/h [22]. The second study extracted ECG or PPG data from three different heart rate sensors worn by 11 adults with temporal lobe epilepsy [30]. The best performance was obtained using a wearable ECG device, with a sensitivity of 64% and a FAR of 2.35/h. A third study tested the algorithm in 28 children and showed a higher overall sensitivity (81.3%) and a lower FAR (0.75/h) [23]. Performance, particularly FAR, improved when applying a patient-specific heuristic classifier. The latter was confirmed in the fourth study of data from 19 people with temporal lobe epilepsy from a pre-existing epilepsy database [24]. The authors also proposed an adaptive seizure detection algorithm, and showed that similar results were obtained with simulated ‘real-time’ user feedback.

Frequency domain analysis is used to extract the frequency components of the HR signal, each with its own physiological footprint: low frequency (LF 0.04–0.15 Hz), high frequency (HF 0.15–0.40 Hz), very low frequency (VLF 0.0001–0.04 Hz), and very high frequency (VHF 0.4–0.5 Hz). Different frequencies were identified by power spectral density analysis of HRV in four studies [8, 25, 28, 41], and two studies sped up this process by applying an efficiency algorithm [fast Fourier transform (FFT)] [8, 28]. The LF/HF ratio , reflecting the balance of sympathetic and parasympathetic function, was examined in two studies [25, 41]. One of these studies tested a seizure detection algorithm combining both time and frequency domain features on 11 focal seizures upon awakening [25]. Ten of the 11 seizures were detected prior to seizure onset (sensitivity 91%, DL − 494 ± 262 s). Another study of seven adults with focal epilepsy that used time–frequency analysis of HRV based on a combination of the matching-pursuit and Wigner–Ville distribution algorithms reported a sensitivity of 96.4% with high FAR (5.4/h) [42]. Combining ECG and EEG algorithms yielded better performance (sensitivity 100%, FAR 1.6/h).

To assess the dynamic properties of ictal HR changes, nonlinear analysis can be applied, such as a Lorenz (or Poincaré) plot. This method plots the current R–R interval against the next R–R value. Standard deviations in the transverse (SD1) and longitudinal (SD2) directions of these plots can be calculated, and higher ratios of SD2/SD1 reflect increased sympathetic tone. These ratios can be used in seizure detection algorithms, since an increase in sympathetic tone is often seen during the preictal and early ictal phases. One small retrospective study proposed the modified cardio sympathetic index (mCSI) as a new measure in seizure detection that reflects the sympathetic tone [26]. A seizure detection algorithm based on changes in mCSI yielded a sensitivity of 88% in five people with temporal lobe epilepsy (FAR not reported). A larger follow-up study of adults with focal epilepsy compared frequency domain analysis with Lorenz plot analysis [8]. mCSI appeared more sensitive, but FARs were not reported.

The two remaining studies of HRV combined linear and nonlinear analysis [28, 41]. The first retrospective study of seven people with focal epilepsy reported an overall sensitivity of 88.3% with a specificity of 86.2% after selecting an optimal performance threshold for each patient [41]. The second study combined time–frequency and Lorenz plot analysis with a second nonlinear analysis of ‘sample entropy’ [28]. This parameter quantifies the regularity and complexity of a time series, and entropy decreases can be seen during the ictal phase. Applying all of these methods together to ECG data from twelve temporal lobe epilepsy patients resulted in overall sensitivity of 94.1% with a FAR of 0.49/h.

Another retrospective study reported two different seizure detection algorithms based on changes in QRS morphology (algorithm 1) and cardiorespiratory interactions (algorithm 2) [35]. The first algorithm captured five consecutive QRS complexes, aligned them with respect to the R peak, and assembled them into one QRS matrix. Principal component analysis was used to select different features from this QRS matrix. This process was repeated for every heart beat, which resulted in a sensitivity of 89.5–100% for detecting focal onset seizures and 86% for generalized onset seizures. The second algorithm was based on the well-known modulatory effects of respiration on HRV. These cardiorespiratory changes were quantified using phase-rectified signal averaging—a methodology used to detect quasi-periodicities in nonstationary signals such as the resampled RR interval time series—and were used for seizure detection. Slightly better performance was achieved by the second algorithm, which yielded a sensitivity of 100% for focal onset seizures and 90% for generalized onset seizures. In this study, 10.4–90% of the generated alarms were false, and this percentage was lower for the second algorithm.

Combining autonomic parameters

All multimodal autonomic algorithms were retrospectively validated. A combination of three biosignals, measured by two different devices, was used for seizure detection in a study of ten subjects with focal epilepsy [2]. An algorithm based on a specific seizure pattern of increased HR, decreased SpO2, and increased EDA was able to detect all seizures in six out of ten patients with a low FAR of 0.015/h. Specific thresholds of HR, QTC, and SpO2 were combined in an algorithm tested on a larger study population of 45 people with refractory epilepsy [37]. Only half of the collected data was used for analysis, and a sensitivity of 81–94% was found for focal to bilateral tonic–clonic seizures, while focal seizures without bilateral spreading showed worse performance, with a sensitivity of 25–36%. Overall FAR ranged from 0.4–2.4/h.

Three other retrospective validation studies combined EDA and accelerometry (ACC), measured with one device [28,29,30,31,32,33,34,35,36,37,38,39,40]. Different classifiers were used to select features of EDA and ACC. The first study tested two machine learning algorithms, the k-nearest neighbor (kNN) and random forest classifiers. The kNN classifier achieved the best results with 11 features, and was most sensitive for nonmotor seizures (sensitivity 97.1%, FAR not reported). The random forest classifier selected 26 features and showed its best performance with motor seizures (sensitivity 90.5%, FAR not reported). A second study used a SVM classifier to extract 19 features (16 ACC and 3 EDA) [40]. Fourteen out of 16 focal onset seizures with bilateral spreading were detected (sensitivity 88%) and FAR was 0.04/h. The same feature set was used in the third study and compared to a larger (40 ACC and 6 EDA) and a reduced (22 ACC and 3 EDA) feature set [39]. Retrospectively tested on 24 children and 45 adults with focal epilepsy, the reduced set showed the best performance (sensitivity 94.6%, FAR 0.20/day).

A multicenter study combined HR and ACC measures in 95 people with nocturnal major motor seizures [36]. Data from only 23 patients could be used to retrospectively validate three different algorithms based on changes in HR, ACC, and ‘HR or ACC.’ Clinically urgent seizures were detected well (sensitivity 71–87%), but FAR was relatively high (2.3–6.3/night), with wide variation between subjects.

Quality of the included studies

According to the QUADAS-2 criteria, the overall quality of the included studies was medium–high (Table 2). Seventeen out of 21 studies were at risk of bias, mainly due to an undefined patient selection process and fitting of the algorithm [2, 8, 22,23,24,25,26, 30, 32, 34, 37,38,39,40,41,42,43]. There was concern regarding the applicability of the selected patients in three studies, because the populations consisted of children only and/or were not well described [23, 25, 33]. Concerns about the applicability of the index test (i.e., the tested algorithm) arose in nine studies, mainly because the algorithm was fitted to one dataset [2, 8, 23, 25, 28, 30, 32, 36, 37].

Table 2 Quality of the included studies according to QUADAS-2

Based on the standards for the clinical validation of SDDs proposed by Beniczky and Ryvlin [21], most studies were classified as phase 1 proof-of-principle studies, whereas three were classified as phase 0 initial studies [34, 41, 42], and only one as a phase 2 study on a dedicated SDD [31] (Table 3). Seven other studies also tested a dedicated device but included small population sizes or did not address the safety of the device and were therefore classified as phase 1 [2, 30, 33, 36, 38,39,40]. Ten studies trained and tested their algorithm on the same dataset [2, 8, 22, 26, 32, 34, 37, 40,41,42], and only four used a predefined algorithm or cutoff values [30, 31, 33, 36]. Eighteen studies used video-EEG as reference standard; the remaining three used EEG or ECoG without video recordings [34, 41, 42].

Table 3 Quality of validation studies of seizure detection, as assessed using standards proposed by Beniczky and Ryvlin

Discussion

The overall quality of studies on seizure detection using autonomic parameters is low. Small population sizes, short follow-up periods, and high study heterogeneity raise concerns about the applicability of the results. Available studies are mainly initial or proof-of-principle studies that lack long-term and real-time ambulatory monitoring, which is needed to obtain more reliable performance data and usability outcomes.

HR- or HRV-based algorithms are most frequently applied, but it is hard to compare the results of different studies due to wide variation in the detection techniques used and a lack of FAR data (Table 4). Additionally, FAR, when mentioned, is high for these studies and exceeds acceptable limits for daily practice. We could not compare the performance of HR- and HRV-based algorithms due to the wide variety of study designs employed. HRV-based algorithms seem attractive given their short detection latency, but they still require prospective validation. HRV is, however, situation dependent and affected by exercise, stress, respiration, and sleep stage [45,46,47]. These confounding factors make it more challenging to distinguish ictal patterns from non-ictal ones, resulting in lower accuracy [48]. Also, similar activation of the autonomic nervous system can occur before physiological arousal or other sleep-related movements [49].

Table 4 Performance of seizure detection algorithms grouped according to dataset size

Multimodal algorithms might help to lower FARs. A retrospective study of seven children with tonic–clonic seizures validated different unimodal and multimodal algorithms on the same dataset. All combinations of multimodal sensors, including ECG, EMG, and ACC, showed at least 75% lower FAR [50]. Studies differentiating outcome according to seizure type showed diverse results, indicating that that different seizure types may require different detection techniques. Multimodal techniques can provide a solution to this problem [51]. Another solution could be personalizing or tailoring the algorithm. One study group studied two different personalization strategies and calculated the number of seizures required for accurate tailoring [52]. The authors proposed an initialization phase to tailor an existing predefined algorithm to a patient-specific algorithm. Six to eight seizures seemed sufficient to set individual thresholds [52]. Another retrospective multicenter study proposed an automatic adaptive HRV algorithm and tested it on a database of 107 nocturnal seizures from 28 children [23]. After an initialization phase of five seizures, the personalized algorithm resulted in lower FARs compared to those obtained with the patient-independent algorithm. A follow-up study proposed an adaptive classifier with real-time user feedback that presented similar performance; this method might be better accepted in daily practice [24].

Conclusion

Autonomic function alterations seem to represent an attractive tool for timely seizure detection. Unimodal autonomic algorithms cannot, however, reach acceptable performance: while most algorithms are quite sensitive, false alarm rates are still too high. Multimodal algorithms and personalization of the algorithm are important strategies to improve performance. Larger, prospective, home-based studies with long-term follow-up are needed to validate these methods and to demonstrate the added value of SDDs in clinical care.