1 Introduction

Epileptic seizure is one of the most common neurological diseases caused by abnormal electrical activity in the brain. Epilepsy is often accompanied by short-term abnormal behavior and cognition. The disease affects nearly 50 million people of the world and causes great distress in the lives of patients [1]. Epileptic seizures can be diagnosed with the help of the EEG, which records the wave pattern and detects electrical activity in the brain [2]. EEG is able to accurately record spike waves or irregular spikes and provides a guiding function for clinical experts. To collect EEG signals from patients, long-term continuous monitoring is required for server days or weeks. The analysis of long-term EEG signals produces a huge workload for the expert. Therefore, automatic seizure detection methods that can identify seizure events from long-term recordings are necessary.

Traditional analysis of EEG signals is based on time domain, frequency domain, time-frequency, and nonlinear methods [3]. As EEG is a time series signal, some metrics from the time domain, such as mean, variance, or peak value, are computed in a specific time window, without providing any information regarding frequency. The frequency analysis methods are based on the Fourier transform or wavelet transform (WT), which transfers the time series into the frequency domain and extracts frequency patterns [4,5,6]. Time-frequency domain analysis employed time-frequency distribution and images of EEG signals to extract features [7,8,9]. Nonlinear methods are used to characterize the complexity and fractal nature of EEG signals, including entropy, correlation dimension, and the largest Lyapunov exponent [10,11,12]. However, these methods cannot extract the non-stationary features of time series signals and suffer from noisy sensitivity [13]. Due to the limitations of traditional EEG signal analysis methods, graph theory-based analysis provides a key research direction for seizure detection with the help of graph parameters.

Many researchers have recently used graph theory to analyze multi-channel EEG signals. Molla et al. [14] used the graph eigen decomposition-based method to select the features for classification in a feedforward neural network. Zhao et al. [15] constructed a graph according to the correlation matrix to enhance the feature embedding of EEG signals without manually designed features. Raeisi et al. [16] proposed a graph convolutional neural network and considered long-range spatial information to extract features from the time domain and the frequency domain. The spatial information was calculated by the functional connections among the EEG channels. Jiang et al. [17] constructed functional brain networks and combined person correlation coefficient and mutual information to extract feature from a graph for seizure detection. The results based on two public datasets were competitive with the state-of-the-art methods. Overall, graph-based seizure detection methods show innovative sights and expected results.

Brain connectivity is the correlation between different regions of the brain that can reflect the transmission of information. It includes three different types of connectivity: structural brain connectivity (SBC), functional brain connectivity (FBC), and effective brain connectivity (EBC). SBC includes the physical connections between neurons. The FBC represents the statistical interdependence between the physiological time series recorded in different regions of the brain. SBC and FBC cannot measure the causal relationships between brain regions. EBC is determined by sampling recorded signals at multiple time points, which provides a better understanding of brain function. Common metrics used to calculate EBC are directed coherence (DC), partial directed coherence (PDC), generalized PDC (GPDC), directed transfer function (DTF), and direct DTF (dDTF) [18, 19].

The EBC contributes significantly to characterizing the influence of one neural region on the rest of the neuronal regions. The EBC method can measure the directed effects between each channel via Granger causality (GC) for EEG signals. Kose et al. [20] used EBC to analyze the dynamic mental workload condition based on EEG signals. Khan et al. [21] proposed a novel technique to estimate effective connectivity between EEG channels. The results verified that the method gave a better estimate of directed causality. The advantage of effective connectivity is that it can extract the causal relationship directly without prior knowledge. The relationship between channels can be used to construct a graph and extract features for machine learning.

Fig. 1
figure 1

The flowchart of the proposed method for automatic seizure detection

Machine learning methods based on feature extraction have been widely used in seizure detection, such as linear discriminant analysis (LDA), support vector machine (SVM), random forest (RF), and K-nearest neighbor (KNN) [22]. Wang et al. [23] introduced a novel RF model combined with grid search optimization for seizure detection based on multiple time-frequency analysis methods. Li et al. [24] used empirical mode decomposition in EEG signals to obtain time-frequency features. Then, a common spatial pattern was used to reduce the dimension of the features. A classifier consisting of ten SVMs was adopted to identify the onset of seizures. Wang et al. [25] extracted multi-domain features and used LDA, SVM, and RF for validation of classification performance. Tapani et al. [26] validated the neonatal seizure detection using the SVM-based method. Generalizability, non-inferiority, and clinical efficacy were tested in their work to show the performance of the seizure detection algorithm. Deep learning algorithms can process data directly, learn feature information automatically from the raw data, and classify these feature vectors. Cimr et al. [27] designed an automatic computer-aided diagnosis system to optimize the complexity of the seizure detection method using a deep convolutional neural network. Wei et al. [28] regarded a 12-layer CNN as baseline model and introduced the merger of the increasing and decreasing sequences for patient-cross seizure detection, identifying 90.57% seizure events with the latency 4.68s. Akyol focused on the stacking ensemble based deep neural network for seizure detection, which is superior to the base model [29]. Shoka et al. [30] first converted the EEG signals into a 2D image and then encrypted, and finally fed to CNN model. Zhao et al.[31] proposed a novel seizure detection model based on graph convolution network, where Pearson correlation matrix was calculated. In this work, we use a graph theory-based approach to convert the raw EEG signal data into graph form and calculate the weight value of each edge in the graph. The flowchart of the proposed method is depicted in Fig. 1. The features of the signal are extracted based on graph metrics. And machine learning algorithms are used to perform the classification. The performance of different machine learning algorithms is provided with three classifiers using graph theory-based feature extraction. Furthermore, we illustrate the ability of the proposed method to achieve seizure detection for the patient-specific model and the cross-patient model. The main contributions of this work are as follows.

  1. 1.

    A weighted directed graph-based method with EBC is proposed for seizure detection. The weighted directed graph captures the directed effect and relationship between brain regions.

  2. 2.

    Various machine learning algorithms are employed to classify the extracted features. The selected RF classifier computes feature importance and generates a ranking of the features based on their significance. Additionally, RF conducts feature selection, eliminating the need for an extra feature selection process.

  3. 3.

    To validate the effectiveness and generalizability of the proposed method, it is evaluated in both the patient-specific model and in the cross-patient model.

The remainder of this paper is organized as follows. In Sect. 2, the data preprocessing and the proposed method are provided. In Sect. 3, the experimental results for the patient-specific model and the cross-patient model are presented. In Sect. 4, a comparison with the existing literatures related to this study is discussed. Finally, in Sect. 5, the conclusion is presented.

2 Methods and materials

2.1 Dataset

The dataset used in this work is a public dataset collected by Children’s Hospital Boston - Massachusetts Institute of Technology (CHB-MIT). The dataset contains long-term, multi-channel EEG recordings from 23 neonatal epileptic patients with refractory epilepsy [32, 33]. Data acquisition is performed using electrodes placed on the scalp of patients. And the placement of electrodes on the scalp follows the international standard 10-20 system. The dataset contains a total of 24 cases (chb01, chb02,... chb24) of 23 pediatric patients aged 1.5 to 22 years. The first 23 cases are from 22 patients with 17 women and 5 men. The sex and age of the 24th case are not provided. Each case contains continuous files in.edf format from a single subject. The sample rate of the collected EEG signals is 256 Hz with a 16-bit resolution. In our work 23 channels of EEG signals are used. At the same time, due to the continuity of the electrode montages, we could not read the data of some channels in chb15. Therefore, data from this patient are not used.

2.2 Preprocessing of EEG signals

The fourth-order Butterworth band pass filter was used to observe the frequency in the range of 0.01–32 Hz for seizure diagnosis to discard components with high-frequency and physiological artifacts that confound seizure detection. EEG signals usually have a long duration, so the original signal needs to be segmented before the next step. However, long segments do not effectively remove the artifact. And short segments are unable to detect epilepsy events. So, a length of 1 s with a non-overlapping window was used to segment the signals for each channel in this work. To ensure a balanced distribution of samples across different classes, our approach in this work involved using all available samples from ictal periods and randomly subsampling the interictal samples to obtain an equal number of positive and negative samples, with 9023 samples in each category. To avoid large gradient updates during data training, the normalization method was used. The normalization method converts the original data into a distribution with a mean of 0 and a variance of 1.

2.3 Effective connectivity in epilepsy

EEG-based data analysis methods contain single-channel analysis and multi-channel analysis. Multi-channel analysis provides more data and information. Single-channel analysis methods ignore the structure–function relationships between different regions of the brain, while multi-channel methods such as EBC use information from all channels. Identifying these structure–function relationships can effectively characterize the underlying dynamics in EEG signals.

The autoregressive (AR) model is the core of the parametric GC method. To consider the entire multivariate structure of the random process with m channels, the multivariate autoregressive model (MVAR) is introduced [34,35,36]. For a signal \(X(t)=(X_{1}(t), X_{2}(t),..., X_{m}(t))^T\) with m channels, where T denotes matrix transposition. The MVAR model of order p can be represented as

$$\begin{aligned} X(t)=\sum _{r=1}^pA(r)X(t-r)+E(t) \end{aligned}$$
(1)

where A(r) is a \(p\times p\) matrix of coefficients and E is an uncorrelated noise vector with a covariance of \(\Sigma \). The MVAR model uses the same framework to obtain the current values of the variables, using not only past instances of the same variables but also past instances of the other variables in the model. The coefficient matrix A and the noise covariance \(\Sigma \) can be estimated by solving the Yule–Walker equation [37]. The model order p is usually estimated by minimizing the Akaike information criterion (AIC) [38] to achieve the optimal fitting parameters of the model.

Referring to MVAR in Eq. (1), the relations in the frequency domain can be described as follows:

$$\begin{aligned} A(f)X(f)=E(t) \end{aligned}$$
(2)

where

$$\begin{aligned} A(f)=\sum _{r=0}^pA(r){\text {exp}}(-i2\pi fr) \end{aligned}$$
(3)

Rewriting Eq. (2),

$$\begin{aligned} X(f)=H(f)E(f) \end{aligned}$$
(4)

\(H(f)=A^{-1}(f)\) is the transfer matrix. In DTF, the causal effect of channel j on channel i is represented as follows:

$$\begin{aligned} {\text {DTF}}_{j\rightarrow i}^2=\frac{|H_{ij}(f)|^2}{\sum _{l=1}^m|H_{il}(f)|^2} \end{aligned}$$
(5)

The DTF is the ratio of the causal effect of channel j on channel i to the net effect of all other channels on channel i, which has the desirable property of taking a value between 0 (with no causal effect) and 1 (with a strong causal effect). As a multivariate method, DTF has the advantage that only a single model fitting is required for all channels. Another widely used effective connectivity method is PDC.

$$\begin{aligned} {\text {PDC}}=\frac{|A_{ij}(f)|}{\sqrt{\sum _{k}|A_{kj}(f)|^2}} \end{aligned}$$
(6)

where \(\sum _{k}\) is the \(k^{th}\) column of the matrix.

In addition, GPDC is introduced to show the direct influence between channels. The presentation of GPDC is defined as

$$\begin{aligned} {\text {GPDC}}_{j \rightarrow i}(f)=\frac{A_{ij}(f)}{\sum _{i=1}^m|A_{ij}(f)|^2} \end{aligned}$$
(7)

The MATLAB toolbox of eMVAR was used to calculate DTF, PDC, and GPDC [39]. When calculating PDC, DTF, and GPDC, the frequency values are set to [1, 32] with a step size of 0.5. This results in a total of 63 points for f. Averaging over these frequency points yields a matrix W of \(23\times 23\), where \(w_{ij}\) denotes the causal effect of channel j on channel i. In the next subsection, such an EBC network is represented in the form of a graph, where W is used as the weight matrix of the graph. The characteristic information of the segment signal is extracted according to the measurement characteristics of the graph.

2.4 The measures of the graph

Given a graph \(G\!=\!(V, E, W)\), \(V\!=\!\{i|i \) is a vertex of \(G\}\) represents the set of the vertices of the graph and \(E\!=\!\{e_{ij} | i, j \in V\}\) indicates the set of the direct edges. If there is a directed edge from i to j, \(e_{ij}=1\); otherwise \(e_{ij}=0\). \(W\!=\!\{w_{ij}| i, j \in V\}\) is the set of the weight, and \(w_{ij}\) represents the weight value of the directed edge \(e_{ij}\). For EEG signals, the vertices indicate the channels, and the weight of the edges is the effective connectivity between the channels.

Directed edges indicate that information flows in only one direction. And the activity of one node depends on the other (such as the causal effect). However, undirected edges indicate that information flows in both directions along the edge of the connection. The weight of the edge between the two nodes reflects the strength of the connection of these nodes, which can distinguish between strong and weak connections. Weak connections can be removed by thresholding. Specifically, the threshold is varied from 0.1 to 0.9, and the model achieved its best performance when the threshold is set to 0.3. Consequently, the threshold value is 0.3 in our experimental setup.

In the graph, given two vertices, the minimum number of edges connecting the two vertices is defined as the path length of the two vertices. And the average of the path lengths of all vertex pairs in the graph is defined as the characteristic path length (CPL).

$$\begin{aligned} {\text {CPL}}&=\frac{1}{M}\sum _{i \in V}\frac{\sum _{j \in V,j \ne i}d_{ij}}{M-1}\nonumber \\ d_{ij}&=\sum _{e_{ij} \in S_{i \rightarrow j}}e_{ij} \end{aligned}$$
(8)

where \(d_{ij}\) is shortest path length to measure the integration of the graph. \(S_{i \rightarrow j}\) is the shortest path from vertex i to j.

Global efficiency (GE) measures the harmonic mean of the shortest path of any two nodes, indicating how to transmit efficient information through the entire network. The GE is calculated by

$$\begin{aligned} GE=\frac{1}{M}\sum _{i \in V}\frac{\sum _{j \in V,j \ne i}d_{ij}^{-1}}{M-1} \end{aligned}$$
(9)

Transitivity (T) is the ratio of triangles to the total number of triplets in the network. T is defined as

$$\begin{aligned} T\!&=\!\frac{\sum _{i \in V}t_{i}}{\sum _{i \in V}[(k_{i}^{\text {out}}\!+\!k_{i}^{\text {in}})[(k_{i}^{\text {out}}\!+\!k_{i}^{\text {in}}\!-\!1)\!-\!2\sum _{j \in V}e_{ij}e_{ji}]]}\nonumber \\ t_{i}&=\frac{1}{2}\sum _{j,h \in V}(e_{ij}+e_{ji})(e_{ih}+e_{hi})(e_{jh}+e_{hj})\nonumber \\ k_{i}^{\text {out}}&=\sum _{j \in V}e_{ij}\nonumber \\ k_{i}^{\text {in}}&=\sum _{j \in V}e_{ji} \end{aligned}$$
(10)

where \(t_{i}\) is the number of triangles of vertex i; \(k_{i}^{\text {out}}\) and \(k_{i}^{\text {in}}\) are out-degree and in-degree of i, respectively.

Modularity (MD) measures the quality of the network that can be subdivided into modules or communities. MD is measured as

$$\begin{aligned} MD&=\frac{1}{l}\sum _{i,j \in V}[e_{ij}-\frac{k_{i}^{\text {out}}k_{i}^{\text {in}}}{l}]\delta (C_{i},C_{j})\nonumber \\ l&=\sum _{i,j \in V}e_{ij} \end{aligned}$$
(11)

where l is the number of links in a graph. \(C_{i}\) and \(C_{j}\) are the cluster of vertex i and j, respectively. \(\delta ()\) measures whether the two vertices belong to the same community.

Assortativity coefficient (AC) examines whether nodes with a similar degree tend to be connected to each other. If the AC is positive, this means that nodes in the network tend to be connected to other nodes with similar degrees.

$$\begin{aligned}{} & {} AC\!=\nonumber \\{} & {} \!\frac{l^{-1}\sum _{(i,j) \in E}k_{i}^{\text {out}}k_{j}^{\text {in}}-[l^{-1}\sum _{(i,j) \in E}\frac{1}{2}(k_{i}^{\text {out}}+k_{j}^{\text {in}})]^2}{l^{\!-\!1}\sum _{(i,j) \in E}\frac{1}{2}[(k_{i}^{\text {out}})^2+(k_{j}^{\text {in}})^2]\!-\![l^{\!-\!1}\sum _{(i,j) \in E}\frac{1}{2}(k_{i}^{\text {out}}\!+\!k_{j}^{\text {in}})]^2} \end{aligned}$$
(12)

The clustering coefficient is a statistical feature of a graph that measures the degree to which a node is to be grouped. The average clustering coefficient AvgC is defined as

$$\begin{aligned}{} & {} AvgC\!=\nonumber \\{} & {} \!\frac{1}{M}\sum _{i \in V}\frac{t_{i}}{(k_{i}^{\text {out}}+k_{i}^{\text {in}})(k_{i}^{\text {out}}+k_{i}^{\text {in}}-1)\!-\!2\sum _{j \in V}e_{ij}e_{ji}} \end{aligned}$$
(13)

The sum of all link weights connected to a node is the node strength. The average node strength \(NS_{i}\) of i is defined as

$$\begin{aligned} NS_{i}=\sum _{j \in V}W_{ij} \end{aligned}$$
(14)

Graph entropy can measure the similarity of two graphs, which is the sum of the vertices in G. Given that a vertex i belongs to V, the entropy \(e_{i}\) of i is calculated by

$$\begin{aligned} e_{i}=-\sum _{j=0,j\ne i}^{M}w_{ij} {\text {log}} w_{ij} \end{aligned}$$
(15)

where M represents the number of vertices and \(w_{ij}\) represents the casual effect from j to i. Thus, the graph entropy of G with M vertices is formulated as

$$\begin{aligned} e_{G}=\sum _{i \in V}e_{i} \end{aligned}$$
(16)

The directed features are CPL, GE, T, MD AC, and AvgC. The weighted directed features are \(NS_{i}\), \(e_{i}\), and e(G). According to the graph theory measures, the extracted feature vector is [CPL, GE, T, MD, AC, AvgC, \(NS_{i}\), \(e_{i}\), \(e_{G}\)]. \(NS_{i}\) and \(e_{i}\) are local features, the others are global features. So, the dimension of the feature vector is 53 \((2\times 23+7)\).

3 Results

The preprocessing and feature extraction of EEG signals are performed using MATLAB 2020a. The classification work is implemented using PYTHON 3.9 on a Thinkpad T14, Intel i5 10th, and RAM 16 G.

3.1 Metrics

The evaluation metrics used in this work are accuracy (Acc), specificity (Spe), sensitivity (Sen), and area under curve (AUC). Moreover, statistical analysis is also provided. The receiver operating characteristic (ROC) curve does not specify a fixed threshold, but tries all possible thresholds (cutoff points) and computes multiple pairs of sensitivity and (1-specificity) at each possible threshold. The model is measured by comparing the AUC. The higher the AUC value, the higher the correct rate of the classifier. The performance metrics are defined as follows:

$$\begin{aligned} {\text {Acc}}= & {} \frac{{\text {TP}}+{\text {TN}}}{{\text {TP}}+{\text {FN}}+{\text {TN}}+{\text {FP}}}*100\% \end{aligned}$$
(17)
$$\begin{aligned} {\text {Spe}}= & {} \frac{{\text {TN}}}{{\text {TN}}+{\text {FP}}}*100\% \end{aligned}$$
(18)
$$\begin{aligned} {\text {Sen}}= & {} \frac{{\text {TP}}}{{\text {TP}}+{\text {FN}}}*100\% \end{aligned}$$
(19)

where TP denotes the number of positive samples correctly predicted. TN denotes the number of negative samples correctly predicted. FN denotes the number of positive samples predicted as negative. FP denotes the number of negative samples predicted as positive.

Table 1 The classification results of the proposed method
Fig. 2
figure 2

Classification results for the EEG signals with the graph theory method. a A graph was constructed with DTF, PDC, and GPDC methods. b The classification performance of the three classifiers. c Classification performances for the DTF, PDC, and GPDC methods. d The ROC curve for the classification result of the DTF. e The ROC curve for the classification result of the PDC. f The ROC curve for the classification result of the GPDC. The AUC is presented for each ROC curve

3.2 Experimental results

In this work, we convert the EEG signal into the form of a graph for each patient. The nodes of the graph represent the channels, and the edges represent the directed influence between the channels. EBC can capture the effective influence between different channels. The analysis of EEG signals includes the single-channel method and the multiple channel method. The single-channel analysis method neglects the structural information between channels and cannot characterize the underlying dynamics. Bhattacharyya et al. [40] selected five channels for multivariate analysis. Amiri et al. [41] used a set of optimal channels to discriminate seizure events. Multi-channel analysis can consider the interaction between channels and provide more information for the classification of seizures. To compare the recognition performance of different EBC methods for EEG signals, DTF, PDC, and GPDC are used to determine the directed influence between the nodes in the graph, as well as the weights of the edges. To obtain a reliable and stable automatic seizure detection model, fivefold cross-validation is used. A summary of the classification results is provided for the 23 patients. The classification results of the proposed method are given in Table 1. For the extracted graph-based features, three classifiers, SVM, RF, and KNN, are used to identify seizure and non-seizure signals. The sklearn library is utilized for experimentation. The radial basis function (RBF) is used as the kernel function for SVM. The number of trees in the RF is set to 25. As for the KNN classifier, the value of K is set to 5. All other parameters are kept at their default values. The results show that all three classifiers have a good effect on the identification of seizure and non-seizure signals for the DTF method. Among them, RF has the best classification with 99.97% accuracy of classification. The RF classifier achieves the highest classification accuracy of 99.72% for the PDC method. The KNN classifier has the highest sensitivity with 100%. The RF classifier achieves the highest classification accuracy of 99.78%, and the KNN classifier has a sensitivity of 99.97% for the GPDC method. Among these results of the EBC methods, the seizure ictal state is successfully identified from the interictal state. According to the results, the RF classifier performed the best average accuracy with 99.97%, 99.72%, and 99.78% for DTF, PDC, and GPDC, respectively. This is because the RF classifier randomly selects a portion of the feature vector per decision tree to identify seizure events and then selects the optimal set of features among these randomly selected features. The diversity of the system is enhanced by constructing multiple decision trees, thus improving classification performance. It also indicates that the DTF method can achieve better results than the others.

The visualization of the classification results with the DTF, PDC, and GPDC methods using three classifiers is provided in Fig. 2. The RF classifier has superior performance to others (Fig. 2b). From Fig. 2c, we can see that the DTF demonstrated the highest accuracy among the three types of EBC methods. We also obtained sensitivity and specificity of 99.95% and 99.99% with RF classifier, respectively. Furthermore, the statistical analysis of the proposed method is depicted. ROC curves are drawn on the basis of RF classifier. The ROCs of three EBC methods are given in Fig. 2, and the respective curves have a better classification performance. Among them, seizure events are accurately classified with high specificity and sensitivity by the proposed method. The ROC of the DTF used to obtain the directed relationship of the channels has a significantly larger AUC than others with AUC=100%.

Extracting graph features can capture the relation of nodes in the network at both global and local levels. Global features measure the shortest paths within the network, representing the degree of integration in the network’s communication. On the other hand, local features quantify the interactions between neighboring nodes. Specifically, they are based on the shortest paths between each node’s neighbors, reflecting the efficiency of communication between the node’s immediate spatial neighbors. Modularity reflects how the global node network is partitioned into highly connected subnetworks or modules, which often correspond to underlying neural processes. In highly modular networks, nodes within the same module are considered to play a role in common processes. The RF classifier ranks the importance of the features. From Fig. 3, we can see that the features with a score above 0.1 are node strength, graph entropy, and characteristic path length. The node strength has a relatively high importance score and contributes more to the classification of the EEG signals. The sorting results also reveal that the features considering both weight and direction have relatively high importance, and other directed features also contribute to the classification.

Fig. 3
figure 3

The importance of the selected features with the RF classifier

Fig. 4
figure 4

Degree rank and degree histogram of non-seizure a and seizure b signals

Fig. 5
figure 5

Results for the LOO model with the RF classifier. Patients in the x-axis are sorted by decreasing accuracy (red star marker). a LOO evaluation of DTF. b LOO evaluation of PDC. c LOO evaluation of GPDC

The graph structures of seizure and non-seizure events are given in Fig. 4, where at least one short path exists between most of the node pairs. The presence of high-degree nodes (hub nodes) in the graph shortens the path length between the nodes. The number of these nodes is small while the other nodes have a low degree. There have been numerous studies, showing that neural networks exhibit small-world properties. From the experimental results, it can be seen that the brain neural network built using EBC in this work has a low average shortest path length and a high clustering coefficient, which is consistent with the property of small-world network.

To improve the performance of the seizure detection method and guarantee that the model has good generalization. The experimental results of the cross-patient analysis are provided in Fig. 5a–c. We use the leave-one-patient-out (LOO) model to train and evaluate the presentation method. The metrics using the DTF, PDC, and GPDC methods for each patient are given in Fig. 5. Generally, all methods achieve 100% specificity without false positive during the classification. The average accuracy for DTF, PDC, and GPDC is 98.28%, 95.37%, and 95%, respectively. The sensitivity of the three methods varies greatly. For the DTF method, almost all sensitivity results are greater than 91.95% except patient 12. About 43% of patients have a sensitivity of less than 90%. The average accuracy for the cross-patient model with LOO evaluation is inferior to the patient-specific model. Due to differences in brain wave patterns between patients, the construction of a patient-specific model can obtain high performance. However, more data are used to train the machine learning algorithm in the cross-patient model, obtaining good classification results as well. So, the proposed method has the capacity to detect seizures in unseen patients.

Table 2 Comparisons with state-of-the-art seizure detection methods using the same dataset. NR stands for not reported values

4 Discussion

The proposed method based on weighted directed graph and EBC is evaluated using the publicly accessible CHB-MIT scalp EEG database for seizure detection. In recent years, many promising automatic EEG analysis techniques have been used for seizure detection. To demonstrate the superiority of the proposed technique, Table 2 compares the method proposed in this work with other seizure detection algorithms using the same EEG database. A comparison of the metrics reported in Table 2 shows that the proposed algorithm outperforms most previous work due to the directed influence of EBC and the graph theory-based methods. As shown in the table, the patient-specific model and the cross-patient model are all covered. The contribution of the proposed methods is superior to the existing work for the patient-specific model [17, 27, 30, 31, 40,41,42]. The results obtained by our method are 99.94%, 99.99%, and 99.97% for sensitivity, specificity, and accuracy, respectively. For the cross-patient model, the classification results are inferior to the patient-specific model [5, 43]. And among these cross-patient methods, the results of [5] achieve a higher sensitivity of 96.81% than others.

Mutual information between channels can help to obtain a valid relationship. Jiang et al. [17] presented a seizure detection method with the functional brain network. The correlation between channels is characterized by Pearson correlation coefficient and mutual information. The results they obtained demonstrated that the method is competitive with others. In this work, correlation is considered using the EBC method to represent the connectivity of noisy data. Then, the feature extraction method based on graph theory is used to obtain the network topology and detect the seizure events. This work captured the abnormalities of the network and connected the brain network characterized by the occurrence of seizures. Although the proposed method effectively realizes seizure detection, the limitations and future work of this work are as follows

  1. 1.

    The proposed model can detect seizures from EEG signals, but it also has a weakness in predicting seizures without delay. The next step is to accurately identify the characteristics of a preictal signal to give early warning before a seizure occurs.

  2. 2.

    The experiments were performed with a small amount of data. To verify the clinical importance of the automatic seizure detection algorithm, a larger dataset and various epilepsy syndromes are required.

5 Conclusion

In this work, an automatic seizure detection method based on EBC and graph theory is proposed, which built a directed weighted graph by discovering the relationships among the multi-channel EEG signals. Three classifiers are used to distinguish the seizure and non-seizure events. Furthermore, the patient-specific model and the cross-patient model are provided to test the generalization of the seizure detection method. The results demonstrated that the DTF method with RF can achieve high-performance classification. The proposed method is superior to the existing work on the same dataset in terms of accuracy, specificity, and sensitivity. Therefore, the directed effect is essential for automatic seizure detection methods of multi-channel signals. Our method can reduce the workload of reading EEG signals and help assess the strategy for seizure detection.