1 Introduction

Epilepsy is accompanied by spontaneous debilitating seizures and currently affects 2.9 million individuals in the USA and roughly 70 million people worldwide. Epilepsy results in severe co-morbidities and tremendous loss of quality of life, while also costing $15.5 billion annually to the US healthcare system [69]. There are around 2.4 million new cases of Epilepsy around the world. Taking into account the aging society, according to forecasts, mental and neurological diseases, including epilepsy will be the main health problem in the world in the near future. For this reason, it is extremely important to develop appropriate tools (computer-aided diagnosis) that support the diagnosis of neurologists and psychiatrists.

Automatic identification and detection of Epilepsy or brain-related disorders are the major challenges for both clinicians and research scientists. Generally, neurological experts or brain surgeons inspect EEG recordings to detect the epileptic activity. Therefore, it is essential to develop a reliable and automated technique to detect Epileptic seizures is essential. An automatic interpretation of EEG signals in the diagnosis and treatment of brain diseases is one of the most important areas of study [46]. These systems, which are formed by the guidance of an expert physician due to intensive working hours and workload of physicians, play a significant role in diagnosis and treatment. The following advantages of EEG signals cause widespread use in automatic identification systems [3, 64].

  • Recording in digital devices,

  • High resolution,

  • Low cost

  • Applicable to people of all ages,

  • Noninvasiveness.

The EEG signal analysis is complex and tedious process, therefore, despite many methods presented in the literature, they still do not show sufficient effectiveness or are characterized by high computational complexity, which makes it difficult to implement them in practice. That is why there is no alarm system for future epilepsy attacks. Such a system would be very valuable for people with epilepsy because proper preparation for an impending epilepsy attack is crucial.

Therefore, the goal of our research was to create an effective and easy to use (not very computationally complex) method for automatically and mobile recognizing epilepsy seizures. Therefore, a texture feature generator (OP) and DWT based feature generation model is presented. As stated, texture feature generator can be implemented easily. However, they cannot reach high performance for some problems. Because they cannot extract both high-level, mid-level and low-level features. To generates these types features, levels are created deploying DWT. The most discriminative/distinctive features are selected using NCA. By using this collaboration, high accuracy rates are attained.

The characteristics of this method are listed below.

  • We propose a novel 1D OP for EEG signals classification.

  • The multi-level wavelet transform is applied onto a signal to extract distinctive features.

  • Five experimental cases are presented to obtain numerical results.

  • Four classifiers are used to test the performance. By using them, a comprehensive benchmark is obtained.

The contributions of this method are presented as follows. Epilepsy detection is one of the very crucial problems for patients and epilepsy can occur everywhere suddenly. Therefore, automated epilepsy detection models must be presented. In the EEG signals, classifications, transformations, and patterns based methods have been presented in the literature, and we know that DWT and local binary pattern (LBP) like methods have been accomplished satisfactory results. However, these methods did not achieve high classification capabilities for 5 classes (Z-O-N-F-S). Hence, multi-level DWT and 1D-OP based feature extraction methods are proposed to achieve high performance. Likewise, maximum pooling and NCA based feature reduction methods are presented to select distinctive features. Results have illustrated the success of this method. We presented a lightweight and a basic model for automated epilepsy detection. The presented model can be applied simply and it can be used any medical centers. Briefly,

  • New feature generation function is presented to generate discriminative features of the EEG signals.

  • The positive effect of the DWT is that it can be used directly, and DWT is utilized as pooling, for instance, maximum, average and minimum in the feature extraction.

  • By using a traditional classifier (kNN), high accuracies were calculated. This situation clearly denoted the effectiveness of the presented OP and DWT feature generation method.

  • This method has demonstrated high classification accuracy.

2 Literature review

Owing to these advantages, variable methods have been presented in the literature. Some of them are given as follows. Ibrahim et al. (2018) [34] analyzed a method based wavelet and Shannon entropy for epilepsy. In their study, a SVM, LDA, artificial neural networks, and k-nearest neighbor were used as a classifier. Gruszczyńska et al. (2019) [25] suggested a study based recurrence quantification analysis to classify EEG signals. The principal component analysis was used to visualize the results. SVM was utilized for the classification of results obtained. Anuragi and Sisodia (2018) [9] presented an approach based machine learning methods and wavelet transform for alcohol use disorder. The aim of this study is to classify alcoholic and healthy individuals automatically. SVM and Naïve Bayes methods were utilized in the study. Motamedi-Fakhr et al. (2014) [47] investigated the effect of EEG signals on sleep. Signal processing techniques have been utilized for this purpose. Dose et al. (2019) [17] suggested a method using convolutional neural network for motor imagery electroencephalogram. Comparative results were presented according to time and accuracy parameter. Mutlu (2018) [48] proposed an approach using Hilbert vibration decomposition for epilepsy. In the study, the least squares SVM was selected as the classifier. The aim of the study is to distinguish the signals normal of and epileptic individuals. Time, accuracy, and receiver operating characteristics were selected as the evaluation parameter. Acharya et al. (2018) [4] presented a study using a neural network for EEG signal classification. This study was evaluated according to specificity, sensitivity, and accuracy. Yilmaz et al. (2018) [70] suggested a novel method to classify EEG signals. 2-D signal representation method was used in this study. Specificity, f-measure, sensitivity, and accuracy parameters were selected as evaluation criteria. Kocadagli and Langari (2017) [42] proposed a model by using wavelet transforms and ANN. In this study, comparative results were presented according to the accuracy parameter. Al-Salman et al. (2019) [7] proposed a study using wavelet Fourier analysis for detection of sleep spindles detection from EEG signals. Least squares SVM was used for the classifier. The results were presented according to accuracy, kappa coefficient, f-score, sensitivity, and specificity. Moreover, many intelligent classification methods have been proposed for signal (electrocardiography, electromyography, electrooculography) [13, 18, 24, 27, 37, 44, 49, 57] and image or data (breast cancer, brain tumors, stomach cancer, kidney cancer) [1, 2, 11, 16, 29, 55, 65, 74] processing in the literature related to medical studies. In addition, there are many different studies on machine learning and artificial intelligence in the literature [5, 12, 14, 15, 26, 30, 31, 35, 36, 61,62,63] [10, 28, 38, 52, 53, 59, 60, 66]. Some of these studies, which include machine learning techniques, are on forecasting outside of medical studies [19,20,21,22, 32, 45, 71, 72].

3 The proposed 1D octal pattern

In this study, a novel method is presented for the feature extraction of EEG signals. This method is called the 1D octal pattern (1D-OP). Firstly, the average, standard and median deviation values of the signal are calculated. These are widely preferred statistical moments for feature generation (the two of the commonly used). The purposes of the presented 1D-OP are;

  • The mostly used texture feature generator for signals is one-dimensional binary pattern (1D-BP). 1D-BP uses 9 sized overlapping blocks and extract 256 features. The size of the generated feature vector is high. Therefore, it increases memory complexity. To decrease memory complexity, center symmetric model and less sized overlapping have been used. The presented 1D-OP generates 128 features using eight size overlapping blocks.

  • Hand-crafted features are divided into textural feature generation and statistical feature generation. Statistical features are generated employing statistical moments. Three statistical moments have been used to generate salient features.

The single EEG signal is seperated into 8 overlapping blocks of equal length. Center-symmetric signal values and the signum function are used to extract binary features. The average, median and standard deviation values of the block are calculated. Global and local statistical values are also used to extract binary features. Finally, 1D-OP extracts 7-bits feature values to each block. 4-bits feature values are extracted from blocks with a length of 8 using signum function, and 3-bits are extracted using statistical attributes. The signal is constructed using the extracted feature values, and histogram of the feature signal is used as a feature set. Steps of the proposed 1D- OP is presented as below.

  1. Step 1:

    Load EEG signal.

  2. Step 2:

    Calculate the global statistical values of the EEG signals.

$$ av{r}_g=\frac{1}{L}\sum \limits_{i=1}^L{S}_i $$
(1)
$$ me{d}_g={\left(\frac{L+1}{2}\right)}^{th} $$
(2)
$$ st{d}_g=\sqrt{\sum \limits_{i=1}^L\frac{{\left({S}_i- av{r}_g\right)}^2}{L}} $$
(3)

Where S is EEG signal, avrg, medg and stdg describe the average, median and standard deviation of the signals.

  1. Step 3:

    Divide into non-overlapping blocks with a length of 8 (Fig. 1).

Fig. 1
figure 1

A sample block with a size of 8

Eight sized non-overlapping blocks are chosen to extract 7 bits, feature values of each block. Four bits feature values are extracted using Signum function, and 3 bits are calculated using statistical attributes.

  1. Step 4:

    Calculate the statistical values of the block using Eqs. 13.

  2. Step 5:

    Use the signum function to extract binary features. The signum function is given Eq. 4. Algorithm 1 also explains bit extraction process.

$$ Signum\left({S}_c,{S}_i\right)=\left\{\begin{array}{c}0, if\ {S}_i-{S}_c\le 0\\ {}1, if\ {S}_i-{S}_c>0\end{array}\right. $$
(4)

Algorithm 1: Bit extraction process of the OP.

figure a

In this algorithm, stdg, mdng and avrg express standard deviation, median and average values of the input EEG signal respectively. stdb, mdnb and avrb are standard deviation, median and average values of the each divided 8 sized overlapping blocks respectively. Both statistical and textural features are used together to generate salient/informative features from an EEG signal. By applying this algorithm (Algorithm 1), seven bits are generated for each eight sized overlapping block.

  1. Step 6:

    Calculate map signal values mv using binary to decimal conversion.

$$ mv(i)=\sum \limits_{j=1}^7 bit(j)\ast {2}^{7-j},i=\left\{1,2,\dots, Ln-7\right\} $$
(5)
  1. Step 7:

    Extract the histogram of the feature signal.

$$ feat=H(mv) $$
(6)

where feat is the feature set, H(.) is histogram extraction function, and fs represents feature signal.

A numerical example about the proposed 1D-OP is shown in Fig. 2 for better understanding the presented 1D-OP.

Fig. 2
figure 2

An illustration of the proposed 1D-OP with a numerical example

4 The proposed method

As we know from previously presented works, DWT has a positive effect on the EEG classification. Textural feature extraction methods for instance 1D LBP and 1D TP have already been found to be successful in EEG classification. Statistical features have also been used in texture classification. To use these positive effects directly, we proposed a novel 1D-OP method, coupled with multi-level DWT. We use seven-levels DWT to extract in-depth features. 1D OP is utilized to the signal and low pass filter coefficients of the signals, and the extracted features are concatenated and a 1024D feature set is obtained. NCA is employed to select the most discriminative/salient features and the length of final feature vector is calculated as 128. The selected 128 features are forwarded to kNN classifier. The graphical definition of this model is shown in Fig. 3.

Fig. 3
figure 3

Graphical outline of the proposed method

As can be seen Fig. 4, seven level DWT is applied to the EEG signal. Symlet 4 mother wavelet function is used for DWT. The presented 1D OP generates 128 features from each EEG signal and low-pass filter coefficients (L bands) and the extracted features are merged and final feature vector with a length of 1024 is created. 128 the most valuable features are selected by deploying NCA and the selected/chosen features are feed to kNN classifier. As known from the literature, variable models have been presented to classify/detect/predict the EEG signals. Our main objective is to reach high classification capability with a low computational complexity. Deep learning models have high performances but their time complexity is very high. The deep learning models uses variable layers to extract both low-level and high-level features. Moreover, they need set millions parameters. We presented a feed-forward feature creation method based EEG classification model. We need layers/levels to generate multiple leveled features. Therefore, DWT is used seven times and a multileveled transformation is created (See Fig. 4). In the feature generation phase, we need a both effective and lightweight feature generator. Therefore, OP is presented to generate discriminative features. In the recommended OP, statistical moments have been used. Moreover, time complexity of the presented OP is calculated as O(n). Feature selection is one of the critical phases for the multilayered/multileveled machine learning models. By applying NCA, both performance of the model is increased and execution time of the used classifier is decreased. Briefly, simple and effective methods are used together to create an effective EEG classification model.

Fig. 4
figure 4

The calculated F1-scores for each class

The steps of the feature extraction method are as follows.

  1. Step 1:

    Step 1: Load the raw EEG signal.

  2. Step 2:

    Step 2: Apply 7-level DWT onto EEG signal and calculate L1, L2, …, L7 sub-bands using sym4 filter [56].

$$ \left[{L}_1\ {H}_1\right]= DWT(S) $$
(7)
$$ \left[{L}_2\ {H}_2\right]= DWT\left({L}_1\right) $$
(8)
$$ \left[{L}_3\ {H}_3\right]= DWT\left({L}_2\right) $$
(9)
$$ \left[{L}_4\ {H}_4\right]= DWT\left({L}_3\right) $$
(10)
$$ \left[{L}_5\ {H}_5\right]= DWT\left({L}_4\right) $$
(11)
$$ \left[{L}_6\ {H}_6\right]= DWT\left({L}_5\right) $$
(12)
$$ \left[{L}_7\ {H}_7\right]= DWT\left({L}_6\right) $$
(13)

Where DWT(.) represent DWT function with haar filter. L1, L2, …, L7 are low pass filter coefficients of three level DWT. H1, H2, …, H7 are high pass filter coefficients. Here, DWT is utilized as decomposition method. As stated, to reach high classification accuracy, high level, low level and medium level features must be generated. To generate feature with 3-levels, a transformation/decomposition method must be used. In the deep network, pooling based decomposition methods such as maximum, average and minimum pooling methods have been used. This work employs DWT as decomposition methods. Also, we used 3-levelled DWT to achieve the best results. According to experiments, best results were obtained by using 3-leveled DWT. Therefore, this method developed by using 3-leveled DWT.

  1. Step 3:

    Step 3: Extract features of EEG signals and low pass filter coefficients using 1D-OP.

$$ {f}_1= OP(signal) $$
(14)
$$ {f}_2= OP\left({L}_1\right) $$
(15)
$$ {f}_3= OP\left({L}_2\right) $$
(16)
$$ {f}_4= OP\left({L}_3\right) $$
(17)
$$ {f}_5= OP\left({L}_4\right) $$
(18)
$$ {f}_6= OP\left({L}_5\right) $$
(19)
$$ {f}_7= OP\left({L}_6\right) $$
(20)
$$ {f}_8= OP\left({L}_7\right) $$
(21)

where OP(.) represents 1D OP feature generation function and f1, f2, …, f8 are features of the EEG signal and low-pass filter coefficients.

Step 4: Merge features to obtain 1024D feature set. Equation 13 represents feature fusion.

$$ fs= conc\left({f}_1,{f}_2,\dots, {f}_8\right) $$
(22)

where fs is feature set and conc(.) represents a merge function.

Step 5: Choose the most discriminative 128 features using NCA. Algorithm 2 demonstrates the used feature selection process.

Algorithm 2. The pseudo-code of the NCA based feature reduction.

figure b

The classification abilities of the chosen features are evaluated in the experiments.

Step 6: Use kNN classifiers for classification of the selected features. The parameters of the used kNN classifier are given in Table 1.

Table 1 The parameters of the used four kNN classifier

As seen in Table 1, the constant parameters of the classifier are given, and these parameters are utilized to evaluate this method and obtain comparisons.

As demonstrated in Fig. 4, this method was applied to an EEG signal, and the obtained components and feature vectors are shown.

Moreover, pseudocode of the presented model is shown in Algorithm 3.

Algorithm 3. Pseudocode of the presented OP and NCA based model.

figure c

5 Experimental results

The proposed study uses the EEG dataset of Bonn University. This dataset is widely used in the literature. There are 500 EEG signals of 5 classes in this dataset [8]. The classes of the used dataset are called S, F, N, O, and Z. Each file consists of 4097 samples. The sampling rate of these EEG signals is 174 Hz. Therefore, length of each signal is also calculated as 23.5 s. Each class has 100 signals. The Z and O clusters belong to healthy people. O and Z signals were recorded while the eyes were close and open. EEG signals of the N, F and S classes were recorded during epilepsy attack, before the epileptic attack, and from the epileptic zone. This dataset can be downloaded using http://epileptologie-bonn.de/cms/front_content.php?idcat=193&lang=3 URL.

In this section, numerical computational time, results, confusion matrix, and comparisons are provided. We use MATLAB 2018a software and perform the simulation using a PC with 32 GB RAM, Intel Core i7–7700 CPU on Windows 10.1 operating system. We didn’t use any graphical processing units or cores. The average execution time is listed in Table 2.

Table 2 The average time of the proposed method for 500 EEG signals

As seen in Table 2, the proposed method is a lightweight method because it has a short execution time. The execution time of the OP and NCA based methods for single EEG fragment was calculated as 71.42 millisecond (35.71/500). This situation is indicated that real-time systems can be constructed using the proposed method. As we know, the testing time of these is shorter than the training times. The time complexity of the presented OP and NCA based classification models were calculated in Table 3.

Table 3 The time complexity calculation of the presented model using Big O notation

Accuracy was utilized as a main performance evaluation parameter. Specificity, sensitivity, and g-means are calculated to show the performance of this method clearly. The description of the parameters used is listed in Eqs. 2225.

$$ Acc\left(\%\right)=\frac{TP+ FP}{TP+ TN+ FN+ FP}\ \mathrm{x}\ 100 $$
(23)
$$ \mathrm{Sen}\left(\%\right)=\frac{TP}{TP+ FN}\ \mathrm{x}\ 100 $$
(24)
$$ \mathrm{Spe}\left(\%\right)=\frac{TN}{TN+ FP}\ \mathrm{x}\ 100 $$
(25)
$$ \mathrm{g}-\mathrm{m}=\sqrt{Sen\ \mathrm{x}\ Spe} $$
(26)

where Acc, FN, FP, TN, TP, Spe, Sen, and g − m are accuracy, false negative, false positive, true negative, true positive, specificity, sensitivity, and g-means respectively.

7 cases were defined in this work to obtain results and comparisons. Our experimental cases are ZS, ZF, OS, FS, NS, ZFS, and ZONFS. The explanation of these cases is listed in Table 4 [67].

Table 4 The cases defined for testing the proposed method

The calculated accuracy rates according to experimental cases are demostrated in Table 5.

Table 5 Accuracy rates of the proposed method

The best results for the fp and fn were achieved using KNN and SVM.

The widely used cases are Z-S, Z-F, O-S, F-S, N-S, and Z-F-S in the literature. The comparative results are listed in Table 6.

Table 6 Comparison result for Z-S, Z-F, O-S, F-S, N-S and Z-F-S class recognition

In the literature, 1D local binary pattern (LBP) and 1D ternary pattern (TP) are used to EEG classification, and the variable studies were not given the results of the classification for Z-O-N-F-S classes recognition. 1D TP and LBP are used for comparison because these are textural feature extractor as 1D-OP. 1D-OP extracts a 128-dimensional feature set. The calculated best success rates are listed in Table 7.

Table 7 Comparison result for Z-O-N-F-S cluster

The confusion matrices of the Z-O-N-F-S is given in Table 8.

Table 8 The calculated confusion matrix for Z-O-N-F-S classes

The geometric mean, unweighted average recall, unweighted average precision and F1-score were also calculated for all five classes, and these results were given in Table 9.

Table 9 The results of selected parameters for Z-O-N-F-S cluster

The calculated F1 scores for each classes were also shown in Fig. 4.

10-fold cross validation was used to obtain testing accuracies. Fold by fold accuracies for the Z-O-N-F-S classification was denoted in Fig. 5.

Fig. 5
figure 5

The calculated accuracies for each fold using Z-O-N-F-S classes

Moreover, ROC curve of this result was also plotted and it was given in Fig. 6.

Fig. 6
figure 6

ROC curves of the best result for all 5 classes

The ROC curves of the best results for all 5 classes are demostrated in Fig. 6, and the average AUC value of the proposed method is computed as 0.976.

6 Discussion

In this paper, a novel stable EEG feature extraction method is proposed. We proposed a lightweight feature extraction network because the computational complexity was calculated as O(nlogn). This method is similar to a deep network such as CNN. CNN uses a convolution operator, weight assignment, and weight updating methods. These methods (CNN like methods) have high computational complexity. Hence, the execution time of CNN is long. In this study, the proposed 1D-OTP is used as convolution and 1D-DWT is used as maximum pooling. In the feature reduction phase, NCA selector is chosen. There are no weight assignments and updating methods. Also, the comparisons apparently demonstrate that this method has high classification accuracy than 1D-CNN. In this work, a novel feature extraction method is presented using basic and effective methods. This paper clearly shows that patterns like 1D-OP can be used in deep networks instead of convolution operators. The discussions of this method are also given below.

  • By using kNN classifier, high accuracy rates were yielded for seven cases.

  • By using a traditional classifier (kNN), successful results were attained.

  • NCA is used for feature selection and 128 features are selected.

  • The separable features are extracted using this method. The statistical analyses of the extracted features are demostrated in Fig. 7.

    Fig. 7
    figure 7figure 7

    Statistical analysis of the features for the selected 128 features (a) Z (b) O (c) N (d) F (e) S

In Fig. 7, the blue box shows the range of the features; red lines represent the average value of the features; red stars are boundary values of these features. By using boxplot analysis, distributions and statistical attributes of the features are shown easily. Moreover, we applied t-test to extracted features. The calculated minimum p values were listed in Table 10.

Table 10 The calculated minimum p values for features of classes

Table 10 denoted statistical strength of the generated features. Moreover, the calculated p-values are visualized in Fig. 8.

Fig. 8
figure 8

The calculated p values for class pairs

  • Figures 7-8 and Table 9 obviously show separation of the extracted feature sets. Therefore, the proposed method achieves high classification capability.

  • The experiments clearly illustrated that this method is achieved more successful results than 1D-CNN for all five classes case.

  • Comparisons were apparently illustrated that the proposed method has high classification ability and more successful than another state of art methods for ZF, OS, FS, NS, ZFS and ZONFS experimental cases (class recognition).

  • Comprehensive experimental results are given.

  • Universal features were extracted from EEG signals using the proposed method.

  • This method has a simple mathematical structure, and it can be applied to real-world problems, e.g., in mobile devices and cloud computing. The novel mobile applications can be presented by using this method, and this application can be used to diagnose epilepsy.

The advantages of this method are:

  • This method has a short execution time.

  • The novel signal pattern is presented and achieved high success rates using this pattern.

  • High classification accuracy was obtained using this method.

  • This method has wider availability. Because it has a short execution time and high classification capability.

  • By using seven cases, high classification accuracies were calculated. This situation obviously denoted that the presented 1D-OP based method has a general success.

The disadvantages and limitations of this method are summarized below.

This method is only implemented on the EEG signal, and the used dataset is small because the Bonn EEG dataset consists of 500 EEG signals with 5 classes. The Bonn dataset is widely used in the literature for EEG classification. Therefore, this dataset is chosen. The used dataset is also homogeneous.

7 Conclusions and future works

The aim of this method is to classify EEG signals with a high accuracy rate. Multilevel DWT and a novel 1D-OP are used together. 1D-OP extracts 128 sizes of features from each signal. Three-level DWT is utilized in this paper. Original signal and seven low pass filter coefficients are utilized to extract features. In the feature fusion phase, the extracted features with a size of 128 for each component (original signal and low pass filter coefficients) are concatenated and 1024 sizes of the feature are obtained. After the feature selection, 128 features are obtained and these features are feed kNN classifier. Seven experimental cases were selected from Bonn EEG dataset. 99.5%, 100%, 100.0%, 99.0%, 100.0%, 99.3% and 96.0% accuracies were calculated on ZS, ZF, OS, FS, NS, ZFS and ZONFS cases consecutively. The calculated accuracy for ZONFS is 7.3% higher than the best result of the others. These results and comparison demonstrated the effectuality of this method. Therefore, the goal of research has been met, and this method can be utilized for automatically and mobile recognizing epilepsy attacks. Moreover, this model uses basic methods/functions to classify EEG signals. Therefore, this model is ready to use on the medical centers.

In future works, real-time applications can be implemented by using the proposed. IoT and cloud-based EEG based systems will be constructed using this system. Bigger datasets can be collected to create a successful model. Moreover, as we know that, deep learning methods have high computational complexity. The novel deep learning methods can be proposed using the proposed method and novel 1D pattern like the proposed octal pattern.