1 Introduction

Human brains are capable to accomplish two or more different cognitive activities simultaneously. These activities can be termed as mental tasks. In BCI scientific domain, the majority of the research actions have been augmented to differentiate between two diverse mental tasks at a given time. It is pointed out the researchers that the capability of human mind is to perform more two action simultaneously [35]. Also, a handful studies deal with multi-mental task classification [1, 13, 30, 36, 41, 45]. Thus, it is essential requirement of a system that can segregate more than two cognitive activities at a given instance of time, can be termed as multi-mental task classification system. It becomes more harder to classify a test sample correctly when number of classes increases in a given dataset. Also, the computational efforts to solve the multi-class problem are larger than a binary class problem as it involves the learning of several hyperplanes for classification model. The electroencephalograph (EEG) technology is dominant invigilating techniques to capture brain activity corresponding to a given mental task. The EEG signals, in its raw form cannot, give significant feature for distinguishing among different brain activity. Thus, from these arguments, classification of both types mental tasks classification is presumed to be a challenging problem.

In literature, numerous systems of working for the feature representation (feature extraction) from the EEG signals have been studied and suggested for BCI researches [3]. The feature extraction methods can be clustered into these major domains: (i) temporal [44]; (ii) frequency (spectral) domain; and (iii) hybrid of temporal and frequency domain methods; (iv) band power [28] and (v) through principle component analysis and linear discriminant analysis [14].

To describe accurate and precise time information of the neuro-physiological signal, methods in the temporal domain adapt predominantly approach. The changes information, of the signal with respect to the time, are designated by the extracted features. In the temporal domain, the amplitude of the signal or statistical measures like absolute mean, standard deviation and kurtosis of the signal are explored to represent EEG signal [4, 21, 32, 40]. There is a set of explicit oscillations functions in the EEG signal, which can be termed as rhythms. Commensurate with distinct mental tasks, different rhythms are allies to these EEG signals [6, 27, 29, 37, 38]. Hence, frequency information which is embedded in the signal can be used as a feature of the signal and utilized to characterize the signal more accurately. Power spectral analysis (density) is more powerful technique in BCI research community to extract accurate frequency content features and produces high-frequency resolution [36]. The phase synchronization between EEG signals in a same frequency has been also used as feature descriptors [2]. However, the neuro-physiological signals utilize in BCI contain distinct virtues in the both temporal and frequency domain. The variation in frequency contents of the EEG signal with respect to time demonstrates that the nature of EEG signal is non-stationary. Short-time Fourier transform and wavelet transform are widely used methods to extract the both frequency and temporal information-based features from the non-stationary signal. Such methods can detect abrupt temporal variations in the EEG signal. The wavelet transform (WT) [9, 31] is a useful technique by which analysis of both time and frequency contents of the signal can be analyzed simultaneously. WT is utilized in analysis of EEG signals in the fields of motor imagery and epileptic seizures, [4, 8, 22, 34], brain disorders, [20], classification of human emotions [33] and non-motor imagery [5]. However, WT uses some fixed basis functions which make it non-adaptive [23] to the signal to be processed. Another method for analyzing signals like EEG is empirical mode decomposition (EMD)[23], which is a data-dependent approach. This method is self-adaptive according to the signal to be processed unlike to WT, where a fixed set of basis functions is used. The signals are decomposed into finite set of low and high components by EMD, also known as intrinsic mode functions (IMFs) or modes. The EMD and its many variant like multi-EMD (MEMD) confers proper analysis of EEG signal having muscle artifacts [7]. The EMD method has been used to extract representative data for BCI [11, 18, 25] to classify mental task. In this paper, the following novelties are presented:

  1. 1.

    This is very first comprehensive study of three different types of EMD algorithm for binary as well as multi cognitive task recognition system.

  2. 2.

    A new parameter named as Husrt Exponent is investigated first time for cognitive task recognition system.

Rest of the paper is organized as follows: Section  2 glimpses brief overview of EMD algorithm and its variants. Section  3 elucidates the description of proposed approach. Data, experimental setup and result are discussed in Section  4, and finally, Section  5 mark-outs the conclusion.

2 Empirical mode decomposition and its variants

The brief description of the variants of EMD is discussed below.

2.1 Empirical mode decomposition (EMD)

The non-stationary and nonlinear signal can be proper investigated with the help of empirical mode decomposition (EMD) by breaking the signal into a series of its different endogenous oscillation components, which is also known as intrinsic mode function (IMF). A continuous function would be an IMF under the following circumstances [23]:

  1. 1.

    The difference between extrema and zero crossings of the function must be at most one.

  2. 2.

    At a given point, the mean value of the envelope defined by the local maxima and local minima.

Fig. 1
figure 1

IMF plot obtained for a given EEG signal

To be stationary Gaussian process for a signal, the requirement of narrow band is fulfilled by first condition. To abstain instantaneous frequency from unwanted fluctuations induced by asymmetric wave forms, second condition comes in the picture. The first four IMFs on EEG segment are shown Fig. 1 obtained using EMD algorithm. From Fig. 1, it can be noted that these IMFs can characterize the signal well. Thus, a signal x(m), can be expressed as:

$$\begin{aligned} x(m)=\sum ^{k}_{j=1}IMF_{j} +r_{k} \end{aligned}$$
(1)

According to [23], the stopping criteria in T steps to further produce IMFs on the basis of standard deviation are defined as

$$\begin{aligned} SD_{i}=\sum _{t=0}^{T}\frac{\left| IMF_{i+1}(t)-IMF_{i}(t)\right| ^{2}}{IMF_{i}(t)^{2}} \end{aligned}$$
(2)

There will be no decomposition in the signal, when the value of SDs reaches smaller than predefined value.

2.2 Ensemble empirical mode decomposition (EEMD)

The major drawback of EMD algorithm is the frequent mode mixing. This problem arises when a different scale of signal is contained by a single IMF or distinct IMFs produces a single scale of signal. To alleviate the problem of scale separation, [42] has suggested a noise-embedded data analysis method, called ensemble empirical mode decomposition (EEMD). EEMD defines exact IMF ingredients as the average of an ensemble of the trails which consists of signal plus white noise with finite amplitude [42]. Thus, the signal x(m) in \(i^{th}\) trial after adding white noise can be represented as

$$\begin{aligned} x^{i}(m)=x(m)+a_{0}w^{i}(n), \ \text {for}\ i=1, \ldots l \end{aligned}$$
(3)

where \(w^{i}(n)\) is the white noise in \(i^{th}\) trial with unit variance and \(a_{0}\) amplitude. For each \(i = 1, 2 \dots l\), the \(IMF_{k}^{i}\) is calculated with different realization of white noise with the signal obtained using Equation 3. The average \(k^{th}\) \(\overline{IMF}_{k}\) can be defined as

$$\begin{aligned} \overline{IMF}_{k}=\frac{1}{l}\sum _{i=1}^{l}IMF_{k}^{i} \end{aligned}$$
(4)

where l is number of white noise realizations.

In this work, parameter values of \(a_{0}\) and l have been empirically determined. The distinctive concepts of EEMD are as follows:

  1. 1.

    The ensemble mean neutralizes effect of added collection of white noises; thus, the mixture of the signal and white noises contains only signal.

  2. 2.

    To explore all feasible solution, it is essential to ensemble white noise of finite amplitude with signal.

  3. 3.

    To obtain true and physically meaningful IMFs from EMD, it is necessary to add noise to the signal.

2.3 Complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN)

The mode mixing problem of the original EMD algorithm is successfully alleviated by EEMD with adding white noise into the signal, but this also leads to a problem that noise is not fully segregated from the signal and the resultant different IMFs may contain mixture of noise and signal. To resolve this problem, [39] have proposed CEEMDAN algorithm that provides good spectral separation of the modes. Hence, it gives an exact reconstruction of the original signal with a lower computational cost.

The first residue can be calculated as:

$$\begin{aligned} r_{1}(m)=x(m)-\overline{IMF_{1}} \end{aligned}$$
(5)

where \(\overline{IMF_{1}}\) is the first average IMF obtained by EEMD. The second average IMF can be found as:

$$\begin{aligned} \overline{IMF}=\frac{1}{l}\sum ^{l}_{i=1}E_{1}\left( r_{1}(m)+a_{0}E_{1}\left( w^{i}(m)\right) \right) \end{aligned}$$
(6)

After finding \(k^{th}\) residue, for \(k=2,\ldots,K\), the \(k+1\) average IMF can be defined as:

$$\begin{aligned} \overline{IMF}_{k+1}=\frac{1}{l}\sum _{i=1}^{l}E_{1}\left( r_{k}(m)+a_{k}E_{k}\left( w^{i}(m)\right) \right) \end{aligned}$$
(7)

where \(E_{k}(.)\) is an operator to extract \(k^{th}\) IMF from given signal by EMD algorithm and amplitude \(a_k\) allows to select the SNR at each stage. Detailed description can be found in [39].

3 Proposed approach

The proposed method involves the extraction of features with the application of the EMD algorithms separately from the raw EEG data in the very first stage. In second stage, these decomposed EEG signals are encoded with the help of parametric feature transformation using information feature to represent them compactly which leads to dimensionality reduction of the features via encoding with the help of statistical features as every signal or data have the distinguishable property in terms of a set of statistical parameters associated with the signal or data [19]. It may be possible that the two signals have same value associated with one or more statistical parameter. In this work, these eight parameters are selected empirically as discussed below in subsection.

3.1 Information feature

Extraction of useful information from the data is one of the crucial tasks. Statistical and uncertainty parameters have capability to produce good amount of information along with the interpretation from raw data obtained from instruments [43]. We have extracted following parameters:

  • Root mean square

  • Variance

  • Skewness

  • Kurtosis

  • Shannon entropy

  • Central frequency

  • Maximum frequency

  • Hurst exponent [24]

In this study, we have formulated a problem for the multi-mental task classification as well as binary mental task classification with the help of EEG signals. Classification of these EEG signals can only be done if an efficient feature representation for signals obtained. Further the obtained features vector is used for classification of these signals. In order to develop a classification model for multi-class problem, one versus rest approach-based support vector machine (SVM) is incorporated as a multi-mental task classifier to build the decision model. The overall flowchart of proposed model is shown in Fig. 2.

Fig. 2
figure 2

Schematic flowchart of the proposed model for mental task classification

4 Experimental setup and result

4.1 Dataset

For mental task, classification experiments were performed on a publicly available EEG dataset. This dataset consists the recordings of EEG signals using seven electrode channels (namely C3, C4, P3, P4, O1, O2 and EOG) of different lobes of the brain (central, parietal, occipital and near the eyes) from seven subjects with the recording protocols described below. Each subject was asked to perform five different mental tasks as: Baseline task (relax: B); mental Letter Composing task (L); non-trivial Mathematical task (M); Visualizing Counting (C) of numbers written on a blackboard and Geometric Figure Rotation (R) task. Each of the recording session consists of five trials of each of the five mental tasks. Each trial is of 10 second duration recorded with a sampling frequency of 250 Hz, which resulted in 2500 samples points per trial. We have utilized data of all subjects except Subject 4, due to some missing and incomplete information [15]. Detailed explanation can be found in the work of [26]Footnote 1 and Table  1. Six electrodes placed on the scalp at C3, C4, P3, P4, O1 and O2 are used for extracting the feature for mental task classification as EOG gives the only artifact.

Table 1 Data description

4.2 Constructing feature vector

For feature construction, the data of each task of each subject is sampled into half-second segments, yielding 20 segments (signal) per trial for each subject as some researchers have done [36]. The complete pipeline for constructing the feature vector from each subject using all trial corresponding to each mental task labels (B, L, M, C and R) is described below:

  1. 1.

    The EEG signal corresponding to each task of a given subject is sampled into half-second segments, yielding 20 segments (signal) per trial per channel.

  2. 2.

    In this way, corresponding to each channel, each of the 20 segments is used to generate the IMFs using EMD algorithms. First 4 IMFs are used for further analysis as it captures enough information to represent the signal.

  3. 3.

    To represent each of these IMFs per segment per channel compactly, eight statistical or uncertainty parameters (QM, Variance, Skewness, Kurtosis, Hurst Exponent [24], Shannon Entropy, Central Frequency, Maximum Frequency) are calculated for a given subject. Some of these parameters represent linear characteristics of the EEG signal, and other represent nonlinear properties of EEG [12, 17, 18]. In this work, the parameters are selected empirically as every signal or data has the distinguishable property regarding a particular set of statistical parameters associated with the signal or data as shown in Fig. 3.

  4. 4.

    Hence, final feature vector obtained after concatenation of features from six channels contains 192 parameters (4 IMFs corresponding to each segment \(\times\) 8 parameters corresponding to each IMFs \(\times\) 6 channels) for each task labels for a given subject.

Fig. 3
figure 3

Eight features obtained corresponding to all five mental tasks for channel 1 from IMF 1 using EEMD method for Subject 1

4.3 Result

The performance of the EMD and its variant has been evaluated in terms of various classification performance measures such as accuracy, sensitivity, specificity and kappa statistics, achieved by the SVM classifier with one versus all approach. Grid search is used to find optimal choice of regularization parameters. To check the efficacy of the proposed method, we have formulated three type of multi-mental task classification problems, viz. three class, four class, and five class as well as binary mental task classification.

Binary class problem

We have used binary combination of these tasks as BC, BL, BM, BR, CL, CM, CR, LM, LR and MR in this work.

Three-class problem

In this problem, we have formed three-class mental tasks problems by choosing three different mental tasks at a time from given five mental tasks. There are ten different triplet mental task combinations for forming three-class problem given as BCL, BCM, BCR, BLM, BLR, BMR, CLM, CLR, CMR and LMR.

Four class problem

Construction of four mental task classification problem has been done by choosing four tasks at a time from the given five tasks. There are five different four class problems namely BCLM, BCLR, BCMR, BLMR and CLMR.

Five-class problem

For the formation of the five mental task classification problem, we have taken all five mental tasks at a time. Thus, we have the five-class mental tasks classification problem as BCLMR.

Tables  2, 3, 4 show classification performance for binary mental task for all subjects. Similarly from Tables  5, 6,  7 for three class, Tables  8, 9 for four class and Table  10 for five-class mental task classification problem. From these tables, we can observe following:

  • For binary mental task classification problem, majority of tasks combination on the ground of accuracy parameter CEEMDAN outperforms among three chosen algorithms for Subject 1, Subject 3,and Subject 6 and EEMD performs best among three algorithms for Subject 7, and EEMD and CEEMDAN equally perform for Subject 2, and Subject 5.

  • The performance of EEMD is the best among three algorithms for Subject 3, Subject 5, Subject 7 and CEEMDAN performs best for Subject 1, Subject 2 and the performance of both EEMD and CEEMDAN is same for Subject 5, on the basis of sensitivity parameter for the binary mental task classification problem.

  • As per specificity parameter, CEEMDAN performs best for Subject 2, Subject 3, Subject 6 and EEMD is best for Subject 7, EMD and CEEMDAN equally perform for subject 1, and EEMD and CEEMDAN perform same for Subject 5 for binary mental task classification problem.

  • Analysis based on Kappa measure shows that CEEMDAN gives best performance for Subject 1, Subject 3, Subject 6, and EEMD gives best performance for Subject 7, and the equally perform by EEMD and CEEMDAN for Subject 2, Subject 5 for the binary mental task classification problem.

  • As per accuracy, sensitivity and specificity parameters concern, EEMD performs best among three algorithms for Subject 1, Subject 2, Subject 5 and Subject 7 for ternary mental task classification. The performance of CEEMDAN is superior or equal than other two algorithms for Subject 3 and Subject 6.

  • For the KAPPA parameter, EEMD is best choice among three chosen algorithms for Subject 1, 2, 3, 5 and 7.

  • For quaternary mental task classification problem, EEMD achieve highest classification performance among three selected algorithms for Subject 1, 2, 5, and 7, for performance measures accuracy, sensitivity and specificity. It is also observed for these parameters EMD and CEEMD are equally perform for Subject 3.

  • For kappa performance measure, the three algorithms compete each other.

  • For mental task classification problem with five tasks, EEMD is the best for all performance parameters except kappa measure. EMD is the best choice for kappa measures for five-class mental task classification.

Table 2 Comparison table of classification performance for binary mental task classification for Subject 1 and Subject 2
Table 3 Comparison table of classification performance for binary mental task classification for Subject 3 and Subject 5
Table 4 Comparison table of classification for binary mental task classification for Subject 6 and Subject 7
Table 5 Comparison table of classification for three class mental task classification for Subject 1 and Subject 2
Table 6 Comparison table of classification for three class mental task classification for Subject 3 and Subject 5
Table 7 Comparison table of classification for three class mental task classification for Subject 6 and Subject 7
Table 8 Comparison table of classification for four class mental task classification for Subject 1 Subject and Subject 3
Table 9 Comparison table of classification performance for four class mental task classification for Subject 5,Subject 6andSubject 7
Table 10 Comparison table of classification performance for five-class mental task classification for all subjects

Further average value of performance measures is obtained over all the subject. Tables 11, 12, 13 and 14 show the comparison table of classification performance for binary mental task classification, three-class mental task classification, four class mental task classification and five-class mental task classification over all the subjects, respectively. It can be noted from these tables that the EEMD method performed well in comparison with EMD and CEEMD methods in terms of aforementioned performance measures except kappa index. CEEMD performs well than other two methods in terms of kappa index for some combinations of the tasks for the different type of class combinations of the mental task.

Table 11 Comparison table of classification performance for binary mental task classification over all the subjects
Table 12 Comparison table of classification performance for three-class mental task classification over all the subjects
Table 13 Comparison table of classification performance for four class mental task classification over all the subjects
Table 14 Comparison table of classification performance for five-class mental task classification over all the subjects

4.4 Comparison of the proposed model for multi-mental task classification problem

In this subsection, we have discussed and compared the proposed approach with the work of [45] for multi-mental task classification in Table 15.

In Table 15, methods A, B and C are the schemes used by [45] based on asymmetry ratio for calculation of a different number of frequency band powers using 75-dimensional, 90-dimensional and the 42-dimensional feature vector, respectively. From this Table, it is clear that our approach for creating features vectors outperforms regarding average classification accuracy for all the three subjects for all the multi-mental tasks classification problem.

Table 15 Comparison table of classification accuracy achieved for multi-mental task classification of the work of [45] with proposed approach

4.5 Discussion

Since EEG signal has nonlinear and non-stationary property, thus there is a need for an algorithm which can capture such properties of the signal. EMD is such an algorithm which can capture tempo-spectral information of the signal. After decomposing the signal into high- and low-frequency components, it is important to extract some statistical and uncertainty parameters from this decomposed signal for compact representation regarding features which can help in differentiating one mental state to another. Also, there is two improved version of EMD algorithm named as EEMD and CEEMDAN algorithm, which can capture tempo-spectral information even from noise assisted signal.

4.6 Statistical test

We have utilized a two-way, nonparametric statistical test known as Friedman test [10, 16] to find out the significant difference among these three EMD methods for EEG signal. Table 16 shows the average Friedman ranking of the methods for different combination of metal tasks classification problem, which shows that EEMD method outperforms among three methods for all the possible metal tasks classification problem.

The performance of different EMD methods (in this work) is studied to control method, i.e., best performer from the Friedman’s ranking (which is EEMD). The test statistics for the comparison of \(m^{th}\) method to \(n^{th}\) method, z, is given as

$$\begin{aligned} z=\frac{R_{m}-R_{n}}{\sqrt{\frac{k(k+1)}{6N}}} \end{aligned}$$
(8)

where \(R_{m}\) and \(R_{n}\) are the average ranking of the methods, k and N are the number of methods (algorithms) and experiments, respectively. However, these p values so obtained are not suitable for comparison with the control method. Instead, adjusted p values [10] are computed which take into account the error accumulated and provide the correct correlation. For this, a set of post hoc procedures is defined and adjusted p values are computed. For pair-wise comparisons, the widely used post hoc methods to obtain adjusted p values are [10]: Bonferroni–Dunn, Holm, Hochberg and Hommel procedures. Table 17 shows the various value of adjusted p values obtained from methods mentioned above. From this Table, it is clear that there is a statistical difference between EEMD and other two methods.

Table 16 Average Rankings of the algorithms
Table 17 Adjusted p-values

5 Conclusion

Classification of electroencephalograph (EEG) signal for designing brain–computer interface systems requires detailed analysis of the signal, i.e., intrinsic properties related to the signal such as temporal-spectral virtue of a signal. This work has presented a comprehensive study of the three variants of empirical mode decomposition (EMD) algorithms to find intrinsic characteristics of the EEG signal for mental task classification problem. EMD and its variants are very useful for extracting the EEG signal’s intrinsic properties, as these algorithms can capture non-stationary and nonlinear properties of the signal. After decomposing the signal through the EMDs algorithms, 8 parameters, using statistical properties, uncertainty virtues and structural relationship of the decomposed signal, were calculated from each segment of the decomposed signal to form the feature vector from the signal for further classification. For developing the classification model from the extracted features, we have utilized support vector machine (SVM) and several performance measures are utilized to validate the model on test sets for mental task classification problem. Experimental results showed that Ensemble EMD (EEMD) algorithm performs best among the three chosen EMD algorithms as the number of the mental task grows. Further, statistical analysis is also conducted to investigate whether three EMD algorithms were statistically different or not for MTC.

In future work, we would explore more advanced decomposition methods for extracting the efficient features from the EEG signal to improve the classification accuracy for metal task classification problem. Further to reduce the dimensionality, feature selection approaches can be investigated to improve time complexity of learning the MTC classification model. It is also interesting to examine some new set of parameters associated with the signals, which can help distinguish different mental states more accurately. It will also be interesting to see whether deep learning architecture will be suitable for MTC or not.