Common Methodology for Cardiac and Ocular Artifact Suppression from EEG Recordings by Combining Ensemble Empirical Mode Decomposition with Regression Approach

Electroencephalography (EEG) is a non-invasive way of recording brain activities, making it useful for diagnosing various neurological disorders. However, artifact signals associated with eye blinks or the heart spread across the scalp, contaminating EEG recordings and making EEG data analysis difficult. To solve this problem, we implement a common methodology to suppress both cardiac and ocular artifact signal, by correlating the measured contaminated EEG signals with the clean reference electro-oculography (EOG) and electrocardiography (EKG) data and subtracting the scaled EOG and EKG from the contaminated EEG recording. In the proposed methodology, the clean EOG and EKG signals are extracted by subjecting the raw reference time-series data to ensemble empirical mode decomposition to obtain the intrinsic mode functions. Then, an unsupervised technique is used to capture the artifact components. We compare the distortion introduced into the brain signal after artifact suppression using the proposed method with those obtained using conventional regression alone and with a wavelet-based approach. The results show that the proposed method outperforms the other techniques, with an additional advantage of being a common methodology for the suppression of two types of artifact.


Introduction
Electroencephalography (EEG) is an electrophysiological monitoring method for recording the electrical activity of the brain, giving insight into the brain and its abnormalities. EEG signals are recorded from the scalp either with electrodes in the standard 10-20 arrangement or an extended 10-20 system (64 electrodes), following a standard devised by the International Federation of Societies of EEG [1]. EEG signals are often contaminated by potentials of non-cerebral origin, such as the heart and eyeblinks; these signals pose a serious problem in analyzing the recorded EEG activity [2,3]. A number of techniques, such as principal component analysis (PCA) [4], independent component analysis (ICA) [5,6], wavelet transform [7][8][9][10], and regression [11] have been suggested for eliminating the relatively common ocular artifacts and somewhat rare artifacts due to activity [12]. Artifact suppression techniques based on statistical methods such as ICA have been used for artifact rejection [13,14]. The main and inevitable problem of ICA methodology is that the independent components attributed to the artifacts may also contain neural activity of interest. In this sense, the omission of particular components corresponding to the artifacts, followed by a signal reconstruction from the remaining components, may lead to distortions of the underlying cerebral activity [15]. Wavelet-based approaches are a relatively easier technique for artifact suppression and can also be used for single-channel EEG data. However, it is well known that wavelet-based methods require a set of pre-defined basis functions to decompose the signal and that the selection of a particular basis function is governed by morphological features of the signal to be decomposed. Hence, a common basis function may not be optimal for the suppression of artifacts associated with different sources (eyeblinks and the heart) using a wavelet-based approach.
Recently, some researchers [16][17][18] have used datadriven techniques such as empirical mode decomposition (EMD) and ensemble empirical mode decomposition (EEMD) for ocular artifact suppression; however, these works did not address the development of a single or common methodology for the suppression of artifacts associated with both ocular and cardiac activities.
Regression-based methodology is one of the simplest techniques for removing ocular artifacts, where the measured EOG signal is projected onto the EEG channels. However, this method might produce unexpected distortion in the clean EEG data due to bidirectional contamination [2]. To overcome the bidirectional contamination problem inherent in the regression approach, we propose performing EEMD with an unsupervised technique on the raw reference data to obtain the clean reference data. Here, we propose a common methodology for the suppression of two kinds of biological artifact (EOG and EKG) from the contaminated EEG by using a regression technique between the measured contaminated EEG and the clean EKG and EOG reference data.
The rest of this paper is organized as follows. Section 2 presents a brief introduction to regression and EEMD. Section 3 introduces the experimental protocol as well as an overview of existing approaches for artifact suppression and their comparative evaluation, followed by artifact suppression using the proposed technique. Section 4 presents the results and the metrics used to evaluate and compare the performance of the proposed methodology with other techniques. Finally, a discussion and the conclusion are presented in Sect. 5.

Regression and EEMD
In the conventional regression technique, ocular and cardiac artifacts are suppressed by projecting the measured raw reference signals related to EOG and EKG onto the EEG data. However, the regression technique does not take into account bidirectional contamination, i.e., the EOG recording is also corrupted by cerebral activity, and thus this cerebral activity will also be eliminated from the EEG recordings after ocular artifact suppression using the conventional regression-based technique. Low-pass filtering has been suggested for reducing the suppression of highfrequency cerebral components from EEG data [19]; however, the criterion for the selection of the cut-off frequency was not discussed.
The EMD technique adaptively decomposes non-stationary signals into a set of intrinsic oscillatory modes termed as intrinsic mode functions (IMFs) [20]. These IMFs represent signal components with progressively decreasing frequency content. The EMD algorithm applied to a given data x(t)for deriving the IMFs is defined in the following way.
The EMD technique adaptively decomposes non-stationary signals into a set of intrinsic oscillatory modes, called intrinsic mode functions (IMFs) [20]. These IMFs represent signal components with progressively decreasing frequency content. The EMD algorithm applied to a given data x(t) for deriving IMFs is defined as follows: 1. All the extrema in the data are identified and connected by a cubic spline to form upper and lower envelopes.

The mean of the upper and lower envelopes m 1 (t) is
obtained and the mean from the original series is subtracted to get h 1 (t) as: 3. If the difference h 1 (t) does not satisfy the IMF conditions, steps (1) and (2) are repeated until the envelopes are symmetric with respect to zero under standard stopping criteria, described elsewhere [20,21]. 4. When the desired stopping criterion is achieved, the first IMF, c 1 (t), is obtained and the residue is calculated as: 5. Now, residue r 1 (t) is treated as the signal to be further decomposed and the above steps are repeated to obtain r 2 (t), r 3 (t),…,r m (t).
The original signal can be represented as, where c i (t) is the ith order IMF and r m (t) is the residue or the last IMF. Although the adaptive EMD method is useful for nonlinear and non-stationary signals, it has the disadvantage of a mode mixing effect, which arises due to the inter-mittency present in the signal. To overcome this, an EEMD method was suggested to prevent mode mixing in IMFs [22]. To avoid this mode mixing, a finite amplitude of identically distributed white noise is added to the input signal before the EMD is performed, and the ensemble average of IMFs is taken over a number of trials to avoid the mode mixing effect. The noise added in each trial tends to cancel when the ensemble average is taken as there is no correlation between the noise introduced in different trials. In our case, the noise amplitude added to EEMD is 0.3 times the standard deviation and the number of ensemble averages is fixed at 100.

Data Acquisition
A total of four male subjects (28-45 years old) took part in the EEG experiments. Written informed consent for participation was obtained from all the subjects. A 64-channel EEG system (Compumedics Neuroscan Synamps 64 Channel EEG System, Australia) was used for acquiring EEG signals from the brain. The system is equipped with a data acquisition card with 24-bit resolution. EEG recordings were carried out with a sampling rate of 1 kHz and the allowable bandwidth set from 0 to 200 Hz. The EEG data were recorded for 2 min per subject. Separate electrodes (EOG and EKG) were also used to acquire eye blinks and heart activity simultaneously with the EEG. EOG artifacts were observed in most of the EEG recordings during the pattern reversal checkerboard EEG experiments, but EKG artifacts were observed for only one male subject. For data analysis, five EEG data segments of 2.5 s contaminated by eyeblinks were taken from every subject and five EEG data segments of 1 s contaminated by cardiac activity were selected from a male subject.

Artifact Removal Evaluation
The aim of the artifact suppression algorithm is to selectively suppress artifacts without much affecting the signal from the brain. To compare the performance of the artifact suppression technique, we calculated the change in power spectral density (DPSD) before and after performing artifact suppression [23]. PSD is a measure of signal power in the selected frequency domain. In our case, we calculated the change in signal power (DPSD) over a contaminated EEG data segment before and after artifact suppression using the Welch's method across the frequency bands corresponding to different brain rhythms, i.e., delta (0.5-4 Hz), theta (4-8 Hz), alpha (8)(9)(10)(11)(12), and beta (13-30 Hz); the respective changes in PSD are denoted as DPSD d , DPSD h , DPSDa, and DPSD b . These metrics were used for the quantitative evaluation of the distortion introduced in the EEG signals after artifact suppression. Lower values of DPSD correspond to lower distortion introduced in the underlying cerebral activity by the algorithm used for artifact suppression.

Data Analysis
EOG interference was dominant in the frontal lobe electrodes FP2, FP1, and FZ, whereas EKG interference was dominant in the central lobe electrodes CP2, CP1, C1, and C2. Samples belonging to EOG and EKG recordings and the corresponding contaminated part of EEG from the frontal FP2 electrode and the central CP2 electrode were taken off from the raw EEG recording for analysis, as shown in Figs. 1 and 2. EEG data sets were converted from the continuous format (.cnt) to ASCII format (.dat) to make use of Python for further analysis [24][25][26]. The applied methodology is summarized in the following steps: 1. EEMD was performed on the raw EOG reference electrode data, as shown in Figs. 3 and 4, to get the IMFs. Similarly, the raw EKG data were also subjected to EEMD to get the IMFs, as shown in Figs. 5 and 6. Then, an unsupervised technique called PCA was used to capture the principal components for extracting eyeblinks and cardiac activity. 2. The number of principal components retained for the reconstruction of the clean EOG and EKG reference data is based on the threshold value driven from the scree plot [27]. Figures 7 and 8 show the clean EOG and EKG reference data, respectively, after applying the above approach on the raw reference data. 3. After extracting the clean reference data, correlations were calculated between contaminated EEG and the clean reference data (EOG and EKG), and then the scaled EOG and EKG data were subtracted from the corresponding contaminated EEG recording in order to suppress ocular and cardiac artifacts.
As with eyeblinks, heart and brain activities have physiologically distinct sources. The measured EEG is thus a superposition of the true EEG signal and a scaled part of the clean EOG or EKG signal. The contaminated recorded EEG data can be represented as: where EEG1 mes = measured contaminated EEG at frontal lobe(FP2), EEG2 mes = measured contaminated EEG at central lobe(CP2), EEG1 true and EEG2 true = EEG due to cortical (brain) activity alone at measurement electrode, k EOG ÁEOG cleaned = propagated EOG artifact from eye to recording site, k EKG ÁEKG cleaned = propagated EKG artifact from heart to recording site. EEG1 true and EEG2 true signals were estimated from EEG1 mes and EEG2 mes by efficiently removing the k EOG ÁEOG clean and k EKG ÁEKG clean while retaining the EEG1 true and EEG2 true signal integrity. It is necessary to determine the coefficients k EOG and k EKG for extracting the EEG1 true and EEG2 true signals. To determine k EOG and k EKG , the covariance between EEG1 mes and EOG clean and between EEG2 mes and EKG clean , and the variance for EOG clean and EKG clean were calculated.
The scale factors k EOG and k EKG were respectively estimated as: Finally, EEG1 true and EEG2 true signals were estimated using the coefficients k EOG and k EKG in Eqs. (3) and (4).

Results
Figures 9 and 10 show the clean and contaminated EEG data obtained using the proposed technique. From their visual appearance, it is evident that the proposed methodology worked well for the effective suppression of EOG and EKG artifacts. In order to compare the performance of the proposed method with a conventional wavelet-based technique, we extracted the signal corresponding to eyeblinks and cardiac activity from the contaminated EEG data segment using Biorthogonal 3.3 [28] and Symlet (sym8) [29] wavelets, respectively. In our case, since the measurement bandwidth allowed is 0-200 Hz, decomposition up to the fourth level was selected as the range of frequency for ocular artifacts is 0-10 Hz [30].For cardiac artifact suppression, decomposition up to the second level was performed. To compare the results obtained using the wavelet-based approach for the suppression of artifacts, we reconstructed clean EEG data from the contaminated EEG data by setting the noisy coefficients of the wavelet-decomposed data to zero. Tables 1 and 2 indicate that significantly lower distortion in the brain signal results when the proposed technique is

Conclusion
We proposed a common methodology for the suppression of artifacts associated with heart activity and eyeblinks from measured EEG data by calculating and subtracting the relative changes in the EEG data with respect to the clean reference EKG and EOG data. The results show that the proposed methodology is able to eliminate EOG and EKG artifacts from the EEG data while causing little distortion in the underlying cerebral activity. Compared to other artifact suppression techniques, the proposed method has several advantages. It can be used for the suppression of artifacts associated with different sources (eyeblinks and the heart) using singlechannel contaminated EEG data along with measured reference electrode EOG/EKG data, whereas PCA and ICA methods require multi-channel EEG data. The proposed method overcomes the problem of bidirectional contamination between EEG and an EOG channel that is difficult to properly resolve using the conventional regression-based approach. The significance of the proposed method lies in the fact that it is simple and easy to implement, unlike waveletand ICA-based artifact suppression approaches, which require careful selection of optimal mother wavelet function and noisy components, respectively.