1 Introduction

Fig. 1
figure 1

Illustration of the proposed method

Electroencephalography (EEG) signal analysis [1, 2] has been used intensively in many applications such as diagnosis of brain-related diseases and rehabilitation of neuromuscular disorders [3,4,5], emotion recognition [6,7,8], brain–computer interface (BCI) [9,10,11,12], and many others. BCI has played an essential role in helping disabled people to communicate with assistive technologies and machines that help them cope with everyday tasks. Deep learning (DL) techniques [13,14,15], especially convolutional neural networks (CNNs) [16, 17], have been extensively used in the field of BCI motor imagery (MI) signal analysis for their high classification accuracy and simple construction procedure. One successful approach that has been proposed in the literature is fusion CNN, which enhances classification accuracy by concatenating (i.e., fusing) the output of different CNNs. The following paragraph lists some of the recent work that has been carried out in the field of fusion CNNs. Yang et al. [18] proposed a combination of a long short-term memory (LSTM) network and a spatial CNN, where the spatial and temporal features are simultaneously learned from the EEG signals. A discrete wavelet transform was used to represent the EEG signals. Amin et al. [19] showed the benefits of fusing multilevel convolutional features from various CNN layers, their proposed model can learn spectral and temporal features from the EEG data, and it outperforms the models where no fusion is used, their proposed method achieved a 74.5% accuracy for the BCI competition IV-2a dataset. In another work, Amin et al. [20] proposed improved fusion CNN methods (MCNN) and CCNN that fuse different characteristics and architectures. Their methods capture spatial and temporal features from raw EEG data using different convolutional features. The MCNN and CCNN methods achieved 75.7% and 73.8%, respectively, for the BCI Competition IV-2a dataset, and 95.4% and 93.2%, respectively, for the High Gamma dataset. Roots et al. [21] proposed a multibranch CNN, namely EEGNet_fusion, that utilizes different hyperparameter values for each branch, and it achieved an accuracy of 84.1% for two classes in the Physionet dataset. Their model achieved higher accuracy than EEGNet [22], ShallowConvNet, and DeepConvNet [23]; however, it has four times more computational cost. Li et al. [24] proposed a multiscale fusion CNN based on an attention mechanism, where spatiotemporal features from different brain areas are extracted and fed to a fusion network to increase classification accuracy. The added attention mechanism improved the sensitivity of the network. Their proposed method achieved 79.9% for the BCI Competition IV-2a dataset. Li et al. [25] proposed a feature fusion algorithm for neural networks that combines CNN and LSTM networks and connects them in parallel. The spatial and temporal features are extracted by the CNN and the LSTM networks, respectively. They achieved 87.68% accuracy for the BCI Competition IV-2a dataset. Musallam et al. [26] proposed TCNet-Fusion, a fixed hyperparameter-based CNN model. In addition to fusion layers, different techniques were used in this model, such as separable convolution, depth-wise convolution, and temporal convolution networks (TCNs). The EEG signal undergoes two successive 1D convolutions along the time domain and channel-wise, and then an image-like representation is fed to the main TCN. Their model achieved 83.73% and 94.41% for the BCI Competition IV-2a and the High Gamma datasets, respectively. The rest of this paper is organized as follows: Sect. 2 describes the proposed method. Section 3 presents the second release of Coleeg software, and describes the used datasets. The evaluation setup and the experimental results are presented in Sect. 4. Finally, the paper is concluded in Sect. 5.

2 The proposed method

The proposed method is named multifrequency band fusion (MFBF). Its main idea is to divide the signal spectrum into multiple frequency bands and feed each band into a duplicate of the selected model. All the model duplicates are then concatenated to give the required classification. The proposed method, which is illustrated in Fig. 1, has the flexibility of choosing any neural network model and any number of frequency bands. This paper uses Coleeg software [27] for the evaluation process; the CNN1D model was selected to build the CNN1D_MFBF model. A block diagram that describes the CNN1D model is shown in Fig. 2. The spectrum of the signal is divided into the following frequency bands: 0.5–8.0 Hz, 8.0–13.0 Hz, and 13.0–40.0 Hz. The frequency bands were selected to coincide with the distinctive brainwave frequencies to get homogeneous signal properties in each band. The band 0.5–8.0 Hz coincides with delta (\(\delta \)) and theta (\(\theta \)) waves combined. The band 8.0–13.0 Hz contains the alpha (\(\alpha \)) rhythm, while the band 13.0–40.0 Hz coincides with beta (\(\beta \)) wave and some of the lower part of gamma (\(\gamma \)) wave. The function raw.filter() provided by the mne library [28] was used to filter the signal with default parameters, such as firwin filter design, and Hamming window. With firwin option, a finite impulse response filter with a linear phase is used for filtering.

Fig. 2
figure 2

A detailed block diagram for the CNN1D model

Fig. 3
figure 3

Mean accuracies for Physionet dataset

3 Introduction to the second release of Coleeg software

Coleeg [27, 29] is an open-source initiative for EEG MI signal classification. Its goal is to deal with different neural network models and datasets in a unified and simplified manner. Coleeg uses Google\(^{TM}\) Colaboratory (or Colab), so there is no need to install any software on the computer. Coleeg consists of an application file (Coleeg2_App.ipynb) and a python module file (coleeg.py). The application file calls the utility functions in the module file, which facilitates the evaluation process. In the second release of Coleeg, two models were added: CNN1D_MFBF, and EEGNet_fusion [21]. A new dataset from the Hungarian Academy of Sciences Research Centre for Natural Sciences (MTA-TTK) is also added to the two datasets that Coleeg already have, which are: Physionet [30] and BCI Competition IV-2a [31]. New utility functions were also added, which are: (1) divide_time: used to divide the time of a trial event into multiple portions, and that could be used for data augmentation. (2) evaluate_model: model evaluation is now done using this function instead of in the application file (as in the first release). (3) get_data_ttk: used to load the data from the MTA-TTK dataset. (4) plot_results: used to plot the evaluation accuracy versus training epochs. (5) save_results: save the evaluation results to Google Drive.

The first used dataset is Physionet, which has 109 subjects, 64 EEG sensor channels, and a 160 Hz sampling frequency. Five classes were considered, which are: rest and the imagined opening and closing of the left fist, right fist, both fists, and both feet. Subjects 88, 89, 92, 100, and 104 were excluded due to data annotation errors. The second dataset is BCI Competition IV-2a, which has 18 subjects, 22 EEG sensor channels, and a 250 Hz sampling frequency. Four classes were considered: the imagined movements of the left hand, right hand, both feet, and tongue. The third dataset is MTA-TTK, with 25 subjects, 63 EEG sensor channels, and a 500 Hz sampling frequency. Five classes were considered: rest, imagined movements of the left hand, right hand, left leg, and right leg. No filtering was applied to the original raw signals; however, a 0.5-Hz low-pass filter is used in this paper in order to remove the DC component from the signal and enhance its accuracy.

Fig. 4
figure 4

Mean accuracies for BCI Competition IV-2a dataset

Fig. 5
figure 5

Mean accuracies for MTA-TTK dataset

Table 1 Evaluation time for the considered models and datasets
Table 2 The average of the values in Figs. 3, 4, and 5 from epoch 200 to 300

4 Evaluation setup and experimental results

The signals from the three datasets were resampled at 100 Hz, and the loaded data were balanced so that each subject had the same number of trial events for each class. The data are normalized to have zero mean and unity standard deviation. In this way, the batch normalization stage is not used in CNN1D and CNN1D_MFBF models. The evaluation process was carried out considering two scenarios. The first one is where no multiband filtering is used, and it was applied to the CNN1D and the EEGNet_fusion models. The second scenario is where the following frequency bands are used to filter the signal: 0.5–8.0 Hz, 8.0–13.0 Hz, and 13.0–40.0 Hz. The second scenario is applied to the CNN1D and CNN1D_MFBF models. It should be noted that the CNN1D model can accept single or multi frequency bands at its input. Cross-subject validation was carried out for each dataset by dividing the subjects into five groups and performing the evaluation for each group for 500 epochs. Figures 3, 4, and 5 show the mean accuracy versus epochs for both scenarios for Physionet, BCI Competition IV-2a, and MTA-TTK datasets, respectively.

The evaluation time is also shown in Table 1. Figures 3, 4, and 5 show that when the CNN1D_MFBF model is compared to the EEGNet_fusion model, it gives comparable accuracy results for the BCI Competition IV-2a dataset and better accuracy results for Physionet and MTA-TTK datasets. The significant advantage is apparent when the evaluation time is considered, where the time taken by the CNN1D_MFBF is less than one-fifth of the time taken by the EEGNet_fusion model. The experimental results also show that feeding multiple bands for the CNN1D model increases the accuracy, even though no fusion is used.

In order to get a numerical insight regarding the accuracy, in Table 2 the average of the values in Figs. 3, 4, and 5 is calculated from epoch 200 to 300, where the curve settled around a constant value. Table 2 shows that EEGNet_fusion exceeds the CNN1D_MFBF model for the BCI Competition IV-2a dataset with only 0.7%, while it falls behind with 1.7% and 2.5% for Physionet and MTA-TTK datasets, respectively.

It is worth mentioning that the accuracy results are affected by the parameters used in the evaluation process. Some of the parameters used in this paper, such as cross-subject instead of within-subject validation, data balancing, unprocessed raw signal, and a higher number of classes, generally result in reduced accuracy values.

5 Conclusion

In this paper, a novel fusion method for CNNs is proposed. The method is named multifrequency band fusion (MFBF) and is based on dividing the input signal spectrum into multiple frequency bands and feeding each band into a duplicate of a CNN model. The output of the model duplicates is concatenated to form a fusion model. The method shows comparable or better results when compared to EEGNet_fusion model, with less than one-fifth of the training time. The evaluation was conducted using the second release of Coleeg software, which is an open-source initiative for dealing with different models and datasets for EEG MI signal classification in a unified and simplified manner.