Keywords

1 Introduction

Recognition of human’s emotional states in real time plays an important role in machine emotional intelligence and human-machine interaction. The ability to identify a person’s emotional state based on relatively easily acquired scalp electroencephalographic (EEG) data could be of clinical importance for anger management, depression, anxiety or stress reduction, and for relating to persons with communication disabilities. Human emotion refers to a complex psychological state comprised of three components i.e. user experience, his physiological response along with behavioral or expressive reaction [1]. Different categories of emotions are disgust, pride, satisfaction, anger etc. [2]. Various studies have been conducted to find how the EEG signals correlate to human emotions. Emotiv has also been used for this purpose. Pham et al. have used Emotiv to capture EEG data while users are watching movies to induce emotions. Oscillatory brain rhythms with different frequency bands filtered from the recorded brain signals are used as input to different machine learning classifiers [3].

Human emotion can be described as any psychological mental state comprising of three constituents that are personal experience, person’s physiological reaction along with his expressive or behavioral response [1]. In literature, we find different categories of human emotions that include disgust, anger, happiness, satisfaction, pride etc. [2]. There are various studies that have been performed to find out how the EEG based brain signals correlate to emotions. In one of the studies, Pham, Duy et al. recorded EEG data while the subjects are watching videos and movies. The displayed videos invoked different emotions to the participants while EEG headset is used to record brain signals. The recorded data is used for spectral analysis in different frequency bands. Classification is performed using different machine learning classifiers [3].

Currently, most of the real-time emotion recognition systems make use of stimuli having low ecological validity for example pictures, images, sound to invoke and recognize emotions. In few studies short video clips are shown to the subjects for elicitation of emotions. Furthermore, most of the emotion-based studies have used single method to elicit emotions [4,5,6,7]. Table 1 lists some studies mentioning the stimuli used for emotion elicitation. Furthermore, the type of emotion and the apparatus used to capture and analyse data are also mentioned.

Table 1. List of studies for emotion recognition

Most of the BCI applications and studies are conducted using medical grade EEG sensors with large number of electrodes [8, 9]. In order to enhance the portability and flexibility of these systems, various commercial low-cost EEG devices are available. Although comparative studies have reported their poorer fidelity compared with medical grade EEG recording systems like ANT and Neuroscan, but still they can provide valuable information on EEG data.

To overcome aforementioned issues and limitations, in this paper, multi modal emotion elicitation paradigm is proposed to investigate if same signatures of emotions exist if induced by using different methods of elicitation. This study is focused to record and analyze EEG data of human subjects for differences between target positive, negative and neutral emotions using different ways of stimuli for elicitation. Medical and commercial grade both EEG sensors will be used to analyze whether low cost devices could be helpful in understanding brain patterns associated with emotional states. In the present study, along with neutral brain state we attempt to explore the EEG correlates of joy and fear as positive and negative emotions respectively.

For EEG recording, following two sensors will be used:

  1. i.

    BrainAmp (medical grade)

  2. ii.

    Emotiv EPOC (low-cost commercial grade)

To address the issue of low ecological validity of stimuli in emotion recognition, this research focuses towards conducting the experiments based on three different ways of stimuli presentation. Explanation of each method is as follows:

1.1 Memory Recall/Emotional Imagery

As mentioned earlier, most of the EEG based emotion recognition researches have used external stimuli presentations to evoke emotions to the subjects for example images, sounds, video clips. In this method of memory recall/emotional imagery, we are primarily dealing with inwardly visualized, imagined or felt emotions evoked by the subject’s own imagination or recall of emotionally loaded memories. The participants are requested to get involved or immerse themselves in prolonged, self-paced recall of emotion imagination usually with closed eyes.

In this work we make an attempt to discover and identify brain dynamic correlates of inwardly imagined emotional feelings by suggesting to healthy male and female participants, to remind or imagine such scenarios in which the subject has felt or would likely to feel a series of emotions under consideration.

1.2 Virtual Reality

During the last two decades, virtual reality (VR) has proved its significance for mainstream psychological research [14, 15]. A large number of studies have been conducted using VR sets for emotion recognition systems. This advance technology, having unique feature to simulate complicated but real scenarios and contexts, provides researchers exceptional and unprecedented options to analyze and investigate human emotions in pretty well controlled designs in the laboratory.

Having its nature virtual, with simulation of reality as much as possible, VR mainly relies on the suitable and adequate choice of specific perceptual hints and cues to evoke emotions. VR systems strongly provide safe and secure; motivating and controlled scenarios that help in improving BCI learning. Moreover, it provides not only options for better BCI experimental designs but also the analysis and investigation of brain and neural processes under consideration becomes comparatively convenient [16, 17]. In this research work, virtual reality based simulations and videos will be shown to the subjects to elicit emotions.

1.3 Audio-Visual Film Clips

In this scenario, the participants will view short clips/excerpts of audio-visual videos to elicit each category of emotions mentioned above.

2 Research Objective

The objectives of this work are:

  1. 1.

    To quantify with what accuracy the three emotional states of positive, negative and neutral can be differentiated from each other based on the classification on brain signals.

  2. 2.

    To examine whether specific pattern of brain activity exists for these emotional states.

  3. 3.

    By introducing different ways to elicit emotions, it is to find out if same signatures of such feelings appear.

  4. 4.

    Whether these patterns are to some extent common across individuals.

  5. 5.

    Which oscillatory processes in the brain are mainly modulated by such feelings and in what way.

3 Experimental Setup

In order to achieve its objectives, the proposed study is comprised of multiple phases that have to be completed systematically. First of all a briefing will be given to the participants regarding the purpose of the study and informed that their brain activity (EEG) will be recorded while having three different emotional states. Joy and fear are considered as positive and negative emotions respectively along with neutral state. Each participant is requested to fill a questionnaire. The first part of the questionnaire asks to write down any two events of their life/imagination that could induce feelings of joy and fear. Participants are free to choose any scenario, and may write as little or as much detail as they like.

The second part of the questionnaire asks whether the participant feels scared while watching horror movies and describe their experiences. The third part asks to mention any phobia/s e.g. the fear of height, reptiles etc. the participant is experiencing in daily life. In the last part approval from the participants is asked whether they allow the research team to show horror movies clips and videos related to their phobia.

Once the questionnaire is filled, only those participants will be shortlisted for the experiments that have given their consent to a large enough number of items eliciting negative emotions. The EEG brain activity of the participants will be recorded during the experiments based on three different stimuli on three different days. BrainCap will be used for EEG recording for first two days while Emotiv EPOC on third and last day of the experiments. Experiments are based on following three different stimuli:

3.1 Memory Recall/Emotional Imagery

To induce emotions, the first method is to recall memory of any joyful, frightening and neutral event/imagination.

In first phase of this research, initially experiments are performed with commercial grade Emotiv EPOC headset. The experiments are conducted at Bahria University, Karachi, Pakistan. Eight subjects participated in the study. Initially, recorded EEG data for self-induced emotions based on memory recall is analysed for classification performance. Total eight sessions of emotional imagery were conducted with each participant such that four out of eight sessions are for fear evoking memories while four based on neutral state. Complete details of the experiment are explained in our work [18]. Block diagram of the experiment is show in Fig. 1.

Fig. 1.
figure 1

Block diagram of the experiment [18]

The subject sitting on the chair will be asked to close the eyes and recall the joyful event they have mentioned in the questionnaire. The researcher signals the participant to start and stop the activity. Each session is comprised of 15 s. After the session, the participant is given a break of 30 s and asked to report the level of arousal (on a scale from 0 to 5) on the feedback form. In the same manner, sessions for fearful and neutral states are conducted.

The whole activity has to be repeated for four times on three different days.

3.2 Viewing of Immersive Videos on VR Set

The second method for elicitation of emotion is viewing of immersive videos on VR set. Each video is comprised of 1 to 5 min duration. After the session, the participants are given a break of 40 s and asked to report level of arousal (on a scale from 0 to 5) on the feedback form and specify the moments when in the video they felt specifically fear and joy. During each day, a total of ten videos will be shown to the subjects.

Selection of videos is based on two steps. Firstly, three students were selected from Bahria University who are frequent movie viewers. These students were requested to propose movies and videos that could be representative clips (at least two) for each emotion. In the next step, the recommended videos were shown to another group of eight undergraduate students and asked to report their rating for level of arousal and category of emotion elicited while watching it. Some interesting results were obtained during this activity. Immersive videos related to roller coaster ride or skydiving are categorized as joyful by three students while rest of them put these videos for fear category as these participants have phobia of height. Similarly, a video related to car driving simulation is categorized in three categories of joy, fear and neutral depending upon the personality trait of the viewer.

3.3 Viewing of Audio-Video Movie Clips

Instead of immersive videos, normal audio-visual film clips are shown to the users to elicit emotions. Each video is comprised of 1 to 5 min duration. After the session, the participants are given a break of 40 s and asked to report level of arousal (on a scale from 0 to 5) on the feedback form and specify the moments when in the video they felt fear and joy. During each day, total ten videos will be shown to the subjects.

4 Results and Discussion

As mentioned earlier, initially experiments based on scenario of emotional imagery are performed with commercial grade Emotiv EPOC headset. The experiments are conducted at Bahria University, Karachi, Pakistan. Eight subjects participated in the study. Initially, recorded EEG data for self-induced emotions based on memory recall is analysed for classification performance. Total eight sessions of emotional imagery were conducted with each participant such that four out of eight sessions are for fear evoking memories while four based on neutral state.

EEG data is time sampled with 1 s time window. Bandpass features are extracted for input to the Common Spatial Pattern (CSP) algorithm. The features then fed to the Linear Discriminant Analysis (LDA) classifier. EEG data is firstly band pass filtered in frequency range of 1–100 Hz with two pairs of spatial filters. Mention Butterworth order. Results of the classification accuracies are mentioned in Table 2. Here we observe that maximum accuracy of 96.3% and mean accuracy of 66.58% are achieved.

Table 2. Classification Accuracies in spectral band of 1–100 Hz for each subject

In order to explore the relevant spectral band for emotion recognition in case of emotional imagery, we performed bandpass filtering in different frequency bands in which the neurophysiological signals reside i.e. Delta band (1–3 Hz), Theta band (4–7 Hz), Alpha band (8–13 Hz), Beta band (14–30 Hz), Low Gamma band (31–50 Hz), High Gamma band (30–100 Hz) and the whole band of 1–100 Hz. So now we have seven filtered datasets for each of the eight subjects. Each dataset is separately provided as input to the LDA classifier to evaluate classification performance. Figure 2 plots the comparison between accuracy achieved for whole band of 1–100 Hz vs. highest accuracy for each subject in any of the spectral band other than the whole one. It is evident from the graph that for seven out of eight subjects have higher accuracy in other bands while for SUB-7, same accuracies are achieved for the two cases of spectral bands.

Fig. 2.
figure 2

Comparison between classification accuracies achieved for whole band of 1-100 Hz vs. highest accuracy for each subject in any of the spectral band other than the whole one

Now, we have attempted to find the most relevant frequency band for classification purpose. Figure 3 plots the comparison of mean classification accuracies achieved in each of the considered spectral bands of Delta, theta, alpha, beta, low and high gamma, and the whole band. Here the findings reveal that gamma band is able to achieve highest classification performance. This is consistent with the findings of studies related to emotion recognition. Li and Lu [19] have used image as stimuli for invoking emotions and they concluded that high frequency bands of EEG are highly relevant for classification. Dan Nie et al. also found that higher frequency bands provide more significant contribution to emotional response as compared to lower frequency ranges [20]. In our case, the high gamma band resulted in accuracy of 70% while low gamma is almost 68%.

Fig. 3.
figure 3

Graph showing Mean accuracies achieved in each of the seven spectral band considered

5 Conclusion

In this work, we have proposed multi modal emotion elicitation paradigm for EEG based emotion recognition system. Most of the studies, use single way of invoking emotions but here in order to address the problem of low ecological validity of these studies, we proposed the paradigm with multi modal stimuli for invoking emotions. The outcome of the proposed work is a prototype of EEG based BCI system for emotion recognition. Initially, EEG data recorded based on self-induced emotions by recalling memories is analysed with Emotiv EPOC. Initial findings reveal that gamma band is able to produce highest accuracy. Furthermore, using Emotiv EPOC headset, these findings show promising result to use commercial grade EEG sensors for emotion recognition in case of emotional imagery. In the next phase of this study, experiments based on watching videos and immersive reality will be conducted.