Advertisement

Wink based facial expression classification using machine learning approach

Abstract

Facial expression may establish communication between physically disabled people and assistive devices. Different types of facial expression including eye wink, smile, eye blink, looking up and looking down can be extracted from the brain signal. In this study, the possibility of controlling assistive devices using the individual’s wink has been investigated. Brain signals from the five subjects have been captured to recognize the left wink, right wink, and no wink. The brain signals have been captured using Emotiv Insight which consists of five channels. Fast Fourier transform and the sample range have been computed to extract the features. The extracted features have been classified with the help of different machine learning algorithms. Here, support vector machine (SVM), linear discriminant analysis (LDA) and K-nearest neighbor (K-NN) have been employed to classify the features sets. The performance of the classifier in terms of accuracy, confusion matrix, true positive and false positive rate and the area under curve (AUC)—receiver operating characteristics (ROC) have been evaluated. In the case of sample range, the highest training and testing accuracies are 98.9% and 96.7% respectively which have been achieved by two classifiers namely, SVM and K-NN. The achieved results indicate that the person’s wink can be utilized in controlling assistive devices.

Introduction

Plenty of neurological diseases including brainstem stroke, spinal cord injury, multiple sclerosis, cerebral palsy, muscular dystrophies and amyotrophic lateral sclerosis (ALS) may impair the person’s regular communication pathways [1]. If individuals are drastically affected by these neural disorders, they may partially or completely lose their voluntary muscle control. In such situations, the subject is incapable of interacting with their surroundings in any other form of communication. To address this issue, researchers are trying to invent a variety of assistive technologies. Among these assistive technologies, the brain-computer interface (BCI) concept is widely investigating by the researchers.

In a BCI system, specific patterns of brain activity are translated into control commands in the purpose of particular devices operation [2]. Mind-controlled wheelchair [3], home appliances [4], prosthetic arm controlling [5], spelling system [6], emotion detection system [7] and biometrics [8] are the popular BCI applications [9]. Currently, BCI applications have been widened from medical to non-medical fields, for example, BCI based games and virtual reality [9]. Both non-invasive and invasive brain activity recording modalities are contributing progressively in neuroscience research as well as in brain-computer interfacing [10]. There are two invasive strategies are employed in BCI research including electrocorticography (ECoG) and intracortical neuron recording. The broadly used non-invasive modalities are Electroencephalography (EEG), Magnetoencephalography (MEG), Near-Infrared Spectroscopy (NIRS) and Functional Magnetic Resonance Imaging (fMRI). The majority of BCI studies have been utilized brain waves based on EEG recording. There are some positive aspects in preference of EEG including low-cost data capturing devices, ease of mobility and non-invasive manner of data acquisition. However, the Signal-to-Noise Ratio (SNR) of EEG does not always meet the satisfactory level. Moreover, the algorithm for EEG analysis in certain cases lessens the classification accuracy as well as the data transfer rate.

The brain activity due to the changes in facial expression can be used either separately or combined with EEG in the purpose of BCI applications. Raheel et al. [11] have been classified five facial expressions using brain activity and the classified facial expressions are smile, wink, looking up, looking down the eye. The brain activity has been captured through a 14-channel Emotiv EPOC EEG headset. Authors have extracted thirteen statistical features and these features have been classified using K-NN, Naive Bayes, SVM, and multi-layer perceptron. The highest classification accuracy of 81.60% has been achieved using K-NN which seems to poor. Ma et al. [12] proposed a hybrid approach consists of Electrooculogram (EOG) and event-related potential (ERP) for robot control. From EOG signals, Wink, eye blink, frown, and gaze have been detected whereas P300, N170 and VPP were evoked from ERP. Double and triple blink was used to stop and move (forward) the robot respectively, whereas the looking left and looking right were utilized to move the robot in the left and right direction respectively. Finally, the frown was used to stop and enter in ERP mode. Reyes et al. [13] investigated the possibility of using brain activity (generated by facial gestures and eye movements) in the purpose of controlling assistive devices.

In this study, the brain activity from different facial expressions including left wink, right wink, and no wink have been classified. The brain wave regarding the winks has been captured using the fives channel Emotiv Insight EEG headset. Emotiv Insight has also been employed in other studies [14,15,16,17,18,19] to capture the brain wave. The remaining part of this paper has been organized in the following sections i.e. Sects. 2 and 3 discusses issues related to methodology, results and discussion respectively; finally, Sect. 4 deals with the conclusion.

Methodology

The human brain is the control unit of the whole body. Hence, any change in facial expression can be recorded in the brain. The brain wave due to the left wink, right wink, and no wink has been analyzed in the current study. The brain waves have been captured from five different subjects using Emotiv Insight. Two different features have been extracted from the collected dataset and the selected features have been classified using machine learning techniques. The complete methodology of the proposed approach has been discussed in this section.

Experimental design for data acquisition

The complete data collection experiment has been sorted into seven steps shown in Fig. 1. The first step was to select the subjects. In this study, brain activity has been recorded from a total of five subjects where three subjects were male, and two subjects were female. The age range of all subjects was 21–28 years. All the subjects are students and they are familiar with the BCI technology. All the subjects were asked to sleep for 6 h at the previous night of data collection day. Moreover, all the subjects were prohibited to take coffee, alcohol, and smoking within the past 12 h. All subjects have been signed-in the consent form before data collection.

Fig. 1
figure1

Complete steps of experimental design

The full experiment was conducted at Applied Electronics and Communication Engineering (AppECE) lab, Faculty of Electrical and Electronic Engineering, University Malaysia Pahang, Malaysia. The temperature of the data acquisition place was 16–20 ℃ and the background noise level was 25–30 dB. The data has been collected on two different days. In each day, four trials of all classes (left wink, right wink, and no wink) from five subjects have been recorded. All the subjects were informed of their time slots (30 min) before the experiment day. Table 1 shows the details information regarding subjects, dates, time-slots and trials.

Table 1 Experiment details of data acquisition

The total number of observations from five subjects is 120, where 90 observations have been utilized in training and the remaining observations have been used to test the model. Hence, the percentages of training and testing observations are 75% and 25% respectively. In this study, 5 channels Emotiv Insight EEG headset has been used to capture brain activity. The five electrodes namely, AF3, AF4, T7, T8, and Pz have been placed according to the 10–20 electrode positioning system. This device is capable of capturing the EEG data containing mental commands, attention, stress, meditation, and focus. Moreover, some facial expressions including blink, wink, surprise, clench, and smile can also be detected accurately [20]. The sampling rate of this device is 128 Hz. It can transfer data to the computer or other embedded systems wirelessly. The users of Emotiv Insight enjoy the freedom of mobility due to its wireless data transmission system. Figure 2 shows the experimental set up of Emotiv Insight. In order to record and store the brain activity, Emotiv Pro software has been used. This software is capable of displaying real-time raw EEG, data packet acquisition and loss, motion data, performance metrics (0.1 Hz) and contact quality [21]. Moreover, it displays the FFT and band power graphs in real-time. Initially, Emotiv Insight and the computer have been paired through Bluetooth and finally, using the Emotiv Pro software, the computer and device have been connected. The contact between the skin and electrode can be improved using conductive gel.

Fig. 2
figure2

Data acquisition device and software set-up

A powerpoint slide has been made and the data has been captured according to this slide. The duration of this slide was 10 s shown in Fig. 3. First 4 s of this slide was a blank screen (white) to stable the mental activity. A colored screen (blue) has been displayed from 4 to 5 s. Finally, from 5 to 10 s, the slide has been shown either left wink or right wink or no wink. The time between 5 to 10 s, subjects was asked to wink according to the display. Hence, the duration of each trial was 5 s (5 to 10 s). Before, collecting the final dataset, all the subjects were asked to practice with the powerpoint slide. Figure 4 shows the raw data of each trial for the left wink, right wink, and no wink.

Fig. 3
figure3

Data acquisition protocol

Fig. 4
figure4

Raw data of five channels for left wink, right wink and no wink

Data analysis framework

Once the data has been collected, the data should be analyzed. Figure 5 shows the complete follow chart of the data analysis. The raw data has been stored in the format of.csv file and each file contains data of 10 s. In the pre-processing phase, the data for the first 5 s have been removed. The raw data for five electrodes has also been separated. After preprocessing, each trial contains total samples of 640 (each channel). In this study, features in terms of fast furrier transform (FFT) and sample rage have been extracted. Finally, the extracted features have been employed to three different machine learning algorithms namely, LDA, SVM, and K-NN.

Fig. 5
figure5

Complete flow chart of data analysis

Fast fourier transform

Fourier analysis converts signals from time domain to frequency domain. Fast Fourier transform (FFT) algorithm calculates Discrete Fourier transform (DFT) of a sequence. FFT quickly calculate the transformations by factorizing the DFT matrix into a product of sparse factors and produces the same result as DFT. The difference is that FFT is much faster than DFT. The N-point DFT of sequence \( \{ x_{n} ,n = 0,1,2, \ldots \ldots N - 1\} \) is defined by Eq. (1) [22].

$$ X_{k} = \mathop \sum \limits_{n = 0}^{N - 1} x_{n} e^{{ - \frac{i2\pi kn}{N}}} $$
(1)

where \( X_{k} \) is the FFT coefficients, N is the total number of input EEG samples, n is total number of points in FFT. k = 0, 1, 2… N − 1. In this study, the mean absolute value of FFT coefficients has been taken as the feature.

Sample range

Another feature namely sample range has also been employed in this study. The sample range is the difference between the maximum and minimum sample of each channel. Mathematically, it can be shown by Eq. (2);

$$ {\text{Sample range}}\, = \, X_{max} \left( k \right) - X_{min} \left( k \right) $$
(2)

where, \( X_{max} \) and \( X_{min} \) are the maximum and minimum samples of the raw data respectively and k denotes the channel or electrode.

Linear discriminant analysis

LDA is employed to find the linear combinations of feature vectors that describe the characteristics of the corresponding signal. It utilizes hyperplanes to separate two or more classes. The isolating hyperplane is achieved by searching the projection which maximizes the distance among the classes’ means and minimizes the interclass variance [23, 24]. This technique has a very low computational requirement and it is simple to use. The LDA has been successfully applied in a variety of BCI systems.

K-nearest neighbor

The K-NN algorithm depends on the principle that the features corresponding to the several classes will form individual clusters in the feature space. The features that are closer to each other recognized as neighbors. This classifier takes k metric distances into account between the test sample features and those of the nearest classes, to classify a test feature vector. In K-NN architecture, the number of neighbor and the types of distance metrics are key factor [24]. In this study, the Euclidean distance metrics has been utilized to build the K-NN model. In order to get optimum accuracy of K-NN model, we have picked different values of K and at K = 2, the K-NN model provides the best accuracy.

Support vector machine

The concept of SVM is to figure out the most suitable hyperplane within the feature space that classifies the data points distinctly. The hyperplane should be such that the gap between the hyperplane and each adjacent class is maximum [25]. This hyperplane contributes to an increase in classification accuracy. Different types of kernel function and the regularization parameter C have a crucial role in the structure of SVM. The widely employed kernel functions of SVM are radial basis function (RBF), linear, polynomial and sigmoid [26]. In this study, the Gaussian RBF kernel function has been employed. The C and gamma parameters are selected as default from the Classifier Learner App.

To validate the training model, k-fold cross-validation has been employed where the value of K is five. The complete data analysis section has been carried out in the Matlab environment. To classify the extracted features through the Machine Learning approach, we have used classifier learner apps which is a built-in toolbox of Matlab.

Performance evaluation

The performance of the proposed method has been analyzed in terms of confusion matrix, classification accuracy, true positive and false negative rate and AUC (Area Under the Curve) ROC (Receiver Operating Characteristics). The classification accuracy (CA) of the proposed method is calculated by Eq. (3) [27].

$$ CA = \frac{TP + TN}{TP + FN + TN + FP} \times 100\% $$
(3)

where TP = true positive, FN = false negative, TN = true negative and FP = false positive.

Results and discussion

In this section, the performance of the classifiers has been evaluated. Before training the classification model, the selected feature has been labeled. In this study, the left wink, right, wink and no wink have been labeled with 1, 2 and 3 respectively.

Figure 6 presents the confusion matrix (see Fig. 6a), TPR and FNR (see Fig. 6b) of the LDA, SVM and K-NN when the feature extraction technique is FFT. From the confusion matrix, it is clear that the 72 observations (out of 90) have been recognized accurately by the LDA and K-NN whereas the 70 observations have been recognized by the SVM. In the case of LDA, the TPR for classes 1, 2 and 3 are 90%, 90% and 70% respectively which seems to best performance as compared to the K-NN and SVM. According to Table 2, the classification accuracy of LDA, SVM, and K-NN are 83.3%, 75.6% and 80% respectively. Hence, the LDA shows the highest accuracy when the feature has been extracted using FFT.

Fig. 6
figure6

All classifiers with FFT a Confusion matrix b True positive rate (TPR) and false negative rate (FNR)

Table 2 Classification accuracy

Figure 7 represents the confusion matrix, TPR and FNR for the employed classifiers when the feature is extracted by the sample range. The performance of this feature is excellent with all classifiers. A total of 89 observations (out of 90) have been recognized by the SVM and K-NN whereas 88 observations have been recognized accurately by the LDA. In the case of SVM and K-NN, the TPR for classes 1, 2 and 3 are 100%, 100%, and 97% respectively.

Fig. 7
figure7

All classifier with sample range a Confusion matrix b True positive rate (TPR) and false negative rate (FNR)

The Training accuracy of SVM and K-NN are 98.9% whereas LDA achieves 97.8% shown in Table 2. Once the models are trained, the test data have been utilized to test the models. The testing accuracy of LDA, SVM, and K-NN with respect to FFT and sample range shown in Table 2. The maximum testing accuracy with respect to FFT has been achieved by LDA whereas the SVM and K-NN have been achieved 96.7% with respect to sample range. The FFT and sample range have been employed separately to extract the feature. From Table 2, it is obvious that the training and testing accuracy with respect to the sample range is significantly higher than the FFT.

In machine learning, the classification performance can also be evaluated by the area under curve (AUC)—receiver operating characteristics (ROC). Basically, the ROC signifies a probability curve whereas the AUC relates to the degree of separability. In short, this metric explains the model’s capability of differentiating the classes. The range of AUC is 0 to 1. The larger value of AUC represents that the model is capable of recognizing the classes accurately. Figure 8 presents the classifier performance in terms of AUC-ROC when the feature was extracted using sample range. Among three classifiers, the AUC of SVM (see Fig. 8b) is 1 in all classes which indicates that the performance of SVM is superior as compared to the LDA and K-NN.

Fig. 8
figure8

Area under curve (AUC)—receiver operating characteristics (ROC) a LDA, b SVM and c K-NN

Conclusion

In this study, the facial expressions in terms of left wink, right wink, and no wink have been classified. The wink data has been recorded using a five-channel Emotiv Insight EEG headset in the form of brain activity. Two features namely, FFT and sample range have been extracted. The extracted features have been classified using SVM, K-NN, and LDA. The performance of the classifiers has been evaluated using classification accuracy, confusion metrics, AUC-ROC, TPR, and FNR. The feature, sample range has been achieved the better accuracy with all classifiers as compared to the FFT. The obtained classification accuracy signifies that the wink based facial expression can be utilized in the operation of assistive technology. However, there are some issues that need to be overcome. As the key objective of BCI technology is to assist the physically challenged people, the data should be captured from the targeted users. Moreover, the data should be collected from more subjects and the number of trials from each subject should also be increased. The classifier results should be transformed into device commands and in the meantime, the complete experiment should be developed in real-time.

References

  1. 1.

    Bamdad M, Zarshenas H, Auais MA (2015) Application of BCI systems in neurorehabilitation: a scoping review. Disabil Rehabil Assist Technol 10:355–364. https://doi.org/10.3109/17483107.2014.961569

  2. 2.

    Wolpaw JR, Birbaumer N, Mcfarland DJ, Pfurtscheller G, Vaughan TM (2002) Brain-computer interfaces for communication and control. Clin Neurophysiol 113:767–791. https://doi.org/10.1016/S1388-2457(02)00057-3

  3. 3.

    Zhang R, Li Y, Yan Y, Zhang H, Wu S, Yu T, Gu Z (2016) Control of a wheelchair in an indoor environment based on a brain-computer interface and automated navigation. In: IEEE Transaction NEURAL systems rehabilitation engineering vol 24 https://doi.org/10.1109/TNSRE.2015.2439298

  4. 4.

    Kosmyna N, Tarpin-Bernard F, Bonnefond N, Rivet B (2016) Feasibility of BCI control in a realistic smart home environment. Front Hum Neurosci 10:1–10. https://doi.org/10.3389/fnhum.2016.00416

  5. 5.

    Bright D, Nair A, Salvekar D, Bhisikar S (2016) EEG-based brain controlled prosthetic arm.In: Conference on advances in signal processing CASP 2016 pp 479–483. https://doi.org/10.1109/CASP.2016.7746219

  6. 6.

    Nguyen T-H, Chung W-Y (2019) A single-channel SSVEP-based BCI speller using deep learning. IEEE Access 7:1752–1763. https://doi.org/10.1109/ACCESS.2018.2886759

  7. 7.

    Chakladar DD, Chakraborty S (2018) EEG based emotion classification using correlation based subset selection. Biol Inspired Cogn Archit 24:98–106. https://doi.org/10.1016/J.BICA.2018.04.012

  8. 8.

    Pham T, Ma W, Tran D, Tran DS, Phung D (2015) A study on the stability of EEG signals for user authentication. In: 2015 7th International IEEE/EMBS conference on neural engineering (NER). pp 122–125. IEEE. https://doi.org/10.1109/NER.2015.7146575

  9. 9.

    Rashid M, Sulaiman N, Mustafa M, Khatun S, Bari BS (2019) The Classification of eeg signal using different machine Learning techniques for BCI application. In: Jong-Hwan, KimHyung Myung, SML (ed) Robot intelligence technology and applications. RiTA 2018. communications in computer and information science, vol 1015. pp 207–221. Springer, Singapore. https://doi.org/10.1007/978-981-13-7780-8_17

  10. 10.

    Ball T, Kern M, Mutschler I, Aertsen A, Schulze-Bonhage A (2009) Signal quality of simultaneously recorded invasive and non-invasive EEG. Neuroimage 46:708–716. https://doi.org/10.1016/j.neuroimage.2009.02.028

  11. 11.

    Raheel A, Majid M, Anwar SM (2019) Facial expression recognition based on electroencephalography. In: 2019 2nd International conference computer mathematics and engineering technologies iCoMET 2019 pp 1–5. https://doi.org/10.1109/ICOMET.2019.8673408

  12. 12.

    Ma J, Zhang Y, Nam Y, Cichocki A, Matsuno F (2013) EOG/ERP Hybrid human-machine interface for robot control. In: IEEE Interanational conference intelligent robots System pp 859–864. https://doi.org/10.1109/IROS.2013.6696451

  13. 13.

    Reyes CE, Rugayan JLC, Jason C, Rullan G, Oppus CM, Tangonan GL (2012) A study on ocular and facial muscle artifacts in EEG signals for BCI applications. In: IEEE Region 10 annual international conference proceedings/TENCON. pp 1–6 https://doi.org/10.1109/TENCON.2012.6412241

  14. 14.

    Rosca S, Leba M, Ionica A, Gamulescu O (2018) Quadcopter control using a BCI. In: IOP Conference series materials science engineering, vol 294, https://doi.org/10.1088/1757-899X/294/1/012048

  15. 15.

    Pathirana S, Asirvatham D, Md Johar MG (2018) An agent based approach for electroencephalographic data classification. In: 13th International conference computer science education ICCSE 2018. pp 248–252. https://doi.org/10.1109/ICCSE.2018.8468709

  16. 16.

    Ahad R, Rahman KAA, Mustaffa MZ, Fuad N, Ahmad MKI (2019) Body motion control via brain signal response. In: 2018 IEEE EMBS Conference biomedical engineering science IECBES 2018-Proceedings pp 696–700 https://doi.org/10.1109/IECBES.2018.08626738

  17. 17.

    Queiroz RL, Bichara De Azeredo Coutinho I, Xexeo GB, Machado Vieira Lima P, Sampaio FF (2019) Playing with robots using your brain. In: Brazilian symposium games digital entertainment. SBGAMES. 2018–November pp 197–204. https://doi.org/10.1109/SBGAMES.2018.00031

  18. 18.

    Mamani MA, Yanyachi PR (2017) Design of computer brain interface for flight control of unmanned air vehicle using cerebral signals through headset electroencephalograph. In: Proceedings 2017 IEEE international conference aerospace signals, INCAS 2017. 2017–November, pp 1–4. https://doi.org/10.1109/INCAS.2017.8123499

  19. 19.

    Orenda MP, Garg L, Garg G (2017) Exploring the feasibility to authenticate users of web and cloud services using a brain-computer interface (BCI). In: Battiato S, Farinella G, Leo M, Gallo G (eds) New trends in image analysis and processing – ICIAP 2017. ICIAP 2017. Lecture Notes in Computer Science, vol 10590. Springer, Cham. https://doi.org/10.1007/978-3-319-70742-6_33

  20. 20.

    Insight Brainwear® 5 channel wireless EEG headset| EMOTIV, https://www.emotiv.com/insight/, last accessed 2019/10/03

  21. 21.

    EMOTIVPRO| The most advanced EEG software| EMOTIV, https://www.emotiv.com/emotivpro/, last accessed 2019/10/09

  22. 22.

    Biswas PC, Wahed S, Nath D, Rana MM, Ahmad M (2015) Determination of most effective rhythm, domain and stimulation technique for noninvasive brain computer interface through SVM algorithm. In: 2nd International conference on electrical engineering and information and communication technology, iCEEiCT 2015. https://doi.org/10.1109/ICEEICT.2015.7307368

  23. 23.

    Abdulkader SN, Atia A, Mostafa MSM (2015) Brain computer interfacing: applications and challenges. Egypt Inform J 16:213–230. https://doi.org/10.1016/j.eij.2015.06.002

  24. 24.

    Nicolas-Alonso LF, Gomez-Gil J (2012) Brain computer interfaces, a review. Sensors. https://doi.org/10.3390/s120201211

  25. 25.

    Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167. https://doi.org/10.1023/A:1009715923555

  26. 26.

    Rakotomamonjy A, Guigue V, Mallet G, Alvarado V (2005) Ensemble of SVMs for improving brain computer interface P300 speller performances. In: Duch W, Kacprzyk J, Oja E, Zadrożny S (eds) Artificial neural networks: biological inspirations – ICANN 2005. ICANN 2005. Lecture Notes in Computer Science, vol 3696. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550822_8

  27. 27.

    Kumar S, Sharma A, Tsunoda T (2019) Brain wave classification using long short-term memory network based OPTICAL predictor. Sci Rep 9:9153. https://doi.org/10.1038/s41598-019-45605-1

Download references

Acknowledgements

This research was supported by the faculty of Electrical and Electronic Engineering, Universiti Malaysia Pahang, Malaysia through the Grant RDU180396.

Author information

Correspondence to Mamunur Rashid.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rashid, M., Sulaiman, N., Mustafa, M. et al. Wink based facial expression classification using machine learning approach. SN Appl. Sci. 2, 183 (2020). https://doi.org/10.1007/s42452-020-1963-5

Download citation

Keywords

  • Wink
  • Facial expression
  • Electroencephalography (EEG)
  • Brain-computer interface (BCI)
  • Machine learning