Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network

Zisad, Sharif Noor; Hossain, Mohammad Shahadat; Andersson, Karl

doi:10.1007/978-3-030-59277-6_26

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12241))

Included in the following conference series:

International Conference on Brain Informatics

1571 Accesses
42 Citations

Abstract

Detecting emotions from the speech is one of the emergent research fields in the area of human information processing. Expressing emotion is a very difficult task for a person with neurological disorder. Hence, a Speech Emotion Recognition (SER) system may solve this by ensuring a barrier-less communication. Various research has been carried out in the area of SER. Therefore, the main objective of this research is to develop a system that can recognize emotion from the speech of a neurologically disordered person. Since convolutional neural network (CNN) is an effective method, it has been considered to develop the system. The system uses tonal properties like MFCCs. RAVDESS audio speech and song databases for training and testing. In addition, a custom local dataset developed to support further training and testing. The performance of the proposed system compared with the traditional machine learning models as well as with the pre-trained CNN models including VGG16 and VGG19. The results demonstrate that the CNN model proposed in this research performed better than the mentioned machine learning techniques. This system enables one tohhhhhh classify eight emotions of neurologically disordered person including calm, angry, fearful, disgust, happy, surprise, neutral and sad.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmed, T.U., Hossain, M.S., Alam, M.J., Andersson, K.: An integrated CNN-RNN framework to assess road crack. In: 2019 22nd International Conference on Computer and Information Technology (ICCIT), pp. 1–6. IEEE (2019)
Google Scholar
Alharbi, S.T., Hossain, M.S., Monrat, A.A.: A belief rule based expert system to assess autism under uncertainty. In: Proceedings of the World Congress on Engineering and Computer Science, vol. 1 (2015)
Google Scholar
Aloufi, R., Haddadi, H., Boyle, D.: Emotionless: privacy-preserving speech analysis for voice assistants. arXiv preprint arXiv:1908.03632 (2019)
Bojanić, M., Delić, V., Karpov, A.: Call redistribution for a call center based on speech emotion recognition. Appl. Sci. 10(13), 4653 (2020)
Article Google Scholar
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT, pp. 177–186. Springer, Cham (2010). https://doi.org/10.1007/978-3-7908-2604-3_16
Chapter Google Scholar
Chernykh, V., Prikhodko, P.: Emotion recognition from speech with recurrent neural networks. arXiv preprint arXiv:1701.08071 (2017)
Chowdhury, R.R., Hossain, M.S., ul Islam, R., Andersson, K., Hossain, S.: Bangla handwritten character recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 318–323. IEEE (2019)
Google Scholar
Ghai, M., Lal, S., Duggal, S., Manik, S.: Emotion recognition on speech signals using machine learning. In: 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC), pp. 34–39. IEEE (2017)
Google Scholar
Hossain, M.S., Habib, I.B., Andersson, K.: A belief rule based expert system to diagnose dengue fever under uncertainty. In: 2017 Computing Conference, pp. 179–186. IEEE (2017)
Google Scholar
Hossain, M.S., Hossain, E., Khalid, S., Haque, M.A.: A belief rule based (BRB) decision support system to assess clinical asthma suspicion. In: Scandinavian Conference on Health Informatics, Grimstad, Norway, 22 August 2014, pp. 83–89. No. 102, Linköping University Electronic Press (2014)
Google Scholar
Hossain, M.S., Rahaman, S., Kor, A.L., Andersson, K., Pattinson, C.: A belief rule based expert system for datacenter PUE prediction under uncertainty. IEEE Trans. Sustain. Comput. 2(2), 140–153 (2017)
Article Google Scholar
Hossain, M.S., Sultana, Z., Nahar, L., Andersson, K.: An intelligent system to diagnose chikungunya under uncertainty. J. Wirel. Mob. Netw. Ubiquit. Comput. Dependable Appl. 10(2), 37–54 (2019)
Google Scholar
Iqbal, A., Barua, K.: A real-time emotion recognition from speech using gradient boosting. In: 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1–5. IEEE (2019)
Google Scholar
Islam, M.Z., Hossain, M.S., ul Islam, R., Andersson, K.: Static hand gesture recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 324–329. IEEE (2019)
Google Scholar
Islam, R.U., Ruci, X., Hossain, M.S., Andersson, K., Kor, A.L.: Capacity management of hyperscale data centers using predictive modelling. Energies 12(18), 3438 (2019)
Article Google Scholar
Kabir, S., Islam, R.U., Hossain, M.S., Andersson, K.: An integrated approach of belief rule base and deep learning to predict air pollution. Sensors 20(7), 1956 (2020)
Article Google Scholar
Karim, R., Andersson, K., Hossain, M.S., Uddin, M.J., Meah, M.P.: A belief rule based expert system to assess clinical bronchopneumonia suspicion. In: 2016 Future Technologies Conference (FTC), pp. 655–660. IEEE (2016)
Google Scholar
Livingstone, S.R., Russo, F.A.: The Ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PloS One 13(5), e0196391 (2018)
Article Google Scholar
Martínez, B.E., Jacobo, J.C.: An improved characterization methodology to efficiently deal with the speech emotion recognition problem. In: 2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), pp. 1–6. IEEE (2017)
Google Scholar
Rovetta, S., Mnasri, Z., Masulli, F., Cabri, A.: Emotion recognition from speech signal using fuzzy clustering. In: 2019 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (EUSFLAT 2019). Atlantis Press (2019)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Tzirakis, P., Zhang, J., Schuller, B.W.: End-to-end speech emotion recognition using deep neural networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5089–5093. IEEE (2018)
Google Scholar
Yang, N., Dey, N., Sherratt, R.S., Shi, F.: Recognize basic emotional statesin speech by machine learning techniques using mel-frequency cepstral coefficient features. J. Intell. Fuzzy Syst. (Preprint) 1–12 (2020)
Google Scholar
Zhang, M., Liang, Y., Ma, H.: Context-aware affective graph reasoning for emotion recognition. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 151–156. IEEE (2019)
Google Scholar
Zhao, Z., Bao, Z., Zhao, Y., Zhang, Z., Cummins, N., Ren, Z., Schuller, B.: Exploring deep spectrum representations via attention-based recurrent and convolutional neural networks for speech emotion recognition. IEEE Access 7, 97515–97525 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Chittagong, Chittagong, Bangladesh
Sharif Noor Zisad & Mohammad Shahadat Hossain
Department of Computer Science, Electrical and Space Engineering, Luleå University of Technology, Skellefteå, Sweden
Karl Andersson

Authors

Sharif Noor Zisad
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Shahadat Hossain
View author publications
You can also search for this author in PubMed Google Scholar
Karl Andersson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Sharif Noor Zisad or Mohammad Shahadat Hossain .

Editor information

Editors and Affiliations

Nottingham Trent University, Nottingham, UK
Mufti Mahmud
University of Padua, Padua, Italy
Stefano Vassanelli
Jahangirnagar University, Dhaka, Bangladesh
M. Shamim Kaiser
Maebashi Institute of Technology, Maebashi, Japan
Ning Zhong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zisad, S.N., Hossain, M.S., Andersson, K. (2020). Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network. In: Mahmud, M., Vassanelli, S., Kaiser, M.S., Zhong, N. (eds) Brain Informatics. BI 2020. Lecture Notes in Computer Science(), vol 12241. Springer, Cham. https://doi.org/10.1007/978-3-030-59277-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-59277-6_26
Published: 15 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59276-9
Online ISBN: 978-3-030-59277-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics