Abstract
Detecting emotions from the speech is one of the emergent research fields in the area of human information processing. Expressing emotion is a very difficult task for a person with neurological disorder. Hence, a Speech Emotion Recognition (SER) system may solve this by ensuring a barrier-less communication. Various research has been carried out in the area of SER. Therefore, the main objective of this research is to develop a system that can recognize emotion from the speech of a neurologically disordered person. Since convolutional neural network (CNN) is an effective method, it has been considered to develop the system. The system uses tonal properties like MFCCs. RAVDESS audio speech and song databases for training and testing. In addition, a custom local dataset developed to support further training and testing. The performance of the proposed system compared with the traditional machine learning models as well as with the pre-trained CNN models including VGG16 and VGG19. The results demonstrate that the CNN model proposed in this research performed better than the mentioned machine learning techniques. This system enables one tohhhhhh classify eight emotions of neurologically disordered person including calm, angry, fearful, disgust, happy, surprise, neutral and sad.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmed, T.U., Hossain, M.S., Alam, M.J., Andersson, K.: An integrated CNN-RNN framework to assess road crack. In: 2019 22nd International Conference on Computer and Information Technology (ICCIT), pp. 1–6. IEEE (2019)
Alharbi, S.T., Hossain, M.S., Monrat, A.A.: A belief rule based expert system to assess autism under uncertainty. In: Proceedings of the World Congress on Engineering and Computer Science, vol. 1 (2015)
Aloufi, R., Haddadi, H., Boyle, D.: Emotionless: privacy-preserving speech analysis for voice assistants. arXiv preprint arXiv:1908.03632 (2019)
Bojanić, M., Delić, V., Karpov, A.: Call redistribution for a call center based on speech emotion recognition. Appl. Sci. 10(13), 4653 (2020)
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT, pp. 177–186. Springer, Cham (2010). https://doi.org/10.1007/978-3-7908-2604-3_16
Chernykh, V., Prikhodko, P.: Emotion recognition from speech with recurrent neural networks. arXiv preprint arXiv:1701.08071 (2017)
Chowdhury, R.R., Hossain, M.S., ul Islam, R., Andersson, K., Hossain, S.: Bangla handwritten character recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 318–323. IEEE (2019)
Ghai, M., Lal, S., Duggal, S., Manik, S.: Emotion recognition on speech signals using machine learning. In: 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC), pp. 34–39. IEEE (2017)
Hossain, M.S., Habib, I.B., Andersson, K.: A belief rule based expert system to diagnose dengue fever under uncertainty. In: 2017 Computing Conference, pp. 179–186. IEEE (2017)
Hossain, M.S., Hossain, E., Khalid, S., Haque, M.A.: A belief rule based (BRB) decision support system to assess clinical asthma suspicion. In: Scandinavian Conference on Health Informatics, Grimstad, Norway, 22 August 2014, pp. 83–89. No. 102, Linköping University Electronic Press (2014)
Hossain, M.S., Rahaman, S., Kor, A.L., Andersson, K., Pattinson, C.: A belief rule based expert system for datacenter PUE prediction under uncertainty. IEEE Trans. Sustain. Comput. 2(2), 140–153 (2017)
Hossain, M.S., Sultana, Z., Nahar, L., Andersson, K.: An intelligent system to diagnose chikungunya under uncertainty. J. Wirel. Mob. Netw. Ubiquit. Comput. Dependable Appl. 10(2), 37–54 (2019)
Iqbal, A., Barua, K.: A real-time emotion recognition from speech using gradient boosting. In: 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1–5. IEEE (2019)
Islam, M.Z., Hossain, M.S., ul Islam, R., Andersson, K.: Static hand gesture recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 324–329. IEEE (2019)
Islam, R.U., Ruci, X., Hossain, M.S., Andersson, K., Kor, A.L.: Capacity management of hyperscale data centers using predictive modelling. Energies 12(18), 3438 (2019)
Kabir, S., Islam, R.U., Hossain, M.S., Andersson, K.: An integrated approach of belief rule base and deep learning to predict air pollution. Sensors 20(7), 1956 (2020)
Karim, R., Andersson, K., Hossain, M.S., Uddin, M.J., Meah, M.P.: A belief rule based expert system to assess clinical bronchopneumonia suspicion. In: 2016 Future Technologies Conference (FTC), pp. 655–660. IEEE (2016)
Livingstone, S.R., Russo, F.A.: The Ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PloS One 13(5), e0196391 (2018)
Martínez, B.E., Jacobo, J.C.: An improved characterization methodology to efficiently deal with the speech emotion recognition problem. In: 2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), pp. 1–6. IEEE (2017)
Rovetta, S., Mnasri, Z., Masulli, F., Cabri, A.: Emotion recognition from speech signal using fuzzy clustering. In: 2019 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (EUSFLAT 2019). Atlantis Press (2019)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Tzirakis, P., Zhang, J., Schuller, B.W.: End-to-end speech emotion recognition using deep neural networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5089–5093. IEEE (2018)
Yang, N., Dey, N., Sherratt, R.S., Shi, F.: Recognize basic emotional statesin speech by machine learning techniques using mel-frequency cepstral coefficient features. J. Intell. Fuzzy Syst. (Preprint) 1–12 (2020)
Zhang, M., Liang, Y., Ma, H.: Context-aware affective graph reasoning for emotion recognition. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 151–156. IEEE (2019)
Zhao, Z., Bao, Z., Zhao, Y., Zhang, Z., Cummins, N., Ren, Z., Schuller, B.: Exploring deep spectrum representations via attention-based recurrent and convolutional neural networks for speech emotion recognition. IEEE Access 7, 97515–97525 (2019)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zisad, S.N., Hossain, M.S., Andersson, K. (2020). Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network. In: Mahmud, M., Vassanelli, S., Kaiser, M.S., Zhong, N. (eds) Brain Informatics. BI 2020. Lecture Notes in Computer Science(), vol 12241. Springer, Cham. https://doi.org/10.1007/978-3-030-59277-6_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-59277-6_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59276-9
Online ISBN: 978-3-030-59277-6
eBook Packages: Computer ScienceComputer Science (R0)