Abstract
Emotion recognition is the identification of emotions usually through verbal communication and facial expressions such as happy, angry, sad, etc. Not only on the basis of a wide spectrum of moods, but different emotions can also be recognized in order to track mental health of as many people as possible for societal well being. Inside positive it detects specific emotions like happiness, satisfaction, or excitement -depending on how it’s configured. The main principles involved in the implementation of our sentiment recognition system that identifies various emotions: anger, happiness, depression, neutral, etc. are audio content and identification of the emotion associated with it. The application developed takes audio input, applies Mel-Frequency Cepstral Coefficients (MFCC) algorithm on it, compares them with those of the content of the existing audio file database depicting various human sentiments, and presents output in the text the emotion expressed by the user. The input from testing was gathered and meaningful spectral coefficients were extracted and stored in a database for comparison with future audio samples. The application extracts the coefficients of the external audio sample and matches it with those present in the database. MFCC algorithm is used to extract the spectral coefficients which are good and can be used for feature matching purposes discarding any static and background noise if present. We have done comparative analysis on our models for their performance evaluation, using four classification metrics and also presented the confusion matrix for better understanding.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mishra, A., Dey, K., Bhattacharyya, P.: Learning cognitive features from gaze data for sentiment and sarcasm classification using convolutional neural network. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 377–387 (2017)
Rodden, T., Cheverst, K., Davies, K., Dix, A.: Exploiting context in HCI design for mobile systems. In: Workshop on Human Computer Interaction with Mobile Devices, vol. 12 (1998)
Squire, K.: From content to context: videogames as designed experience. Educ. Res. 35(8), 19–29 (2006)
Leggitt, J.S., Gibbs, R.W.: Emotional reactions to verbal irony. Discourse Process. 29(1), 1–24 (2000)
How Vanity Affects Video Communication | Highfive. Access Time: 3:20 am Saturday, 15 May 2021 (IST)
Somerville, L.H., Jones, R.M., Ruberry, E.J., Dyke, J.P., Glover, G., Casey, B.J.: The medial prefrontal cortex and the emergence of self-conscious emotion in adolescence. Psychol. Sci. 24(8), 1554–1562 (2013)
Salih, H., Kulkarni, L.: Study of video based facial expression and emotions recognition methods. In: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), pp. 692–696. IEEE (2017)
Izard, C.E.: The psychology of emotions. Springer Science & Business Media (1991)
Duffy, E.: Activation and behavior (1962)
Izard, C.E.: The face of emotion (1971)
Izard, C.E., Tomkins, S.S.: Affect and behavior: anxiety as a negative affect. Anxiety Behav. 1, 81–125 (1966)
Gilbert, P.: Affiliative and prosocial motives and emotions in mental health. Dialogues Clin. Neurosci. 17(4), 381 (2015)
Depue, R.A., Morrone-Strupinsky, J.V.: A neurobehavioral model of affiliative bonding: implications for conceptualizing a human trait of affiliation. Behav. Brain Sci. 28(3), 313–349 (2005)
Le Doux, J.: The Emotional Brain. London: Weidenfeld and Nicholson. Deutsch: Das Netz der Gefühle, München: Deutscher Taschenbuch-Verlag (2001)
Panksepp, J.: Affective neuroscience of the emotional Brain. Mind: evolutionary perspectives and implications for understanding depression. Dialogues Clin. Neurosci. 12(4), 533 (2010)
Gilbert, P.: The compassionate mind. Robinson (2009)
Gilbert, P.: The evolution and social dynamics of compassion. Soc. Pers. Psychol. Compass 9(6), 239–254 (2015)
Gilbert, P.: Human nature and suffering. Routledge (2016)
Keltner, D., Kogan, A., Piff, P.K., Saturn, S.R.: The sociocultural appraisals, values, and emotions (SAVE) framework of prosociality: Core processes from gene to meme. Annu. Rev. Psychol. 65, 425–460 (2014)
Gilbert, P.: The origins and nature of compassion focused therapy. Br. J. Clin. Psychol. 53(1), 6–41 (2014)
Dunbar, R.I.: The social role of touch in humans and primates: behavioural function and neurobiological mechanisms. Neurosci. Biobehav. Rev. 34(2), 260–268 (2010)
Ingale, A.B., Chaudhari, D.S.: Speech emotion recognition. Int. J. Soft Comput. Eng. (IJSCE) 2(1), 235–238 (2012)
Shen, P., Changjun, Z., Chen, X.: Automatic speech emotion recognition using support vector machine. In: Proceedings of 2011 International Conference on Electronic & Mechanical Engineering and Information Technology, vol. 2, pp. 621–625. IEEE (2011)
Shaikh Nilofer, R.A., Gadhe, R.P., Deshmukh, R.R., Waghmare, V.B., Shrishrimal, P.P.: Automatic emotion recognition from speech signals: a review. Int. J. Sci. Eng. Res. 6(4) (2015)
Gunawan, T.S., Alghifari, M.F., Morshidi, M.A., Kartiwi, M.: A review on emotion recognition algorithms using speech analysis. Indones. J. Electr. Eng. Inform. (IJEEI) (IJEEI) 6(1), 12–20 (2018)
Basharirad, B., Moradhaseli, M. (2017) Speech emotion recognition methods: a literature review. In: AIP Conference Proceedings, vol. 1891, No. 1, p. 020105. AIP Publishing LLC
VH, A., Marimuthu, R.: A study on speech recognition technology. J. Comput. Technol. 2278–3814 (2014)
Logan, B.: Mel frequency cepstral coefficients for music modeling. In: Ismir, vol. 270, pp. 1–11 (2000)
Nandi, S., Banerjee, M., Sinha, P., Dastidar, J.G.: SVM based classification of sounds from musical instruments using MFCC features. Int. J. Adv. Res. Comput. 8(5) (2017)
Murarka, A., Shivarkar, K., Gupta, V., Sankpal, L.: Sentiment analysis of speech. Int. J. Adv. Res. Comput. Commun. Eng. 6(11), 240–243 (2017)
Davletcharova, A., Sugathan, S., Abraham, B., James, A.P.: Detection and analysis of emotion from speech signals. Procedia Comput. Sci. 58, 91–96 (2015)
Maghilnan, S., Kumar, M.R.: Sentiment analysis on speaker specific speech data. In: 2017 International Conference on Intelligent Computing and Control (I2C2), pp. 1–5. IEEE (2017)
Mermelstein, P.: Distance measures for speech recognition, psychological and instrumental. Pattern Recognit. Artif. Intell. 116, 374–388 (1976)
Hochreiter, S.: JA1 4 rgen Schmidhuber (1997).“Long Short-Term Memory”. Neural Comput. 9(8)
Atlas, L., Homma, T., Marks, R.: An artificial neural network for spatio-temporal bipolar patterns: application to phoneme classification. In: Neural Information Processing Systems, pp. 31–40 (1987)
Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30(7), 1145–1159 (1997)
Mishra, A., Dey, K., Bhattacharyya, P.: Learning cognitive features from gaze data for sentiment and sarcasm classification using convolutional neural network. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 377–387 (2017)
http://mirlab.org/jang/books/audiosignalprocessing/speechFeatureMfcc.asp?title=12-2%20MFCC. Access Time: 10:30 pm Tuesday, 2 April 2019 (IST)
https://medium.com/mlreview/understanding-lstm-and-its-diagrams-37e2f46f1714. Access Time: 10:30 pm Tuesday, 2 April 2019 (IST)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Mitra, A., Biswas, A., Ghosh, A., Ghosh, A., Majumdar, S.K., Dastidar, J.G. (2023). Using Deep Learning to Recognize Emotions Through Speech Analysis. In: Biswas, A., Semwal, V.B., Singh, D. (eds) Artificial Intelligence for Societal Issues. Intelligent Systems Reference Library, vol 231. Springer, Cham. https://doi.org/10.1007/978-3-031-12419-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-12419-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-12418-1
Online ISBN: 978-3-031-12419-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)