Speech Emotion Recognition Systems: A Comprehensive Review on Different Methodologies

Anthony, Audre Arlene; Patil, Chandreshekar Mohan

doi:10.1007/s11277-023-10296-5

Speech Emotion Recognition Systems: A Comprehensive Review on Different Methodologies

Published: 15 March 2023

Volume 130, pages 515–525, (2023)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

1092 Accesses
7 Citations
Explore all metrics

Abstract

As humans, speech is the common as well as a natural way of expressing ourselves. Speech Emotion Recognition (SER) systems can be defined as an assortment of methods processes and classifies speech signals for the detection of associated emotions. Automatic emotion recognition is the technique of identification of human emotions from various signals like speech, facial expression and text. Collection of such signals and labelling them is often tiresome and needs proficient knowledge. This paper deals with the different types of open source speech emotion datasets of various languages and recent literature survey in the area of speech emotion recognition that employs a number of machine learning approaches with an objective of enhancing the classification accuracy. The paper prudently aims at identifying and synthesizing contemporary pertinent literature associated to the SER systems with different methodologies or design components, thus providing the researchers with an up-to-date understanding of the research topic in the field of SER.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey of Human Emotion Recognition Using Speech Signals: Current Trends and Future Perspectives

Speech Emotion Recognition: A Review

Speech Emotion Recognition: A Comprehensive Survey

Article 08 March 2023

Data Availability

The data used to support the finding of this study are included within the article.

References

Aouani, H., & Ayed, Y. B. (2020). Speech emotion Recognition with Deep Learning. Procedia Computer Science, 176, 251–260. https://doi.org/10.1016/j.procs.2020.08.027.
Article Google Scholar
Cheng, H., & Tang, X. (2020). Speech Emotion Recognition based on Interactive Convolutional Neural Network (2020). In IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP), pp. 163–167. https://doi.org/10.1109/ICICSP50920.2020.9232071.
Cornejo, J. Y. R., & Pedrini, H. (2019). Audio-Visual Emotion Recognition Using a Hybrid Deep Convolutional Neural Network based on Census Transform. In IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 3396–3402. https://doi.org/10.1109/SMC.2019.8914193.
Qadri, S. A. A., Gunawan, T. S., Wani, T. M., Ambikairajah, E., Kartiwi, M., & Ihsanto, E. (2021). Speech emotion Recognition using deep neural networks on multilingual databases. In J. A. Mat Jizat, et al. (Eds.), Advances in Robotics, automation and data analytics. iCITES 2020 (vol. 1350). Advances in Intelligent Systems and Computing. Cham: Springer. https://doi.org/10.1007/978-3-030-70917-4_3.
Abo absa, A. H., Deriche, M., & Mohandes, M. (2018). A Bilingual Emotion Recognition System Using Deep Learning Neural Networks. In 15th International Multi-Conference on Systems, Signals & Devices (SSD), pp. 1241–1245, https://doi.org/10.1109/SSD.2018.8570407.
Hasan, H. M. M., & Islam, M. A. (2020). Emotion Recognition from Bengali Speech using RNN Modulation-based Categorization. In Third International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 1131–1136, https://doi.org/10.1109/ICSSIT48917.2020.9214196.
Cai, L., Dong, J., & Wei, M. (2020). Multi-Modal Emotion Recognition From Speech and Facial Expression Based on Deep Learning. In Chinese Automation Congress (CAC), pp. 5726–5729, https://doi.org/10.1109/CAC51589.2020.9327178.
Bharti, D., & Kukana, P. (2020). A Hybrid Machine Learning Model for Emotion Recognition from Speech Signals. In International Conference on Smart Electronics and Communication (ICOSEC), pp. 491–496, https://doi.org/10.1109/ICOSEC49089.2020.9215376.
Dangol, R., Alsadoon, A., Prasad, P. W. C., et al. (2020). Speech emotion Recognition using convolutional neural network and long-short TermMemory. Multimed Tools Appl, 79, 32917–32934. https://doi.org/10.1007/s11042-020-09693-w.
Article Google Scholar
Tang, D., Kuppens, P., Geurts, L. (2021). End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural network. J Audio Speech Music Proc18 (2021), https://doi.org/10.1186/s13636-021-00208-5.
Huilian, L., Weiping, H., & Wang, Y. (2020). Speech Emotion Recognition Based on BLSTM and CNN Feature Fusion. In Proceedings of the 2020 4th International Conference on Digital Signal Processing (ICDSP 2020), Association for Computing Machinery, New York, NY, USA, 169–172. https://doi.org/10.1145/3408127.3408192
Meng, H., Yan, T., Yuan, F., & Wei, H. (2019). Speech Emotion Recognition From 3D Log-Mel Spectrograms With Deep Learning Network, IEEE Access, 7, 125868–125881. https://doi.org/10.1109/ACCESS.2019.2938007.
Zhao, J., Mao, X., & Chen, L. (2019). Speech emotion recognition using deep 1D & 2D CNN LSTM Networks. Biomedical Signal Processing and Control, 47, 312–323. https://doi.org/10.1016/j.bspc.2018.08.035.
Article Google Scholar
Jiang, P., Fu, H., Tao, H., Lei, P., & Zhao, L. (2019). Parallelized Convolutional Recurrent Neural Network With Spectral Features for Speech Emotion Recognition. IEEE Access, 7, 90368–90377, https://doi.org/110.1109/ACCESS.2019.2927384.
Anvarjon, T., Mustaqeem, & Kwon, S. (2020). Deep-net: a lightweight CNN-based speech emotion recognition system using deep frequency features. Sensors (Basel, Switzerland), 20(18), 5212. https://doi.org/10.3390/s20185212.
Article Google Scholar
Basavaiah, J., & Arlene Anthony, A. (2020). Tomato Leaf Disease classification using multiple feature extraction techniques. Wireless Personal Communications, 115, 633–651. https://doi.org/10.1007/s11277-020-07590-x.
Article Google Scholar

Download references

Funding

This work is not funded by any governmental or non-governmental funding agencies.

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Vidyavardhaka College Of Engineering, Mysuru, India
Audre Arlene Anthony & Chandreshekar Mohan Patil

Authors

Audre Arlene Anthony
View author publications
You can also search for this author in PubMed Google Scholar
Chandreshekar Mohan Patil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Audre Arlene Anthony.

Ethics declarations

Conflict of Interest

The authors declare that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers’ bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Anthony, A.A., Patil, C.M. Speech Emotion Recognition Systems: A Comprehensive Review on Different Methodologies. Wireless Pers Commun 130, 515–525 (2023). https://doi.org/10.1007/s11277-023-10296-5

Download citation

Accepted: 24 February 2023
Published: 15 March 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11277-023-10296-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Emotion Recognition Systems: A Comprehensive Review on Different Methodologies

Abstract

Access this article

Similar content being viewed by others

A Survey of Human Emotion Recognition Using Speech Signals: Current Trends and Future Perspectives

Speech Emotion Recognition: A Review

Speech Emotion Recognition: A Comprehensive Survey

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speech Emotion Recognition Systems: A Comprehensive Review on Different Methodologies

Abstract

Access this article

Similar content being viewed by others

A Survey of Human Emotion Recognition Using Speech Signals: Current Trends and Future Perspectives

Speech Emotion Recognition: A Review

Speech Emotion Recognition: A Comprehensive Survey

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation