Pattern recognition and features selection for speech emotion recognition model using deep learning

Jermsittiparsert, Kittisak; Abdurrahman, Abdurrahman; Siriattakul, Parinya; Sundeeva, Ludmila A.; Hashim, Wahidah; Rahim, Robbi; Maseleno, Andino

doi:10.1007/s10772-020-09690-2

Pattern recognition and features selection for speech emotion recognition model using deep learning

Published: 08 September 2020

Volume 23, pages 799–806, (2020)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Kittisak Jermsittiparsert¹,
Abdurrahman Abdurrahman²,
Parinya Siriattakul³,
Ludmila A. Sundeeva⁴,
Wahidah Hashim⁵,
Robbi Rahim⁶ &
…
Andino Maseleno⁷

722 Accesses
30 Citations
Explore all metrics

Abstract

Automatic speaker recognizing models consists of a foundation on building various models of speaker characterization, pattern analyzing and engineering. The effect of classification and feature selection methods for the speech emotion recognition is focused. The process of selecting the exact parameter in arrangement with the classifier is an important part of minimizing the difficulty of system computing. This process becomes essential particularly for the models which undergo deployment in real time scenario. In this paper, a new deep learning speech based recognition model is presented for automatically recognizes the speech words. The superiority of an input source, i.e. speech sound in this state has straight impact on a classifier correctness attaining process. The Berlin database consist around 500 demonstrations to media persons that is both male and female. On the applied dataset, the presented model achieves a maximum accuracy of 94.21%, 83.54%, 83.65% and 78.13% under MFCC, prosodic, LSP and LPC features. The presented model offered better recognition performance over the other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Article Open access 07 May 2022

Biometrics recognition using deep learning: a survey

Article 13 January 2023

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

Article Open access 13 February 2024

References

Cakır, E., Heittola, T., & Virtanen, T. (2016). Domestic audio tagging with convolutional neural networks. In IEEE AASP challenge on detection and classification of acoustic scenes and events (DCASE 2016), (pp. 1–2).
Cakır, E., Parascandolo, G., Heittola, T., Huttunen, H., & Virtanen, T. (2017). Convolutional recurrent neural networks for polyphonic sound event detection. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(6), 1291–1303.
Article Google Scholar
Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 1251–1258).
El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.
Article Google Scholar
Hershey, S., Chaudhuri, S., Ellis, D.P., Gemmeke, J. F., Jansen, A., Moore, R. C., Plakal, M., Platt, D., Saurous, R. A., & Seybold, B. et al. (2017). CNN architectures for largescale audio classification. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 131–135). IEEE.
Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech: A review. International Journal of Speech Technology, 15(2), 99–117.
Article Google Scholar
Krishnaraj, N., Elhoseny, M., Thenmozhi, M., Selim, M. M., & Shankar, K. (2019). Deep learning model for real-time image compression in Internet of Underwater Things (IoUT). Journal of Real-Time Image Processing. https://doi.org/10.1007/s11554-019-00879-6.
Article Google Scholar
Lakshmanaprabu, S. K., Mohanty, S. N., Krishnamoorthy, S., Uthayakumar, J., & Shankar, K. (2019a). Online clinical decision support system using optimal deep neural networks. Applied Soft Computing, 81, 105487.
Article Google Scholar
Lakshmanaprabu, S. K., Mohanty, S. N., Shankar, K., Arunkumar, N., & Ramirez, G. (2019b). Optimal deep learning model for classification of lung cancer on CT images. Future Generation Computer Systems, 92, 374–382.
Article Google Scholar
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436.
Article Google Scholar
Lydia, E., Moses, G., Sharmili, N., Shankar, K., & Maseleno, A. (2019). Image classification using deep neural networks for malaria disease detection. International Journal on Emerging Technologies, 10, 66–70.
Google Scholar
Partila, P., Voznak, M., Mikulec, M., & Zdralek, J. (2012). Fundamental frequency extraction method using central clipping and its importance for the classification of emotional state. Advances in Electrical and Electronic Engineering, 10(4), 270–275.
Article Google Scholar
Shankar, K., Manickam, P., Devika, G., & Ilayaraja, M. (2018, December). Optimal feature selection for chronic kidney disease classification using deep learning classifier. In 2018 IEEE international conference on computational intelligence and computing research (ICCIC) (pp. 1–5). IEEE.
Voznak, M., Rezac, F., & Rozhon, J. (2010). Speech quality monitoring in Czech national research network. Advances in Electrical and Electronic Engineering, 8(5), 114–117.
Google Scholar
Zarkowski, M. (2013). Identification-driven emotion recognition system for a social robot. In Proceedings of the 18th international conference on methods and models in automation and robotics (MMAR’13), August 2013 (pp. 138–143).
Zhang, L., & Han, J. (2019). Acoustic scene classification using multi-layer temporal pooling based on convolutional neural network. arXiv:1902.10063.

Download references

Author information

Authors and Affiliations

Ton Duc Thang University, Ho Chi Minh City, Vietnam
Kittisak Jermsittiparsert
Physics Education Department, Lampung University, Tanjungkarang, Indonesia
Abdurrahman Abdurrahman
School of Psychology, University of Queensland, Brisbane, Australia
Parinya Siriattakul
Togliatti State University, Tolyatti, Russia
Ludmila A. Sundeeva
Institute of Informatics and Computing Energy, Universiti Tenaga Nasional, Kajang, Malaysia
Wahidah Hashim
Sekolah Tinggi Ilmu Manajemen sukma, Medan, Indonesia
Robbi Rahim
Department of Information Systems, STMIK Pringsewu, Pringsewu, Lampung, Indonesia
Andino Maseleno

Authors

Kittisak Jermsittiparsert
View author publications
You can also search for this author in PubMed Google Scholar
Abdurrahman Abdurrahman
View author publications
You can also search for this author in PubMed Google Scholar
Parinya Siriattakul
View author publications
You can also search for this author in PubMed Google Scholar
Ludmila A. Sundeeva
View author publications
You can also search for this author in PubMed Google Scholar
Wahidah Hashim
View author publications
You can also search for this author in PubMed Google Scholar
Robbi Rahim
View author publications
You can also search for this author in PubMed Google Scholar
Andino Maseleno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andino Maseleno.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jermsittiparsert, K., Abdurrahman, A., Siriattakul, P. et al. Pattern recognition and features selection for speech emotion recognition model using deep learning. Int J Speech Technol 23, 799–806 (2020). https://doi.org/10.1007/s10772-020-09690-2

Download citation

Received: 12 November 2019
Accepted: 17 February 2020
Published: 08 September 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10772-020-09690-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pattern recognition and features selection for speech emotion recognition model using deep learning

Abstract

Access this article

Similar content being viewed by others

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Biometrics recognition using deep learning: a survey

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Pattern recognition and features selection for speech emotion recognition model using deep learning

Abstract

Access this article

Similar content being viewed by others

Human emotion recognition from EEG-based brain–computer interface using machine learning: a comprehensive review

Biometrics recognition using deep learning: a survey

Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation