Skip to main content

Advertisement

Log in

A robust feature selection method based on meta-heuristic optimization for speech emotion recognition

  • Research Paper
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

Most of the traditional feature selection methods do not show effective performance on speech emotion recognition systems. One of the recent advances in the feature selection is using meta-heuristic optimization algorithms. Individually, each algorithm plays a key role in many speech processing based applications. However, hybrid meta-heuristic are one most impressive technique in the field of feature selection and optimization problems. Hence, in this paper proposed a new robust hybrid-meta-heuristic feature selection model named as CSEO FS model as to improve the accuracy of SER task and also reduce the burden of computing capability. In this study, we investigated the performance of the proposed approach using speech-based emotional data-sets such as EMoDB and RAVDESS, which are primarily used in the development of human-computer interaction systems. The experimental results confirm the superiority of the proposed feature selection in terms of classification accuracy, precision, f1-score and number of selected features. Compared to state-of-the-art feature selection methods for SER systems, our experimental results show us achieving 94.35% and 96.78% high-level emotion recognition rates, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Abd ElA ziz M, Hassanien AE (2018) Modified cuckoo search algorithm with rough sets for feature selection. Neural Comput Appl 29(4):925–934

    Article  Google Scholar 

  2. Anagnostopoulos CN, Iliou T, Giannoukos I (2015) Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011. Artif Intell Rev 43(2):155–177

    Article  Google Scholar 

  3. Ashkzari A, Azizi A (2014) Introducing genetic algorithm as an intelligent optimization technique. Appl Mech Mater 568–570:793–797. https://doi.org/10.4028/www.scientific.net/amm.568-570.793

    Article  Google Scholar 

  4. Azizi A (2019) Hybrid artificial intelligence optimization technique. In: Applications of artificial intelligence techniques in industry 4.0. SpringerBriefs in applied sciences and technology. Springer, Singapore. https://doi.org/10.1007/978-981-13-2640-0_4

  5. Azizi A (2020) Applications of artificial intelligence techniques to enhance sustainability of industry 4.0: design of an artificial neural network model as dynamic behavior optimizer of robotic arms. Complexity 20, Article ID 8564140, 10 pages. https://doi.org/10.1155/2020/8564140

  6. Badshah AM, Ahmad J, Rahim N, Baik SW (2017) Speech emotion recognition from spectrograms with deep convolutional neural network. In: 2017 International conference on platform technology and service (PlatCon). IEEE, pp 1–5

  7. Brester C, Semenkin E, Sidorov M (2016) Multi-objective heuristic feature selection for speech-based multilingual emotion recognition. J Artif Intell Soft Comput Res

  8. Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B (2005) A database of German emotional speech. In: Ninth European conference on speech communication and technology

  9. Chakri A, Ragueb H, Yang XS (2018) Bat algorithm and directional bat algorithm with case studyies. In: Nature-inspired algorithms and applied optimization. Springer, Berlin, pp 189–216

  10. Chakraborty C, Abougreen AN (2021) Intelligent internet of things and advanced machine learning techniques for COVID-19. EAI Endorsed Trans Pervasive Health Technol 7(26)

  11. Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28

    Article  Google Scholar 

  12. Chibelushi CC, Bourel F (2003) Facial expression recognition: a brief tutorial overview. In: CVonline: on-line compendium of computer vision

  13. Das A, Guha S, Singh PK, Ahmadian A, Senu N, Sarkar R (2020) A hybrid meta-heuristic Feature selection method for identification of Indian spoken languages from audio signals. IEEE Access 181432–181449

  14. Demircan S, Kahramanli H (2018) Application of fuzzy c-means clustering algorithm to spectral features for emotion classification from speech. Neural Comput Appl 29(8):59–66

    Article  Google Scholar 

  15. Dhall A, Goecke R, Ghosh S, Joshi J, Hoey J, Gedeon T (2017) From individual to group-level emotion recognition: Emotiw 5.0. In: Proceedings of the 19th ACM international conference on multimodal interaction, pp 524–528

  16. Dhall A, Kaur A, Goecke R, Gedeon T (2018) Emotiw 2018: audio-video, student engagement and group-level affect prediction. In: Proceedings of the 20th ACM international conference on multimodal interaction, pp 653–656

  17. Duda RO, Hart PE (2012) et stork, david g. pattern classification

  18. El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn 44(3):572–587

    Article  ADS  Google Scholar 

  19. Eyben F, Weninger F, Gross F, Schuller B (2013) Recent developments in opensmile, the munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM international conference on multimedia, pp 835–838

  20. Faramarzi A, Heidarinejad M, Stephens B, Mirjalili S (2020) Equilibrium optimizer: a novel optimization algorithm. Knowl Based Syst 191:105190

    Article  Google Scholar 

  21. Ghosh M, Guha R, Alam I, Lohariwal P, Jalan D, Sarkar R (2020) Binary genetic swarm optimization: a combination of GA and PSO for feature selection. J Intell Syst 29(1):1598–1610

    Google Scholar 

  22. Guha S, Das A, Singh PK, Ahmadian A, Senu N, Sarkar R (2020) Hybrid feature selection method based on harmony search and naked mole-rat algorithms for spoken language identification from audio signals. IEEE Access 8:182868–182887

    Article  Google Scholar 

  23. Issa D, Demirci MF, Yazici A (2020) Speech emotion recognition with deep convolutional neural networks. Biomed Signal Process Control 59:101894

    Article  Google Scholar 

  24. Ivanov A, Riccardi G (2012) Kolmogorov-Smirnov test for feature selection in emotion recognition from speech. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5125–5128

  25. Khammassi C, Krichen S (2017) A GA-LR wrapper approach for feature selection in network intrusion detection. Comput Secur 70:255–277

    Article  Google Scholar 

  26. Khanchandani K, Hussain MA (2009) Emotion recognition using multilayer perceptron and generalized feed forward neural network

  27. Koller D, Sahami M (1996) Toward optimal feature selection. Tech. rep, Stanford InfoLab

  28. Kozodoi N, Lessmann S, Papakonstantinou K, Gatsoulis Y, Baesens B (2019) A multi-objective approach for profit-driven feature selection in credit scoring. Decis Support Syst 120:106–117

    Article  Google Scholar 

  29. Kwon S (2021) Optimal feature selection based speech emotion recognition using two-stream deep convolutional neural network. Int J Intell Syst

  30. Lampropoulos AS, Tsihrintzis GA (2012) Evaluation of MPEG-7 descriptors for speech emotional recognition. In: 2012 Eighth international conference on intelligent information hiding and multimedia signal processing. IEEE, pp 98–101

  31. Lanjewar RB, Mathurkar S, Patel N (2015) Implementation and comparison of speech emotion recognition system using gaussian mixture model (GMM) and k-nearest neighbor (k-NN) techniques. Proc Comput Sci 49:50–57

    Article  Google Scholar 

  32. Li AD, He Z, Zhang Y (2016) Bi-objective variable selection for key quality characteristics selection based on a modified NSGA-II and the ideal point method. Comput Ind 82:95–103

    Article  Google Scholar 

  33. Manosij G, Ritam G, Sarkar R, Abraham A (2020) A wrapper-filter feature selection technique based on ant colony optimization. Neural Comput Appl 32(12):7839–7857

    Article  Google Scholar 

  34. Mistry K, Zhang L, Neoh SC, Lim CP, Fielding B (2016) A micro-GA embedded PSO feature selection approach to intelligent facial emotion recognition. IEEE Trans Cybern 47(6):1496–1509

    Article  PubMed  Google Scholar 

  35. Nagar P, Menaria HK, Tiwari M (2020) Novel approach of intrusion detection classification deeplearning using SVM. In: First international conference on sustainable technologies for computational intelligence. Springer, Berlin, pp 365–381

  36. Nemati S, Basiri ME, Ghasem-Aghaee N, Aghdam MH (2009) A novel ACO–GA hybrid algorithm for feature selection in protein function prediction. Expert Syst Appl 36(10):12086–12094

    Article  Google Scholar 

  37. Ortega MGS, Rodríguez LF, Gutierrez-Garcia JO (2019) Towards emotion recognition from contextual information using machine learning. J Ambient Intell Humaniz Comput 1–21

  38. Özseven T (2019) A novel feature selection method for speech emotion recognition. Appl Acoust 146:320–326

    Article  Google Scholar 

  39. Huang Z, Dong M, Mao Q, Zhan Y (2014) Speech emotion recognition using CNN. In: Proceedings of the 22nd ACM international conference on Multimedia, pp 801–804

  40. Pereira L, Rodrigues D, Almeida T, Ramos C, Souza A, Yang XS, Papa J (2014) A binary cuckoo search and its application for feature selection. In: Cuckoo search and firefly algorithm. Springer, Berlin, pp 141–154

  41. Popova AS, Rassadin AG, Ponomarenko AA (2017) Emotion recognition in sound. In: International conference on neuroinformatics. Springer, Berlin, pp 117–124

  42. Preetha N, Brammya G, Ramya R, Praveena S, Binu D, Rajakumar B (2017) Grey wolf optimisation-based feature selection and classification for facial emotion recognition. IET Biometrics 7(5):490–499

    Google Scholar 

  43. Rao KS, Koolagudi SG, Vempada RR (2013) Emotion recognition from speech using global and local prosodic features. Int J Speech Technol 16(2):143–160

    Article  Google Scholar 

  44. Sadeg S, Hamdad L, Chettab H, Benatchba K, Habbas Z, Kechadi MT (2020) Feature selection based bee swarm meta-heuristic approach for combinatorial optimisation problems: a case-study on MaxSAT. Memetic Comput 12(4):283–298

    Article  Google Scholar 

  45. Sant A et al (2021) A novel green IoT-based pay-as-you-go smart parking system. CMC Comput Mater Cont 67(3):3523–3544

    Google Scholar 

  46. Schuller B, Arsic D, Wallhoff F, Lang M, Rigoll G (2005) Bioanalog acoustic emotion recognition by genetic feature generation based on low-level-descriptors. In: EUROCON 2005—the international conference on“computer as a tool”, vol 2. IEEE, pp 1292–1295

  47. Shegokar P, Sircar P (2016) Continuous wavelet transform based speech emotion recognition. In: 2016 10th International conference on signal processing and communication systems (ICSPCS). IEEE, pp 1–8

  48. Sheikhan M, Bejani M, Gharavian D (2013) Modular neural-SVM scheme for speech emotion recognition using ANOVA feature selection method. Neural Comput Appl 23(1):215–227

    Article  Google Scholar 

  49. Shen P, Changjun Z, Chen X (2011) Automatic speech emotion recognition using support vector machine. In: Proceedings of 2011 international conference on electronic & mechanical engineering and information technology, vol 2. IEEE, pp 621–625

  50. Swanson R, Livingstone SR, Russo FA (2019) Ravdess facial landmark tracking. https://doi.org/10.5281/zenodo.3255102. Funding Information Undergraduate Stipends and Expenses (USE) grant, University of Wisconsin - River Falls

  51. Tao Y, Wang K, Yang J, An N, Li L (2015) Harmony search for feature selection in speech emotion recognition. In: 2015 International conference on affective computing and intelligent interaction (ACII). IEEE, pp 362–367

  52. Tran B, Xue B, Zhang M (2017) A new representation in PSO for discretization-based feature selection. IEEE Trans Cybern 48(6):1733–1746

    Article  PubMed  Google Scholar 

  53. Venkataramanan K, Rajamohan HR (2019) Emotion recognition from speech. arXiv preprint arXiv:1912.10458

  54. Ververidis D, Kotropoulos C (2006) Emotional speech recognition: resources, features, and methods. Speech Commun 48(9):1162–1181

    Article  Google Scholar 

  55. Wu S, Falk TH, Chan WY (2011) Automatic speech emotion recognition using modulation spectral features. Speech Commun 53(5):768–785

    Article  Google Scholar 

  56. Yang XS, Deb S (2009) Cuckoo search via Le’vy flights. In: 2009 World congress on nature & biologically inspired computing (NaBIC). IEEE, pp 210–214

  57. Yildirim S, Kaya Y, Kılıç F (2021) A modified feature selection method based on metaheuristic algorithms for speech emotion recognition. Appl Acoust 173:107721

    Article  Google Scholar 

  58. Yogesh C, Hariharan M, Ngadiran R, Adom AH, Yaacob S, Berkai C, Polat K (2017) A new hybrid PSO assisted biogeography-based optimization for emotion and stress recognition from speech signal. Expert Syst Appl 69:149–158

    Article  Google Scholar 

  59. Zhang C, Ouyang D, Ning J (2010) An artificial bee colony approach for clustering. Expert Syst Appl 37(7):4761–4767

    Article  Google Scholar 

  60. Zhang B, Provost EM, Essl G (2016) Cross-corpus acoustic emotion recognition from singing and speaking: a multi-task learning approach. In: 2016 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5805–5809

  61. Zhao J, Mao X, Chen L (2019) Speech emotion recognition using deep 1d & 2d CNN LSTM networks. Biomed Signal Process Control 47:312–323

    Article  Google Scholar 

Download references

Funding

This work is not funded by any agencies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kesava Rao Bagadi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal rights

The authors declare that the work described has not involved experimentation on humans or animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bagadi, K.R., Sivappagari, C.M.R. A robust feature selection method based on meta-heuristic optimization for speech emotion recognition. Evol. Intel. 17, 993–1004 (2024). https://doi.org/10.1007/s12065-022-00772-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-022-00772-5

Keywords

Navigation