Skip to main content

Advertisement

Log in

Glowworm swarm based fuzzy classifier with dual features for speech emotion recognition

  • Special Issue
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

Nowadays, a great attention is focusing on the study of the emotional content in speech signals since the speech signal is one of the quickest and natural tactic to communicate among humans, and thus, many systems have been suggested to recognize the emotional content of a spoken utterance. This paper schemes two contributions: Gender recognition and Emotion recognition. In the first contribution, there has two phases: feature extraction and classification. The pitch feature is extracted and given for classification using k-Nearest Neighboring classifier. In the second contribution, features like Non-Negative Matrix Factorization and pitch are extracted and given as the input to Adaptive Fuzzy classifier to recognize the respective emotions. In addition, the limits of membership functions are optimally chosen using a renowned optimization algorithm namely glowworm swarm optimization (GSO). Thus the proposed adaptive Fuzzy classifier using GSO is termed as GSO-FC. The performance of proposed model is compared to other conventional algorithms like Grey Wolf Optimization, FireFly, Particle Swarm Optimization, Artificial Bee Colony and Genetic Algorithm in correspondence with varied performance measures like Accuracy, Sensitivity, Specificity, Precision, False positive rate, False negative rate, Negative Predictive Value, False Discovery Rate, F1 Score and Mathews correlation coefficient.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Ntalampiras S, Fakotakis N (2012) Modeling the temporal evolution of acoustic parameters for speech emotion recognition. IEEE Trans Affect Comput 3(1):116–125

    Article  Google Scholar 

  2. Zheng W, Xin M, Wang X, Wang B (2014) A novel speech emotion recognition method via incomplete sparse least square regression. IEEE Signal Process Lett 21(5):569–572

    Article  Google Scholar 

  3. Sheikhan M, Gharavian D, Ashoftedl F (2012) Using DTW neural–based MFCC warping to improve emotional speech recognition. Neural Comput Appl 21:1765–1773

    Article  Google Scholar 

  4. McKeown G, Valstar M, Cowie R, Pantic M, Schroder M (2012) The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans Affect Comput 3(1):5–17

    Article  Google Scholar 

  5. Wang K, An N, Li BN, Zhang Y, Li L (2015) Speech emotion recognition using fourier parameters. IEEE Trans Affect Comput 6(1):69–75

    Article  Google Scholar 

  6. Zong Y, Zheng W, Cui Z, Li Q (2016) Double sparse learning model for speech emotion recognition. Electron Lett 52(16):1410–1412

    Article  Google Scholar 

  7. Huang Y, Wu A, Zhang G, Li Y (2015) Extraction of adaptive wavelet packet filter-bank-based acoustic feature for speech emotion recognition. IET Signal Process 9(4):341–348

    Article  Google Scholar 

  8. Gangeh MJ, Fewzee P, Ghodsi A, Kamel MS, Karray F (2014) Multiview supervised dictionary learning in speech emotion recognition. IEEE ACM Trans Audio Speech Lang Process 22(6):1056–1068

    Article  Google Scholar 

  9. Deng J, Zhang Z, Eyben F, Schuller B (2014) Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Process Lett 21(9):1068–1072

    Article  Google Scholar 

  10. Deng J, Xu X, Zhang Z, Frühholz S, Schuller B (2016) Exploitation of phase-based features for whispered speech emotion recognition. IEEE Access 4:4299–4309

    Article  Google Scholar 

  11. Kamaruddin N, Wahab A, Quek C (2012) Cultural dependency analysis for understanding speech emotion. Expert Syst Appl 39(5):5115–5133

    Article  Google Scholar 

  12. Hayat M, Bennamoun M (2014) An automatic framework for textured 3D video-based facial expression recognition. IEEE Trans Affect Comput 5(3):301–313

    Article  Google Scholar 

  13. Kotti Margarita, Paternò Fabio (2012) Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema. Int J Speech Technol 15(2):131–150

    Article  Google Scholar 

  14. Mannepalli K, NarahariSastry P, Suman M (2016) A novel adaptive fractional deep belief networks for speaker emotion recognition. Alex Eng J 56:485–497

    Article  Google Scholar 

  15. Xiaoqing J, Kewen X, Yongliang L, Jianchuan B (2017) Noisy speech emotion recognition using sample reconstruction and multiple-kernel learning. J China Univ Posts Telecommun 24(2):1–9

    Article  Google Scholar 

  16. Khan FS, van de Weijer J, Anwer RM, Felsberg M, Gatta C (2014) Semantic pyramids for gender and action recognition. IEEE Trans Image Process 23(8):3633–3645

    Article  MathSciNet  Google Scholar 

  17. Azzopardi G, Greco A, Saggese A, Vento M (2018) Fusion of domain-specific and trainable features for gender recognition from face images. IEEE Access 6:24171–24183

    Article  Google Scholar 

  18. Mahalingam G, Ricanek K, Albert AM (2014) Investigating the periocular-based face recognition across gender transformation. IEEE Trans Inf Forensics Secur 9(12):2180–2192

    Article  Google Scholar 

  19. Deng J, Xu X, Zhang Z, Frühholz S, Schuller B (2017) Universum autoencoder-based domain adaptation for speech emotion recognition. IEEE Signal Process Lett 24(4):500–504

    Article  Google Scholar 

  20. El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit 44(3):572–587

    Article  Google Scholar 

  21. Väyrynen E, Kortelainen J, Seppänen T (2013) Classifier-based learning of nonlinear feature manifold for visualization of emotional speech prosody. IEEE Trans Affect Comput 4(1):47–56

    Article  Google Scholar 

  22. Mao Q, Dong M, Huang Z, Zhan Y (2014) Learning salient features for speech emotion recognition using convolutional neural networks. IEEE Trans Multimed 16(8):2203–2213

    Article  Google Scholar 

  23. Glodek Michael, Tschechne Stephan, Layher Georg, Schels Martin, Brosch Tobias, Scherer Stefan, Kächele Markus, Schmidt Miriam, Neumann Heiko, Palm Günther, Schwenker Friedhelm (2011) Multiple classifier systems for the classification of audio-visual emotional states. Affect Comput Intell Interaction 6975:359–368

    Google Scholar 

  24. Yoon WJ, Park KS (2011) Building robust emotion recognition system on heterogeneous speech databases. IEEE Trans Consum Electron 57(2):747–750

    Article  Google Scholar 

  25. Song P, Jin Y, Zha C, Zhao L (2015) Speech emotion recognition method based on hidden factor analysis. Electron Lett 51(1):112–114

    Article  Google Scholar 

  26. Zong Y, Zheng W, Zhang T, Huang X (2016) Cross-corpus speech emotion recognition based on domain-adaptive least-squares regression. IEEE Signal Process Lett 23(5):585–589

    Article  Google Scholar 

  27. Palo HK, Mohanty MN (2017) Wavelet based feature combination for recognition of emotions. Ain Shams Eng J 9:1799

    Article  Google Scholar 

  28. Zhang Z, Coutinho E, Deng J, Schuller B (2015) Cooperative learning and its application to emotion recognition from speech. IEEE ACM Trans Audio Speech Lang Process 23(1):115–126

    Google Scholar 

  29. Attabi Y, Dumouchel P (2013) Anchor models for emotion recognition from speech. IEEE Trans Affect Comput 4(3):280–290

    Article  Google Scholar 

  30. Deng J, Xu X, Zhang Z, Frühholz S, Schuller B (2018) Semisupervised Autoencoders for speech emotion recognition. IEEE ACM Trans Audio Speech Lang Process 26(1):31–43

    Article  Google Scholar 

  31. Basu S, Chakraborty J, Aftabuddin M (2017) Emotion recognition from speech using convolutional neural network with recurrent neural network architecture. In: 2017 2nd international conference on communication and electronics systems (ICCES), Coimbatore, pp 333–336

  32. Peng Z, Zhu Z, Unoki M, Dang J, Akagi M (2017) Speech emotion recognition using multichannel parallel convolutional recurrent neural networks based on gammatone auditory filterbank. In: 2017 Asia-Pacific signal and information processing association annual summit and conference (APSIPA ASC), Kuala Lumpur, pp 1750–1755

  33. Zhang Y, Lu S, Zhou X, Yang M, Liu B, Wu L, Phillips P, Wang S (2016) Comparison of machine learning methods for stationary wavelet entropy-based multiple sclerosis detection: decision tree, k-nearest neighbors, and support vector machine. Simul Digit Image Process Med Appl 92(9):861–871

    Google Scholar 

  34. Zhang G, Lu Z, Ji G, Sun P, Yang J, Zhang Y (2015) Automated classification of brain MR images by wavelet-energy and k-nearest neighbors algorithm. In: 2015 seventh international symposium on parallel architectures, algorithms and programming (PAAP), Nanjing, pp 87–91

  35. Portenier T, Hu Q, Favaro P, Zwicker M (2018) Fine-grained retrieval with autoencoders. In: VISIGRAPP, pp 85–95

  36. Wu Y, Ianakiev K, Govindaraju V (2002) Improved k-nearest neighbor classification. Pattern Recognit 35:2311–2318

    Article  Google Scholar 

  37. Zhang L, Chen Z, Zheng M, He X (2011) Robust non-negative matrix factorization. Front Electr Electron 6(2):192–200

    Article  Google Scholar 

  38. Gonzalez S, Brookes M (2011) A pitch estimation filter robust to high levels of noise (PEFAC). In: 19th European signal processing conference 2011

  39. Ng SC (2017) Principle component analysis to reduce dimension on digital image. Procedia Comput Sci 111:113–119

    Article  Google Scholar 

  40. Zhou Yongquan, Zhou Guo, Wang Yingju, Zhao Guangwei (2013) A glowworm swarm optimization algorithm based tribes. Appl Math Inf Sci 7(2L):537–541

    Article  Google Scholar 

  41. McCall John (2005) Genetic algorithms for modelling and optimisation. J Comput Appl Math 184:205–222

    Article  MathSciNet  Google Scholar 

  42. Karaboga D, Basturk B (2008) On the performance of artificial bee colony (ABC) algorithm. Appl Soft Comput 8:687–697

    Article  Google Scholar 

  43. Pedersen MEH, Chipperfield AJ (2010) Simplifying particle swarm optimization. Appl Soft Comput 10:618–628

    Article  Google Scholar 

  44. Gandomi AH, Yang X-S, Talatahari S, Alavi AH (2013) Firefly algorithm with chaos. Commun Nonlinear Sci Numer Simul 18:89–98

    Article  MathSciNet  Google Scholar 

  45. Mirjalili S, Mirjalili S, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. Rajasekhar.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rajasekhar, B., Kamaraju, M. & Sumalatha, V. Glowworm swarm based fuzzy classifier with dual features for speech emotion recognition. Evol. Intel. 15, 939–953 (2022). https://doi.org/10.1007/s12065-019-00262-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-019-00262-1

Keywords

Navigation