Skip to main content

Speech Emotion Recognition Using Neural Network and Wavelet Features

  • Conference paper
  • First Online:
Book cover Recent Trends in Wave Mechanics and Vibrations

Part of the book series: Lecture Notes in Mechanical Engineering ((LNME))

Abstract

Human speech which is generated through the vibration of the vocal cord gets affected by the emotional state of the speaker. Accurate recognition of different emotions concealed in human speech is a significant factor toward further improvement of the quality of Human–Computer Interaction (HCI). But the satisfactory level of accuracy is not yet achieved mainly because there is no well-accepted standard feature set. Emotions are hard to distinguish from speech even by human and that is why the standard feature set is difficult to extract. This paper presents a model to classify emotions from speech signals with high accuracy compared to the present state of the art. The speech dataset used in this experiment where speech recordings that are specifically labeled with different emotions of the speakers. A wavelet-based novel feature set is extracted from speech signals and then a Neural Network (NN) with a single hidden layer is trained on the feature set for classification of different emotions. The feature set is a newly introduced one and for the first time it is being tested with NN architecture and classification results are also compared with the results of other prominent classification techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit 44(3):572–587

    Article  Google Scholar 

  2. Busso C, Lee S, Narayanan S (2009) Analysis of emotionally salient aspects of fundamental frequency for emotion detection. IEEE Trans Audio Speech Lang Process 17:582–596

    Article  Google Scholar 

  3. Bosch LT (2003) Emotions, speech and the asr framework. Speech Commun 40(1):213–225

    MATH  Google Scholar 

  4. Cowie R, Douglas-Cowie E, Tsapatsoulis N, Votsis G, Kollias S, Fellenz W, Taylor JG (2001) Emotion recognition in human-computer interaction. IEEE Signal Process Mag 18(1):32–80. https://doi.org/10.1109/79.911197

    Article  Google Scholar 

  5. Han K, Dong Y, Tashev I (2014) Speech emotion recognition using deep neural network and extreme learning machine. In: Proceedings of the INTERSPEECH

    Google Scholar 

  6. Lee J, Tashev I (2015) High-level feature representation using recurrent neural network for speech emotion recognition. In: Proceedings of the INTERSPEECH

    Google Scholar 

  7. Neiberg D, Elenius K, Laskowski K (2006) Emotion recognition in spontaneous speech using GMMs. In: Proceedings of the INTERSPEECH

    Google Scholar 

  8. Shen P, Changjun Z, Chen X (2011) Automatic speech emotion recognition using support vector machine. In: Proceedings of the international conference on electronic mechanical engineering and information technology, vol 2, pp 621–625. https://doi.org/10.1109/EMEIT.2011.6023178

  9. JB (2001) Speech emotion recognition using hidden markov models. In: Proceedings of INTERSPEECH, pp 2679–2682,

    Google Scholar 

  10. Nwe TL, Foo SW, De Silva LC (2003) Speech emotion recognition using hidden Markov models. Speech Commun 41:603–623

    Article  Google Scholar 

  11. Mower E, Mataric MJ, Narayanan S (2011) A framework for automatic human emotion classification using emotion profiles. IEEE Trans Audio Speech Lang Process 19(5):1057–1070. ISSN 1558-7916. https://doi.org/10.1109/TASL.2010.2076804

    Article  Google Scholar 

  12. Lugger M, Yang B (2008) Psychological motivated multi-stage emotion classification exploiting voice quality features. In: Mihelic F, Zibert J (eds) Speech recognition, technologies and applications, chapter 22. I-Tech

    Google Scholar 

  13. Yang B, Lugger M (2010) Emotion recognition from speech signals using new harmony features. Signal Process 90:1415–1423

    Article  Google Scholar 

  14. Fragopanagos N, Taylor JG (2005) Emotion recognition in human-computer interaction. Neural Netw, 18(5):389–405. ISSN 0893-6080. https://doi.org/10.1016/j.neunet.2005.03.006

    Article  Google Scholar 

  15. Walker JS (2008) A primer on WAVELETS and their scientific applications. Taylor and Francis Group, LLC

    Google Scholar 

  16. Quiroga RQ, Rosso OA, Basar E, Schurman M (2001) Wavelet entropy in event-related potentials: a new method shows ordering of EEG oscillations. Biol Cybern 84:291–299

    Google Scholar 

  17. Kullback S (1959) Digital signal processing. Wiley

    Google Scholar 

  18. Livingstone SR, Russo FA (2018) The ryerson audio-visual database of emotional speech and song (RAVDESS). Public Library Sci 13(5):1–35. https://doi.org/10.1371/journal.pone.0196391

    Article  Google Scholar 

  19. Slaney M, McRoberts G (1998) Baby ears: a recognition system for affective vocalizations. In: Proceedings of the international conference on acoustics, speech, and signal processing

    Google Scholar 

  20. Engberg IS, Hansen AV, Andersen O, Dalsgaard P (1997) Design, recording and verification of a Danish emotional speech database. In: Proceedings of the 5th European conference on speech communication and technology

    Google Scholar 

  21. Fayek HM, Lech M, Cavedonb L (2017) Evaluating deep learning architectures for speech emotion recognition. Neural Netw, 92:60–68

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tanmoy Roy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Roy, T., Marwala, T., Chakraverty, S. (2020). Speech Emotion Recognition Using Neural Network and Wavelet Features. In: Chakraverty, S., Biswas, P. (eds) Recent Trends in Wave Mechanics and Vibrations. Lecture Notes in Mechanical Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-15-0287-3_30

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-0287-3_30

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-0286-6

  • Online ISBN: 978-981-15-0287-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics