Skip to main content

Human Emotion Classification Based on Speech Enhancement Using Neural Networks

  • Conference paper
  • First Online:
Advanced Machine Intelligence and Signal Processing

Abstract

The quality of speech signal is more concerned in the domain of speech processing. Impulsive noise in speech signal has a capability to deteriorate the overall performance and reliability of the speech signal. This paper presents an idea of speech enhancement by amputation of impulsive disruption (noise) with the aid of modifying the dual spectrum (noisy magnitude and phase spectrum) under short time Fourier transform (STFT) domain. The processing step involves signal segmentation, short-time Fourier transforms, spectrum modification, and signal concatenation. Initially, the samples of the signal that are noisy are broke down into a limited count of frames, and each of them are assessed with coefficients generated using hamming window. The noisy signal spectrum will be attained through fast Fourier transform, and noisy components are removed using spectral subtraction. After the restoring process, the filtered signal magnitude combined with signal phase and then inverse transform followed by signal concatenation will be done to reconstruct signal. This paper mainly brings forward a proposal for extraction of features by using Mel-frequency cepstral coefficients (MFCCs). Initially, it takes the input speech signal that is enhanced in terms of reducing noise followed by the classification of emotion from the resultant output using neural network training. The simulated result shows that the proposed filtering model is more accurate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Gupta, S., Mehra, A.: Gender specific emotion recognition through speech signals. In: International Conference on Signal Processing and Integrated Networks 2014, SPIN, pp. 727–733 (2014)

    Google Scholar 

  2. Gajšek, R., Mihelič, F., Dobrišek, S.: Speaker state recognition using an HMM-based feature extraction method. Comput. Speech Lang. 27(1), 135–150 (2013)

    Article  Google Scholar 

  3. Li, M., Han, K.J., Narayanan, S.: Automatic speaker age and gender recognition using acoustic and prosodic level information fusion. Comput. Speech Lang. 27(1), 151–167 (2013)

    Article  Google Scholar 

  4. Ooi, C.S., Seng, K.P., Ang, L.M., Chew, L.W.: A new approach of audio emotion recognition. Expert Syst. Appl. 41(13), 5858–5869 (2014)

    Article  Google Scholar 

  5. Vrebčević, N., Mijić, I., Petrinović, D.: Emotion Classification Based on Convolutional Neural Network Using Speech Data. International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1007–1012 (2019)

    Google Scholar 

  6. Srinivas, C., Prasad, M., Sirisha, M.: Remote sensing image segmentation using OTSU algorithm. Int. J. Comput. Appl. 178(12), 46–50 (2019)

    Google Scholar 

  7. Chenchah, F., Lachiri, Z.: Speech emotion recognition in noisy environment. In: International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) 2016, pp. 788–792. IEEE (2016)

    Google Scholar 

  8. Panda, B., Padhi, D., Dash, K., Anghamitra, M.: Use of SVM classifier & MFCC in speech emotion recognition system. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(3) (2012)

    Google Scholar 

  9. Akshatha, K.V., Geethashree, A., Ravi, DJ.: Neutral to target speech conversion using polynomial curve fitting. Int. Conf. Curr. Trends Comput. Electr. Electron. Commun. (CTCEEC), pp. 468–472 (2017)

    Google Scholar 

  10. Hemanta, P., Sangeet, S.: Characterization and classification of speech emotion with spectrograms. Int. Conf. Adv. Comput. Conf. (IACC), pp. 309–313 (2018)

    Google Scholar 

  11. Ortony, A., Turner, T.J.: What’s basic about basic emotions? Psychol. Rev. 97(3), 315 (1990)

    Article  Google Scholar 

  12. Hana, M., Alyahya, MK., Alharthi, A.M., Alattas, V.: Saudi license plate recognition system using artificial neural network classifier. Int. Conf. Comput. Appl. (ICCA), pp. 220–226 (2017)

    Google Scholar 

  13. Yu, L., Zhou, K., Huang, Y.: A comparative study on support vector machines classifiers for emotional speech recognition. Immune Comput. (IC) 2 (2014)

    Google Scholar 

  14. Yildirim, S., Bulut, M., Lee, C.M., Kazemzadeh, A., Deng, Z., Lee, S.,Busso, C.: An acoustic study of emotions expressed in speech. In: International Conference on Spoken Language Processing (2004)

    Google Scholar 

  15. Basu, S., Bag, A., Aftabuddin, M., Mahadevappa, M., Mukherjee, J., Guha, R.: Effects of emotion on physiological signals. In: 2016 IEEE Annual India Conference (INDICON), pp. 1–6. IEEE (2016)

    Google Scholar 

  16. Bharat Kumar, R., Srinivas, C., Shaik, A.: Image forensic analysis and recognition in copy-move using bag of features and SVMInt. J. Eng. Trends Technol. 67(5), 1–6 (2019)

    Google Scholar 

  17. Yang, B., Lugger, M.: Emotion recognition from speech signals using new harmony features. Signal Process. 90(5), 1415–1423 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ch V. V. S. Srinivas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Srinivas, C.V.V.S., Gubbala, S., Lakshmi, N.D.N. (2022). Human Emotion Classification Based on Speech Enhancement Using Neural Networks. In: Gupta, D., Sambyo, K., Prasad, M., Agarwal, S. (eds) Advanced Machine Intelligence and Signal Processing. Lecture Notes in Electrical Engineering, vol 858. Springer, Singapore. https://doi.org/10.1007/978-981-19-0840-8_3

Download citation

Publish with us

Policies and ethics