Abstract
The quality of speech signal is more concerned in the domain of speech processing. Impulsive noise in speech signal has a capability to deteriorate the overall performance and reliability of the speech signal. This paper presents an idea of speech enhancement by amputation of impulsive disruption (noise) with the aid of modifying the dual spectrum (noisy magnitude and phase spectrum) under short time Fourier transform (STFT) domain. The processing step involves signal segmentation, short-time Fourier transforms, spectrum modification, and signal concatenation. Initially, the samples of the signal that are noisy are broke down into a limited count of frames, and each of them are assessed with coefficients generated using hamming window. The noisy signal spectrum will be attained through fast Fourier transform, and noisy components are removed using spectral subtraction. After the restoring process, the filtered signal magnitude combined with signal phase and then inverse transform followed by signal concatenation will be done to reconstruct signal. This paper mainly brings forward a proposal for extraction of features by using Mel-frequency cepstral coefficients (MFCCs). Initially, it takes the input speech signal that is enhanced in terms of reducing noise followed by the classification of emotion from the resultant output using neural network training. The simulated result shows that the proposed filtering model is more accurate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gupta, S., Mehra, A.: Gender specific emotion recognition through speech signals. In: International Conference on Signal Processing and Integrated Networks 2014, SPIN, pp. 727–733 (2014)
Gajšek, R., Mihelič, F., Dobrišek, S.: Speaker state recognition using an HMM-based feature extraction method. Comput. Speech Lang. 27(1), 135–150 (2013)
Li, M., Han, K.J., Narayanan, S.: Automatic speaker age and gender recognition using acoustic and prosodic level information fusion. Comput. Speech Lang. 27(1), 151–167 (2013)
Ooi, C.S., Seng, K.P., Ang, L.M., Chew, L.W.: A new approach of audio emotion recognition. Expert Syst. Appl. 41(13), 5858–5869 (2014)
Vrebčević, N., Mijić, I., Petrinović, D.: Emotion Classification Based on Convolutional Neural Network Using Speech Data. International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1007–1012 (2019)
Srinivas, C., Prasad, M., Sirisha, M.: Remote sensing image segmentation using OTSU algorithm. Int. J. Comput. Appl. 178(12), 46–50 (2019)
Chenchah, F., Lachiri, Z.: Speech emotion recognition in noisy environment. In: International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) 2016, pp. 788–792. IEEE (2016)
Panda, B., Padhi, D., Dash, K., Anghamitra, M.: Use of SVM classifier & MFCC in speech emotion recognition system. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(3) (2012)
Akshatha, K.V., Geethashree, A., Ravi, DJ.: Neutral to target speech conversion using polynomial curve fitting. Int. Conf. Curr. Trends Comput. Electr. Electron. Commun. (CTCEEC), pp. 468–472 (2017)
Hemanta, P., Sangeet, S.: Characterization and classification of speech emotion with spectrograms. Int. Conf. Adv. Comput. Conf. (IACC), pp. 309–313 (2018)
Ortony, A., Turner, T.J.: What’s basic about basic emotions? Psychol. Rev. 97(3), 315 (1990)
Hana, M., Alyahya, MK., Alharthi, A.M., Alattas, V.: Saudi license plate recognition system using artificial neural network classifier. Int. Conf. Comput. Appl. (ICCA), pp. 220–226 (2017)
Yu, L., Zhou, K., Huang, Y.: A comparative study on support vector machines classifiers for emotional speech recognition. Immune Comput. (IC) 2 (2014)
Yildirim, S., Bulut, M., Lee, C.M., Kazemzadeh, A., Deng, Z., Lee, S.,Busso, C.: An acoustic study of emotions expressed in speech. In: International Conference on Spoken Language Processing (2004)
Basu, S., Bag, A., Aftabuddin, M., Mahadevappa, M., Mukherjee, J., Guha, R.: Effects of emotion on physiological signals. In: 2016 IEEE Annual India Conference (INDICON), pp. 1–6. IEEE (2016)
Bharat Kumar, R., Srinivas, C., Shaik, A.: Image forensic analysis and recognition in copy-move using bag of features and SVM. Int. J. Eng. Trends Technol. 67(5), 1–6 (2019)
Yang, B., Lugger, M.: Emotion recognition from speech signals using new harmony features. Signal Process. 90(5), 1415–1423 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Srinivas, C.V.V.S., Gubbala, S., Lakshmi, N.D.N. (2022). Human Emotion Classification Based on Speech Enhancement Using Neural Networks. In: Gupta, D., Sambyo, K., Prasad, M., Agarwal, S. (eds) Advanced Machine Intelligence and Signal Processing. Lecture Notes in Electrical Engineering, vol 858. Springer, Singapore. https://doi.org/10.1007/978-981-19-0840-8_3
Download citation
DOI: https://doi.org/10.1007/978-981-19-0840-8_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0839-2
Online ISBN: 978-981-19-0840-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)