Abstract
Recognition of emotions from speech is a complex task that is furthermore complicated by the fact that there is no unambiguous answer to what the “correct” emotion is for a given speech sample. In this paper, we discuss emotion classification of a well known German database consisting of 6 basic emotions: sadness, boredom, neutral, fear, happiness, and anger using Mel frequency Cepstral Coefficients (MFCCs). A concern with MFCC is the large number of features. We discuss the use of LBG-VQ algorithm to minimize the amount of data to be handled. At last, emotion classification is done using Euclidean distance, Manhattan distance and Chebyshev distance of the codebooks between neutral state and other emotional states for the same sample.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.: Emotion recognition in human-computer interactions. IEEE Signal Proceedings 18(1), 32–80 (2001)
Litman, D., Forbes, K.: Recognizing emotions from student speech in tutoring dialogues. In: The Proceedings of the ASRU 2003 (2003)
Lee, C.M., Narayanan, S.: Towards detecting emotion in spoken dialogs. IEEE Trans. on Speech and Audio Processing 13(2) (2005)
Tato, R., Santos, R., Kompe, R., Pardo, J.: Emotional space improves emotion recognition. In: The Proceedings of the Seventh International Conference on Spoken Language Processing, vol. 3, pp. 2029–2032 (2002)
Yacoub, S., Simske, S., Lin, X., Burns, J.: Recognition of emotions in interactive voice response systems. In: The Proceedings of the Eighth European Conference on Speech Communication and Technology, pp. 729–732 (2003)
Oudeyer, P.Y.: The production and recognition of emotions in speech: features and algorithms. International Journal of Human Computer Interaction 59(1-2), 157–183 (2003)
Yu, F., Chang, E., Xu, Y.Q., Shum, H.Y.: Emotion detection from speech to enrich multimedia content. In: The Proceedings of the Second IEEE Pacific Rim Conference on Multimedia, pp. 550–557 (2001)
Kwon, O.W., Chan, K., Hao, J., Lee, T.W.: Emotion recognition by speech signals. In: The Proceedings of the Eighth European Conference on Speech Communication and Technology (EUROSPEECH), pp. 125–128 (2003)
German Emotional Speech Database, http://emotion-research.net/biblio/tuDatabase
Deller, J., Hansen, J., Proakis, J.: Discrete-time processing of speech signals, 2nd edn. IEEE Press, New York (2000)
Soong, F., Rosenberg, E., Juang, B., Rabiner, L.: A vector quantization approach to speaker recognition. AT&T Technical Journal 66, 14–26 (1987)
Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Transactions on Communications 28, 84–95 (1980)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Khanna, P., Sasi Kumar, M. (2011). Application of Vector Quantization in Emotion Recognition from Human Speech. In: Dua, S., Sahni, S., Goyal, D.P. (eds) Information Intelligence, Systems, Technology and Management. ICISTM 2011. Communications in Computer and Information Science, vol 141. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19423-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-19423-8_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19422-1
Online ISBN: 978-3-642-19423-8
eBook Packages: Computer ScienceComputer Science (R0)