Abstract
This paper presents vowel recognition from speech using mel frequency cepstral coefficients (MFCCs). In this work, microphone recorded speech and telephonic speech are used for conducting vowel recognition studies. The vowels considered for recognition are from Hindi alphabet namely अ(a), इ(i), उ(u), ए(e), ऐ(ai), ओ(o) and औ(au). Gaussian mixture models are used for developing vowel recognition models. Vowel recognition performance for microphone recorded speech and telephonic speech are 91.4% and 84.2% respectively.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Sadeghi, V.S., Yaghmaie, K.: Vowel Recognition using Neural Networks. International Journal of Computer Science and Network Security (IJCSNS) 6(12) (December 2006)
Rui, W., Yao, H., Gao, W.: Recognition of sequence lip images and application. In: Proc. ICSP (1998)
Tobely, T.E., Tsuruta, N., Amamiya, M.: On-Line Speech-Reading System for Japanese Language (2000)
Paulraj, M.P., Yaacob, S.B., Nazri, A., Kumar, S.: Classification of Vowel Sounds Using MFCC and Feed Forward Neural Network. In: 5th International Colloquium on Signal Processing & Its Applications (CSPA) (2009)
Chauhan, R., Yadav, J., Koolagudi, S.G., Sreenivasa Rao, K.: Text Independent Emotion Recognition Using Spectral Features. In: Aluru, S., Bandyopadhyay, S., Catalyurek, U.V., Dubhashi, D.P., Jones, P.H., Parashar, M., Schmidt, B. (eds.) IC3 2011. CCIS, vol. 168, pp. 359–370. Springer, Heidelberg (2011)
Gheidi, M., Sayadian, A.: Vowel Detection and Classification using Support Vector Machines (SVM). In: 4th International Conference: Sciences of Electronic, Technologies of Information and Telecommunications, TUNISIA, March 25-29 (2007)
Gupta, J.P., Agrawal, S.S., Ahmed, R.: Perception of (Hindi) Vowels in Clipped Speech. Journal of Acoustic Society of America 49(2B), 567–568 (1971)
Li, Y., Zhao, Y.: Recognizing emotions in speech using short-term and long-term features. In: Proc. of the International Conference on Speech and Language Processing, pp. 2255–2258 (1998)
Benesty, J., Sondhi, M.M., Huang, Y.: Springer handbook on speech processing. Springer (2008)
Sreenivasa Rao, K., Yegnanarayana, B.: Duration modification using glottal closure instants and vowel onset points. Speech Communication 51, 1263–1269 (2009), doi:10.1016/j.specom.2009.06.004
Koolagudi, S.G., Kumar, N., Sreenivasa Rao, K.: Speech emotion recognition using segmental level prosodic analysis. In: Proc of IEEE International Confrence on Device Communication BIT MESRA, India (February 2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koolagudi, S.G., Thakur, S.N., Barthwal, A., Singh, M.K., Rawat, R., Sreenivasa Rao, K. (2012). Vowel Recognition from Telephonic Speech Using MFCCs and Gaussian Mixture Models. In: Mathew, J., Patra, P., Pradhan, D.K., Kuttyamma, A.J. (eds) Eco-friendly Computing and Communication Systems. ICECCS 2012. Communications in Computer and Information Science, vol 305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32112-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-32112-2_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32111-5
Online ISBN: 978-3-642-32112-2
eBook Packages: Computer ScienceComputer Science (R0)