Skip to main content

Vowel Recognition from Telephonic Speech Using MFCCs and Gaussian Mixture Models

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 305))

Abstract

This paper presents vowel recognition from speech using mel frequency cepstral coefficients (MFCCs). In this work, microphone recorded speech and telephonic speech are used for conducting vowel recognition studies. The vowels considered for recognition are from Hindi alphabet namely अ(a), इ(i), उ(u), ए(e), ऐ(ai), ओ(o) and औ(au). Gaussian mixture models are used for developing vowel recognition models. Vowel recognition performance for microphone recorded speech and telephonic speech are 91.4% and 84.2% respectively.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sadeghi, V.S., Yaghmaie, K.: Vowel Recognition using Neural Networks. International Journal of Computer Science and Network Security (IJCSNS) 6(12) (December 2006)

    Google Scholar 

  2. Rui, W., Yao, H., Gao, W.: Recognition of sequence lip images and application. In: Proc. ICSP (1998)

    Google Scholar 

  3. Tobely, T.E., Tsuruta, N., Amamiya, M.: On-Line Speech-Reading System for Japanese Language (2000)

    Google Scholar 

  4. Paulraj, M.P., Yaacob, S.B., Nazri, A., Kumar, S.: Classification of Vowel Sounds Using MFCC and Feed Forward Neural Network. In: 5th International Colloquium on Signal Processing & Its Applications (CSPA) (2009)

    Google Scholar 

  5. Chauhan, R., Yadav, J., Koolagudi, S.G., Sreenivasa Rao, K.: Text Independent Emotion Recognition Using Spectral Features. In: Aluru, S., Bandyopadhyay, S., Catalyurek, U.V., Dubhashi, D.P., Jones, P.H., Parashar, M., Schmidt, B. (eds.) IC3 2011. CCIS, vol. 168, pp. 359–370. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Gheidi, M., Sayadian, A.: Vowel Detection and Classification using Support Vector Machines (SVM). In: 4th International Conference: Sciences of Electronic, Technologies of Information and Telecommunications, TUNISIA, March 25-29 (2007)

    Google Scholar 

  7. Gupta, J.P., Agrawal, S.S., Ahmed, R.: Perception of (Hindi) Vowels in Clipped Speech. Journal of Acoustic Society of America 49(2B), 567–568 (1971)

    Article  Google Scholar 

  8. Li, Y., Zhao, Y.: Recognizing emotions in speech using short-term and long-term features. In: Proc. of the International Conference on Speech and Language Processing, pp. 2255–2258 (1998)

    Google Scholar 

  9. Benesty, J., Sondhi, M.M., Huang, Y.: Springer handbook on speech processing. Springer (2008)

    Google Scholar 

  10. Sreenivasa Rao, K., Yegnanarayana, B.: Duration modification using glottal closure instants and vowel onset points. Speech Communication 51, 1263–1269 (2009), doi:10.1016/j.specom.2009.06.004

    Article  Google Scholar 

  11. Koolagudi, S.G., Kumar, N., Sreenivasa Rao, K.: Speech emotion recognition using segmental level prosodic analysis. In: Proc of IEEE International Confrence on Device Communication BIT MESRA, India (February 2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Koolagudi, S.G., Thakur, S.N., Barthwal, A., Singh, M.K., Rawat, R., Sreenivasa Rao, K. (2012). Vowel Recognition from Telephonic Speech Using MFCCs and Gaussian Mixture Models. In: Mathew, J., Patra, P., Pradhan, D.K., Kuttyamma, A.J. (eds) Eco-friendly Computing and Communication Systems. ICECCS 2012. Communications in Computer and Information Science, vol 305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32112-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32112-2_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32111-5

  • Online ISBN: 978-3-642-32112-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics