Skip to main content

Spotting Multilingual Consonant-Vowel Units of Speech Using Neural Network Models

  • Conference paper
Nonlinear Analyses and Algorithms for Speech Processing (NOLISP 2005)

Abstract

Multilingual speech recognition system is required for tasks that use several languages in one speech recognition application. In this paper, we propose an approach for multilingual speech recognition by spotting consonant-vowel (CV) units. The important features of spotting approach are that there is no need for automatic segmentation of speech and it is not necessary to use models for higher level units to recognise the CV units. The main issues in spotting multilingual CV units are the location of anchor points and labeling the regions around these anchor points using suitable classifiers. The vowel onset points (VOPs) have been used as anchor points. The distribution capturing ability of autoassociative neural network (AANN) models is explored for detection of VOPs in continuous speech. We explore classification models such as support vector machines (SVMs) which are capable of discriminating confusable classes of CV units and generalisation from limited amount of training data. The data for similar CV units across languages are shared to train the classifiers for recognition of CV units of speech in multiple languages. We study the spotting approach for recognition of a large number of CV units in the broadcast news corpus of three Indian languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition. PTR Prentice Hall, Englewood Cliffs (1993)

    Google Scholar 

  2. Eswar, P., Gupta, S.K., Chandra Sekhar, C., Yegnanarayana, B., Nagamma Reddy, K.: An acoustic-phonetic expert for analysis and processing of continuous speech in Hindi. In: Proc. European Conf. Speech Technology, Edinburgh, pp. 369–372 (1987)

    Google Scholar 

  3. Gangashetty, S.V., Chandra Sekhar, C., Yegnanarayana, B.: Detection of vowel onset points in continuous speech using autoassociative neural network models. In: Proc. Eighth Int. Conf. Spoken Language Processing (INTERSPEECH 2004 - ICSLP), pp. 1081–1084 (2004)

    Google Scholar 

  4. Gangashetty, S.V., Chandra Sekhar, C., Yegnanarayana, B.: Acoustic model combination for recognition of speech in multiple languages using support vector machines. In: Proc. IEEE Int. Joint Conf. Neural Networks (Budapest, Hungary), vol. 4(4), pp. 3065–3069 (2004)

    Google Scholar 

  5. Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice-Hall International, New Jersey (1999)

    MATH  Google Scholar 

  6. Gangashetty, S.V., Chandra Sekhar, C., Yegnanarayana, B.: Dimension reduction using autoassociative neural network models for recognition of consonant-vowel units of speech. In: Proc. Fifth Int. Conf. Advances in Pattern Recognition (ISI Calcutta, India), pp. 156–159 (2003)

    Google Scholar 

  7. Diamantaras, K.I., Kung, S.Y.: Principal Component Neural Networks, Theory and Applications. John Wiley and Sons, Inc., New York (1996)

    Google Scholar 

  8. Roukos, S., Rohlicek, R., Russel, W., Gish, H.: Continuous hidden Markov modelling for speaker-independent word spotting. In: Proc. IEEE Int. Conf. Acoust., Speech and Signal Processing, pp. 627–630 (1989)

    Google Scholar 

  9. Chandra Sekhar, C., Yegnanarayana, B.: Neural network models for spotting stop consonant-vowel (SCV) segments in continuous speech. In: Proc. Int. Conf. Neural Networks, pp. 2003–2008 (1996)

    Google Scholar 

  10. Gangashetty, S.V., Chandra Sekhar, C., Yegnanarayana, B.: Spotting consonant-vowel units in continuous speech using autoassociative neural networks and support vector machines. In: Proc. IEEE Int. Workshop on Machine Learning for Signal Processing (Sao Luis, Brazil), pp. 401–410 (2004)

    Google Scholar 

  11. Chandra Sekhar, C.: Neural Network Models for Recognition of Stop Consonant-Vowel (SCV) Segments in Continuous Speech. PhD thesis, Department of Computer Science and Engineering, Indian Institute of Technology Madras (1996)

    Google Scholar 

  12. Gangashetty, S.V., Mahadeva Prasanna, S.R.: Significance of vowel onset point for speech recognition using neural network models. In: Proc. Fifth Int. Conf. Cognitive and Neural Systems (Boston, USA), vol. 24 (2001)

    Google Scholar 

  13. Siva Rama Krishna Rao, J.Y., Chandra Sekhar, C., Yegnanarayana, B.: Neural networks based approach for detection of vowel onset points. In: Proc. Int. Conf. Advances in Pattern Recognition and Digital Techniques, Calcutta, pp. 316–320 (1999)

    Google Scholar 

  14. Yegnanarayana, B., Kishore, S.P.: AANN-An alternative to GMM for pattern recognition. Neural Networks 15, 459–469 (2002)

    Article  Google Scholar 

  15. Bourlard, H., Morgan, N.: Connectionist Speech Recognition: A Hybrid Approach. Kluwer Academic Publishers, Boston (1994)

    Google Scholar 

  16. Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech, and Signal Processing 28, 357–366 (1980)

    Article  Google Scholar 

  17. Furui, S.: On the role of spectral transition for speech perception. J. Acoust. Soc. Am. 80(4), 1016–1025 (1986)

    Article  Google Scholar 

  18. Chandra Sekhar, C., Yegnanarayana, B.: A constraint satisfaction model for recognition of stop consonant-vowel (SCV) utterances. IEEE Trans. Speech and Audio Processing 10, 472–480 (2002)

    Article  Google Scholar 

  19. Chopde, A.: ITRANS Indian Language Transliteration Package Version 5.2. Source, http://www.aczone.com/itrans/

  20. Chandra Sekhar, C., Takeda, K., Itakura, F.: Recognition of consonant-vowel (CV) units of speech in a broadcast news corpus using support vector machines. In: Proc. Int. Workshop on Pattern Recognition using Support Vector Machines (Niagara Falls, Canada), pp. 171–185 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gangashetty, S.V., Sekhar, C.C., Yegnanarayana, B. (2006). Spotting Multilingual Consonant-Vowel Units of Speech Using Neural Network Models. In: Faundez-Zanuy, M., Janer, L., Esposito, A., Satue-Villar, A., Roure, J., Espinosa-Duro, V. (eds) Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science(), vol 3817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11613107_27

Download citation

  • DOI: https://doi.org/10.1007/11613107_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31257-4

  • Online ISBN: 978-3-540-32586-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics