Skip to main content

Statistical and Discriminative Methods for Speech Recognition

  • Chapter
Book cover Automatic Speech and Speaker Recognition

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 355))

Abstract

A critical component in the pattern matching approach to speech recognition is the training algorithm which aims at producing typical (reference) patterns or models for accurate pattern comparison. In this chapter, we discuss the issue of speech recognizer training from a broad perspective with root in the classical Bayes decision theory. We differentiate the method of classifier design by way of distribution estimation and the method of discriminative training based on the fact that in many realistic applications, such as speech recognition, the real signal distribution form is rarely known precisely. We argue that traditional methods relying on distribution estimation are suboptimal when the assumed distribution form is not the true one, and that “optimality” in distribution estimation does not automatically translate into “optimality” in classifier design. We compare the two different methods in the context of hidden Markov modeling for speech recognition. We show the superiority of the discriminative method over the distribution estimation method by providing the results of several key speech recognition experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, 77(2): 257–286, February 1989.

    Article  Google Scholar 

  2. L. R. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Prentice Hall, Englewood Cliffs, NJ, 1993.

    Google Scholar 

  3. R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis, New York: Wiley, 1973.

    MATH  Google Scholar 

  4. F. Jelinek, “The development of an experimental discrete dictation recognizer,” Proc. IEEE, 73: 1616–1624, November 1985.

    Article  Google Scholar 

  5. B.-H. Juang, L. R. Rabiner and J. G. Wilpon, “On the use of bandpass littering in speech recognition,” IEEE Trans. Acoust. Speech Signal Processing, ASSP-35 (7): 947–954, July 1987.

    Article  Google Scholar 

  6. B.-H. Juang and L. R. Rabiner, “Hidden Markov models for speech recognition,” Technometrics, vol. 33, no. 3, pp. 251–272, August 1991.

    Article  MathSciNet  MATH  Google Scholar 

  7. L. E. Baum, T. Petrie, G. Soules and N. Weiss, “A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains,” Ann. Math. Stat., 41(1): 164–171, 1970.

    Article  MathSciNet  MATH  Google Scholar 

  8. B.-H. Juang and S. Katagiri, “Discriminative learning for minimum error classification,” IEEE Trans. Signal Processing, SP-40, no. 12, pp. 3043–3054, December 1992.

    Article  Google Scholar 

  9. W. Chou, C.-H. Lee and B.-H. Juang, “Segmental GPD training of an hidden Markov model based speech recognizer,” IEEE Proc. ICASSP-92, pp. 473–476, 1992.

    Google Scholar 

  10. W. Chou, C.-H. Lee and B.-H. Juang, “Minimum error rate training based on N-best string models,” IEEE ICASSP-93 Proceedings, 11–652–655, April 1993.

    Google Scholar 

  11. W. Chou, C.-H. Lee and B-H. Juang, “Minimum error rate training of inter-word context dependent acoustic model units in speech recognition”, Proc. ICSLP’ 94 pp. 439–442, Yokohama.

    Google Scholar 

  12. W. Chou, T. Matsuoka, B.-H. Juang and C.-H. Lee, “A high resolution N-best search algorithm using inter-word context dependent models for continuous speech recognition”, Proc. ICASSP’ 94

    Google Scholar 

  13. W. Chou, C.-H. Lee, B.-H. Juang and F. K. Soong, “A Minimum Error Rate Pattern Recognition Approach to Speech Recognition”, International Journal of Pattern Recognition and Artificial Intelligence Vol. 8 No. 1, pp 5–31, 1994.

    Article  Google Scholar 

  14. W. Chou and B.-H. Juang, “Adaptive Discriminative Learning in Pattern Recognition”, Technical Report of AT&T Bell Laboratories.

    Google Scholar 

  15. D. Pollard, Convergence of Stochastic Process, Springer Series in Statistics.

    Google Scholar 

  16. A. Benveniste, M. Metivier and P. Priouet, Adaptive Algorithms and Stochastic Approximations, Springer-Verlag.

    Google Scholar 

  17. J. R. Blum, “Multidimensional Stochastic Approximation Methods”, Ann. Math. Stat. vol 25, pp 737–744, 1954.

    Article  MATH  Google Scholar 

  18. H. Robbins and S. Monro, “A Stochastic Approximation Method”, Ann. Math. Stat., Vol 22 (1951), pp. 400–407.

    Article  MathSciNet  MATH  Google Scholar 

  19. J.L. Doob, Stochastic Process, John Wiley and Sons, 1953.

    Google Scholar 

  20. C.-H. Lee, E. Giachin, L.R. Rabiner, R. Pieraccini and A.E. Rosenberg, “Improved Acoustic Modeling for Speaker Independent Large Vocabulary Continuous Speech Recognition”, Computer Speech and Language, pp. 103–127, 1992.

    Google Scholar 

  21. C.-S. Liu, C.-H. Lee, W. Chou, B.-H. Juang and A. Rosenberg, “A Study on Minimum Error Discriminative Training For Speaker Recognition”, J. Acoust Soc. Am. 97, pp. 637–648.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Kluwer Academic Publishers

About this chapter

Cite this chapter

Juang, BH., Chou, W., Lee, CH. (1996). Statistical and Discriminative Methods for Speech Recognition. In: Lee, CH., Soong, F.K., Paliwal, K.K. (eds) Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science, vol 355. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1367-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-1367-0_5

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4612-8590-8

  • Online ISBN: 978-1-4613-1367-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics