Skip to main content

Analyzing the Statistical Behavior of Smoothing Method

  • Conference paper
  • 1419 Accesses

Keywords

  • Language Model
  • Smoothing Method
  • Training Corpus
  • Word Sequence
  • Escape Probability

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (Canada)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Placeway, P., Schwartz R., Fung P., and Nguyen, L., 1993, The Estimation of Powerful Language Models from Small and Large corpora, ICASSP-93, Vol. 2, pp. 33-36.

    Google Scholar 

  2. Brown P. F., Della Pietra S. A., Della Pietra V. J., Lai J. C., and Mercer R. L., 1992, An Estimate of an Upper Bound for the Entropy of English, Computational Linguistics, Vol. 18, pp. 31-40.

    Google Scholar 

  3. Chen Standy F. and Goodman Joshua, 1999, An Empirical study of smoothing Techniques for Language Modeling, Computer Speech and Language, Vol. 13,pp. 359-394.

    CrossRef  Google Scholar 

  4. Church K. W. and Gale W. A., 1991, A Comparison of the Enhanced Good-Turing and Deleted Estimation Methods for Estimating Probabilies of English Bigrams, Computer Speech and Language, Vol. 5, pp 19-54.

    CrossRef  Google Scholar 

  5. Dagan I., Maucus S and Markovitch, 1995, Contextual Word Similarity and Estimation from Sparse Data, Computer Speech and Language, Vol. 9, pp. 123-152.

    CrossRef  Google Scholar 

  6. Essen U. and Steinbiss, 1992, Cooccurrence Smoothing for Stochastic Language Modelling, IEEE International conference on Acoustic, Speech and Signal Processing, Vol. 1, pp. 161-164.

    Google Scholar 

  7. Good I. J., 1953, The Population Frequencies of Species and the Estimation of Population Parameters, Biometrika, Vol. 40, pp. 237-264.

    MATH  Google Scholar 

  8. Jelinek F., 1997, Automatic Speech Recognition-Statistical Methods, M.I.T.

    Google Scholar 

  9. Jelinek F. and Mercer R. L., 1980, Interpolated Estimation of Markov Source Parameters from Spars Data, Proceedings of the Workshop on Pattern Recognition in Practice, North-Holland, Amsterdam, The Northlands, pp. 381-397.

    Google Scholar 

  10. Juraskey D. and Martin James H., 2000, Speech and Language Processing, Prentice Hall.

    Google Scholar 

  11. Katz S. M., March 1987, Estimation of Probabilities from Sparse Data for the Language Models Component of a Speech Recognizer, IEEE Trans. On Acoustic, Speech and Signal Processing, Vol. ASSP-35, pp. 400-401.

    CrossRef  Google Scholar 

  12. Knerser R. and Ney H., 1995, Improved Backing-Off for M-gram Language Modeling, IEEE International conference on Acoustic, Speech and Signal Processing, pp. 181-184.

    Google Scholar 

  13. Nádas A., 1984, Estimation of Probabilities in the Language Model of the IBM Speech Recognition System, IEEE Transactions on Acoustics, Speech, Signals Processing, Vol. 32, No. 4, pp. 859-861.

    CrossRef  Google Scholar 

  14. Nádas A., 1985, On Turing’s Formula for Word Probabilities, IEEE Trans. On Acoustic, Speech and Signal Processing, Vol. ASSP-33, pp. 1414-1416.

    CrossRef  Google Scholar 

  15. Ney H. and Essen U., 1991, On Smoothing Techniques for Bigram-Based Natural Language Modeling, IEEE International conference on Acoustic, Speech and Signal Processing, pp. 825-828.

    Google Scholar 

  16. Su K. Y., Chiang T. H., Chang J. S., A Overview of Corpus-Based Statistical-Oriented (CBSO) Techniques for Natural Language Processing, Computational Linguistics and Chinese Language Processing, vol. 1, no. 1, pp.101-157, August 1996.

    Google Scholar 

  17. Witten L. H. and Bell T. C., 1991, The Zero-Frequency Problem: Estimating the Probabilities of Novel Events in Adaptive Text Compression, IEEE Transaction on Information theory, Vol. 37, No. 4, pp. 1085-1094.

    CrossRef  Google Scholar 

  18. Huang C.-R., 1995, Introduction to the Academic Sinica Balance Corpus, Proceeding of ROCLLING VII, pp. 81-99.

    Google Scholar 

  19. Algort P. H. and Cover T. M., 1988, A Sandwich Proof of the Shannon- McMillan-Breiman Theorem, Ahe Annals of Probability, Vol. 16, No. 2, pp. 899-909.

    Google Scholar 

  20. Jurafsky D. and Martin J. H., 2000, Speech and Language Processing, Prentice Hall, Chapter 6.

    Google Scholar 

  21. Juang B. H and Lo S. H., 1994, On the Bias if the Turing-Good Estimate of Probabilities, IEEE Trans. On Signal Processing, Vol. 42, No. 2, pp. 496-498.

    CrossRef  Google Scholar 

  22. Hermann Ney, Ute Essen, Reinhard Kneser, December 1995, On the Estimation of ’Small’ Probabilities by Leaving-One-Out, Vol.17, No. 12, IEEE PAMI, pp. 1202-1212.

    Google Scholar 

  23. Xuehua Shen, ChengXiang Zhai, Active Feedback in Ad Hoc Information Retrieval, Proceedings of ACM SIGIR 2005.

    Google Scholar 

  24. Seung-Hoon Na, In-Su Kang, Ji-Eun Roh, Jong-Hyeok Lee, An Empirical Study of Query Expansion and Cluster-Based Retrieval in Language Modeling Approach, AIRS 2005.

    Google Scholar 

  25. Guihong Cao, Jian-Yun Nie, Jing Bai, Integrating Word Relationships into Language Models Proceedings of ACM SIGIR 2005

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2007 Springer

About this paper

Cite this paper

Huang, FL., Yu, MS. (2007). Analyzing the Statistical Behavior of Smoothing Method. In: Sobh, T. (eds) Innovations and Advanced Techniques in Computer and Information Sciences and Engineering. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6268-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-1-4020-6268-1_35

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-6267-4

  • Online ISBN: 978-1-4020-6268-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics