Maximum Entropy Models for Natural Language Processing

Ratnaparkhi, Adwait

doi:10.1007/978-1-4899-7687-1_525

Adwait Ratnaparkhi³

271 Accesses

Abstract

This chapter provides an overview of the maximum entropy framework and its application to a problem in natural language processing. The framework provides a way to combine many pieces of evidence from an annotated training set into a single probability model. The framework has been applied to many tasks in natural language processing, including part-of-speech tagging. This chapter covers the maximum entropy formulation, its relationship to maximum likelihood, a parameter estimation method, and the details of the part-of-speech tagging application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 699.99; Price excludes VAT (USA)

Hardcover Book: USD 949.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Natural Language Processing: Past, Present and Future

Topic Modeling for Speech and Language Processing

Natural Language Processing, Moving from Rules to Data

Recommended Reading

Berger AL, Della Pietra SA, Della Pietra VJ (1996) A maximum entropy approach to natural language processing. Comput Linguist 22(1):39–71
Google Scholar
Borthwick A (1999) A maximum entropy approach to named entity recognition. PhD thesis, New York University
Google Scholar
Chen S, Rosenfeld R (1999) A Gaussian prior for smoothing maximum entropy models. Technical report CMUCS-99-108, Carnegie Mellon University
Google Scholar
Church KW, Mercer RL (1993) Introduction to the special issue on computational linguistics using large corpora. Comput Linguist 19(1):1–24
Google Scholar
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
MATH Google Scholar
Curran JR, Clark S (2003) Investigating GIS and smoothing for maximum entropy taggers. In: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics-Volume 1. Association for Computational Linguistics, pp 91–98
Google Scholar
Darroch J, Ratcliff D (1972) Generalized iterative scaling for log-linear models. Ann Stat 43(5):1470–1480
Article MathSciNet MATH Google Scholar
Goodman J (2002) Sequential conditional generalized iterative scaling. In: Proceedings of the Association for Computational Linguistics
Google Scholar
Ittycheriah A, Franz M, Zhu W, Ratnaparkhi A (2001) Question answering using maximum-entropy components. In: Procedings of NAACL
Chapter Google Scholar
Jaynes ET (1957) Information theory and statistical mechanics. Phys Rev 106(4):620–630
Article MathSciNet MATH Google Scholar
Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 282–289
Google Scholar
Lau R, Rosenfeld R, Roukos S (1993) Adaptive language modeling using the maximum entropy principle. In: Proceedings of the ARPA human language technology workshop. Morgan Kaufmann, San Francisco, pp 108–113
Chapter Google Scholar
Malouf R (2002) A comparison of algorithms for maximum entropy parameter estimation. In: Sixth conference on natural language learning, pp 49–55
Google Scholar
Marcus MP, Santorini B, Marcinkiewicz MA (1994) Building a large annotated corpus of English: the Penn Treebank. Comput Linguist 19(2): 313–330
Google Scholar
Ratnaparkhi A (1996) A maximum entropy model for part-of-speech tagging. In: Brill E, Church K (eds) Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, Somerset, pp 133–142
Google Scholar
Ratnaparkhi A (1999) Learning to parse natural language with maximum entropy models. Mach Learn 34(1–3):151–175
Article MATH Google Scholar
Sha F, Pereira F (2003) Shallow parsing with conditional random fields. In: Proceedings of HLT-NAACL, pp 213–220
Google Scholar

Download references

Author information

Authors and Affiliations

Yahoo!, Sunnyvale, CA, USA
Adwait Ratnaparkhi

Authors

Adwait Ratnaparkhi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adwait Ratnaparkhi .

Editor information

Editors and Affiliations

The University of New South Wales, Sydney, NSW, Australia
Claude Sammut
Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Ratnaparkhi, A. (2017). Maximum Entropy Models for Natural Language Processing. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_525

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7687-1_525
Published: 14 April 2017
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Maximum Entropy Models for Natural Language Processing

Abstract

Access this chapter

Similar content being viewed by others

Natural Language Processing: Past, Present and Future

Topic Modeling for Speech and Language Processing

Natural Language Processing, Moving from Rules to Data

Recommended Reading

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Navigation

Maximum Entropy Models for Natural Language Processing

Abstract

Access this chapter

Similar content being viewed by others

Natural Language Processing: Past, Present and Future

Topic Modeling for Speech and Language Processing

Natural Language Processing, Moving from Rules to Data

Recommended Reading

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Share this entry

Publish with us

Search

Navigation