Using the Maximum Entropy Method for Natural Language Processing: Category Estimation, Feature Extraction, and Error Correction

Murata, Masaki; Uchimoto, Kiyotaka; Utiyama, Masao; Ma, Qing; Nishimura, Ryo; Watanabe, Yasuhiko; Doi, Kouichi; Torisawa, Kentaro

doi:10.1007/s12559-010-9046-3

Using the Maximum Entropy Method for Natural Language Processing: Category Estimation, Feature Extraction, and Error Correction

Published: 22 May 2010

Volume 2, pages 272–279, (2010)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Masaki Murata¹,
Kiyotaka Uchimoto²,
Masao Utiyama²,
Qing Ma³,
Ryo Nishimura³,
Yasuhiko Watanabe³,
Kouichi Doi⁴ &
…
Kentaro Torisawa²

352 Accesses
8 Citations
Explore all metrics

Abstract

The maximum entropy (ME) method is a powerful supervised machine learning technique that is useful for various tasks. In this paper, we introduce new studies that successfully employ ME for natural language processing (NLP) problems including machine translation and information extraction. Specifically, we demonstrate, using simulation results, three applications of ME for NLP: estimation of categories, extraction of important features, and correction of error data items. We also evaluate the comparative performance of the proposed ME methods with other state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Maximum Entropy Models for Natural Language Processing

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

Diksha Khurana, Aditya Koli, … Sukhdev Singh

Reducing Approximation and Estimation Errors with Heterogeneous Annotations

Notes

There are many studies on categorization using the ME method [6, 7, 8, 9, 10, 11].
We used anonymous symbols such as A to F because the six translation systems are available on the market, and we did not want to influence the market.
There are many studies on feature selection [2, 14]. However, their main purpose is to decrease features for better learning, while our purpose of extracting features is to examine the experimental results. In addition to a study using the ME method for extracting features and examining experimental results, we conducted another study that estimated the referential properties of noun phrases [15]. In the study, we classified features into two types: (i) strong features on which an output category was strongly dependent and by which it was necessarily determined (the normalized alpha values of the features were almost the same as 1.0, which is the maximum value because it is a kind of probability.) and (ii) weak features that, although they showed a tendency concerning which category was likely to be the output category, another category could be an output category when other stronger features appeared. The classified results were highly useful for examining the experimental results.
This correction is identical as changing an original category to the category estimated by the method in Sect. “Method of Categorization”. The technique for error correction is thus related to that for category estimation.
Although the simple Bayes method sometimes shows high performance in such restricted tasks as text categorization [21] and word disambiguation [22], it generally shows low performance in categorization tasks.

References

Berger AL, Pietra SAD, Pietra VJD. A maximum entropy approach to natural language processing. Comput Linguist. 1996;22(1):39–71.
Google Scholar
Ristad ES. Maximum entropy modeling for natural language. Madrid: ACL/EACL Tutorial Program;1997.
Google Scholar
Pietra SD, Pietra VD, Lafferty J. Inducing features of random fields. Technical report, Carnegie Mellon University CMU-CS-95-144. 1995.
Utiyama M. Maximum entropy modeling package. 2006. http://www.nict.go.jp/x/x161/members/mutiyama/software.html#maxent.
Murata M, Ma Q, Uchimoto K, Kanamaru T, Isahara H. Japanese-to-English translations of tense, aspect, and modality using machine-learning methods and comparison with machine-translation systems on market. Lang Resour Eval. 2007;40:233–242.
Article Google Scholar
Ratnaparkhi A. A maximum entropy model for part-of-speech tagging. In: Proceedings of empirical methods for natural language processing. 1996. p. 133–142.
Borthwick A, Sterling J, Agichtein E, Grishman R. Exploiting diverse knowledge sources via maximum entropy in named entity recognition. In: Proceedings of the sixth workshop on very large corpora. 1998. p. 152–160.
Ratnaparkhi A. A linear observed time statistical parser based on maximum entropy models. In: Proceedings of empirical methods for natural language processing. 1997.
Nigam K, Lafferty J, McCallum A. Using maximum entropy for text classification. In: Proceedings of the IJCAI-99 workshop on machine learning for information filtering. 1999. p. 61–67.
Uchimoto K, Murata M, Ozaku H, Ma Q, Isahara H. Named entity extraction based on maximum entropy model and transformation rules. In: Proceedings of the 38th annual meeting of the association of computational linguistics. 2000.
Ittycheriah A, Franz M, Zhu WJ, Ratnaparkhi A. Question answering using maximum entropy components. NAACL-2001. 2001.
Murata M, Utiyama M, Uchimoto K, Ma Q, Isahara H. Correction of errors in a verb modality corpus used for machine translation with a machine-learning method. ACM Trans Asian Lang Inf Process. 2005;4(1):18–37.
Google Scholar
Murata M, Nishimura R, Doi K, Kanamaru T, Torisawa K. Analysis of the degree of importance of information using newspapers and questionnaires. In: Proceedings of 2008 IEEE international conference on natural language processing and knowledge engineering (IEEE NLP-KE 2008). 2008. p. 137–144.
Jebara T, Jaakkola T. Feature selection and dualities in maximum entropy discrimination. In uncertainity in artificial intelligence. 2000. p. 291–300.
Murata M, Uchimoto K, Ma Q, Isahara H. A machine-learning approach to estimating the referential properties of Japanese noun phrases. Computational linguistics and intelligent text processing, second international conference, CICLing 2001, Mexico City, February 2001 proceedings. 2001. p. 142–154.
Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press; 2000.
Google Scholar
Taira H, Haruno M. Feature selection in SVM text categorization. In: Proceedings of AAAI2001. 2001. p. 480–486.
Nakagawa T, Kudoh T, Matsumoto Y. Unknown word gussing and part-of-speech tagging using support vector machine. In: NLPRS’2001. 2001.
Suzuki J, Sasaki Y, Maeda E. SVM answer selection for open-domain question answering. In: Proceedings of the 19th international conference on computational linguistics (COLING-2002). 2002. p. 974–980.
Murata M, Ma Q, Isahara H. Comparison of three machine-learning methods for Thai part-of-speech tagging. ACM Trans Asian Lang Inf Process. 2002;1(2):145–158.
Article Google Scholar
Yang Y, Liu X. A re-examination of text categorization methods. Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’99). 1999. p. 42–49.
Murata M, Utiyama M, Uchimoto K, Ma Q, Isahara H. Japanese word sense disambiguation using the simple bayes and support vector machine methods. In: Proceedings of SENSEVAL-2. 2001.

Download references

Author information

Authors and Affiliations

Tottori University, 4-101 Koyama-Minami, Tottori, 680-8550, Japan
Masaki Murata
National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-0289, Japan
Kiyotaka Uchimoto, Masao Utiyama & Kentaro Torisawa
Ryukoku University, 1-5, Yokotani, Ooemachi, Seta, Ootsu, Shiga, 520-2194, Japan
Qing Ma, Ryo Nishimura & Yasuhiko Watanabe
Pharma Security Consulting Inc., 1-1-5-512, Uchi-Kanda, Chiyoda-ku, Tokyo, 101-0047, Japan
Kouichi Doi

Authors

Masaki Murata
View author publications
You can also search for this author in PubMed Google Scholar
Kiyotaka Uchimoto
View author publications
You can also search for this author in PubMed Google Scholar
Masao Utiyama
View author publications
You can also search for this author in PubMed Google Scholar
Qing Ma
View author publications
You can also search for this author in PubMed Google Scholar
Ryo Nishimura
View author publications
You can also search for this author in PubMed Google Scholar
Yasuhiko Watanabe
View author publications
You can also search for this author in PubMed Google Scholar
Kouichi Doi
View author publications
You can also search for this author in PubMed Google Scholar
Kentaro Torisawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masaki Murata.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Murata, M., Uchimoto, K., Utiyama, M. et al. Using the Maximum Entropy Method for Natural Language Processing: Category Estimation, Feature Extraction, and Error Correction. Cogn Comput 2, 272–279 (2010). https://doi.org/10.1007/s12559-010-9046-3

Download citation

Received: 05 March 2010
Accepted: 06 May 2010
Published: 22 May 2010
Issue Date: December 2010
DOI: https://doi.org/10.1007/s12559-010-9046-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using the Maximum Entropy Method for Natural Language Processing: Category Estimation, Feature Extraction, and Error Correction

Abstract

Access this article

Similar content being viewed by others

Maximum Entropy Models for Natural Language Processing

Natural language processing: state of the art, current trends and challenges

Reducing Approximation and Estimation Errors with Heterogeneous Annotations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using the Maximum Entropy Method for Natural Language Processing: Category Estimation, Feature Extraction, and Error Correction

Abstract

Access this article

Similar content being viewed by others

Maximum Entropy Models for Natural Language Processing

Natural language processing: state of the art, current trends and challenges

Reducing Approximation and Estimation Errors with Heterogeneous Annotations

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation