Skip to main content

Period Disambiguation with Maxent Model

  • Conference paper
Natural Language Processing – IJCNLP 2005 (IJCNLP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3651))

Included in the following conference series:

Abstract

This paper presents our recent work on period disambiguation, the kernel problem in sentence boundary identification, with the maximum entropy (Maxent) model. A number of experiments are conducted on PTB-II WSJ corpus for the investigation of how context window, feature space and lexical information such as abbreviated and sentence-initial words affect the learning performance. Such lexical information can be automatically acquired from a training corpus by a learner. Our experimental results show that extending the feature space to integrate these two kinds of lexical information can eliminate 93.52% of the remaining errors from the baseline Maxent model, achieving an F-score of 99.8227%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aberdeen, J., Burger, J., Day, D., Hirschman, L., Robinson, P., Vilain, M.: Mitre: Description of the alembic system used for muc-6. In: Proceedings of the Sixth Message Understanding Conference (MUC-6), Columbia, Maryland. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  2. Berger, A., Pietra, S.D., Pietra, V.D.: A maximum entropy approach to natural language processing. Computational linguistics 22(1), 39–71 (1996)

    Google Scholar 

  3. Della Pietra, S., Della Pietra, V., Lafferty, J.: Inducing features of random fields. Transactions Pattern Analysis and Machine Intelligence 19(4), 380–393 (1997)

    Article  Google Scholar 

  4. Malouf, R.: A comparison of algorithms for maximum entropy parameter estimation. In: Proceedings of CoNLL-2002, Taipei, Taiwan, pp. 49–55 (2002)

    Google Scholar 

  5. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of english: The penn treebank. Computational Linguistics 19(2), 313–329 (1993)

    Google Scholar 

  6. Mikheev, A.: Tagging sentence boundaries. In: Proceedings of the First Meeting of the North American Chapter of the Association for Computational Linguistics, NAACL 2000 (2000)

    Google Scholar 

  7. Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)

    MATH  Google Scholar 

  8. Palmer, D.D., Hearst, M.A.: Adaptive Multilingual Sentence Boundary Disambiguation. Computational Linguistics 23(2), 241–267 (1997)

    Google Scholar 

  9. Ratnaparkhi, A.: Maximum entropy models for natural language ambiguity resolution. Ph.D. dissertation, University of Pennsylvania (1998)

    Google Scholar 

  10. Reynar, J.C., Ratnaparkhi, A.: A maximum entropy approach to identifying sentence boundaries. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, Washington, D.C. (1997)

    Google Scholar 

  11. Riley, M.D.: Some applications of tree-based modelling to speech and language indexing. In: Proceedings of the DARPA Speech and Natural Language Workshop, pp. 339–352. Morgan Kaufmann (1989)

    Google Scholar 

  12. Rosenfeld, R.: Adaptive statistical language modeling: A Maximum Entropy Approach. PhD thesis CMU-CS-94 (1994)

    Google Scholar 

  13. Van Rijsbergen, C.J.: Information Retrieval. Butterworths, London (1979)

    Google Scholar 

  14. Wallach, H.M.: Efficient training of conditional random fields. Master’s thesis, University of Edinburgh (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kit, C., Liu, X. (2005). Period Disambiguation with Maxent Model. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562214_20

Download citation

  • DOI: https://doi.org/10.1007/11562214_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29172-5

  • Online ISBN: 978-3-540-31724-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics