Skip to main content
Log in

A Maximum Entropy chunking model with N-fold template correction

  • Published:
Journal of Electronics (China)

Abstract

This letter presents a new chunking method based on Maximum Entropy (ME) model with N-fold template correction model. First two types of machine learning models are described. Based on the analysis of the two models, then the chunking model which combines the profits of conditional probability model and rule based model is proposed. The selection of features and rule templates in the chunking model is discussed. Experimental results for the CoNLL-2000 corpus show that this approach achieves impressive accuracy in terms of the F-score: 92.93%. Compared with the ME model and ME Markov model, the new chunking model achieves better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. X. Li, D. Roth. Exploring evidence for shallow parsing. Proceedings of the Conference on Computational Natural Language Learning’2001, Toulouse, France, July 6–7, 2001, 38–44.

  2. S. Abney. Parsing by chunks. In: R. Berwick, S. Abney, C. Tenny (eds). Principle-based Parsing. Dordrecht, Kluwer Academic Publishers, 1991, 257–278.

    Google Scholar 

  3. L. A. Ramshaw, M. P. Marcus. Text chunking using transformation-based learning. Proceedings of the 3rd ACL/SIGDAT Workshop, Cambridge, MA, USA, June 30, 1995, 222–226.

  4. E. Tjong Kim Sang, S. Buchholz. Introduction to the CoNLL-2000 shared task: Chunking. Proceeding of Conference on Computational Natural Language Learning’2000, Lisbon, Portugal, September 13–14, 2000, 127–132.

  5. T. Zhang, F. Damerau, D. Johnson. Text chunking based on a generalization of winnow. Journal of Machine Learning Research, 2(2002)3, 615–637.

    Article  MATH  Google Scholar 

  6. F. Sha, F. Pereira. Shallow parsing with conditional random fields. Proceedings of Human Language Technology Coference’2003, Edmonton, Canada, May 27–June 1, 2003, 134–141.

  7. W. Hammerton, M. Osborne, S. Armstrong, W. Daelemans. Introduction to special issue on machine learning approaches to shallow parsing. Journal of Machine Learning Research, 2(2002)3, 551–558.

    Article  Google Scholar 

  8. D. Wu, G. Ngai, M. Carpuat. N-fold templated piped correction. Proceedings of International Joint Coference of Natural Language Processing’2004, Hainan Island, China, March 22–24, 2004, 632–638.

  9. G. Sun, Y. Guan, X. Wang, J. Zhao. A maximum entropy markov model for chunking. Proceedings of International Conference on Machine Learning and Cybernetics, Guangzhou, China, August 18–21, 2005, 3761–3765.

  10. A. Berger, S. A. Della Pietra, V. J. Della Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1996)1, 39–71.

    Google Scholar 

  11. S. Della Pietra, V. J. Pietra, J. Laffery. Inducing features for random fields. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(1997)4, 380–393.

    Article  Google Scholar 

  12. E. Brill. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, 21(1995) 4, 543–565.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sun Guanglu.

Additional information

Supported by National Natural Science Foundation of China (No. 60504021).

Communication author: Sun Guanglu, born in 1979, male, Ph.D. candidate. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.

About this article

Cite this article

Sun, G., Guan, Y. & Wang, X. A Maximum Entropy chunking model with N-fold template correction. J. of Electron.(China) 24, 690–695 (2007). https://doi.org/10.1007/s11767-006-0215-1

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11767-006-0215-1

Key words

CLC index

Navigation