Abstract
This letter presents a new chunking method based on Maximum Entropy (ME) model with N-fold template correction model. First two types of machine learning models are described. Based on the analysis of the two models, then the chunking model which combines the profits of conditional probability model and rule based model is proposed. The selection of features and rule templates in the chunking model is discussed. Experimental results for the CoNLL-2000 corpus show that this approach achieves impressive accuracy in terms of the F-score: 92.93%. Compared with the ME model and ME Markov model, the new chunking model achieves better performance.
Similar content being viewed by others
References
X. Li, D. Roth. Exploring evidence for shallow parsing. Proceedings of the Conference on Computational Natural Language Learning’2001, Toulouse, France, July 6–7, 2001, 38–44.
S. Abney. Parsing by chunks. In: R. Berwick, S. Abney, C. Tenny (eds). Principle-based Parsing. Dordrecht, Kluwer Academic Publishers, 1991, 257–278.
L. A. Ramshaw, M. P. Marcus. Text chunking using transformation-based learning. Proceedings of the 3rd ACL/SIGDAT Workshop, Cambridge, MA, USA, June 30, 1995, 222–226.
E. Tjong Kim Sang, S. Buchholz. Introduction to the CoNLL-2000 shared task: Chunking. Proceeding of Conference on Computational Natural Language Learning’2000, Lisbon, Portugal, September 13–14, 2000, 127–132.
T. Zhang, F. Damerau, D. Johnson. Text chunking based on a generalization of winnow. Journal of Machine Learning Research, 2(2002)3, 615–637.
F. Sha, F. Pereira. Shallow parsing with conditional random fields. Proceedings of Human Language Technology Coference’2003, Edmonton, Canada, May 27–June 1, 2003, 134–141.
W. Hammerton, M. Osborne, S. Armstrong, W. Daelemans. Introduction to special issue on machine learning approaches to shallow parsing. Journal of Machine Learning Research, 2(2002)3, 551–558.
D. Wu, G. Ngai, M. Carpuat. N-fold templated piped correction. Proceedings of International Joint Coference of Natural Language Processing’2004, Hainan Island, China, March 22–24, 2004, 632–638.
G. Sun, Y. Guan, X. Wang, J. Zhao. A maximum entropy markov model for chunking. Proceedings of International Conference on Machine Learning and Cybernetics, Guangzhou, China, August 18–21, 2005, 3761–3765.
A. Berger, S. A. Della Pietra, V. J. Della Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1996)1, 39–71.
S. Della Pietra, V. J. Pietra, J. Laffery. Inducing features for random fields. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(1997)4, 380–393.
E. Brill. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics, 21(1995) 4, 543–565.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by National Natural Science Foundation of China (No. 60504021).
Communication author: Sun Guanglu, born in 1979, male, Ph.D. candidate. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.
About this article
Cite this article
Sun, G., Guan, Y. & Wang, X. A Maximum Entropy chunking model with N-fold template correction. J. of Electron.(China) 24, 690–695 (2007). https://doi.org/10.1007/s11767-006-0215-1
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/s11767-006-0215-1