Abstract
This paper presents an approach to partial parsing of natural language sentences that makes global inference on top of the outcome of hierarchically learned local classifiers. The best decomposition of a sentence into clauses is chosen using a dynamic programming based scheme that takes into account previously identified partial solutions. This inference scheme applies learning at several levels—when identifying potential clauses and when scoring partial solutions. The classifiers are trained in a hierarchical fashion, building on previous classifications. The method presented significantly outperforms the best methods known so far for clause identification.
Supported by a grant from the Catalan Research Department.
This research is partially funded by the Spanish Research Department (TIC2000-0335-C03-02, TIC2000-1735-C02-02) and the EC (NAMIC IST-1999-12392).
Supported by NSF grants IIS-99-84168,ITR-IIS-00-85836 and an ONR MURI award.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
S. P. Abney. Parsing by chunks. In R. C. Berwick, S. P. Abney, and C. Tenny, editors, Principle-based parsing: Computation and Psycholinguistics, pages 257–278. Kluwer, Dordrecht, 1991.
S. Buchholz, J. Veenstra, and W. Daelemans. Cascaded grammatical relation assignment. In EMNLP-VLC’99, the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, June 1999.
Xavier Carreras and Luís Màrquez. Boosting trees for clause splitting. In Proceedings of CoNLL-2001, pages 73–75. Toulouse, France, 2001.
J. Goodman. Parsing algorithms and metrics. In Proceedings of the 34th Annual Meeting of the ACL, pages 177–183, 1996.
G. Grefenstette. Evaluation techniques for automatic semantic extraction: comparing semantic and window based approaches. In ACL’93 workshop on the Acquisition of Lexical Knowledge from Text, 1993.
Z. S. Harris. Co-occurrence and transformation in linguistic structure. Language, 33(3):283–340, 1957.
M. Munoz, V. Punyakanok, D. Roth, and D. Zimak. A learning approach to shallow parsing. In EMNLP-VLC’99, the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, June 1999.
V. Punyakanok and D. Roth. The use of classifiers in sequential inference. In NIPS-13; The 2000 Conference on Advances in Neural Information Processing Systems, 2001.
L. A. Ramshaw and M. P. Marcus. Text chunking using transformation-based learning. In Proceedings of the Third Annual Workshop on Very Large Corpora, 1995.
R. E. Schapire and Y. Singer. Improved Boosting Algorithms Using Confidencerated Predictions. Machine Learning, 37(3):297–336, 1999.
R. E. Schapire. The Boosting Approach To Machine Learning: An Oveview. In Proceedings of the MSRI Workshop on Nonlinear Estimation and Classification, 2002.
E. F. Tjong Kim Sang and S. Buchholz. Introduction to the CoNLL-2000 shared task: Chunking. In Proceedings of CoNLL-2000 and LLL-2000, pages 127–132, 2000.
Erik F. Tjong Kim Sang and Hervé D'ejean. Introduction to the CoNLL-2001 shared task: Clause identification. In Walter Daelemans and Rémi Zajac, editors, Proceedings of CoNLL-2001, pages 53–57. Toulouse, France, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Carreras, X., Màrquez, L., Punyakanok, V., Roth, D. (2002). Learning and Inference for Clause Identification. In: Elomaa, T., Mannila, H., Toivonen, H. (eds) Machine Learning: ECML 2002. ECML 2002. Lecture Notes in Computer Science(), vol 2430. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36755-1_4
Download citation
DOI: https://doi.org/10.1007/3-540-36755-1_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44036-9
Online ISBN: 978-3-540-36755-0
eBook Packages: Springer Book Archive