Abstract
This chapter presents the application of ETL to language independent phrase chunking (PCK). The PCK task consists in dividing a text into non-overlapping phrases. We apply ETL and ETL committee to four different corpora in three different languages: Portuguese, English and Hindi. For the four corpora ETL system achieves very competitive results. For two copora ETL achieves state-of-the-art results. ETL committee significantly improves the ETL results for the four corpora. This chapter is organized as follows. In Sect. 6.1, we describe the task and the selected corpora. In Sect. 6.2, we detail some modeling configurations used in our PCK system. In Sect. 6.3, we show some configurations used in the machine learning algorithms. Section 6.4 presents the application of ETL for the SNR-CLIC Corpus. In Sect. 6.5, we detail the application of ETL for the Ramshaw and Marcus Corpus. Section 6.6 presents the application of ETL for the CoNLL-2000 Corpus. In Sect. 6.7, we present the application of ETL for the SPSAL-2007 Corpus. Finally, Sect. 6.8 presents some concluding remarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Avinesh, P.V.S., Karthik, G.: Part-of-speech tagging and chunking using conditional random fields and transformation based learning. In: Proceedings of the IJCAI and the Workshop on Shallow Parsing for South Asian Languages, pp. 21–24 (2007)
Bharati, A., Mannem, P.R.: Introduction to shallow parsing contest on South Asian languages. In: Proceedings of the IJCAI and the Workshop On Shallow Parsing for South Asian Languages, pp. 1–8 (2007)
dos Santos, C.N., Oliveira, C.: Constrained atomic term: Widening the reach of rule templates in transformation based learning. In: Portuguese Conference on Artificial Intelligence—EPIA, pp. 622–633 (2005)
Freitas, M.C., Garrao, M., Oliveira, C., dos Santos, C.N., Silveira, M.: A anotação de um corpus para o aprendizado supervisionado de um modelo de sn. In: Proceedings of the III TIL / XXV Congresso da SBC. São Leopoldo (2005).
Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of the NAACL-2001 (2001)
Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Yarovsky, D., Church, K. (eds.) Proceedings of the Third Workshop on Very Large Corpora, pp. 82–94. Association for Computational Linguistics, Somerset (1995)
Sang, E.F.T.K., Buchholz, S.: Introduction to the conll-2000 shared task: chunking. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th CONLL, pp. 127–132. Association for Computational Linguistics, Morristown (2000). doi: 10.3115/1117601.1117631
Wu, Y.C., Chang, C.H., Lee, Y.S.: A general and multi-lingual phrase chunking model based on masking method. In: Proceedings of 7th International Conference on Intelligent Text Processing and Computational Linguistics, pp. 144–155 (2006)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2012 The Author(s)
About this chapter
Cite this chapter
dos Santos, C.N., Milidiú, R.L. (2012). Phrase Chunking. In: Entropy Guided Transformation Learning: Algorithms and Applications. SpringerBriefs in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-2978-3_6
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2978-3_6
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2977-6
Online ISBN: 978-1-4471-2978-3
eBook Packages: Computer ScienceComputer Science (R0)