Abstract
This chapter presents the application of ETL to language independent named entity recognition (NER). The NER task consists of finding all proper nouns in a text and classifying them among several given categories of interest. We apply ETL and ETL Committee to three different corpora in three different languages: Portuguese, Spanish and Dutch. ETL system achieves state-of-the-art competitive results for the three corpora. Moreover, ETL Committee significantly improves the ETL results for the three corpora. This chapter is organized as follows. In Sect. 7.1, we describe the NER task and the selected corpora. In Sect. 7.2, we detail some modeling configurations used in our NER system. In Sect. 7.3, we show some configurations used in the machine learning algorithms. Section 7.4 presents the application of ETL for the HAREM Corpus. In Sect. 7.5, we present the application of ETL for the SPA CoNLL-2002. In Sect. 7.6, we detail the application of ETL for the DUT CoNLL-2002. Finally, Sect. 7.7 presents some concluding remarks.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aranha, C.N.: Reconhecimento de entidades mencionadas em português, chap. O Cortex e a sua participação no HAREM. Linguateca, Portugal (2007)
Brill, E.: Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput. Linguist. 21(4), 543–565 (1995)
Carreras, X., Màrques, L., Padró, L.: Named entity extraction using adaboost. In: Proceedings of the Conference on Computational Natural Language Learning, pp. 167–170. Taipei, Taiwan (2002)
Milidiú, R.L., Duarte, J.C., Cavalcante, R.: Machine learning algorithms for portuguese named entity recognition. In: Proceedings of Fourth Workshop in Information and Human Language Technology. Ribeirão Preto, Brazil (2006)
Santos, D., Cardoso, N.: Reconhecimento de entidades mencionadas em português. Linguateca, Portugal (2007)
Sarmento, L., Sofia, A., Cabral, L.: Repentino—a wide-scope gazetteer for entity recognition in portuguese. In: Proceedings of 7th Workshop on Computational Processing of Written and Spoken Portuguese, pp. 31–40. Itatiaia, Brazil (2006)
Tjong Kim Sang, E.F.: Introduction to the conll-2002 shared task: language-independent named entity recognition. In: Proceedings of CoNLL-2002, pp. 155–158. Taipei, Taiwan (2002)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2012 The Author(s)
About this chapter
Cite this chapter
dos Santos, C.N., Milidiú, R.L. (2012). Named Entity Recognition. In: Entropy Guided Transformation Learning: Algorithms and Applications. SpringerBriefs in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-2978-3_7
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2978-3_7
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2977-6
Online ISBN: 978-1-4471-2978-3
eBook Packages: Computer ScienceComputer Science (R0)