Abstract
Named Entity Recognition (NEE) and baseNP chunking are two examples of NLP tasks that annotate chunks, i.e. non-embedding, non-overlapping token sequences. Many approaches to these tasks can be greatly simplified by viewing chunking as a tagging problem (Ramshaw and Marcus 1995). This perspective, however, raises the question of data representation, i. e. how chunk structures can be mapped onto tags, and whether the choice of chunk-tag mapping affects the system performance (cf. Tjong Kim Sang and Veenstra 1999).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bikel, D. M., S. Miller, R. Schwartz and R. Weischedel. 1997. Nymble: a High-Performance Learning Name-finder. In: Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP-97), Washington. 194–201. URL http://www.arxiv.org/abs/cmp-lg/ 9803003.
Borthwick, A. 1999. A Maximum Entrqpy Approach to Named Entity Recognition. Ph.D. thesis, New York University. URL http://cs.nyu.edu/cs/projects/proteus/publication/papers/borth- wick_thesis.ps.
Chinchor, N. and P. Robinson. 1998. MUC-7 Named Entity Task Definition, Version 3.5. In: Proceedings of the 7th Message Understanding Conference (MUC-7). URL http://www.itl. nist.gov/iaui/894.02/related_projects/muc/proceedings/ne_task.html.
Daelemans, W., J. Zavrel, K. van der Sloot and A. van den Bosch. 2000. TiMBL: Tilburg Memory Based Learner Version 3.0 Reference Guide. Technical report ILK 00-01. Tilburg: ILK. URL http://ilk.kub.nl/~ilk/papers/ilk0001.ps.gz.
Feddes, H. 2001. Automatische Erkennung von Eigennamen in englischen Texten. Master’s thesis, Faculty of Philosophy, University of Münster. URL http://santana.ui3i-muenster.de/ feddes/publications.
McDonald, D. D. 1996. Internal and External Evidence in the Identification and Semantic Categorization of Proper Names. In: B. Boguraev and J. Pustejovsky (eds.). Corpus Processing for Lexical Acquisition. Cambridge, Mass.: MIT Press. Ch. 2, 21–39.
Mikheev, A., C. Grover and M. Moens. 1998. Description of the LTC System Used for MUC-7. In: Proceedings of the 7th Message Understanding Conference (MUC-7). URL http://www. itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_proceedings/ltg_muc7.ps.
Mikheev, A., M. Moens and C. Grover. 1999. Named Entity Recognition without Gazetteers. In: Proceedings of the 9th Annual Conference of the European Chapter of the Association for Computational Linguistics (EACL-99), Bergen. 1–8. URL http://www.ltg.ed.ac.uk/ -mikheev/papers_my/eacl99.ps.
Ramshaw, L. A. and M. P. Marcus. 1995. Text Chunking using Transformation-Based Learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora. 82–94. URL http://www.arxiv.org/abs/cmp-lg/9505040.
Rottger, M. 2001. Disamb manpage. Arbeitsbereich Linguistik, University of Miinster. URLhttp://xIex.uni-muenster.de/XlexPublic/disambMan.html.
Sparck Jones, K. 1972. A Statistical Interpretation of Term Specificity and its Application in Retrieval. Journal of Documentation 28 (1): 11–21.
Tjong Kim Sang, E. F. 2000. Noun Phrase Recognition by System Combination. In: Proceedings of the 1st Annual Conference of the North American Chapter of the Association for Computational Linguistics and of the 6th Conference on Applied Natural Language Processing (NAACL-ANLP-2000), Seattle. 50–55.
Tjong Kim Sang, E. F. 2002. Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition. In: D. Roth and A. van den Bosch (eds.). Proceedings of the Sixth Conference on Natural Language Learning (CoNLL-2002), Taipei, Taiwan. 155–158. URLhttp://www.arxiv.org/abs/cs/0209010.
Tjong Kim Sang, E. F., W. Daelemans, H. Dejean, R. Koeling, Y. Krymolowski, V. Pun- yakanok and D. Roth. 2000. Applying System Combination to Base Noun Phrase Identification. In: Proceedings of the 18th International Conference on Computational Linguistics f COLING-2000), Luxembourg. 857–863. URLhttp://www.arxiv.org/abs/cs/0008012.
Tjong Kim Sang, E. F. and J. Veenstra. 1999. Representing Text Chunks. In: Proceedings of the 9th Annual Conference of the European Chapter of the Association for Computational Linguistics FEACL-99), Bergen. 173–179.
Ule, T. 1999. Tokenize manpage. Arbeitsbereich Linguistik, University of Münster. URLhttp://xIex.unimuenster.de/XIexPubIic/tokenizeMan.htmI.
Utsuro, T. and M. Sassano. 2000. Minimally Supervised Japanese Named Entity Recognition: Resources and Evaluation. In: Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC-2000), Athens.
Yu, S., S. Bai and P. Wu. 1998. Description of the Kent Ridge Digital Labs System Used for MUC-7. In: Proceedings of the 7th Message Understanding Conference (MUC-7). URLhttp://wvvW.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_proceedings/kent_ridge.ps.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Deutscher Universitäts-Verlag GmbH, Wiesbaden,
About this chapter
Cite this chapter
Feddes, H. (2003). Mapping structures onto tags. In: Cyrus, L., Feddes, H., Schumacher, F., Steiner, P. (eds) Sprache zwischen Theorie und Technologie / Language between Theory and Technology. Sprachwissenschaft. Deutscher Universitätsverlag, Wiesbaden. https://doi.org/10.1007/978-3-322-81289-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-322-81289-6_6
Publisher Name: Deutscher Universitätsverlag, Wiesbaden
Print ISBN: 978-3-8244-4513-4
Online ISBN: 978-3-322-81289-6
eBook Packages: Springer Book Archive