Abstract
Named Entity Recognition became the basis of many Natural Language Processing applications. However, the existing coarse-grained named entity recognizers are insufficient for complex applications such as Question Answering, Internet Search engines or Ontology population. In this paper, we propose a domain distribution approach according to which names which occur in the same domains belong to the same fine-grained category. For our study, we generate a relevant domain resource by mapping and ranking the words from the WordNet glosses to their WordNetDomains. This approach allows us to capture the semantic information of the context around the named entity and thus to discover the corresponding fine-grained name category. The presented approach is evaluated with six different person names and it reaches 73% f-score. The obtained results are encouraging and perform significantly better than a majority baseline.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Black, W., Rinaldi, F., Mowatt, D.: Facile: Description of the ne system used for muc. In: Proceedings of the Message Understanding Conference (1998)
Bunescu, R., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: Proceeding of ACL, pp. 9–16 (2006)
Cimiano, P., Volker, J.: Towards large-scale, open-domain and ontology-based named entity classification. In: Proceeding of RANLP 2005, pp. 166–172 (2005)
Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (1999)
Fleischman, M., Hovy, E.: Fine grained classification of named entities. In: Proceedings of the 19th international conference on Computational linguistics, pp. 1–7, Association for Computational Linguistics, Morristown (2002)
Gaizauskas, R., et al.: University of sheffield: Description of the lasie system as used for muc. In: Proceedings of the Sixth Message Understanding Conference (MUC-6), Morgan Kaufmann, San Francisco (1995)
Kozareva, Z., Vazquez, S., Montoyo, A.: Discovering the underlying meanings and categories of a name through domain and semantic information. In: Proceedings of the Conference on Recent Advances in Natural Language Processing (RANLP) (2007)
Kozareva, Z., Vazquez, S., Montoyo, A.: A language independent approach for name categorization and discrimination. In: Proceedings of the ACL 2007 Workshop on Balto-Slavonic Natural Language Processing (2007)
Lin, D.: Automatic retrieval and clustering of similar words. In: Proceeding of COLING-ACL (1998)
Magnini, B., Cavaglia, G.: Integrating subject field codes into wordnet. In: Proceedings of LREC, pp. 1413–1418 (2000)
Mann, G.S.: Fine-grained proper noun ontology for question answering. In: Proceeding of COLING-2002 on SEMANET, pp. 1–7 (2002)
Nakov, P., Hearst, M.: Category-based pseudowords. In: NAACL 2003: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 67–69 (2003)
Navarro, B., et al.: Improving interaction with the user in cross-language question answering through relevant domains and syntactic semantic patterns. In: Proceedings of CLEF-2005, pp. 334–342 (2005)
Pasca, M.: Acquisition of categorized named entities for web search. In: Proceedings of CIKM, pp. 137–145 (2004)
Pedersen, T., et al.: An unsupervised language independent method of name discrimination using second order co-occurrence features. In: Proceeding of CICLING, pp. 208–222 (2006)
Sang, E.F.T.K.: Introduction to the conll-2002 shared task: Language-independent named entity recognition. In: Proceedings of CoNLL-2002, pp. 155–158. Taipei, Taiwan (2002)
Sang, E.F.T.K., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proceedings of HLT-NAACL, pp. 142–147 (2003)
Sekine, S., Sudo, K., Nobata, C.: Extended named entity hierarchy. In: Proceeding of LREC (2002)
Tanev, H., Magnini, B.: Weakly supervised approaches for ontology population. In: Proceedings of ACL, pp. 17–24 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kozareva, Z., Vazquez, S., Montoyo, A. (2008). Domain Information for Fine-Grained Person Name Categorization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2008. Lecture Notes in Computer Science, vol 4919. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78135-6_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-78135-6_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78134-9
Online ISBN: 978-3-540-78135-6
eBook Packages: Computer ScienceComputer Science (R0)