Domain Information for Fine-Grained Person Name Categorization

  • Zornitsa Kozareva
  • Sonia Vazquez
  • Andres Montoyo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4919)

Abstract

Named Entity Recognition became the basis of many Natural Language Processing applications. However, the existing coarse-grained named entity recognizers are insufficient for complex applications such as Question Answering, Internet Search engines or Ontology population. In this paper, we propose a domain distribution approach according to which names which occur in the same domains belong to the same fine-grained category. For our study, we generate a relevant domain resource by mapping and ranking the words from the WordNet glosses to their WordNetDomains. This approach allows us to capture the semantic information of the context around the named entity and thus to discover the corresponding fine-grained name category. The presented approach is evaluated with six different person names and it reaches 73% f-score. The obtained results are encouraging and perform significantly better than a majority baseline.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Black, W., Rinaldi, F., Mowatt, D.: Facile: Description of the ne system used for muc. In: Proceedings of the Message Understanding Conference (1998)Google Scholar
  2. 2.
    Bunescu, R., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: Proceeding of ACL, pp. 9–16 (2006)Google Scholar
  3. 3.
    Cimiano, P., Volker, J.: Towards large-scale, open-domain and ontology-based named entity classification. In: Proceeding of RANLP 2005, pp. 166–172 (2005)Google Scholar
  4. 4.
    Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (1999)Google Scholar
  5. 5.
    Fleischman, M., Hovy, E.: Fine grained classification of named entities. In: Proceedings of the 19th international conference on Computational linguistics, pp. 1–7, Association for Computational Linguistics, Morristown (2002)Google Scholar
  6. 6.
    Gaizauskas, R., et al.: University of sheffield: Description of the lasie system as used for muc. In: Proceedings of the Sixth Message Understanding Conference (MUC-6), Morgan Kaufmann, San Francisco (1995)Google Scholar
  7. 7.
    Kozareva, Z., Vazquez, S., Montoyo, A.: Discovering the underlying meanings and categories of a name through domain and semantic information. In: Proceedings of the Conference on Recent Advances in Natural Language Processing (RANLP) (2007)Google Scholar
  8. 8.
    Kozareva, Z., Vazquez, S., Montoyo, A.: A language independent approach for name categorization and discrimination. In: Proceedings of the ACL 2007 Workshop on Balto-Slavonic Natural Language Processing (2007)Google Scholar
  9. 9.
    Lin, D.: Automatic retrieval and clustering of similar words. In: Proceeding of COLING-ACL (1998)Google Scholar
  10. 10.
    Magnini, B., Cavaglia, G.: Integrating subject field codes into wordnet. In: Proceedings of LREC, pp. 1413–1418 (2000)Google Scholar
  11. 11.
    Mann, G.S.: Fine-grained proper noun ontology for question answering. In: Proceeding of COLING-2002 on SEMANET, pp. 1–7 (2002)Google Scholar
  12. 12.
    Nakov, P., Hearst, M.: Category-based pseudowords. In: NAACL 2003: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 67–69 (2003)Google Scholar
  13. 13.
    Navarro, B., et al.: Improving interaction with the user in cross-language question answering through relevant domains and syntactic semantic patterns. In: Proceedings of CLEF-2005, pp. 334–342 (2005)Google Scholar
  14. 14.
    Pasca, M.: Acquisition of categorized named entities for web search. In: Proceedings of CIKM, pp. 137–145 (2004)Google Scholar
  15. 15.
    Pedersen, T., et al.: An unsupervised language independent method of name discrimination using second order co-occurrence features. In: Proceeding of CICLING, pp. 208–222 (2006)Google Scholar
  16. 16.
    Sang, E.F.T.K.: Introduction to the conll-2002 shared task: Language-independent named entity recognition. In: Proceedings of CoNLL-2002, pp. 155–158. Taipei, Taiwan (2002)Google Scholar
  17. 17.
    Sang, E.F.T.K., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proceedings of HLT-NAACL, pp. 142–147 (2003)Google Scholar
  18. 18.
    Sekine, S., Sudo, K., Nobata, C.: Extended named entity hierarchy. In: Proceeding of LREC (2002)Google Scholar
  19. 19.
    Tanev, H., Magnini, B.: Weakly supervised approaches for ontology population. In: Proceedings of ACL, pp. 17–24 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Zornitsa Kozareva
    • 1
  • Sonia Vazquez
    • 1
  • Andres Montoyo
    • 1
  1. 1.Departamento de Lenguajes y Sistemas InformaticosUniversidad de Alicante 

Personalised recommendations