Enhancing the Open-Domain Classification of Named Entity Using Linked Open Data

  • Yuan Ni
  • Lei Zhang
  • Zhaoming Qiu
  • Chen Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6496)


Many applications make use of named entity classification. Machine learning is the preferred technique adopted for many named entity classification methods where the choice of features is critical to final performance. Existing approaches explore only the features derived from the characteristic of the named entity itself or its linguistic context. With the development of the Semantic Web, a large number of data sources are published and connected across the Web as Linked Open Data (LOD). LOD provides rich a priori knowledge about entity type information, knowledge that can be a valuable asset when used in connection with named entity classification. In this paper, we explore the use of LOD to enhance named entity classification. Our method extracts information from LOD and builds a type knowledge base which is used to score a (named entity string, type) pair. This score is then injected as one or more features into the existing classifier in order to improve its performance. We conducted a thorough experimental study and report the results, which confirm the effectiveness of our proposed method.


Named Entity Classification Linked Open Data 


  1. 1.
  2. 2.
  3. 3.
  4. 4.
    Ahn, D., Jijkoun, V., Mishne, G., Muller, K., de Rijke, M., Schlobach, S.: Using wikipedia at the trec qa track. In: Proceedings of the 13rd Text REtrieval Conference, TREC 13 (2004)Google Scholar
  5. 5.
    Banerjee, S., Pedersen, T.: Extended gloss overlaps as a measure of semantic relatedness. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (2003)Google Scholar
  6. 6.
    Banko, M., Cafarella, M.J., Soderland, S., Boardhead, M., Etzioni, O.: Open information extraction from the web. Communications of the ACM (2008)Google Scholar
  7. 7.
    Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble: High-performance learning name-finder. In: Proceedings of the 5th Conference on Applied Natural Language Processing (1997)Google Scholar
  8. 8.
    Cimiano, P., Volker, J.: Towards large-scale, open-domain and ontology-based named entity classification. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing, RANLP (2005)Google Scholar
  9. 9.
    Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A.-M., Shaked, T., Soderland, S., Yates, D.S.W.S.A.: Web-scale information extraction in knowitall. In: Proceedings of the 13th International Conference on World Wide Web, WWW (2004)Google Scholar
  10. 10.
    Evans, R.: A framework for named entity recognition in the open domain. In: Proceedings of the Recent Advances in Natural Language Processing, RANLP (2003)Google Scholar
  11. 11.
    Fellbaum, C. (ed.): Wordnet: An electronic lexical database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  12. 12.
    Fleischman, M., Hovy, E.: Fine-grained classification of named entities. In: Proceedings of the 19th International Conference on Computational Linguistics, Coling (2002)Google Scholar
  13. 13.
    Ganti, V., Konig, A.C., Vernica, R.: Entity categorization over large document collections. In: Proceedings of the 14th ACM SIGKDD International Conference On Knowledge Discovery & Data Mining (2008)Google Scholar
  14. 14.
    Giuliano, C.: jLSI a for latent semantic indexing (2007) Software available at,
  15. 15.
    Giuliano, C.: Fine-grained classification of named entities exploiting latent semantic kernels. In: Proceedings of the 13rd Conference onCcomputational Natural Language Learning, CoNLL (2009)Google Scholar
  16. 16.
    Giuliano, C., Gliozzo, A.: Instance-based ontology population exploiting named-entity substitution. In: Proceedings of the 22nd International Conference on Computational Linguistics, Coling (2008)Google Scholar
  17. 17.
    Harabagiu, S., Moldovan, D., Pasca, M., Mihalcea, R., Surdeanu, M., Bunescu, R., Girju, R., Rus, V., Morarescu, P.: Falcon: Boosting knowledge for answer engines. In: Proceedings of 9th Text REtrieval Conference, TREC 9 (2000)Google Scholar
  18. 18.
    Hirschman, L., Chinchor, N.: Muc-7 named entity task definition. In: Proceedings of the 7th Message Understanding Conference, MUC-7 (1997)Google Scholar
  19. 19.
    Kwok, C.C.T., Etzioni, O., Weld, D.S.: Scaling question answering to the web. In: Proceedings of the 10th World Wide Web Conference, WWW (2001)Google Scholar
  20. 20.
    Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. In: Linguisticae Investigationes (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Yuan Ni
    • 1
  • Lei Zhang
    • 1
  • Zhaoming Qiu
    • 1
  • Chen Wang
    • 1
  1. 1.IBM ResearchChina

Personalised recommendations