Entity Linking with Distributional Semantics

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9727)

Abstract

Entity Linking (EL) consists in linking name mentions in a given text with their referring entities in external knowledge bases such as DBpedia/Wikipedia. In this paper, we propose an EL approach whose main contribution is to make use of a knowledge base built by means of distributional similarity. More precisely, Wikipedia is transformed into a manageable database structured with similarity relations between entities. Our EL method is focused on a specific task, namely semantic annotation of documents by extracting those relevant terms that are linked to nodes in DBpedia/Wikipedia. The method is currently working for four languages. The Portuguese and English versions have been evaluated and compared against other EL systems, showing competitive range, close to the best systems.

Keywords

Entity linking Semantic annotation Term extraction 

References

  1. 1.
    Cassidy, T., Ji, H., Ratinov, L.-A., Zubiaga, A., Huang, H.: Analysis and enhancement of wikification for microblogs with context expansion. In: Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012): Technical Papers, pp. 441–456 (2012)Google Scholar
  2. 2.
    Cucerzan, S.: TAC entity linking by performing full-document entity extraction and disambiguation. In: Proceedings of the Text Analysis Conference (TAC 2011) (2011)Google Scholar
  3. 3.
    Curran, J.R., Moens, M.: Improvements in automatic thesaurus extraction. In: Proceedings of the ACL 2002 Workshop on Unsupervised Lexical Acquisition, Philadelphia, vol. 9, pp. 59–66 (2002)Google Scholar
  4. 4.
    Daiber, J., Jakob, M., Hokamp, C., Mendes, P.N.: Improving efficiency and accuracy in multilingual entity extraction. In: Proceedings of the 9th International Conference on Semantic Systems (I-Semantics), pp. 121–124. Association for Computing Machinery (2013)Google Scholar
  5. 5.
    dos Santos, J.T.L.: Linking entities to Wikipedia documents. PhD thesis, Instituto Superior Técnico, Lisboa (2013)Google Scholar
  6. 6.
    Fernández, N., Fisteus, J.A., Sánchez, L., Martín, E.: WebTLab: a cooccurrence-based approach to KBP 2010 entity-linking task. In: Proceedings of the Text Analysis Conference (TAC 2010) (2010)Google Scholar
  7. 7.
    Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM 2010), Toronto, pp. 1625–1628 (2010)Google Scholar
  8. 8.
    Gamallo, P.: Evaluating two different methods for the task of extracting bilingual lexicons from comparable corpora. In: LREC 2008 Workshop on Comparable Corpora, Marrakesh, pp. 19–26 (2008)Google Scholar
  9. 9.
    Gamallo, P., González, I.: A grammatical formalism based on patterns of part-of-speech tags. Int. J. Corpus Linguist. 16(1), 45–71 (2011)CrossRefGoogle Scholar
  10. 10.
    Garcia, M., Gamallo, P.: Yet another suite of multilingual NLP tools. In: Sierra-Rodríguez, J.-L., et al. (eds.) SLATE 2015. CCIS, vol. 563, pp. 65–75. Springer, Heidelberg (2015). doi:10.1007/978-3-319-27653-3_7 CrossRefGoogle Scholar
  11. 11.
    Guo, S., Chang, M.-W., Kiciman, E.: To link or not to link? a study on end-to-end tweet entity linking. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013), pp. 1020–1030 (2013)Google Scholar
  12. 12.
    Guo, Y., Che, W., Liu, T., Li, S.: A graph-based method for entity linking. In: Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP 2011), pp. 1010–1018 (2011)Google Scholar
  13. 13.
    Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with Wikipedia. Artif. Intell. 194, 130–150 (2013)MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Han, X., Sun, L.: A generative entity-mention model for linking entities with knowledge base. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT 2011), Portland, Oregon, vol. 1, pp. 945–954 (2011)Google Scholar
  15. 15.
    Han, X., Zhao, J.: Named entity disambiguation by leveraging Wikipedia semantic knowledge. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM 2009), Hong Kong, China, pp. 215–224 (2009)Google Scholar
  16. 16.
    Ji, J.N.H., Hachey, B.: Overview of TAC-KBP2014 entity discovery and linking tasks. In: Proceedings of the Text Analysis Conference (TAC 2014), pp. 539–545 (2014)Google Scholar
  17. 17.
    Hoffart, J., et al.: Robust disambiguation of named entities in text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), pp. 782–792 (2011)Google Scholar
  18. 18.
    Huang, H., Cao, Y., Huang, X., Ji, H., Lin, C.-Y.: Collective tweet wikification based on semi-supervised graph regularization. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Volume 1: Long Papers, pp. 380–390. Association for Computational Linguistics (2014)Google Scholar
  19. 19.
    Kageura, K., Umino, B.: Methods of automatic term recognition: a review. Terminology 3(1), 259–289 (1996)CrossRefGoogle Scholar
  20. 20.
    Kozareva, Z., Voevodski, K., Teng, S.-H.: Class label enhancement via related instances. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), pp. 118–128. Association for Computational Linguistics (2011)Google Scholar
  21. 21.
    Kulkarni, S., Singh, A., Ramakrishnan, G., Chakrabarti, S.: Collective annotation of wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2009), Paris, pp. 457–466. Association for Computing Machinery (2009)Google Scholar
  22. 22.
    Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBpedia spotlight: shedding light on the web of documents. In: Proceedings of the 7th International Conference on Semantic Systems, Graz, pp. 1–8. Association for Computing Machinery (2011)Google Scholar
  23. 23.
    Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM 2007), Lisbon, pp. 233–242 (2007)Google Scholar
  24. 24.
    Milne, D., Witten, I.H.: Learning to link with Wikipedia. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM 2008), Napa Valley, pp. 509–518 (2008)Google Scholar
  25. 25.
    Pennacchiotti, M., Pantel, P.: Entity extraction via ensemble semantics. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), pp. 238–247 (2009)Google Scholar
  26. 26.
    Radford, W., Hachey, B., Nothman, J., Honnibal, M., Curran, J.R.: Document-level entity linking: CMCRC at TAC 2010. In: Proceedings of the Text Analysis Conference (TAC 2010) (2010)Google Scholar
  27. 27.
    Sánchez, D., Moren, A.: A methodology for knowledge acquisition from the web. J. Knowl.-Based Intell. Eng. Syst. 10(6), 453–475 (2006)CrossRefGoogle Scholar
  28. 28.
    Vidal, J.C., Lama, M., Otero-García, E., Bugarín, A.: Graph-based semantic annotation for enriching educational content with linked data. Knowl.-Based Syst. 55, 29–42 (2014)CrossRefGoogle Scholar
  29. 29.
    Vivaldi, J., Rodríguez, H.: Improving term extraction by combining different techniques. Terminology 7(1), 31–47 (2001)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Centro Singular de Investigación en Tecnoloxías da Información (CiTIUS)Universidade de Santiago de CompostelaGalizaSpain
  2. 2.Grupo LyS, Dep. de Galego-Português, Francês e LinguísticaUniversidade da CoruñaGalizaSpain

Personalised recommendations