Combining Word and Entity Embeddings for Entity Linking

  • Jose G. Moreno
  • Romaric Besançon
  • Romain Beaumont
  • Eva D’hondt
  • Anne-Laure Ligozat
  • Sophie Rosset
  • Xavier Tannier
  • Brigitte Grau
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10249)

Abstract

The correct identification of the link between an entity mention in a text and a known entity in a large knowledge base is important in information retrieval or information extraction. The general approach for this task is to generate, for a given mention, a set of candidate entities from the base and, in a second step, determine which is the best one. This paper proposes a novel method for the second step which is based on the joint learning of embeddings for the words in the text and the entities in the knowledge base. By learning these embeddings in the same space we arrive at a more conceptually grounded model that can be used for candidate selection based on the surrounding context. The relative improvement of this approach is experimentally validated on a recent benchmark corpus from the TAC-EDL 2015 evaluation campaign.

Keywords

Entity Linking Linked data Natural language processing and information retrieval 

References

  1. 1.
    Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the ACL, pp. 238–247, June 2014Google Scholar
  2. 2.
    Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 26, 2787–2795 (2013)Google Scholar
  3. 3.
    Burkhard, W.A., Keller, R.M.: Some approaches to best-match file searching. Commun. ACM 16(4), 230–236 (1973)CrossRefMATHGoogle Scholar
  4. 4.
    Cao, Z., Tao, Q., Tie-Yan, L., Ming-Feng, T., Hang, L.: Learning to rank: from pairwise approach to listwise approach. In: 24th International Conference on Machine Learning (ICML 2007), Corvalis, Oregon, USA, pp. 129–136 (2007)Google Scholar
  5. 5.
    Cassidy, T., Chen, Z., Artiles, J., Ji, H., Deng, H., Ratinov, L.A., Zheng, J., Han, J., Roth, D.: CUNY-UIUC-SRI TAC-KBP2011 entity linking system description. In: Text Analysis Conference (TAC 2011) (2011)Google Scholar
  6. 6.
    Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: 2007 Joint Conference on EMNLP-CoNLL, pp. 708–716 (2007)Google Scholar
  7. 7.
    Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, pp. 277–285 (2010)Google Scholar
  8. 8.
    Fang, W., Zhang, J., Wang, D., Chen, Z., Li, M.: Entity disambiguation by knowledge and text jointly embedding. In: CoNLL 2016, p. 260 (2016)Google Scholar
  9. 9.
    Han, X., Zhao, J.: NLPR_KBP in TAC 2009 KBP track: a two-stage method to entity linking. In: Text Analysis Conference (TAC 2009) (2009)Google Scholar
  10. 10.
    Hoffart, J., Suchanek, F., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Ji, H., Nothman, J., Hachey, B.: Overview of TAC-KBP2014 entity discovery and linking tasks. In: Text Analysis Conference (TAC 2014) (2014)Google Scholar
  12. 12.
    Ji, H., Nothman, J., Hachey, B., Florian, R.: Overview of TAC-KBP2015 tri-lingual entity discovery and linking. In: Text Analysis Conference (TAC 2015) (2015)Google Scholar
  13. 13.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st ICML, pp. 1188–1196 (2014)Google Scholar
  14. 14.
    Lehmann, J., Monahan, S., Nezda, L., Jung, A., Shi, Y.: LCC approaches to knowledge base population at TAC 2010. In: Text Analysis Conference (2010)Google Scholar
  15. 15.
    Levy, O., Goldberg, Y., Dagan, I.: Improving distributional similarity with lessons learned from word embeddings. Trans. Assoc. Comput. Linguist. 3, 211–225 (2015)Google Scholar
  16. 16.
    Ling, X., Singh, S., Weld, D.: Design challenges for entity linking. Trans. Assoc. Comput. Linguist. (TACL) 3, 315–328 (2015)Google Scholar
  17. 17.
    Mihalcea, R., Csomai, A.: Wikify! linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 233–242. ACM, Lisbon (2007)Google Scholar
  18. 18.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26, 3111–3119 (2013)Google Scholar
  19. 19.
    Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  20. 20.
    Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th IJCNLP, ACL 2009, pp. 1003–1011 (2009)Google Scholar
  21. 21.
    Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. Trans. Assoc. Comput. Linguist. (TACL) 2, 231–244 (2014)Google Scholar
  22. 22.
    Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Pappu, A., Blanco, R., Mehdad, Y., Stent, A., Thadani, K.: Lightweight multilingual entity extraction and linking. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM 2017, pp. 365–374. ACM (2017)Google Scholar
  24. 24.
    Shen, W., Jianyong, W., Ping, L., Min, W.: LINDEN: linking named entities with knowledge base via semantic knowledge. In: Proceedings of the 21st International Conference on World Wide Web (WWW 2012), Lyon, France, pp. 449–458 (2012)Google Scholar
  25. 25.
    Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. Trans. Knowl. Data Eng. 27, 443–460 (2015)CrossRefGoogle Scholar
  26. 26.
    Varma, V., Bharath, V., Kovelamudi, S., Bysani, P., Santosh, G.S.K., Kiran Kumar, N., Reddy, K., Kumar, K., Maganti, N.: IIT Hyderabad at TAC 2009. In: Text Analysis Conference (TAC 2009) (2009)Google Scholar
  27. 27.
    Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph and text jointly embedding. In: The 2014 Conference on Empirical Methods on Natural Language Processing. ACL - Association for Computational Linguistics, October 2014Google Scholar
  28. 28.
    Yamada, I., Shindo, H., Takeda, H., Takefuji, Y.: Joint learning of the embedding of words and entities for named entity disambiguation. In: Proceedings of the 20th SIGNLL CoNLL, pp. 250–259 (2016)Google Scholar
  29. 29.
    Zwicklbauer, S., Seifert, C., Granitzer, M.: DoSeR - a knowledge-base-agnostic framework for entity disambiguation using semantic embeddings. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 182–198. Springer, Cham (2016). doi:10.1007/978-3-319-34129-3_12 CrossRefGoogle Scholar
  30. 30.
    Zwicklbauer, S., Seifert, C., Granitzer, M.: Robust and collective entity disambiguation through semantic embeddings. In: 39th International ACM Conference on Research and Development in Information Retrieval (SIGIR), pp. 425–434 (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Jose G. Moreno
    • 1
  • Romaric Besançon
    • 2
  • Romain Beaumont
    • 3
  • Eva D’hondt
    • 3
  • Anne-Laure Ligozat
    • 3
    • 4
  • Sophie Rosset
    • 3
  • Xavier Tannier
    • 3
    • 5
  • Brigitte Grau
    • 3
    • 4
  1. 1.Université Paul Sabatier, IRITToulouseFrance
  2. 2.CEA, LIST, Vision and Content Engineering LaboratoryGif-sur-YvetteFrance
  3. 3.LIMSI, CNRSUniversité Paris-SaclayOrsayFrance
  4. 4.ENSIIEÉvryFrance
  5. 5.Univ. Paris-SudOrsayFrance

Personalised recommendations