Advertisement

Data Mining and Knowledge Discovery

, Volume 29, Issue 5, pp 1486–1504 | Cite as

Knowledge base completion by learning pairwise-interaction differentiated embeddings

  • Yu ZhaoEmail author
  • Sheng GaoEmail author
  • Patrick Gallinari
  • Jun GuoEmail author
Article

Abstract

A knowledge base of triples like (subject entity, predicate relation,object entity) is a very important resource for knowledge management. It is very useful for human-like reasoning, query expansion, question answering (Siri) and other related AI tasks. However, such a knowledge base often suffers from incompleteness due to a large volume of increasing knowledge in the real world and a lack of reasoning capability. In this paper, we propose a Pairwise-interaction Differentiated Embeddings model to embed entities and relations in the knowledge base to low dimensional vector representations and then predict the possible truth of additional facts to extend the knowledge base. In addition, we present a probability-based objective function to improve the model optimization. Finally, we evaluate the model by considering the problem of computing how likely the additional triple is true for the task of knowledge base completion. Experiments on WordNet and Freebase show the excellent performance of our model and algorithm.

Keywords

Knowledge base Embedding model Knowledge base completion Representation learning 

Notes

Acknowledgments

This work was supported by the Natural Science Foundation of China under Grant No. 61300080, No. 61273217, the 111 Project under Grant No. B08004 and FP7 MobileCloud Project under Grant No. 612212. The authors are partially supported by the Key project of China Ministry of Education under Grant No. MCM20130310, Huawei’s Innovation Research Program and Postgraduate Innovation Fund of SICE, BUPT, 2015. We are thankful to the anonymous reviewers of DMKD whose comments helped us improving this work.

References

  1. Angeli G, Manning CD (2013) Philosophers are mortal: inferring the truth of unseen facts. In: Proceeding of the 2013 Conference on Computational Natural Language Learning, Sofia, Bulgaria, pp 133–142Google Scholar
  2. Berant J, Chou A, Frostig R, Liang P (2013) Semantic parsing on Freebase from question-answer pairs. In: Proceeding of the 2013 Conference on Empirical Methods in Natural Language Processing, pp 1533–1544Google Scholar
  3. Berant J, Liang P (2014) Semantic parsing via paraphrasing. In: Proceeding of the 2014 Annual Meeting of the Association for Computational Linguistics, pp 1415–1425Google Scholar
  4. Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceeding of the 2008 International Conference on Management of Data, Vancouver, BC, Canada, pp 1247–1250Google Scholar
  5. Bordes A, Weston J, Collobert R, Bengio Y (2011) Learning structured embeddings of knowledge bases. In: Proceeding of the 25th Annual Conference on Artificial Intelligence, San Francisco, USA, pp 301–306Google Scholar
  6. Bordes A, Glorot X, Weston J, Bengio Y (2012) Joint learning of words and meaning representations for open-text semantic parsing. In: Proceeding of 2012 International Conference on Artificial Intelligence and Statistics, pp 127–135Google Scholar
  7. Bordes A, Glorot X, Weston J, Bengio Y (2013a) A semantic matching energy function for learning with multi-relational data. Mach Learn 94(2):233–259MathSciNetCrossRefGoogle Scholar
  8. Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013b) Translating embeddings for modeling multi-relational data. Proc Adv Neural Inf Process Syst 26:2787–2795Google Scholar
  9. Bordes A, Chopra S, Weston J (2014) Question answering with subgraph embeddings. In: Proceeding of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp 615–620Google Scholar
  10. Castells P, Fernandez M, Vallet D (2007) An adaptation of the vector-space model for ontology-based information retrieval. IEEE Trans Knowl Data Eng 19(2):261–272CrossRefGoogle Scholar
  11. Fader A, Soderland S, Etzioni O (2011) Identifying relations for open information extraction. In: Proceeding of the 2011 Conference on Empirical Methods in Natural Language Processing, pp 1535–1545Google Scholar
  12. Fader A, Zettlemoyer L, Etzioni O (2014) Open question answering over curated and extracted knowledge bases. In: Proceeding of the 2014 International Conference on Knowledge Discovery and Data Mining, pp 1156–1165Google Scholar
  13. Graupmann J, Schenkel R, Weikum G (2005) The SphereSearch engine for unified ranked retrieval of heterogeneous XML and web documents. In: Proceeding of the 2005 International Conference on Very Large Data Bases, pp 529–540Google Scholar
  14. Huang EH, Socher R, Manning CD, Ng AY (2012) Improving word representations via global context and multiple word prototypes. In: Proceeding of the 2012 Annual Meeting of the Association for Computational Linguistics, pp 873–882Google Scholar
  15. Jenatton R, Roux NL, Bordes A, Obozinski GR (2012) A latent factor model for highly multi-relational data. Proc Adv Neural Inf Process Syst 25:3167–3175Google Scholar
  16. Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41CrossRefGoogle Scholar
  17. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceeding of Advances in Neural Information Processing Systems 26:3111–3119Google Scholar
  18. Ng V, Cardie C (2002) Improving machine learning approaches to coreference resolution. In: Proceeding of the 2002 Annual Meeting of the Association for Computational Linguistics, pp 104–111Google Scholar
  19. Rendle S, Marinho LB, Nanopoulos A, Schmidt-Thieme L (2009) Learning optimal ranking with tensor factorization for tag recommendation. In: Proceeding of the 2009 International Conference on Knowledge Discovery and Data Mining, pp 727–736Google Scholar
  20. Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat 22:400–407MathSciNetCrossRefzbMATHGoogle Scholar
  21. Snow R, Jurafsky D, Ng AY (2005) Learning syntactic patterns for automatic hypernym discovery. In: Proceeding of Advances in Neural Information Processing Systems 17, MIT Press, Cambridge, MA, pp 1297–1304Google Scholar
  22. Socher R, Chen D, Manning CD, Ng AY (2013) Reasoning with neural tensor networks for knowledge base completion. Proc Adv Neural Inf Process Syst 26:926–934Google Scholar
  23. Suchanek FM, Kasneci G, Weikum G (2007) Yago: a core of semantic knowledge. In: Proceeding of the 2007 International Conference on World Wide Web, pp 697–706Google Scholar
  24. Sutskever I, Salakhutdinov R, Tenenbaum J (2009) Modelling relational data using bayesian clustered tensor factorization. In: Proceeding of Advances in Neural Information Processing Systems 22:1821–1828Google Scholar
  25. Vallet D, Fernandez M, Castells P (2005) An ontology-based information retrieval model. In: The Semantic Web: Research and Applications. Springer, Berlin Heidelberg, pp 455–470Google Scholar
  26. Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence, pp 1112–1119Google Scholar
  27. Weston J, Bordes A, Yakhnenko O, Usunier N (2013) Connecting language and knowledge bases with embedding models for relation extraction. In: Proceeding of 2013 Conference on Empirical Methods in Natural Language Processing, pp 1366–1371Google Scholar
  28. Yao X, Durme BV (2014) Information extraction over structured data: Question answering with freebase. In: Proceeding of the 2014 Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, USA, pp 956–966Google Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  1. 1.Beijing University of Posts and TelecommunicationsBeijingChina
  2. 2.LIP6Universit Pierre et Marie CurieParisFrance

Personalised recommendations