Knowledge Base Completion by Variational Bayesian Neural Tensor Decomposition
- 168 Downloads
Knowledge base completion is an important research problem in knowledge bases, which play important roles in question answering, information retrieval, and other applications. A number of relational learning algorithms have been proposed to solve this problem. However, despite their success in modeling the entity relations, they are not well founded in a Bayesian manner and thus are hard to model the prior information of the entity and relation factors. Furthermore, they under-represent the interaction between entity and relation factors. In order to avoid these disadvantages, we provide a neural-inspired approach, namely Bayesian Neural Tensor Decomposition approach for knowledge base completion based on the Stochastic Gradient Variational Bayesian framework. We employ a multivariate Bernoulli likelihood function to represent the existence of facts in knowledge graphs. We further employ a Multi-layered Perceptrons to represent more complex interactions between the latent subject, predicate, and object factors. The SGVB framework can enable us to make efficient approximate variational inference for the proposed nonlinear probabilistic tensor decomposition by a novel local reparameterization trick. This way avoids the need of expensive iterative inference schemes such as MCMC and does not make any over-simplified assumptions about the posterior distributions, in contrary to the common variational inference. In order to evaluate the proposed model, we have conducted experiments on real-world knowledge bases, i.e., FreeBase and WordNet. Experimental results have indicated the promising performance of the proposed method.
KeywordsKnowledge base completion Variational Bayesian Neural networks
All the authors except Yafang Wang were supported by the Natural Science Foundation of China (No. 61572111), 1000-Talent Startup Funds (Nos. G05QNQR004, A1098531023601041) and a Fundamental Research Fund for the Central Universities of China (No. ZYGX2016Z003).
Compliance with Ethical Standards
Conflict of Interest
The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent was not required as no humans or animals were involved.
- 1.Auer S, Bizer C, Kobilarov G, Lehmann J, Ives Z. DBpedia: A nucleus for a web of open data. In: Proceedings ISWC; 2007. p. 11–15.Google Scholar
- 2.Bishop CM, Nasrabadi NM. Pattern recognition and machine learning. Springer, 2006. p. 461–462.Google Scholar
- 3.Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J. Freebase: a collaboratively created graph database for structuring human knowledge. In: ACM’S special interest group on management of data conference; 2008. p. 1247–1250.Google Scholar
- 5.Bordes A, Usunier N, García-Durán A, Weston J, Yakhnenko O. Translating embeddings for modeling multi-relational data. In: Advances in neural information processing systems; 2013. pp. 2787–2795.Google Scholar
- 6.Chen S, Lyu MR, King I, Xu Z. Exact and stable recovery of pairwise interaction tensors. In: Advances in neural information processing systems 26: 27th annual conference on neural information processing systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe; 2013. pp. 1691–1699.Google Scholar
- 7.Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W. Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: ACM SIGKDD international conference on knowledge discovery and data mining; 2014. p. 601–610.Google Scholar
- 8.Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 2011;12(Jul):2121–2159.Google Scholar
- 12.Kingma DP, Welling M. 2013. Auto-encoding variational bayes. arXiv:1312.6114.
- 14.Lao N, Mitchell T, Cohen WW. Random walk inference and learning in a large scale knowledge base. In: Conference on empirical methods in natural language processing, EMNLP 2011, john mcintyre conference centre, edinburgh, uk, a meeting of sigdat, a special interest group of the ACL; 2012. p. 529–539.Google Scholar
- 15.Li G, Xu Z, Wang L, Ye J, King I, Lyu MR. Simple and efficient parallelization for probabilistic temporal tensor factorization. In: 2017 international joint conference on neural networks, IJCNN 2017, anchorage; 2017, p. 1–8.Google Scholar
- 16.Lin Y, Liu Z, Zhu X, Zhu X, Zhu X. Learning entity and relation embeddings for knowledge graph completion. In: Twenty-ninth AAAI conference on artificial intelligence; 2015. p. 2181– 2187.Google Scholar
- 20.Nickel M, Tresp V. 2013. Logistic tensor factorization for multi-relational data. arXiv:1306.2084.
- 21.Nickel M, Tresp V, Kriegel HP. A three-way model for collective learning on multi-relational data. In: International conference on international conference on machine learning; 2011, vol. 11. p. 809–816.Google Scholar
- 23.Socher R, Chen D, Manning CD, Ng AY. Reasoning with neural tensor networks for knowledge base completion. In: Advances in neural information processing systems; 2013. p. 926– 934.Google Scholar
- 24.Suchanek FM, Kasneci G, Weikum G. Yago: a core of semantic knowledge. Proceedings of the 16th international conference on World Wide Web. ACM; 2007. p. 697–706.Google Scholar
- 25.Sutskever I, Salakhutdinov R, Tenenbaum JB. Modelling relational data using bayesian clustered tensor factorization. In: Advances in neural information processing systems; 2009. p. 1821–1828.Google Scholar
- 27.Wang Z, Zhang J, Feng J, Chen Z. Knowledge graph embedding by translating on hyperplanes. In: The association for the advance of artificial intelligence; 2014, vol. 14. p. 1112–1119.Google Scholar
- 28.Weston J, Bordes A, Yakhnenko O, Usunier N. Connecting language and knowledge bases with embedding models for relation extraction. In: Conference on empirical methods in natural language processing; 2013. p. 1366–1371.Google Scholar
- 29.Xu Z, Yan F, Qi Y. Infinite tucker decomposition: Nonparametric bayesian models for multiway data analysis. In: Proceedings of the 29th international conference on machine learning, ICML 2012. Edinburgh; 2012.Google Scholar
- 31.Yang X, Huang K, Zhang R, Hussain A. Learning latent features with infinite non-negative binary matrix tri-factorization. IEEE Trans Emerg Topics Comput Intell. 2018;2(3). https://doi.org/10.1109/TETCI.2018.2806934.
- 32.Zhe S, Qi Y, Park Y, Xu Z, Molloy I, Chari S. Dintucker: Scaling up gaussian process models on large multidimensional arrays. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. Phoenix; 2016. p. 2386–2392.Google Scholar
- 33.Zhe S, Xu Z, Chu X, Qi Y, Park Y. Scalable nonparametric multiway data analysis. In: Proceedings of the eighteenth international conference on artificial intelligence and statistics, AISTATS 2015, San Diego; 2015.Google Scholar
- 34.Zhe S, Zhang K, Wang P, Lee K, Xu Z, Qi Y, Ghahramani Z. Distributed flexible nonlinear tensor factorization. In: Advances in neural information processing systems 29, Barcelona; 2016. p. 920–928.Google Scholar
- 37.Zhu J. Max-margin nonparametric latent feature models for link prediction. In: Proceedings of the 29th international coference on international conference on machine learning. Omnipress; 2012. p. 1179–1186.Google Scholar