Entity-Relation Distribution-Aware Negative Sampling for Knowledge Graph Embedding

Yao, Naimeng; Liu, Qing; Yang, Yi; Li, Weihua; Bai, Quan

doi:10.1007/978-3-031-47240-4_13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14265))

Included in the following conference series:

International Semantic Web Conference

1466 Accesses

Abstract

Knowledge Graph Embedding (KGE) is a powerful technique for mining knowledge from knowledge graphs. Negative sampling plays a critical role in KGE training and significantly impacts the performance of KGE models. Negative sampling methods typically preserve a pair of Entity-Relation (ER) in each positive triple and replace the other entity with negative entities selected randomly from the entity set to create a consistent number of negative samples. However, the distribution of ER pairs is often long-tailed, making it problematic to assign the same number of negative samples to each ER pair, which is overlooked in most related works. This paper investigates the impact of assigning the same number of negative samples to ER pairs during training and demonstrates that this approach impedes the training from reaching the optimal solution in the negative sampling loss function and undermines the objective of the trained model. To address this issue, we propose a novel ER distribution-aware negative sampling method that can adaptively assign a varying number of negative samples to each ER pair based on its distribution characteristics. Furthermore, our proposed method also mitigates the issue of introducing false negative samples commonly found in many negative sampling methods. Our approach is founded on theoretical analysis and practical considerations and can be applied to most KGE models. We validate the effectiveness of our proposed method by testing it on conventional KGE and Neural Network-based KGE models. Our experimental results outperform most state-of-the-art negative sampling methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the implementation, observed positive entities are filtered out. However, since the total number of entities \(|\mathcal {E}|\) is significantly larger than the frequency \(\#(e,r)\) of ER pairs, for the sake of simplification in theoretical analysis, we set \({p_n({q}|(e,r))}=\frac{1}{|\mathcal {E}|}\).

References

Ahrabian, K., Feizi, A., Salehi, Y., Hamilton, W.L., Bose, A.J.: Structure aware negative sampling in knowledge graphs. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). pp. 6093–6101 (2020)
Google Scholar
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26 (2013)
Google Scholar
Cai, L., Wang, W.Y.: Kbgan: Adversarial learning for knowledge graph embeddings. CoRR (2017)
Google Scholar
Chen, X., Zhou, Y., Wu, D., Zhang, W., Zhou, Y., Li, B., Wang, W.: Imagine by reasoning: A reasoning-based implicit semantic data augmentation for long-tailed classification. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36, pp. 356–364 (2022)
Google Scholar
Dettmers, T., Minervini, P., Stenetorp, P., Riedel, S.: Convolutional 2d knowledge graph embeddings. In: Proceedings of the AAAI conference on artificial intelligence. vol. 32 (2018)
Google Scholar
Dietterich, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation 10(7), 1895–1923 (1998)
Article Google Scholar
Guo, G., Ouyang, S., Yuan, F., Wang, X.: Approximating word ranking and negative sampling for word embedding. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. p. 4092–4098 (2018)
Google Scholar
Ji, G., He, S., Xu, L., Liu, K., Zhao, J.: Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers). pp. 687–696 (2015)
Google Scholar
Kamigaito, H., Hayashi, K.: Unified interpretation of softmax cross-entropy and negative sampling: With case study for knowledge graph embedding. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). pp. 5517–5531 (2021)
Google Scholar
Kamigaito, H., Hayashi, K.: Comprehensive analysis of negative sampling in knowledge graph representation learning. In: International Conference on Machine Learning. pp. 10661–10675. PMLR (2022)
Google Scholar
Kazemi, S.M., Poole, D.: Simple embedding for link prediction in knowledge graphs. Advances in neural information processing systems 31 (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. Technical report (2014)
Google Scholar
Levy, O., Goldberg, Y.: Neural word embedding as implicit matrix factorization. Advances in neural information processing systems 27 (2014)
Google Scholar
Li, Z., Ji, J., Fu, Z., Ge, Y., Xu, S., Chen, C., Zhang, Y.: Efficient non-sampling knowledge graph embedding. In: Proceedings of the Web Conference 2021. pp. 1727–1736 (2021)
Google Scholar
Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25–30, 2015, Austin, Texas, USA. pp. 2181–2187 (2015)
Google Scholar
Mahdisoltani, F., Biega, J., Suchanek, F.: Yago3: A knowledge base from multilingual wikipedias. In: 7th biennial conference on innovative data systems research. CIDR Conference (2014)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems 26 (2013)
Google Scholar
Nguyen, D.Q., Nguyen, T.D., Nguyen, D.Q., Phung, D.: A novel embedding model for knowledge base completion based on convolutional neural network. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). pp. 327–333 (2018)
Google Scholar
Schlichtkrull, M., Kipf, T.N., Bloem, P., Berg, R.v.d., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: European semantic web conference. pp. 593–607 (2018)
Google Scholar
Shang, C., Tang, Y., Huang, J., Bi, J., He, X., Zhou, B.: End-to-end structure-aware convolutional networks for knowledge base completion. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 3060–3067 (2019)
Google Scholar
Sun, Z., Deng, Z.H., Nie, J.Y., Tang, J.: Rotate: Knowledge graph embedding by relational rotation in complex space. In: International Conference on Learning Representations (2019)
Google Scholar
Toutanova, K., Chen, D.: Observed versus latent features for knowledge base and text inference. In: Proceedings of the 3rd workshop on continuous vector space models and their compositionality. pp. 57–66 (2015)
Google Scholar
Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., Bouchard, G.: Complex embeddings for simple link prediction. In: International conference on machine learning. pp. 2071–2080. PMLR (2016)
Google Scholar
Vashishth, S., Sanyal, S., Nitin, V., Agrawal, N., Talukdar, P.: Interacte: Improving convolution-based knowledge graph embeddings by increasing feature interactions. In: Proceedings of the AAAI conference on artificial intelligence. vol. 34, pp. 3009–3016 (2020)
Google Scholar
Wang, M., Qiu, L., Wang, X.: A survey on knowledge graph embeddings for link prediction. Symmetry 13(3), 485 (2021)
Article Google Scholar
Wang, P., Li, S., et al.: Incorporating gan for negative sampling in knowledge representation learning. CoRR (2018)
Google Scholar
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29(12), 2724–2743 (2017)
Article Google Scholar
Wang, Y., Ruffinelli, D., Gemulla, R., Broscheit, S., Meilicke, C.: On evaluating embedding models for knowledge base completion. Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019) (2019)
Google Scholar
Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the AAAI conference on artificial intelligence. vol. 28 (2014)
Google Scholar
Yang, B., Yih, W., He, X., Gao, J., Deng, L.: Embedding entities and relations for learning and inference in knowledge bases. In: 3rd International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Yang, Z., Ding, M., Zhou, C., Yang, H., Zhou, J., Tang, J.: Understanding negative sampling in graph representation learning. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. pp. 1666–1676 (2020)
Google Scholar
Zhang, N., Deng, S., Sun, Z., Wang, G., Chen, X., Zhang, W., Chen, H.: Long-tail relation extraction via knowledge graph embeddings and graph convolution networks. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). pp. 3016–3025 (2019)
Google Scholar
Zhang, W., Paudel, B., Wang, L., Chen, J., Zhu, H., Zhang, W., Bernstein, A., Chen, H.: Iteratively learning embeddings and rules for knowledge graph reasoning. In: The World Wide Web Conference. pp. 2366–2377 (2019)
Google Scholar
Zhang, Y., Yao, Q., Shao, Y., Chen, L.: Nscaching: simple and efficient negative sampling for knowledge graph embedding. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE). pp. 614–625 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Tasmania, Hobart, Australia
Naimeng Yao & Quan Bai
Data61, CSIRO, Hobart, Australia
Qing Liu
Hefei University of Technology, Hefei, China
Yi Yang
Auckland University of Technology, Auckland, New Zealand
Weihua Li

Authors

Naimeng Yao
View author publications
You can also search for this author in PubMed Google Scholar
Qing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Weihua Li
View author publications
You can also search for this author in PubMed Google Scholar
Quan Bai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Naimeng Yao .

Editor information

Editors and Affiliations

University of Liverpool, Liverpool, UK
Terry R. Payne
University of Bologna, Bologna, Italy
Valentina Presutti
Southeast University, Nanjing, China
Guilin Qi
Universidad Politécnica de Madrid, Madrid, Spain
María Poveda-Villalón
Huawei Technologies R&D UK, Edinburgh, UK
Giorgos Stoilos
Centrum Wiskunde and Informatica, Amsterdam, The Netherlands
Laura Hollink
IT University of Copenhagen, Copenhagen, Denmark
Zoi Kaoudi
Nanjing University, Nanjing, China
Gong Cheng
Tsinghua University, Beijing, Beijing, China
Juanzi Li

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 310 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yao, N., Liu, Q., Yang, Y., Li, W., Bai, Q. (2023). Entity-Relation Distribution-Aware Negative Sampling for Knowledge Graph Embedding. In: Payne, T.R., et al. The Semantic Web – ISWC 2023. ISWC 2023. Lecture Notes in Computer Science, vol 14265. Springer, Cham. https://doi.org/10.1007/978-3-031-47240-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-47240-4_13
Published: 27 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47239-8
Online ISBN: 978-3-031-47240-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the Semantic Web Science Association (opens in a new tab)

Entity-Relation Distribution-Aware Negative Sampling for Knowledge Graph Embedding