Cross-Knowledge Graph Entity Alignment via Neural Tensor Network

Wang, Jingchu; Liu, Jianyi; Chen, Feiyu; Lu, Teng; Huang, Hua; Zhao, Jinmeng

doi:10.1007/978-981-19-2456-9_8

Jingchu Wang⁴⁰,
Jianyi Liu⁴¹,
Feiyu Chen⁴¹,
Teng Lu⁴⁰,
Hua Huang⁴² &
…
Jinmeng Zhao⁴⁰

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE))

Included in the following conference series:

INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND APPLICATIONS

8479 Accesses

Abstract

With the expansion of the current knowledge graph scale and the increase of the number of entities, a large number of knowledge graphs express the same entity in different ways, so the importance of knowledge graph fusion is increasingly manifested. Traditional entity alignment algorithms have limited application scope and low efficiency. This paper proposes an entity alignment method based on neural tensor network (NtnEA), which can obtain the inherent semantic information of text without being restricted by linguistic features and structural information, and without relying on string information. In the three cross-lingual language data sets DBP_FR−EN, DBP_ZH−EN and DBP_JP−EN of the DBP15K data set, Mean Reciprocal Rank and Hits@k are used as the alignment effect evaluation indicators for entity alignment tasks. Compared with the existing entity alignment methods of MTransE, IPTransE, AlignE and AVR-GCN, the Hit@10 values of the NtnEA method are 85.67, 79.20, and 78.93, and the MRR is 0.558, 0.511, and 0.499, which are better than traditional methods and improved 10.7% on average.

You have full access to this open access chapter, Download conference paper PDF

Unsupervised Deep Cross-Language Entity Alignment

Cross-lingual knowledge graph entity alignment by aggregating extensive structures and specific semantics

Article 18 July 2022

Semantic-aware entity alignment for low resource language knowledge graph

Article 18 December 2023

Keywords

1 Introduction

The development of knowledge graph research has developed a variety of methods for the alignment of knowledge graph entities. Traditional entity alignment methods can only use the symbolic information on the surface of the knowledge graph data. The entity alignment between knowledge graphs can be realized efficiently and accurately.

This paper proposes a method for entity alignment based on joint knowledge representation and using improved NTN. We regard entity alignment as a binary classification problem, improve the evaluation function of NTN, and use the aligned entity pair vector as the input of alignment relationship model. If the “the Same As” relationship exists between the input entity pairs, the evaluation function of the model will return a high score, otherwise it will return a low score, based on the scores of the candidate entities to complete the entity alignment task.

2 Related Work

2.1 Joint Knowledge Represents Learning

The purpose of knowledge representation learning is to embed entities and relationships into a low-dimensional vector space, and to maximize the preservation of the original semantic structure information. The TransE method opens a series of translation-based methods that learn vectorized representations of entities and relationships to support further applications, such as entity alignment, relationship reasoning, and triple classification. However, TransE is not very effective in solving many-to-one and one-to-many problems. In order to improve the effect of TransE learning multiple mapping relations, TransH, TransR and TransDare proposed. All variants of TransE specifically embed entities for different relationships, and improve the knowledge representation learning method of multi-mapping relationships at the cost of increasing the complexity of the model. In addition, there are some non-translation-based methods, including UM [1], SE, DistMult, and HolE [2], which do not express relational embedding.

2.2 Evaluation of the Similarity of the Neural Tensor Network

The goal of similarity evaluation is to measure the degree of similarity between entities. The BootEA model [3] designed a method to solve the problem that the training data set is very limited in the process of knowledge representation learning, iteratively marked out the possible entity alignment pairs, added them into the training of knowledge embedded model, and constrained the alignment data generated in each iteration. The similarity evaluation methods of these models belong to the traditional string text similarity calculation method. For example, KL divergence [4] is used to measure the amount of information lost when one vector approximates to another; There are also Euclidean distance, Manhattan distance [5] and other distance evaluation functions for mapping entities to vector space; There are many models using cosine similarity [6] as entity similarity calculation. Entity alignment algorithm.

3 Entity Alignment Algorithm

3.1 Algorithm Framework

This paper proposes an entity alignment method based on neural tensor network, which consists of two parts: Joint knowledge representation and neural tensor network similarity evaluation. The whole framework of this method is illustrated in Fig. 1. We use $\mathrm{G}$ to represent a set of knowledge maps, and ${\mathrm{G}}^{2}$ to represent the combination of kgs (that is, the set of unordered knowledge pairs). For ${G}_{1}$ and ${\mathrm{G}}_{2}$ is defined as the entity set in knowledge graph $\mathrm{G}$, and $\mathrm{R}$ is defined as the relationship set in knowledge map $\mathrm{G}$. $\mathrm{T }= (\mathrm{h},\mathrm{ R},\mathrm{ t})$ denotes the entity relation triple of a positive example in the knowledge graph $\mathrm{G}$, let $\mathrm{h},\mathrm{ t }\in \mathrm{ E};\mathrm{r}\in \mathrm{R}$, vector_ h, vector_ r, vector_ T represents the embedding vectors of head entity $\mathrm{h}$, relation $\mathrm{R}$ and tail entity $\mathrm{t}$ respectively.

We regard the alignment relationship “the Same As” as a special relationship between entities, as shown in Fig. 2, and perform alignment specific translation operations between aligned entities to constrain the training process of two knowledge maps to learn joint knowledge representation.

Formulaic given two aligned entities ${e}_{1}\in {\mathrm{E}}_{1}$ and ${\mathrm{e}}_{2}\in {\mathrm{E}}_{2}$. We assume that there is an alignment relation ${r}^{same}$ between two aligned entities, so ${e}_{1}+{r}^{Same}\cong {e}_{2}$. The energy function of joint knowledge representation is defined as:

$$E\left({e}_{1},{r}^{Same},{e}_{2}\right)=\| {e}_{1}+{r}^{Same}-{e}_{2}\| $$

(1)

The similarity evaluation model in 2.2 does not use the underlying semantic and structural information of the entity vector, and then considers that the neural tensor network is used in knowledge reasoning. This is in modeling the relationship between two vectors and inferring the relationship that exists between entities. A task has a very good effect, as shown in Fig. 3. Inspired by this, this article uses the NTN method as an alignment model to infer and judge whether there is a “the Same As” alignment relationship between two entities to be aligned. This method uses The tensor function regards entity alignment as a binary classification problem, and the evaluation function of the neural tensor network is:

$$S\left({e}_{1},{e}_{2}\right)={u}^{T}f({{e}_{1}}^{T}{W}^{\left[1:k\right]}{e}_{2}+V\left(\begin{array}{c}{e}_{1}\\ {e}_{2}\end{array}\right)+b)$$

(2)

Where $\mathrm{f }=\mathrm{ tanh}$ is a nonlinear function; ${\mathrm{W}}^{[1:\mathrm{k}]}\in {\mathrm{R}}^{\mathrm{d }\times \mathrm{ d }\times \mathrm{ k}}$ is a three-dimensional tensor; $\mathrm{D}$ is the dimension of entity embedding vector, $\mathrm{k}$ is the number of tensor slices; $\mathrm{V }\in {\mathrm{R}}^{2\mathrm{d }\times \mathrm{ k}}$ And $\mathrm{b }\in {\mathrm{R}}^{k}$ is the parameter of the linear part of the evaluation function; $\mathrm{u }\in {\mathrm{R}}^{k}$.

In the legal triples, the relationship between the head entity and the tail entity is irreversible and directional for the current triple; However, for the alignment of entities to triples, the alignment relationship between entities is undirected, that is, there is such a triple relationship between aligned entity pairs $(\mathrm{A},\mathrm{ B})$:$\left(\mathrm{A},\mathrm{theSameAs},\mathrm{B}\right)$, $\left(\mathrm{B},\mathrm{theSameAs},\mathrm{A}\right)$,

The triplet embedding section in Fig. 1 shows this very well. We optimize the evaluation function:

$$S\left({e}_{1},{e}_{2}\right)={u}^{T}f\left(\begin{array}{c}{mean({e}_{1}}^{T}{W}^{\left[1:k\right]}{e}_{2}+V\left(\begin{array}{c}{e}_{1}\\ {e}_{2}\end{array}\right),\\ {{e}_{2}}^{T}{W}^{\left[1:k\right]}{e}_{1}+V\left(\begin{array}{c}{e}_{2}\\ {e}_{1}\end{array}\right))+b\end{array}\right)$$

(3)

The final loss function is as follows:

$$L(\mathit{\Omega})=\mathop{\sum}\nolimits_{i=1}^{N}\mathop{\sum}\nolimits_{c=1}^{C}\mathit{max}\left(\mathrm{0,1}-S\left({T}^{\mathrm{i}}\right)+S\left({T}_{c}^{i}\right)\right)+\lambda {\Vert \mathit{\Omega} \Vert }_{2}^{2}$$

(4)

where $\Omega $ is the set of all parameters. ${T}_{c}^{i}$ is the ${\mathrm{c}}^{th}$ negative example of the ${\mathrm{i}}^{th}$ positive example.

3.2 Algorithm Flow

The algorithm description of the specific NtnEA model is shown in Algorithm 1.

4 Experiment

4.1 Datasets

This experiment is aimed at the comparison of entity alignment methods based on knowledge representation learning, in order to facilitate the horizontal comparison of multiple entity alignment methods, and evaluate the NtnEA method in the context of cross-language entity alignment tasks. This experimental data set uses a more general paper data, the DBP15K [7] data set, which contains three cross-language data sets. These data sets are constructed based on the multilingual version of the DBpedia knowledge base: ${\mathrm{DBP}}_{\mathrm{ZH}-\mathrm{EN}}$ (Chinese and English), ${\mathrm{DBP}}_{\mathrm{JP}-\mathrm{EN}}$ (Japanese and English) and ${\mathrm{DBP}}_{\mathrm{FR}-\mathrm{EN}}$ (French and English). Each data set contains 15,000 aligned entities.

4.2 Training and Evaluation

In order to verify the effectiveness of this research method on the task of knowledge map alignment, the following relatively common method pairs were selected as experimental reference comparisons:

MTransE, the linear transformation between two vector spaces established by TransE;
IPTransE, which embeds entities from different knowledge graphs into a unified vector space, and iteratively uses predicted anchor points to improve performance;
AlignE [6] uses ε-truncated uniform negative sampling and parameter exchange to realize the embedded representation of the knowledge graph. It is a variant of BootEA method without bootstrapping;
AVR-GCN uses VR-GCN as a network embedding model to learn the representation of entities and the representation of relations at the same time and use this network in the task of multi-relational network alignment based on this network;

To experimentally verify the algorithm in this paper, first learn the vectorized representation of entity relationships in the low-dimensional embedding space in the DBP15K data set. In the entire training process, the dimension d of the vector space is selected from the set $\{\mathrm{50, 80, 100, 150}\}$, and the learning rate λ is selected from the set $\{{10}^{-2},{10}^{-3},{10}^{-4}\}$, the number of negative samples n is selected from the set $\{\mathrm{1,3},\mathrm{5,15,30}\}$. Three sets of data sets are trained separately, and the final optimal parameter configuration is selected as follows: 1. ZH-EN data set, $\mathrm{d}=100$, $\uplambda =0.001$, $\mathrm{n}=5$; 2. JP-EN data set, $\mathrm{d}=100$, $\uplambda =0.001$, $\mathrm{n}=3$; 3. FR-EN data set, $\mathrm{d}=100$, $\uplambda =0.003$, $\mathrm{n}=5$.

The alignment entity data of each cross-language data set is divided according to the ratio of 3:7. As shown in Fig. 4, as the number of tensor slices k increases, the complexity of the model becomes larger, and its performance also improves, but considering that the parameter complexity will increase with the increase of tensor slice parameters. Therefore, the optimal parameter configuration of the neural tensor network model in this process is: $\uplambda =0.0005,\mathrm{ k}=200 (\mathrm{tensor})$.

4.3 Experimental Results and Analysis

According to the experimental settings in the experimental method in the previous section, entity alignment experiments were performed on the three sets of cross-language data sets of DBP15K. The results of entity alignment are shown in Table 1. Through the experimental results, it can be seen that in the data sets ${\mathrm{DBP}}_{\mathrm{FR}-\mathrm{EN}}$, ${\mathrm{DBP}}_{\mathrm{ZH}-\mathrm{EN}}$ and ${\mathrm{DBP}}_{\mathrm{JP}-\mathrm{EN}}$, compared with the traditional entity alignment method on Hit@k and MRR indicators, The experimental results are shown in the table. The experimental results of MTransE, IPTransE, AlignE and AVR-GCN are obtained from the literature [8]. It can be seen from the table that the experimental results of the two NtnEA methods are significantly improved compared to the benchmark methods MTransE and IPTransE. For example, the Hit@10 values of NtnEA on the three cross-language data sets of DBP15k are 82.00, 78.07 and 77.10, respectively. Compared with the experimental indicators of the AlignE model, an average increase of 10.7%.

This paper uses the semantic structure information of triple data, and through joint knowledge indicates that more alignment information is integrated, so the results show that its alignment effect is significantly improved compared to the alignment methods based on knowledge representation learning such as MTransE and IPTransE. Among the two NtnEA entity alignment methods, the NtnEA model performs better than the NtnEA(Orig) model. This verifies the fact that the head entity and the tail entity in the triples of the alignment relationship are undirected graph structures under the relationship “the same As”. On the three cross-language data sets, the Hit@10 and MRR indicators of the NtnEA(Orig) and NtnEA models proposed in this paper exceed the MTransE and IPTransE methods. However, there is no obvious advantage over the current more advanced AVR-GCN model in the Hit@1 indicator, which represents the alignment accuracy.

Table 2 shows that when using the similarity evaluation model for training, the more priori seed set training set alignment relationship data, the better the effect of the model on the entity alignment task.

Table 1. Comparison of entity alignment results

Full size table

Table 2. Comparison results under different seed set partition ratios Hit@k index

Full size table

5 Conclusions

This paper introduces a cross-knowledge graph entity alignment model based on neural tensor network proposed in this paper. The model is mainly divided into two parts: joint knowledge representation learning and neural tensor network similarity evaluation. The entity alignment method based on neural tensor network is verified experimentally. The experimental results show that the method based on neural tensor network has good entity alignment performance under given experimental conditions. Compared with previous algorithms, the indexes HIT@5 and HIT@10 have been improved, but the improvement effect on HIT@1 is not obvious, which means that the method has short board in alignment accuracy.

References

Bordes, A., Glorot, X., Weston, J., et al.: Joint learning of words and meaning representations for open-text semantic parsing. In: International Conference on Artificial Intelligence and Statistics, pp. 127–135 (2012)
Google Scholar
Nickel, M., Rosasco, L., Poggio, T.: Holographic embeddings of knowledge graphs (2015)
Google Scholar
Sun, Z., Hu, W., Zhang, Q., et al.: Bootstrapping entity alignment with knowledge graph embedding. International Joint Conference on Artificial Intelligence, pp. 4396–4402 (2018)
Google Scholar
Lasmar, N., Baussard, A., Chenadec, G.L.: Asymmetric power distribution model of wavelet subbands for texture classification. Pattern Recogn. Lett. 52, 1–8
Google Scholar
Schoenharl, T.W., Madey, G.: Evaluation of measurement techniques for the validation of agent-based simulations against streaming data. In: Proceedings of the 8th International Conference on Computational Science, Part III (2008)
Google Scholar
Xia, P., Zhang, L., Li, F.: Learning similarity with cosine similarity ensemble. Inf. Sci. 307, 39–52
Google Scholar
Sun, Z., Hu, W., Li, C., et al.: Cross-lingual entity alignment via joint attribute-preserving embedding. In: International Semantic Web Conference, pp. 628–644 (2017)
Google Scholar
Ye, R., Li, X., Fang, Y., Zang, et al.: A vectorized relational graph convolutional network for multi-relational network alignment. In: International Joint Conferences on Artificial Intelligence, pp. 4135–4141 (2019)
Google Scholar

Download references

Acknowledgments

The authors would like to thank the anonymous referees for their valuable comments and helpful suggestions. The work is supported by Science and Technology Project of the Headquarters of State Grid Corporation of China, “The research and technology for collaborative defense and linkage disposal in network security devices” (5700-202152186A-0-0-00).

Author information

Authors and Affiliations

State Grid Information and Telecommunication Branch, Beijing, China
Jingchu Wang, Teng Lu & Jinmeng Zhao
Beijing University of Posts and Telecommunications, Beijing, China
Jianyi Liu & Feiyu Chen
Information and Telecommunication Company, State Grid ShanDong Electric Power Corporation, Jinan, China
Hua Huang

Authors

Jingchu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianyi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Feiyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Teng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Hua Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jinmeng Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianyi Liu .

Editor information

Editors and Affiliations

College of Communication Engineering, Jilin University, Jilin, Jilin, China
Zhihong Qian
Department of AI & ML, Vardhaman College of Engineering, Hyderabad, Telangana, India
M.A. Jabbar
College of Technology, Indiana State University, Terre Haute, IN, USA
Xiaolong Li

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Liu, J., Chen, F., Lu, T., Huang, H., Zhao, J. (2022). Cross-Knowledge Graph Entity Alignment via Neural Tensor Network. In: Qian, Z., Jabbar, M., Li, X. (eds) Proceeding of 2021 International Conference on Wireless Communications, Networking and Applications. WCNA 2021. Lecture Notes in Electrical Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-19-2456-9_8

Download citation

DOI: https://doi.org/10.1007/978-981-19-2456-9_8
Published: 13 July 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-2455-2
Online ISBN: 978-981-19-2456-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics