Abstract
As a crucial extension of entity alignment (EA), multi-modal entity alignment (MMEA) aims to identify identical entities across disparate knowledge graphs (KGs) by exploiting associated visual information. However, existing MMEA approaches primarily concentrate on the fusion paradigm of multi-modal entity features, while neglecting the challenges presented by the pervasive phenomenon of missing and intrinsic ambiguity of visual images. In this paper, we present a further analysis of visual modality incompleteness, benchmarking latest MMEA models on our proposed dataset MMEA-UMVM, where the types of alignment KGs covering bilingual and monolingual, with standard (non-iterative) and iterative training paradigms to evaluate the model performance. Our research indicates that, in the face of modality incompleteness, models succumb to overfitting the modality noise, and exhibit performance oscillations or declines at high rates of missing modality. This proves that the inclusion of additional multi-modal data can sometimes adversely affect EA. To address these challenges, we introduce UMAEA, a robust multi-modal entity alignment approach designed to tackle uncertainly missing and ambiguous visual modalities. It consistently achieves SOTA performance across all 97 benchmark splits, significantly surpassing existing baselines with limited parameters and time consumption, while effectively alleviating the identified limitations of other models. Our code and benchmark data are available at https://github.com/zjukg/UMAEA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The appendix is attached with the arXiv version of this paper.
References
Bordes, A., Usunier, N., García-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NIPS, pp. 2787–2795 (2013)
Cai, W., Ma, W., Zhan, J., Jiang, Y.: Entity alignment with reliable path reasoning and relation-aware heterogeneous graph transformer. In: IJCAI, pp. 1930–1937. ijcai.org (2022)
Cao, Y., Liu, Z., Li, C., Li, J., Chua, T.: Multi-channel graph neural network for entity alignment. In: ACL (1), pp. 1452–1461. Association for Computational Linguistics (2019)
Chen, L., Li, Z., Wang, Y., Xu, T., Wang, Z., Chen, E.: MMEA: entity alignment for multi-modal knowledge graph. In: Li, G., Shen, H.T., Yuan, Y., Wang, X., Liu, H., Zhao, X. (eds.) KSEM 2020. LNCS (LNAI), vol. 12274, pp. 134–147. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55130-8_12
Chen, L., et al.: Multi-modal siamese network for entity alignment. In: KDD, pp. 118–126. ACM (2022)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.E.: A simple framework for contrastive learning of visual representations. In: ICML. Proceedings of Machine Learning Research, vol. 119, pp. 1597–1607. PMLR (2020)
Chen, Z., Chen, J., Geng, Y., Pan, J.Z., Yuan, Z., Chen, H.: Zero-shot visual question answering using knowledge graph. In: Hotho, A., et al. (eds.) ISWC 2021. LNCS, vol. 12922, pp. 146–162. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88361-4_9
Chen, Z., et al.: Meaformer: multi-modal entity alignment transformer for meta modality hybrid. In: ACM Multimedia. ACM (2023)
Gao, Y., Liu, X., Wu, J., Li, T., Wang, P., Chen, L.: Clusterea: scalable entity alignment with stochastic training and normalized mini-batch similarities. In: KDD, pp. 421–431. ACM (2022)
Guo, L., Chen, Z., Chen, J., Chen, H.: Revisit and outstrip entity alignment: a perspective of generative models. CoRR abs/2305.14651 (2023)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778. IEEE Computer Society (2016)
Huang, J., Sun, Z., Chen, Q., Xu, X., Ren, W., Hu, W.: Deep active alignment of knowledge graph entities and schemata. CoRR abs/2304.04389 (2023)
Jiménez-Ruiz, E., Cuenca Grau, B.: LogMap: logic-based and scalable ontology matching. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 273–288. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_18
Jing, M., Li, J., Zhu, L., Lu, K., Yang, Y., Huang, Z.: Incomplete cross-modal retrieval with dual-aligned variational autoencoders. In: ACM Multimedia, pp. 3283–3291. ACM (2020)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)
Lee, H., Nam, T., Yang, E., Hwang, S.J.: Meta dropout: learning to perturb latent features for generalization. In: ICLR. OpenReview.net (2020)
Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)
Li, C., Cao, Y., Hou, L., Shi, J., Li, J., Chua, T.: Semi-supervised entity alignment via joint knowledge embedding model and cross-graph model. In: EMNLP/IJCNLP (1), pp. 2723–2732. Association for Computational Linguistics (2019)
Li, Y., Chen, J., Li, Y., Xiang, Y., Chen, X., Zheng, H.: Vision, deduction and alignment: an empirical study on multi-modal knowledge graph alignment. CoRR abs/2302.08774 (2023)
Lin, Z., Zhang, Z., Wang, M., Shi, Y., Wu, X., Zheng, Y.: Multi-modal contrastive representation learning for entity alignment. In: COLING, pp. 2572–2584. International Committee on Computational Linguistics (2022)
Liu, F., Chen, M., Roth, D., Collier, N.: Visual pivoting for (unsupervised) entity alignment. In: AAAI, pp. 4257–4266. AAAI Press (2021)
Liu, Y., Li, H., Garcia-Duran, A., Niepert, M., Onoro-Rubio, D., Rosenblum, D.S.: MMKG: multi-modal knowledge graphs. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 459–474. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_30
Liu, Z., Cao, Y., Pan, L., Li, J., Chua, T.: Exploring and evaluating attributes, values, and structures for entity alignment. In: EMNLP (1), pp. 6355–6364. Association for Computational Linguistics (2020)
Qi, Z., et al.: Unsupervised knowledge graph alignment by probabilistic reasoning and semantic embedding. In: IJCAI, pp. 2019–2025 (2021)
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML. Proceedings of Machine Learning Research, vol. 139, pp. 8748–8763. PMLR (2021)
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: NIPS, pp. 3483–3491 (2015)
Suchanek, F.M., Abiteboul, S., Senellart, P.: PARIS: probabilistic alignment of relations, instances, and schema. Proc. VLDB Endow. 5(3), 157–168 (2011)
Sun, Z., Hu, W., Li, C.: Cross-lingual entity alignment via joint attribute-preserving embedding. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 628–644. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_37
Sun, Z., Hu, W., Wang, C., Wang, Y., Qu, Y.: Revisiting embedding-based entity alignment: a robust and adaptive method. IEEE Trans. Knowl. Data Eng. 1–14 (2022). https://doi.org/10.1109/TKDE.2022.3200981
Sun, Z., Hu, W., Zhang, Q., Qu, Y.: Bootstrapping entity alignment with knowledge graph embedding. In: IJCAI, pp. 4396–4402. ijcai.org (2018)
Sun, Z., Huang, J., Hu, W., Chen, M., Guo, L., Qu, Y.: TransEdge: translating relation-contextualized embeddings for knowledge graphs. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 612–629. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_35
Sun, Z., et al.: Knowledge graph alignment network with gated multi-hop neighborhood aggregation. In: AAAI, pp. 222–229. AAAI Press (2020)
Sun, Z., et al.: A benchmarking study of embedding-based entity alignment for knowledge graphs. Proc. VLDB Endow. 13(11), 2326–2340 (2020)
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: ICLR (Poster). OpenReview.net (2018)
Wang, M., Shi, Y., Yang, H., Zhang, Z., Lin, Z., Zheng, Y.: Probing the impacts of visual context in multimodal entity alignment. Data Sci. Eng. 8(2), 124–134 (2023)
Wang, Y., et al.: Facing changes: continual entity alignment for growing knowledge graphs. In: Sattler, U., et al. (eds.) ISWC. LNCS, vol. 13489, pp. 196–213. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19433-7_12
Wu, Y., Liu, X., Feng, Y., Wang, Z., Zhao, D.: Neighborhood matching network for entity alignment. In: ACL, pp. 6477–6487. Association for Computational Linguistics (2020)
Xin, K., Sun, Z., Hua, W., Hu, W., Zhou, X.: Informed multi-context entity alignment. In: WSDM, pp. 1197–1205. ACM (2022)
Yang, B., Yih, W., He, X., Gao, J., Deng, L.: Embedding entities and relations for learning and inference in knowledge bases. In: ICLR (Poster) (2015)
Yang, H., Zou, Y., Shi, P., Lu, W., Lin, J., Sun, X.: Aligning cross-lingual entities with multi-aspect information. In: EMNLP/IJCNLP (1), pp. 4430–4440. Association for Computational Linguistics (2019)
Ye, Q., et al.: mPLUG-Owl: modularization empowers large language models with multimodality. CoRR abs/2304.14178 (2023)
Zhang, Q., Sun, Z., Hu, W., Chen, M., Guo, L., Qu, Y.: Multi-view knowledge graph embedding for entity alignment. In: IJCAI, pp. 5429–5435. ijcai.org (2019)
Zhao, J., Li, R., Jin, Q.: Missing modality imagination network for emotion recognition with uncertain missing modalities. In: ACL/IJCNLP (1), pp. 2608–2618. Association for Computational Linguistics (2021)
Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV, pp. 2242–2251. IEEE Computer Society (2017)
Zhu, Q., Zhou, X., Wu, J., Tan, J., Guo, L.: Neighborhood-aware attentional representation for multilingual knowledge graphs. In: IJCAI, pp. 1943–1949. ijcai.org (2019)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (NSFCU19B2027/NSFC91846204), joint project DH-2022ZY0012 from Donghai Lab, and the EPSRC project ConCur (EP/V050869/1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, Z. et al. (2023). Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment. In: Payne, T.R., et al. The Semantic Web – ISWC 2023. ISWC 2023. Lecture Notes in Computer Science, vol 14265. Springer, Cham. https://doi.org/10.1007/978-3-031-47240-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-47240-4_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47239-8
Online ISBN: 978-3-031-47240-4
eBook Packages: Computer ScienceComputer Science (R0)