Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment

Chen, Zhuo; Guo, Lingbing; Fang, Yin; Zhang, Yichi; Chen, Jiaoyan; Pan, Jeff Z.; Li, Yangning; Chen, Huajun; Zhang, Wen

doi:10.1007/978-3-031-47240-4_7

Zhuo Chen¹⁶,
Lingbing Guo¹⁶,
Yin Fang¹⁶,
Yichi Zhang¹⁶,
Jiaoyan Chen^19,20,
Jeff Z. Pan²¹,
Yangning Li²²,
Huajun Chen^16,17 &
…
Wen Zhang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14265))

Included in the following conference series:

International Semantic Web Conference

1783 Accesses
6 Altmetric

Abstract

As a crucial extension of entity alignment (EA), multi-modal entity alignment (MMEA) aims to identify identical entities across disparate knowledge graphs (KGs) by exploiting associated visual information. However, existing MMEA approaches primarily concentrate on the fusion paradigm of multi-modal entity features, while neglecting the challenges presented by the pervasive phenomenon of missing and intrinsic ambiguity of visual images. In this paper, we present a further analysis of visual modality incompleteness, benchmarking latest MMEA models on our proposed dataset MMEA-UMVM, where the types of alignment KGs covering bilingual and monolingual, with standard (non-iterative) and iterative training paradigms to evaluate the model performance. Our research indicates that, in the face of modality incompleteness, models succumb to overfitting the modality noise, and exhibit performance oscillations or declines at high rates of missing modality. This proves that the inclusion of additional multi-modal data can sometimes adversely affect EA. To address these challenges, we introduce UMAEA, a robust multi-modal entity alignment approach designed to tackle uncertainly missing and ambiguous visual modalities. It consistently achieves SOTA performance across all 97 benchmark splits, significantly surpassing existing baselines with limited parameters and time consumption, while effectively alleviating the identified limitations of other models. Our code and benchmark data are available at https://github.com/zjukg/UMAEA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The appendix is attached with the arXiv version of this paper.

References

Bordes, A., Usunier, N., García-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NIPS, pp. 2787–2795 (2013)
Google Scholar
Cai, W., Ma, W., Zhan, J., Jiang, Y.: Entity alignment with reliable path reasoning and relation-aware heterogeneous graph transformer. In: IJCAI, pp. 1930–1937. ijcai.org (2022)
Google Scholar
Cao, Y., Liu, Z., Li, C., Li, J., Chua, T.: Multi-channel graph neural network for entity alignment. In: ACL (1), pp. 1452–1461. Association for Computational Linguistics (2019)
Google Scholar
Chen, L., Li, Z., Wang, Y., Xu, T., Wang, Z., Chen, E.: MMEA: entity alignment for multi-modal knowledge graph. In: Li, G., Shen, H.T., Yuan, Y., Wang, X., Liu, H., Zhao, X. (eds.) KSEM 2020. LNCS (LNAI), vol. 12274, pp. 134–147. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55130-8_12
Chapter Google Scholar
Chen, L., et al.: Multi-modal siamese network for entity alignment. In: KDD, pp. 118–126. ACM (2022)
Google Scholar
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.E.: A simple framework for contrastive learning of visual representations. In: ICML. Proceedings of Machine Learning Research, vol. 119, pp. 1597–1607. PMLR (2020)
Google Scholar
Chen, Z., Chen, J., Geng, Y., Pan, J.Z., Yuan, Z., Chen, H.: Zero-shot visual question answering using knowledge graph. In: Hotho, A., et al. (eds.) ISWC 2021. LNCS, vol. 12922, pp. 146–162. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88361-4_9
Chapter Google Scholar
Chen, Z., et al.: Meaformer: multi-modal entity alignment transformer for meta modality hybrid. In: ACM Multimedia. ACM (2023)
Google Scholar
Gao, Y., Liu, X., Wu, J., Li, T., Wang, P., Chen, L.: Clusterea: scalable entity alignment with stochastic training and normalized mini-batch similarities. In: KDD, pp. 421–431. ACM (2022)
Google Scholar
Guo, L., Chen, Z., Chen, J., Chen, H.: Revisit and outstrip entity alignment: a perspective of generative models. CoRR abs/2305.14651 (2023)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778. IEEE Computer Society (2016)
Google Scholar
Huang, J., Sun, Z., Chen, Q., Xu, X., Ren, W., Hu, W.: Deep active alignment of knowledge graph entities and schemata. CoRR abs/2304.04389 (2023)
Google Scholar
Jiménez-Ruiz, E., Cuenca Grau, B.: LogMap: logic-based and scalable ontology matching. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 273–288. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_18
Chapter Google Scholar
Jing, M., Li, J., Zhu, L., Lu, K., Yang, Y., Huang, Z.: Incomplete cross-modal retrieval with dual-aligned variational autoencoders. In: ACM Multimedia, pp. 3283–3291. ACM (2020)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)
Google Scholar
Lee, H., Nam, T., Yang, E., Hwang, S.J.: Meta dropout: learning to perturb latent features for generalization. In: ICLR. OpenReview.net (2020)
Google Scholar
Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)
Article Google Scholar
Li, C., Cao, Y., Hou, L., Shi, J., Li, J., Chua, T.: Semi-supervised entity alignment via joint knowledge embedding model and cross-graph model. In: EMNLP/IJCNLP (1), pp. 2723–2732. Association for Computational Linguistics (2019)
Google Scholar
Li, Y., Chen, J., Li, Y., Xiang, Y., Chen, X., Zheng, H.: Vision, deduction and alignment: an empirical study on multi-modal knowledge graph alignment. CoRR abs/2302.08774 (2023)
Google Scholar
Lin, Z., Zhang, Z., Wang, M., Shi, Y., Wu, X., Zheng, Y.: Multi-modal contrastive representation learning for entity alignment. In: COLING, pp. 2572–2584. International Committee on Computational Linguistics (2022)
Google Scholar
Liu, F., Chen, M., Roth, D., Collier, N.: Visual pivoting for (unsupervised) entity alignment. In: AAAI, pp. 4257–4266. AAAI Press (2021)
Google Scholar
Liu, Y., Li, H., Garcia-Duran, A., Niepert, M., Onoro-Rubio, D., Rosenblum, D.S.: MMKG: multi-modal knowledge graphs. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 459–474. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_30
Chapter Google Scholar
Liu, Z., Cao, Y., Pan, L., Li, J., Chua, T.: Exploring and evaluating attributes, values, and structures for entity alignment. In: EMNLP (1), pp. 6355–6364. Association for Computational Linguistics (2020)
Google Scholar
Qi, Z., et al.: Unsupervised knowledge graph alignment by probabilistic reasoning and semantic embedding. In: IJCAI, pp. 2019–2025 (2021)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML. Proceedings of Machine Learning Research, vol. 139, pp. 8748–8763. PMLR (2021)
Google Scholar
Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: NIPS, pp. 3483–3491 (2015)
Google Scholar
Suchanek, F.M., Abiteboul, S., Senellart, P.: PARIS: probabilistic alignment of relations, instances, and schema. Proc. VLDB Endow. 5(3), 157–168 (2011)
Article Google Scholar
Sun, Z., Hu, W., Li, C.: Cross-lingual entity alignment via joint attribute-preserving embedding. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 628–644. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_37
Chapter Google Scholar
Sun, Z., Hu, W., Wang, C., Wang, Y., Qu, Y.: Revisiting embedding-based entity alignment: a robust and adaptive method. IEEE Trans. Knowl. Data Eng. 1–14 (2022). https://doi.org/10.1109/TKDE.2022.3200981
Sun, Z., Hu, W., Zhang, Q., Qu, Y.: Bootstrapping entity alignment with knowledge graph embedding. In: IJCAI, pp. 4396–4402. ijcai.org (2018)
Google Scholar
Sun, Z., Huang, J., Hu, W., Chen, M., Guo, L., Qu, Y.: TransEdge: translating relation-contextualized embeddings for knowledge graphs. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 612–629. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_35
Chapter Google Scholar
Sun, Z., et al.: Knowledge graph alignment network with gated multi-hop neighborhood aggregation. In: AAAI, pp. 222–229. AAAI Press (2020)
Google Scholar
Sun, Z., et al.: A benchmarking study of embedding-based entity alignment for knowledge graphs. Proc. VLDB Endow. 13(11), 2326–2340 (2020)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
Google Scholar
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: ICLR (Poster). OpenReview.net (2018)
Google Scholar
Wang, M., Shi, Y., Yang, H., Zhang, Z., Lin, Z., Zheng, Y.: Probing the impacts of visual context in multimodal entity alignment. Data Sci. Eng. 8(2), 124–134 (2023)
Article Google Scholar
Wang, Y., et al.: Facing changes: continual entity alignment for growing knowledge graphs. In: Sattler, U., et al. (eds.) ISWC. LNCS, vol. 13489, pp. 196–213. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19433-7_12
Chapter Google Scholar
Wu, Y., Liu, X., Feng, Y., Wang, Z., Zhao, D.: Neighborhood matching network for entity alignment. In: ACL, pp. 6477–6487. Association for Computational Linguistics (2020)
Google Scholar
Xin, K., Sun, Z., Hua, W., Hu, W., Zhou, X.: Informed multi-context entity alignment. In: WSDM, pp. 1197–1205. ACM (2022)
Google Scholar
Yang, B., Yih, W., He, X., Gao, J., Deng, L.: Embedding entities and relations for learning and inference in knowledge bases. In: ICLR (Poster) (2015)
Google Scholar
Yang, H., Zou, Y., Shi, P., Lu, W., Lin, J., Sun, X.: Aligning cross-lingual entities with multi-aspect information. In: EMNLP/IJCNLP (1), pp. 4430–4440. Association for Computational Linguistics (2019)
Google Scholar
Ye, Q., et al.: mPLUG-Owl: modularization empowers large language models with multimodality. CoRR abs/2304.14178 (2023)
Google Scholar
Zhang, Q., Sun, Z., Hu, W., Chen, M., Guo, L., Qu, Y.: Multi-view knowledge graph embedding for entity alignment. In: IJCAI, pp. 5429–5435. ijcai.org (2019)
Google Scholar
Zhao, J., Li, R., Jin, Q.: Missing modality imagination network for emotion recognition with uncertain missing modalities. In: ACL/IJCNLP (1), pp. 2608–2618. Association for Computational Linguistics (2021)
Google Scholar
Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV, pp. 2242–2251. IEEE Computer Society (2017)
Google Scholar
Zhu, Q., Zhou, X., Wu, J., Tan, J., Guo, L.: Neighborhood-aware attentional representation for multilingual knowledge graphs. In: IJCAI, pp. 1943–1949. ijcai.org (2019)
Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (NSFCU19B2027/NSFC91846204), joint project DH-2022ZY0012 from Donghai Lab, and the EPSRC project ConCur (EP/V050869/1).

Author information

Authors and Affiliations

College of Computer Science, Zhejiang University, Hangzhou, China
Zhuo Chen, Lingbing Guo, Yin Fang, Yichi Zhang & Huajun Chen
Donghai laboratory, Zhoushan, China
Huajun Chen
School of Software Technology, Zhejiang University, Hangzhou, China
Wen Zhang
The University of Manchester, Manchester, UK
Jiaoyan Chen
University of Oxford, Oxford, UK
Jiaoyan Chen
School of Informatics, The University of Edinburgh, Edinburgh, UK
Jeff Z. Pan
Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
Yangning Li

Authors

Zhuo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lingbing Guo
View author publications
You can also search for this author in PubMed Google Scholar
Yin Fang
View author publications
You can also search for this author in PubMed Google Scholar
Yichi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaoyan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jeff Z. Pan
View author publications
You can also search for this author in PubMed Google Scholar
Yangning Li
View author publications
You can also search for this author in PubMed Google Scholar
Huajun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wen Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen Zhang .

Editor information

Editors and Affiliations

University of Liverpool, Liverpool, UK
Terry R. Payne
University of Bologna, Bologna, Italy
Valentina Presutti
Southeast University, Nanjing, China
Guilin Qi
Universidad Politécnica de Madrid, Madrid, Spain
María Poveda-Villalón
Huawei Technologies R&D UK, Edinburgh, UK
Giorgos Stoilos
Centrum Wiskunde and Informatica, Amsterdam, The Netherlands
Laura Hollink
IT University of Copenhagen, Copenhagen, Denmark
Zoi Kaoudi
Nanjing University, Nanjing, China
Gong Cheng
Tsinghua University, Beijing, Beijing, China
Juanzi Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Z. et al. (2023). Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment. In: Payne, T.R., et al. The Semantic Web – ISWC 2023. ISWC 2023. Lecture Notes in Computer Science, vol 14265. Springer, Cham. https://doi.org/10.1007/978-3-031-47240-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-47240-4_7
Published: 27 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47239-8
Online ISBN: 978-3-031-47240-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the Semantic Web Science Association (opens in a new tab)

Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment