Skip to main content

Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment

  • Conference paper
  • First Online:
The Semantic Web – ISWC 2023 (ISWC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14265))

Included in the following conference series:

Abstract

As a crucial extension of entity alignment (EA), multi-modal entity alignment (MMEA) aims to identify identical entities across disparate knowledge graphs (KGs) by exploiting associated visual information. However, existing MMEA approaches primarily concentrate on the fusion paradigm of multi-modal entity features, while neglecting the challenges presented by the pervasive phenomenon of missing and intrinsic ambiguity of visual images. In this paper, we present a further analysis of visual modality incompleteness, benchmarking latest MMEA models on our proposed dataset MMEA-UMVM, where the types of alignment KGs covering bilingual and monolingual, with standard (non-iterative) and iterative training paradigms to evaluate the model performance. Our research indicates that, in the face of modality incompleteness, models succumb to overfitting the modality noise, and exhibit performance oscillations or declines at high rates of missing modality. This proves that the inclusion of additional multi-modal data can sometimes adversely affect EA. To address these challenges, we introduce UMAEA, a robust multi-modal entity alignment approach designed to tackle uncertainly missing and ambiguous visual modalities. It consistently achieves SOTA performance across all 97 benchmark splits, significantly surpassing existing baselines with limited parameters and time consumption, while effectively alleviating the identified limitations of other models. Our code and benchmark data are available at https://github.com/zjukg/UMAEA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The appendix is attached with the arXiv version of this paper.

References

  1. Bordes, A., Usunier, N., García-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NIPS, pp. 2787–2795 (2013)

    Google Scholar 

  2. Cai, W., Ma, W., Zhan, J., Jiang, Y.: Entity alignment with reliable path reasoning and relation-aware heterogeneous graph transformer. In: IJCAI, pp. 1930–1937. ijcai.org (2022)

    Google Scholar 

  3. Cao, Y., Liu, Z., Li, C., Li, J., Chua, T.: Multi-channel graph neural network for entity alignment. In: ACL (1), pp. 1452–1461. Association for Computational Linguistics (2019)

    Google Scholar 

  4. Chen, L., Li, Z., Wang, Y., Xu, T., Wang, Z., Chen, E.: MMEA: entity alignment for multi-modal knowledge graph. In: Li, G., Shen, H.T., Yuan, Y., Wang, X., Liu, H., Zhao, X. (eds.) KSEM 2020. LNCS (LNAI), vol. 12274, pp. 134–147. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55130-8_12

    Chapter  Google Scholar 

  5. Chen, L., et al.: Multi-modal siamese network for entity alignment. In: KDD, pp. 118–126. ACM (2022)

    Google Scholar 

  6. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.E.: A simple framework for contrastive learning of visual representations. In: ICML. Proceedings of Machine Learning Research, vol. 119, pp. 1597–1607. PMLR (2020)

    Google Scholar 

  7. Chen, Z., Chen, J., Geng, Y., Pan, J.Z., Yuan, Z., Chen, H.: Zero-shot visual question answering using knowledge graph. In: Hotho, A., et al. (eds.) ISWC 2021. LNCS, vol. 12922, pp. 146–162. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88361-4_9

    Chapter  Google Scholar 

  8. Chen, Z., et al.: Meaformer: multi-modal entity alignment transformer for meta modality hybrid. In: ACM Multimedia. ACM (2023)

    Google Scholar 

  9. Gao, Y., Liu, X., Wu, J., Li, T., Wang, P., Chen, L.: Clusterea: scalable entity alignment with stochastic training and normalized mini-batch similarities. In: KDD, pp. 421–431. ACM (2022)

    Google Scholar 

  10. Guo, L., Chen, Z., Chen, J., Chen, H.: Revisit and outstrip entity alignment: a perspective of generative models. CoRR abs/2305.14651 (2023)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778. IEEE Computer Society (2016)

    Google Scholar 

  12. Huang, J., Sun, Z., Chen, Q., Xu, X., Ren, W., Hu, W.: Deep active alignment of knowledge graph entities and schemata. CoRR abs/2304.04389 (2023)

    Google Scholar 

  13. Jiménez-Ruiz, E., Cuenca Grau, B.: LogMap: logic-based and scalable ontology matching. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 273–288. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_18

    Chapter  Google Scholar 

  14. Jing, M., Li, J., Zhu, L., Lu, K., Yang, Y., Huang, Z.: Incomplete cross-modal retrieval with dual-aligned variational autoencoders. In: ACM Multimedia, pp. 3283–3291. ACM (2020)

    Google Scholar 

  15. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)

    Google Scholar 

  16. Lee, H., Nam, T., Yang, E., Hwang, S.J.: Meta dropout: learning to perturb latent features for generalization. In: ICLR. OpenReview.net (2020)

    Google Scholar 

  17. Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)

    Article  Google Scholar 

  18. Li, C., Cao, Y., Hou, L., Shi, J., Li, J., Chua, T.: Semi-supervised entity alignment via joint knowledge embedding model and cross-graph model. In: EMNLP/IJCNLP (1), pp. 2723–2732. Association for Computational Linguistics (2019)

    Google Scholar 

  19. Li, Y., Chen, J., Li, Y., Xiang, Y., Chen, X., Zheng, H.: Vision, deduction and alignment: an empirical study on multi-modal knowledge graph alignment. CoRR abs/2302.08774 (2023)

    Google Scholar 

  20. Lin, Z., Zhang, Z., Wang, M., Shi, Y., Wu, X., Zheng, Y.: Multi-modal contrastive representation learning for entity alignment. In: COLING, pp. 2572–2584. International Committee on Computational Linguistics (2022)

    Google Scholar 

  21. Liu, F., Chen, M., Roth, D., Collier, N.: Visual pivoting for (unsupervised) entity alignment. In: AAAI, pp. 4257–4266. AAAI Press (2021)

    Google Scholar 

  22. Liu, Y., Li, H., Garcia-Duran, A., Niepert, M., Onoro-Rubio, D., Rosenblum, D.S.: MMKG: multi-modal knowledge graphs. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 459–474. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_30

    Chapter  Google Scholar 

  23. Liu, Z., Cao, Y., Pan, L., Li, J., Chua, T.: Exploring and evaluating attributes, values, and structures for entity alignment. In: EMNLP (1), pp. 6355–6364. Association for Computational Linguistics (2020)

    Google Scholar 

  24. Qi, Z., et al.: Unsupervised knowledge graph alignment by probabilistic reasoning and semantic embedding. In: IJCAI, pp. 2019–2025 (2021)

    Google Scholar 

  25. Radford, A., et al.: Learning transferable visual models from natural language supervision. In: ICML. Proceedings of Machine Learning Research, vol. 139, pp. 8748–8763. PMLR (2021)

    Google Scholar 

  26. Sohn, K., Lee, H., Yan, X.: Learning structured output representation using deep conditional generative models. In: NIPS, pp. 3483–3491 (2015)

    Google Scholar 

  27. Suchanek, F.M., Abiteboul, S., Senellart, P.: PARIS: probabilistic alignment of relations, instances, and schema. Proc. VLDB Endow. 5(3), 157–168 (2011)

    Article  Google Scholar 

  28. Sun, Z., Hu, W., Li, C.: Cross-lingual entity alignment via joint attribute-preserving embedding. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 628–644. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_37

    Chapter  Google Scholar 

  29. Sun, Z., Hu, W., Wang, C., Wang, Y., Qu, Y.: Revisiting embedding-based entity alignment: a robust and adaptive method. IEEE Trans. Knowl. Data Eng. 1–14 (2022). https://doi.org/10.1109/TKDE.2022.3200981

  30. Sun, Z., Hu, W., Zhang, Q., Qu, Y.: Bootstrapping entity alignment with knowledge graph embedding. In: IJCAI, pp. 4396–4402. ijcai.org (2018)

    Google Scholar 

  31. Sun, Z., Huang, J., Hu, W., Chen, M., Guo, L., Qu, Y.: TransEdge: translating relation-contextualized embeddings for knowledge graphs. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 612–629. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_35

    Chapter  Google Scholar 

  32. Sun, Z., et al.: Knowledge graph alignment network with gated multi-hop neighborhood aggregation. In: AAAI, pp. 222–229. AAAI Press (2020)

    Google Scholar 

  33. Sun, Z., et al.: A benchmarking study of embedding-based entity alignment for knowledge graphs. Proc. VLDB Endow. 13(11), 2326–2340 (2020)

    Article  Google Scholar 

  34. Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)

    Google Scholar 

  35. Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y.: Graph attention networks. In: ICLR (Poster). OpenReview.net (2018)

    Google Scholar 

  36. Wang, M., Shi, Y., Yang, H., Zhang, Z., Lin, Z., Zheng, Y.: Probing the impacts of visual context in multimodal entity alignment. Data Sci. Eng. 8(2), 124–134 (2023)

    Article  Google Scholar 

  37. Wang, Y., et al.: Facing changes: continual entity alignment for growing knowledge graphs. In: Sattler, U., et al. (eds.) ISWC. LNCS, vol. 13489, pp. 196–213. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19433-7_12

    Chapter  Google Scholar 

  38. Wu, Y., Liu, X., Feng, Y., Wang, Z., Zhao, D.: Neighborhood matching network for entity alignment. In: ACL, pp. 6477–6487. Association for Computational Linguistics (2020)

    Google Scholar 

  39. Xin, K., Sun, Z., Hua, W., Hu, W., Zhou, X.: Informed multi-context entity alignment. In: WSDM, pp. 1197–1205. ACM (2022)

    Google Scholar 

  40. Yang, B., Yih, W., He, X., Gao, J., Deng, L.: Embedding entities and relations for learning and inference in knowledge bases. In: ICLR (Poster) (2015)

    Google Scholar 

  41. Yang, H., Zou, Y., Shi, P., Lu, W., Lin, J., Sun, X.: Aligning cross-lingual entities with multi-aspect information. In: EMNLP/IJCNLP (1), pp. 4430–4440. Association for Computational Linguistics (2019)

    Google Scholar 

  42. Ye, Q., et al.: mPLUG-Owl: modularization empowers large language models with multimodality. CoRR abs/2304.14178 (2023)

    Google Scholar 

  43. Zhang, Q., Sun, Z., Hu, W., Chen, M., Guo, L., Qu, Y.: Multi-view knowledge graph embedding for entity alignment. In: IJCAI, pp. 5429–5435. ijcai.org (2019)

    Google Scholar 

  44. Zhao, J., Li, R., Jin, Q.: Missing modality imagination network for emotion recognition with uncertain missing modalities. In: ACL/IJCNLP (1), pp. 2608–2618. Association for Computational Linguistics (2021)

    Google Scholar 

  45. Zhu, J., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV, pp. 2242–2251. IEEE Computer Society (2017)

    Google Scholar 

  46. Zhu, Q., Zhou, X., Wu, J., Tan, J., Guo, L.: Neighborhood-aware attentional representation for multilingual knowledge graphs. In: IJCAI, pp. 1943–1949. ijcai.org (2019)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (NSFCU19B2027/NSFC91846204), joint project DH-2022ZY0012 from Donghai Lab, and the EPSRC project ConCur (EP/V050869/1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, Z. et al. (2023). Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment. In: Payne, T.R., et al. The Semantic Web – ISWC 2023. ISWC 2023. Lecture Notes in Computer Science, vol 14265. Springer, Cham. https://doi.org/10.1007/978-3-031-47240-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-47240-4_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-47239-8

  • Online ISBN: 978-3-031-47240-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics