Effectively Filtering Images for Better Multi-modal Knowledge Graph

Peng, Huang; Xu, Hao; Tang, Jiuyang; Wu, Jibing; Huang, Hongbin

doi:10.1007/978-981-99-1354-1_2

Huang Peng⁷,
Hao Xu⁷,
Jiuyang Tang⁷,
Jibing Wu⁷ &
…
Hongbin Huang⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1784))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data

434 Accesses

Abstract

The existing multi-modal knowledge graph construction techniques have become mature for processing text modal data, but lack effective processing methods for other modal data such as visual modal data. Therefore, the focus of multi-modal knowledge graph construction lies in image and image and text fusion processing. At present, the construction of multi-modal knowledge graph often does not filter the image quality, and there are noises and similar repetitive images in the image set. To solve this problem, this paper studies the quality control and screening of images in the construction process of multi-modal knowledge graph, and proposes an image refining framework of multi-modal knowledge graph, which is divided into three modules. The final experiment proves that this framework can provide higher quality images for multi-modal knowledge graphs, and in the benchmark task of multi-modal entity alignment, the effect of entity alignment based on the multi-modal knowledge graphs constructed in this paper has been improved compared with previous models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anant, R., Sunita, J., Jalal, A.S., Manoj, K.: A density based algorithm for discovering density varied clusters in large spatial databases. International Journal of Computer Applications 3(6) (2010)
Google Scholar
Bizer, C., Heath, T., Berners-Lee, T.: Linked data: The story so far. International Journal on Semantic Web and Information Systems 5, 1–22 (07 2009). https://doi.org/10.4018/jswis.2009081901
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. pp. 248–255. Ieee (2009)
Google Scholar
Ferrada, S., Bustos, B., Hogan, A.: Imgpedia: a linked dataset with content-based analysis of wikimedia images. In: International Semantic Web Conference. pp. 84–93. Springer (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778 (2016)
Google Scholar
Krishna, R., Zhu, Y., Groth, O., Johnson, J., Hata, K., Kravitz, J., Chen, S., Kalantidis, Y., Li, L.J., Shamma, D.A., et al.: Visual genome: Connecting language and vision using crowdsourced dense image annotations. International journal of computer vision 123(1), 32–73 (2017)
Article MathSciNet Google Scholar
Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., Van Kleef, P., Auer, S., et al.: Dbpedia-a large-scale, multilingual knowledge base extracted from wikipedia. Semantic web 6(2), 167–195 (2015)
Article Google Scholar
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: European conference on computer vision. pp. 740–755. Springer (2014)
Google Scholar
Liu, Y., Li, H., Garcia-Duran, A., Niepert, M., Onoro-Rubio, D., Rosenblum, D.S.: Mmkg: multi-modal knowledge graphs. In: European Semantic Web Conference. pp. 459–474. Springer (2019)
Google Scholar
Mousselly-Sergieh, H., Botschen, T., Gurevych, I., Roth, S.: A multimodal translation-based approach for knowledge graph representation learning. In: Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics. pp. 225–234. Association for Computational Linguistics (Jun 2018). https://doi.org/10.18653/v1/S18-2027
Plummer, B.A., Wang, L., Cervantes, C.M., Caicedo, J.C., Hockenmaier, J., Lazebnik, S.: Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models. In: Proceedings of the IEEE international conference on computer vision. pp. 2641–2649 (2015)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015)
Google Scholar
Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Communications of the ACM 57(10), 78–85 (2014)
Article Google Scholar
Wang, H., Zhang, Y., Ji, Z., Pang, Y., Ma, L.: Consensus-aware visual-semantic embedding for image-text matching. In: European Conference on Computer Vision. pp. 18–34. Springer (2020)
Google Scholar
Wang, M., Qi, G., Wang, H., Zheng, Q.: Richpedia: a comprehensive multi-modal knowledge graph. In: Joint International Semantic Technology Conference. pp. 130–145. Springer (2020)
Google Scholar
Wang, Z., Lv, Q., Lan, X., Zhang, Y.: Cross-lingual knowledge graph alignment via graph convolutional networks. In: Proceedings of the 2018 conference on empirical methods in natural language processing. pp. 349–357 (2018)
Google Scholar
Xie, R., Liu, Z., Luan, H., Sun, M.: Image-embodied knowledge representation learning. In: IJCAI. pp. 3140–3146 (2017). https://doi.org/10.24963/ijcai.2017/438
Xueyao, J., Weichen, L., Jingping, L., Zhixu, L., Yanghua, X.: Entity image collection based on multi-modality pattern transfer(). Computer Engineer 48(08) (2022). https://doi.org/10.19678/j.issn.1000-3428.0064039

Download references

Author information

Authors and Affiliations

Laboratory for Big Data and Decision, National University of Defense Technology, Changsha, 410000, China
Huang Peng, Hao Xu, Jiuyang Tang, Jibing Wu & Hongbin Huang

Authors

Huang Peng
View author publications
You can also search for this author in PubMed Google Scholar
Hao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jiuyang Tang
View author publications
You can also search for this author in PubMed Google Scholar
Jibing Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hongbin Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Xu .

Editor information

Editors and Affiliations

Guangzhou University, Guangzhou, China
Shiyu Yang
Griffith University, Gold Coast, QLD, Australia
Saiful Islam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peng, H., Xu, H., Tang, J., Wu, J., Huang, H. (2023). Effectively Filtering Images for Better Multi-modal Knowledge Graph. In: Yang, S., Islam, S. (eds) Web and Big Data. APWeb-WAIM 2022 International Workshops. APWeb-WAIM 2022. Communications in Computer and Information Science, vol 1784. Springer, Singapore. https://doi.org/10.1007/978-981-99-1354-1_2

Download citation

DOI: https://doi.org/10.1007/978-981-99-1354-1_2
Published: 30 March 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1353-4
Online ISBN: 978-981-99-1354-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Effectively Filtering Images for Better Multi-modal Knowledge Graph