Abstract
Multi-modal data is becoming pervasive in the digital era, providing compelling scenarios that require cross-modal linkage such as linking image data with databases. We outline a critical matching/linking task within that space, which we call cross-modal common entity identification. This involves linking images with structured databases with the aid of available unstructured information. We propose a framework and method, ICE, which embodies a structured approach for the same involving information extraction from images and person matching followed by identifying a common entity that unites people represented in the image. We curate data sources from the entertainment domain, upon which we illustrate the effectiveness of our method. We hope ICE will generate interest in other tasks within the realm of multi-modal data processing in the intersection of image processing, NLP and databases.
P. Prakash, J. Rawal and S. Gupta—Contributed equally to this research.
This is a preview of subscription content, access via your institution.
Buying options


References
Face++ API for facial recognition. https://console.faceplusplus.com/documents/5679127
Google Vision API. https://cloud.google.com/vision
KariosAPI for Facial Recognition. https://rapidapi.com/KairosAPI/api/kairos-face-recognition
Bhadra, S.: Multi-view data completion. In: Deepak, P., Jurek-Loughrey, A. (eds.) Linking and Mining Heterogeneous and Multi-view Data. USL, pp. 1–25. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01872-6_1
Deepak, P., Jurek-Loughrey, A. (eds.): Linking and Mining Heterogeneous and Multi-view Data. USL, Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01872-6
DeFazio, S., Daoud, A., Smith, L.A., Srinivasan, J.: Integrating IR and RDBMS using cooperative indexing. In: SIGIR (1995)
Gupta, N., Singh, S., Roth, D.: Entity linking via joint encoding of types, descriptions, and context. In: EMNLP, pp. 2681–2690 (2017)
Jurek-Loughrey, A., Deepak, P.: Semi-supervised and unsupervised approaches to record pairs classification in multi-source data linkage. In: Deepak, P., Jurek-Loughrey, A. (eds.) Linking and Mining Heterogeneous and Multi-view Data. USL, pp. 55–78. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01872-6_3
Lahat, D., Adali, T., Jutten, C.: Multimodal data fusion: an overview of methods, challenges, and prospects. Proc. IEEE 103(9), 1449–1477 (2015)
Radev, D.R., Qi, H., Wu, H., Fan, W.: Evaluating web-based question answering systems. In: LREC (2002)
Roy, P., Mohania, M., Bamba, B., Raman, S.: Towards automatic association of relevant unstructured content with structured query results. In: CIKM (2005)
Sayers, A., Ben-Shlomo, Y., Blom, A.W., Steele, F.: Probabilistic record linkage. Int. J. Epidemiol. 45(3), 954–964 (2016)
Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: Issues, techniques, and solutions. IEEE TKDE 27(2), 443–460 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Prakash, P., Rawal, J., Gupta, S., P, D., Mohania, M. (2022). Cross-modal Data Linkage for Common Entity Identification. In: , et al. Advanced Data Mining and Applications. ADMA 2022. Lecture Notes in Computer Science(), vol 13088. Springer, Cham. https://doi.org/10.1007/978-3-030-95408-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-95408-6_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95407-9
Online ISBN: 978-3-030-95408-6
eBook Packages: Computer ScienceComputer Science (R0)