Abstract
Earth Observation (EO) Big Data Collections are acquired at large volumes and variety, due to their high heterogeneous nature. The multimodal character of EO Big Data requires effective combination of multiple modalities for similarity search. We propose a late fusion mechanism of multiple rankings to combine the results from several uni-modal searches in Sentinel 2 image collections. We fist create a K-order tensor from the results of separate searches by visual features, concepts, spatial and temporal information. Visual concepts and features are based on a vector representation from Deep Convolutional Neural Networks. 2D-surfaces of the K-order tensor initially provide candidate retrieved results per ranking position and are merged to obtain the final list of retrieved results. Satellite image patches are used as queries in order to retrieve the most relevant image patches in Sentinel 2 images. Quantitative and qualitative results show that the proposed method outperforms search by a single modality and other late fusion methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Ah-Pine, J., Csurka, G., Clinchant, S.: Unsupervised visual and textual information fusion in CBMIR using graph-based methods. ACM Trans. Inf. Syst. (TOIS) 33(2), 1–31 (2015)
Arya, D., Rudinac, S., Worring, M.: Hyperlearn: a distributed approach for representation learning in datasets with many modalities. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2245–2253 (2019)
Aslam, J.A., Montague, M.: Models for metasearch. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 276–284 (2001)
Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.S.: Multimodal fusion for multimedia analysis: a survey. Multimedia Syst. 16(6), 345–379 (2010)
Caicedo, J.C., BenAbdallah, J., González, F.A., Nasraoui, O.: Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization. Neurocomputing 76(1), 50–60 (2012)
Cormack, G.V., Clarke, C.L., Buettcher, S.: Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 758–759 (2009)
Feng, F., Wang, X., Li, R.: Cross-modal retrieval with correspondence autoencoder. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 7–16 (2014)
Gialampoukidis, I., Chatzilari, E., Nikolopoulos, S., Vrochidis, S., Kompatsiaris, I.: Multimodal fusion of big multimedia data. In: Big Data Analytics for Large-Scale Multimedia Search, pp. 121–156 (2019)
Li, Y., Zhang, Y., Tao, C., Zhu, H.: Content-based high-resolution remote sensing image retrieval via unsupervised feature learning and collaborative affinity metric fusion. Remote Sens. 8(9), 709 (2016)
Liu, Y., Chen, C., Han, Z., Ding, L., Liu, Y.: High-resolution remote sensing image retrieval based on classification-similarity networks and double fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 13, 1119–1133 (2020)
Liu, Y., Liu, Y., Ding, L.: Scene classification based on two-stage deep feature fusion. IEEE Geosci. Remote Sens. Lett. 15(2), 183–186 (2017)
Magalhães, J., Rüger, S.: An information-theoretic framework for semantic-multimedia retrieval. ACM Trans. Inf. Syst. (TOIS) 28(4), 1–32 (2010)
Montague, M., Aslam, J.A.: Condorcet fusion for improved retrieval. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 538–548 (2002)
Moumtzidou, A., et al.: Flood detection with sentinel-2 satellite images in crisis management systems. In: ISCRAM 2020 Conference Proceedings - 17th International Conference on Information Systems for Crisis Response and Management, pp. 1049–1059 (2020)
Sumbul, G., Charfuelan, M., Demir, B., Markl, V.: Bigearthnet: a large-scale benchmark archive for remote sensing image understanding. In: IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium, pp. 5901–5904. IEEE (2019)
Tang, X., Jiao, L.: Fusion similarity-based reranking for SAR image retrieval. IEEE Geosci. Remote Sens. Lett. 14(2), 242–246 (2016)
Wang, J., He, Y., Kang, C., Xiang, S., Pan, C.: Image-text cross-modal retrieval via modality-specific feature learning. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 347–354 (2015)
Wang, W., Ooi, B.C., Yang, X., Zhang, D., Zhuang, Y.: Effective multi-modal retrieval based on stacked auto-encoders. Proc. VLDB Endow. 7(8), 649–660 (2014)
Younessian, E., Mitamura, T., Hauptmann, A.: Multimodal knowledge-based analysis in multimedia event detection. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, pp. 1–8 (2012)
Acknowledgements
This work was supported by the EC-funded projects H2020-832876-aqua3S and H2020-776019-EOPEN.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Gialampoukidis, I., Moumtzidou, A., Bakratsas, M., Vrochidis, S., Kompatsiaris, I. (2021). A Multimodal Tensor-Based Late Fusion Approach for Satellite Image Search in Sentinel 2 Images. In: Lokoč, J., et al. MultiMedia Modeling. MMM 2021. Lecture Notes in Computer Science(), vol 12573. Springer, Cham. https://doi.org/10.1007/978-3-030-67835-7_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-67835-7_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67834-0
Online ISBN: 978-3-030-67835-7
eBook Packages: Computer ScienceComputer Science (R0)