Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections

Net, Francesc; Folia, Marc; Casals, Pep; Gómez, Lluis

doi:10.1007/978-3-031-41734-4_1

Francesc Net¹¹,
Marc Folia¹²,
Pep Casals¹² &
…
Lluis Gómez¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14191))

Included in the following conference series:

International Conference on Document Analysis and Recognition

644 Accesses

Abstract

This paper presents a comparative study of near-duplicate image detection techniques in a real-world use case scenario, where a document management company is commissioned to manually annotate a collection of scanned photographs. Detecting duplicate and near-duplicate photographs can reduce the time spent on manual annotation by archivists. This real use case differs from laboratory settings as the deployment dataset is available in advance, allowing the use of transductive learning. We propose a transductive learning approach that leverages state-of-the-art deep learning architectures such as convolutional neural networks (CNNs) and Vision Transformers (ViTs). Our approach involves pre-training a deep neural network on a large dataset and then fine-tuning the network on the unlabeled target collection with self-supervised learning. The results show that the proposed approach outperforms the baseline methods in the task of near-duplicate image detection in the UKBench and an in-house private dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT press (2016)
Google Scholar
Joachims, T.: Transductive learning via spectral graph partitioning. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03) (2003)
Google Scholar
Thyagharajan, K.K., Kalaiarasi, G.: A review on near-duplicate detection of images using computer vision techniques. Archives Comput. Methods Eng. 28(3), 897–916 (2021)
Article MathSciNet Google Scholar
Erin Liong, V., et al.: Deep hashing for compact binary codes learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Google Scholar
Liu, H., et al.: Deep supervised hashing for fast image retrieval. In: Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition (2016)
Google Scholar
Wu, D., et al.: Deep supervised hashing for multi-label and large-scale image retrieval. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (2017)
Google Scholar
Zhao, F., et al.: Deep semantic ranking based hashing for multi-label image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Google Scholar
Zhou, Z., et al.: Near-duplicate image detection system using coarse-to-fine matching scheme based on global and local CNN features. Mathematics 8(4), 644 (2020)D
Google Scholar
Morra, L., Lamberti, F.: Benchmarking unsupervised near-duplicate image detection. Expert Syst. Appl. 135, 313–326 (2019)
Article Google Scholar
Zhang, Y., et al.: Single-and cross-modality near duplicate image pairs detection via spatial transformer comparing CNN. Sensors 21(1), 255 (2021)
Google Scholar
Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: Min-hash and TF-IDF weighting. In: Bmvc, vol. 810, pp.81-815 (2008)
Google Scholar
Dong, W., et al.: High-confidence near-duplicate image detection. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval (2012)
Google Scholar
He, B., et al.: Part-regularized near-duplicate vehicle re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Zauner, C.: Implementation and Benchmarking of Perceptual Image Hash Functions (2010)
Google Scholar
Sharif Razavian, A., et al.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2014)
Google Scholar
D Yosinski, J., et al.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Google Scholar
Dubey, S.R.: A decade survey of content based image retrieval using deep learning. IEEE Trans. Circuits Syst. Video Technol. 32(5), 2687–2704 (2021)
Article Google Scholar
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition (2016)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Chen, T., et al.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning. PMLR (2020)
Google Scholar
He, K., et al.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, PMLR (2021)
Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: 2006 IEEE Computer Society Conference on Computer vision and pattern recognition, Vol. 2. IEEE (2006)
Google Scholar
Deng, J. et al.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Center, Universitat Autònoma de Barcelona, Bellaterra, Catalunya, Spain
Francesc Net & Lluis Gómez
Nubilum, Gran Via de les Corts Catalanes 575, 1r 1a, 08011, Barcelona, Spain
Marc Folia & Pep Casals

Authors

Francesc Net
View author publications
You can also search for this author in PubMed Google Scholar
Marc Folia
View author publications
You can also search for this author in PubMed Google Scholar
Pep Casals
View author publications
You can also search for this author in PubMed Google Scholar
Lluis Gómez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lluis Gómez .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Gernot A. Fink
Adobe, College Park, MN, USA
Rajiv Jain
Osaka Metropolitan University, Osaka, Japan
Koichi Kise
Rochester Institute of Technology, Rochester, NY, USA
Richard Zanibbi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Net, F., Folia, M., Casals, P., Gómez, L. (2023). Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14191. Springer, Cham. https://doi.org/10.1007/978-3-031-41734-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-41734-4_1
Published: 19 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41733-7
Online ISBN: 978-3-031-41734-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections