Skip to main content

Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections

  • Conference paper
  • First Online:
Document Analysis and Recognition - ICDAR 2023 (ICDAR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14191))

Included in the following conference series:

  • 686 Accesses

Abstract

This paper presents a comparative study of near-duplicate image detection techniques in a real-world use case scenario, where a document management company is commissioned to manually annotate a collection of scanned photographs. Detecting duplicate and near-duplicate photographs can reduce the time spent on manual annotation by archivists. This real use case differs from laboratory settings as the deployment dataset is available in advance, allowing the use of transductive learning. We propose a transductive learning approach that leverages state-of-the-art deep learning architectures such as convolutional neural networks (CNNs) and Vision Transformers (ViTs). Our approach involves pre-training a deep neural network on a large dataset and then fine-tuning the network on the unlabeled target collection with self-supervised learning. The results show that the proposed approach outperforms the baseline methods in the task of near-duplicate image detection in the UKBench and an in-house private dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://digitaltmuseum.se/.

  2. 2.

    https://docs.opencv.org/3.4/d4/d93/group__img__hash.html.

References

  1. Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT press (2016)

    Google Scholar 

  2. Joachims, T.: Transductive learning via spectral graph partitioning. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03) (2003)

    Google Scholar 

  3. Thyagharajan, K.K., Kalaiarasi, G.: A review on near-duplicate detection of images using computer vision techniques. Archives Comput. Methods Eng. 28(3), 897–916 (2021)

    Article  MathSciNet  Google Scholar 

  4. Erin Liong, V., et al.: Deep hashing for compact binary codes learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  5. Liu, H., et al.: Deep supervised hashing for fast image retrieval. In: Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition (2016)

    Google Scholar 

  6. Wu, D., et al.: Deep supervised hashing for multi-label and large-scale image retrieval. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (2017)

    Google Scholar 

  7. Zhao, F., et al.: Deep semantic ranking based hashing for multi-label image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  8. Zhou, Z., et al.: Near-duplicate image detection system using coarse-to-fine matching scheme based on global and local CNN features. Mathematics 8(4), 644 (2020)D

    Google Scholar 

  9. Morra, L., Lamberti, F.: Benchmarking unsupervised near-duplicate image detection. Expert Syst. Appl. 135, 313–326 (2019)

    Article  Google Scholar 

  10. Zhang, Y., et al.: Single-and cross-modality near duplicate image pairs detection via spatial transformer comparing CNN. Sensors 21(1), 255 (2021)

    Google Scholar 

  11. Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: Min-hash and TF-IDF weighting. In: Bmvc, vol. 810, pp.81-815 (2008)

    Google Scholar 

  12. Dong, W., et al.: High-confidence near-duplicate image detection. In: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval (2012)

    Google Scholar 

  13. He, B., et al.: Part-regularized near-duplicate vehicle re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)

    Google Scholar 

  14. Zauner, C.: Implementation and Benchmarking of Perceptual Image Hash Functions (2010)

    Google Scholar 

  15. Sharif Razavian, A., et al.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2014)

    Google Scholar 

  16. D Yosinski, J., et al.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, vol. 27 (2014)

    Google Scholar 

  17. Dubey, S.R.: A decade survey of content based image retrieval using deep learning. IEEE Trans. Circuits Syst. Video Technol. 32(5), 2687–2704 (2021)

    Article  Google Scholar 

  18. He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition (2016)

    Google Scholar 

  19. Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  20. Chen, T., et al.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning. PMLR (2020)

    Google Scholar 

  21. He, K., et al.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022)

    Google Scholar 

  22. Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, PMLR (2021)

    Google Scholar 

  23. Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: 2006 IEEE Computer Society Conference on Computer vision and pattern recognition, Vol. 2. IEEE (2006)

    Google Scholar 

  24. Deng, J. et al.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lluis Gómez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Net, F., Folia, M., Casals, P., Gómez, L. (2023). Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14191. Springer, Cham. https://doi.org/10.1007/978-3-031-41734-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41734-4_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41733-7

  • Online ISBN: 978-3-031-41734-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics