Skip to main content

A New Loss for Image Retrieval: Class Anchor Margin

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2024)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14646))

Included in the following conference series:

  • 203 Accesses

Abstract

The performance of neural networks in content-based image retrieval (CBIR) is highly influenced by the chosen loss (objective) function. The majority of objective functions for neural models can be divided into metric learning and statistical learning. Metric learning approaches require a pair mining strategy that often lacks efficiency, while statistical learning approaches are not generating highly compact features due to their indirect feature optimization. To this end, we propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimizes for the \(L_{2}\) metric without the need of generating pairs. Our loss is formed of three components. One leading objective ensures that the learned features are attracted to each designated learnable class anchor. The second loss component regulates the anchors and forces them to be separable by a margin, while the third objective ensures that the anchors do not collapse to zero. Furthermore, we develop a more efficient two-stage retrieval system by harnessing the learned class anchors during the first stage of the retrieval process, eliminating the need of comparing the query with every image in the database. We establish a set of three datasets (CIFAR-100, Food-101, and ImageNet-200) and evaluate the proposed objective on the CBIR task, by using both convolutional and transformer architectures. Compared to existing objective functions, our empirical evidence shows that the proposed objective is generating superior and more consistent results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barz, B., Denzler, J.: Deep learning on small datasets without pre-training using cosine loss. In: Proceedings of WACV, pp. 1360–1369. IEEE (2020)

    Google Scholar 

  2. Bossard, L., Guillaumin, M., Van Gool, L.: Food-101 – mining discriminative components with random forests. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 446–461. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_29

    Chapter  Google Scholar 

  3. Cao, B., Araujo, A., Sim, J.: Unifying deep local and global features for image search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 726–743. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58565-5_43

    Chapter  Google Scholar 

  4. Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: a deep quadruplet network for person re-identification. In: Proceedings of CVPR, pp. 1320–1329. IEEE (2017)

    Google Scholar 

  5. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    Article  Google Scholar 

  6. Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of CVPR, pp. 4685–4694. IEEE (2019)

    Google Scholar 

  7. Dubey, S.R.: A decade survey of content based image retrieval using deep learning. IEEE Trans. Circuits Syst. Video Technol. 32(5), 2687–2704 (2021)

    Article  Google Scholar 

  8. Elezi, I., et al.: the group loss++: a deeper look into group loss for deep metric learning. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2505–2518 (2022)

    Article  Google Scholar 

  9. Gajić, B., Amato, A., Baldrich, R., van de Weijer, J., Gatta, C.: Area under the ROC curve maximization for metric learning. In: Proceedings of CVPR, pp. 2807–2816. IEEE (2022)

    Google Scholar 

  10. Georgescu, M.I., Duţǎ, G.E., Ionescu, R.T.: Teacher-student training and triplet loss to reduce the effect of drastic face occlusion: application to emotion recognition, gender identification and age estimation. Mach. Vis. Appl. 33(1), 12 (2022)

    Article  Google Scholar 

  11. Georgescu, M.I., Ionescu, R.T.: Teacher-student training and triplet loss for facial expression recognition under occlusion. In: Proceedings of ICPR, pp. 2288–2295. IEEE (2021)

    Google Scholar 

  12. Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: Proceedings of CVPR, vol. 2, pp. 1735–1742. IEEE (2006)

    Google Scholar 

  13. Harwood, B., Kumar, V.B., Carneiro, G., Reid, I., Drummond, T.: Smart mining for deep metric learning. In: Proceedings of ICCV, pp. 2821–2829. IEEE (2017)

    Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of CVPR, pp. 770–778. IEEE (2016)

    Google Scholar 

  15. Khosla, P., et al.: Supervised contrastive learning. In: Proceedings of NeurIPS, vol. 33, pp. 18661–18673. Curran Associates, Inc. (2020)

    Google Scholar 

  16. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of ICLR (2015)

    Google Scholar 

  17. Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)

    Google Scholar 

  18. Lee, S., Seong, H., Lee, S., Kim, E.: Correlation verification for image retrieval. In: Proceedings of CVPR, pp. 5374–5384. IEEE (2022)

    Google Scholar 

  19. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of CVPR, pp. 6738–6746. IEEE, Los Alamitos, CA, USA (2017)

    Google Scholar 

  20. Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural networks. In: Proceedings of ICML, pp. 507–516. JMLR.org (2016)

    Google Scholar 

  21. Liu, Z., et al.: Swin Transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of ICCV, pp. 10012–10022. IEEE (2021)

    Google Scholar 

  22. Min, W., Mei, S., Li, Z., Jiang, S.: A two-stage triplet network training framework for image retrieval. IEEE Trans. Multimedia 22(12), 3128–3138 (2020)

    Article  Google Scholar 

  23. Muller, S.G., Hutter, F.: TrivialAugment: tuning-free yet state-of-the-art data augmentation. In: Proceedings of ICCV, pp. 754–762. IEEE, Los Alamitos, CA, USA (2021)

    Google Scholar 

  24. Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)

    Google Scholar 

  25. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Proceedings of NeurIPS, pp. 8024–8035. Curran Associates, Inc. (2019)

    Google Scholar 

  26. Patel, Y., Tolias, G., Matas, J.: Recall@k surrogate loss with large batches and similarity mixup. In: Proceedings of CVPR, pp. 7502–7511. IEEE (2022)

    Google Scholar 

  27. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of CVPR, pp. 1–8. IEEE (2007)

    Google Scholar 

  28. Polley, S., Mondal, S., Mannam, V.S., Kumar, K., Patra, S., Nürnberger, A.: X-vision: explainable image retrieval by re-ranking in semantic space. In: Proceedings of CIKM, pp. 4955–4959. Association for Computing Machinery, New York, NY, USA (2022)

    Google Scholar 

  29. Radenović, F., Tolias, G., Chum, O.: Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2019)

    Article  Google Scholar 

  30. Revaud, J., Almazán, J., Rezende, R.S., Souza, C.R.d.: Learning with average precision: training image retrieval with a listwise loss. In: Proceedings of ICCV, pp. 5107–5116. IEEE (2019)

    Google Scholar 

  31. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  32. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of CVPR, pp. 815–823. IEEE (2015)

    Google Scholar 

  33. Sohn, K.: Improved deep metric learning with multi-class N-pair loss objective. In: Proceedings of NIPS, vol. 29. Curran Associates, Inc. (2016)

    Google Scholar 

  34. Suh, Y., Han, B., Kim, W., Lee, K.M.: Stochastic class-based hard example mining for deep metric learning. In: Proceedings of CVPR, pp. 7244–7252. IEEE (2019)

    Google Scholar 

  35. Tang, Y., Bai, W., Li, G., Liu, X., Zhang, Y.: CROLoss: towards a customizable loss for retrieval models in recommender systems. In: Proceedings of CIKM, pp. 1916–1924. Association for Computing Machinery, New York, NY, USA (2022)

    Google Scholar 

  36. Balntas, V., Riba, E., Ponsa, D., Mikolajczyk, K.: Learning local feature descriptors with triplets and shallow convolutional neural networks. In: Proceedings of BMVC, pp. 119.1–119.11. BMVA Press (2016)

    Google Scholar 

  37. Wang, F., Cheng, J., Liu, W., Liu, H.: Additive margin softmax for face verification. IEEE Sig. Process. Lett. 25(7), 926–930 (2018)

    Article  Google Scholar 

  38. Wang, H., et al.: CosFace: large margin cosine loss for deep face recognition. In: Proceedings of CVPR, pp. 5265–5274. IEEE (2018)

    Google Scholar 

  39. Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31

    Chapter  Google Scholar 

  40. Wu, C.Y., Manmatha, R., Smola, A.J., Krähenbühl, P.: Sampling matters in deep embedding learning. In: Proceedings of ICCV, pp. 2859–2867. IEEE (2017)

    Google Scholar 

  41. Wu, H., Wang, M., Zhou, W., Li, H.: Learning deep local features with multiple dynamic attentions for large-scale image retrieval. In: Proceedings of ICCV, pp. 11416–11425. IEEE (2021)

    Google Scholar 

  42. Yadan, O.: Hydra - a framework for elegantly configuring complex applications. Github (2019). https://github.com/facebookresearch/hydra

  43. Yu, B., Tao, D.: Deep metric learning with tuplet margin loss. In: Proceedings of ICCV, pp. 6489–6498. IEEE (2019)

    Google Scholar 

  44. Zhu, Q., Zhang, P., Wang, Z., Ye, X.: A new loss function for CNN classifier based on predefined evenly-distributed class centroids. IEEE Access 8, 10888–10895 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Radu Tudor Ionescu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ghiţă, A., Ionescu, R.T. (2024). A New Loss for Image Retrieval: Class Anchor Margin. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14646. Springer, Singapore. https://doi.org/10.1007/978-981-97-2253-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-2253-2_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-2252-5

  • Online ISBN: 978-981-97-2253-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics