Skip to main content
Log in

A framework to enhance generalization of deep metric learning methods using general discriminative feature learning and class adversarial neural networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Deep Metric Learning (DML) methods automatically extract features from data and learn a non-linear transformation from the input to a semantically embedding space. Many DML methods focused to enhance the discrimination power of the learned metric by proposing novel sampling strategies or loss functions. This approach is very helpful when both the training and test examples are selected from the same set of categories. However, it is less effective in many applications of DML such as image retrieval and person-reidentification. Here, the DML should learn general semantic concepts from observed classes and employ them to rank or identify objects from unseen categories. Neglecting the generalization ability of the learned representation and just emphasizing to learn a more discriminative embedding on the observed classes may lead to the overfitting problem. To address this limitation, we propose a framework to enhance the generalization power of existing DML methods in a Zero-Shot Learning (ZSL) setting by general yet discriminative representation learning and employing a class adversarial neural network. To learn a general representation, we employ feature maps of intermediate layers in a deep neural network and enhance their discrimination power through an attention mechanism. Besides, a class adversarial network is utilized to force the deep model to seek class invariant features. We evaluate our work on widely used machine vision datasets in a ZSL setting. Extensive experimental results confirm that our framework can improve the generalization of existing DML methods, and it consistently outperforms baseline DML algorithms on unseen classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. Content-Based Information Retrieval

  2. Zero-Shot Learning

  3. Few-Shot Learning

  4. Normalized Mutual Information

  5. Deep Adversarial Metric Learning

  6. downloaded from: https://github.com/dichotomies/proxy-nca

  7. Adaptive Adversarial Deep Metric Learning

  8. Source code: https://github.com/KevinMusgrave/pytorch-metric-learning

  9. Source code: https://github.com/tomp11/metric_learning

  10. Source code: https://github.com/dichotomies/proxy-nca

  11. Normalized Mutual Information

  12. General Discriminative Feature Learning

References

  1. Chopra S, et al. (2005) Learning a similarity metric discriminatively, with application to face verification. 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), IEEE

  2. Hoffer E, Ailon N (2015). Deep metric learning using triplet network. International Workshop on Similarity-Based Pattern Recognition, Springer

  3. Wang J, et al. (2014) Learning fine-grained image similarity with deep ranking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  4. Ni J, et al. (2017) Fine-grained patient similarity measuring using deep metric learning. Proceedings of the 2017 ACM on conference on information and knowledge management

  5. Sohn K (2016) Improved deep metric learning with multi-class n-pair loss objective. Advances in Neural Information Processing Systems

  6. Kaya M, Bilge HŞ (2019) Deep metric learning: a survey. Symmetry 11(9):1066

    Article  Google Scholar 

  7. Simo-Serra E, et al. (2015) Discriminative learning of deep convolutional feature point descriptors. Proceedings of the IEEE International Conference on Computer Vision

  8. Schroff F, et al. (2015) Facenet: a unified embedding for face recognition and clustering. Proceedings of the IEEE conference on computer vision and pattern recognition

  9. Wah C, et al. (2011) The Caltech-UCSD Birds-200-2011 dataset, Computation & Neural Systems, technical report, CNS-TR-2011-001

  10. Wu C-Y, et al. (2017) Sampling matters in deep embedding learning. Proceedings of the IEEE International Conference on Computer Vision

  11. Zheng W, et al. (2019) Hardness-aware deep metric learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  12. Kim M, et al. (2022) Variational continual proxy-anchor for deep metric learning. International Conference on Artificial Intelligence and Statistics, PMLR

  13. Movshovitz-Attias Y, et al. (2017) No fuss distance metric learning using proxies. Proceedings of the IEEE International Conference on Computer Vision

  14. Oh Song H, et al. (2017) Deep metric learning via facility location. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  15. Qian Q, et al. (2019) Softtriple loss: deep metric learning without triplet sampling. Proceedings of the IEEE International Conference on Computer Vision

  16. Ge W (2018) Deep metric learning with hierarchical triplet loss. Proceedings of the European conference on computer vision (ECCV)

  17. Wang J, et al. (2017) Deep metric learning with angular loss. Proceedings of the IEEE International Conference on Computer Vision

  18. Ustinova E, Lempitsky V (2016) Learning deep embeddings with histogram loss. Adv Neural Inf Proces Syst 29:4170–4178

    Google Scholar 

  19. Wang X, et al. (2019) Multi-similarity loss with general pair weighting for deep metric learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

  20. Yao H, Zhang S, Hong R, Zhang Y, Xu C, Tian Q (2019) Deep representation learning with part loss for person re-identification. IEEE Trans Image Process 28(6):2860–2871

    Article  MathSciNet  MATH  Google Scholar 

  21. Oh Song H, et al. (2016) Deep metric learning via lifted structured feature embedding. Proceedings of the IEEE conference on computer vision and pattern recognition

  22. Jiang W, Huang K, Geng J, Deng X (2020) Multi-scale metric learning for few-shot learning. IEEE Trans Circ Syst Video Technol 31:1091–1102

    Article  Google Scholar 

  23. Li X, Yu L, Fu CW, Fang M, Heng PA (2020) Revisiting metric learning for few-shot image classification. Neurocomputing 406:49–58

    Article  Google Scholar 

  24. Duan Y, et al. (2018) Deep adversarial metric learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

  25. Wang Z, et al. (2020) Adaptive margin based deep adversarial metric learning. 2020 IEEE 6th Intl conference on big data security on cloud (BigDataSecurity), IEEE Intl conference on high performance and smart computing,(HPSC) and IEEE Intl conference on intelligent data and security (IDS), IEEE

  26. Xu X, He L, Lu H, Gao L, Ji Y (2019) Deep adversarial metric learning for cross-modal retrieval. World Wide Web 22(2):657–672

    Article  Google Scholar 

  27. Gu G, Ko B (2021) Proxy synthesis: learning with synthetic classes for deep metric learning. Proc AAAI Conference on Artificial Intelligence (AAAI), 34, 10853, 10860

  28. Zhang H et al (2017) mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412

    Google Scholar 

  29. Chen B, Deng W (2019) Energy confused adversarial metric learning for zero-shot image retrieval and clustering. Proceedings of the AAAI Conference on Artificial Intelligence.

  30. Zhu J, Zhong D, Luo K (2020) Boosting unconstrained Palmprint recognition with adversarial metric learning. IEEE Trans Biom Behav Identity Sci 2(4):388–398

    Article  Google Scholar 

  31. Shen T, et al. (2018) Disan: directional self-attention network for rnn/cnn-free language understanding. Proceedings of the AAAI Conference on Artificial Intelligence

  32. Szegedy C, et al. (2016) Rethinking the inception architecture for computer vision. Proceedings of the IEEE conference on computer vision and pattern recognition

  33. Krause J, et al. (2013) 3d object representations for fine-grained categorization. Proceedings of the IEEE International Conference on Computer Vision Workshops

  34. Nilsback M-E, Zisserman A (2008). Automated flower classification over a large number of classes. 2008 Sixth Indian conference on computer vision, Graphics & Image Processing, IEEE

Download references

Acknowledgments

We would like to acknowledge the Machine Learning Lab in the Engineering Faculty of FUM for their kind and technical support.

Availability of data

Datasets used in the experiments are publicly available and can be downloaded from the following links:

1. Oxford 102 Flowers: https://www.robots.ox.ac.uk/~vgg/data/flowers/102/

2. CUB-200-2011: https://github.com/cyizhuo/CUB-200-2011-dataset

3. CARS-196: https://ai.stanford.edu/~jkrause/cars/car_dataset.html

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reza Monsefi.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Al-Kaabi, K., Monsefi, R. & Zabihzadeh, D. A framework to enhance generalization of deep metric learning methods using general discriminative feature learning and class adversarial neural networks. Appl Intell 53, 8693–8711 (2023). https://doi.org/10.1007/s10489-022-03959-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03959-6

Keywords

Navigation