Skip to main content

ProxyNCA++: Revisiting and Revitalizing Proxy Neighborhood Component Analysis

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12369))

Included in the following conference series:

Abstract

We consider the problem of distance metric learning (DML), where the task is to learn an effective similarity measure between images. We revisit ProxyNCA and incorporate several enhancements. We find that low temperature scaling is a performance-critical component and explain why it works. Besides, we also discover that Global Max Pooling works better in general when compared to Global Average Pooling. Additionally, our proposed fast moving proxies also addresses small gradient issue of proxies, and this component synergizes well with low temperature scaling and Global Max Pooling. Our enhanced model, called ProxyNCA++, achieves a 22.9% point average improvement of Recall@1 across four different zero-shot retrieval datasets compared to the original ProxyNCA algorithm. Furthermore, we achieve state-of-the-art results on the CUB200, Cars196, Sop, and InShop datasets, achieving Recall@1 scores of 72.2, 90.1, 81.4, and 90.9, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For additional experiments on different crop sizes, please refer to the corresponding supplementary materials

References

  1. Bell, S., Bala, K.: Learning visual similarity for product design with convolutional neural networks. ACM Trans. Graph. 34(4), 1–10 (2015)

    Article  Google Scholar 

  2. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a “siamese" time delay neural network. In: Proceedings of the 6th International Conference on Neural Information Processing Systems, NIPS 1993, pp. 737–744, San Francisco, CA, USA (1993)

    Google Scholar 

  3. Chechik, G., Sharma, V., Shalit, U., Bengio, S.: Large scale online learning of image similarity through ranking. J. Mach. Learn. Res. 11, 1109–1135 (2010)

    MathSciNet  MATH  Google Scholar 

  4. Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 2005. vol. 1, pp. 539-546 IEEE (2005)

    Google Scholar 

  5. Thibaut, D., Nicolas, T., Matthieu, C.: Weldon: weakly supervised learning of deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4743–4752 (2016)

    Google Scholar 

  6. Weifeng, G.: Deep metric learning with hierarchical triplet loss. In: The European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

  7. Jacob, G., Geoffrey, E.H., Sam, T.R., Ruslan, R.S.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems, pp. 513–520 (2005)

    Google Scholar 

  8. Goodfellow, I., Yoshua, B., Aaron, C.: Deep Learning. MIT Press (2016) http://www.deeplearningbook.org

  9. Raia, H., Sumit, C., Yann, L.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 1735–1742. IEEE (2006)

    Google Scholar 

  10. Kaiming, H., Xiangyu, Z., Shaoqing, R., Jian, S.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  11. Hershey, J. R., Chen, Z., Le Roux, J., Watanabe, S.: Deep clustering: Discriminative embeddings for segmentation and separation. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 31–35, (2016)

    Google Scholar 

  12. Geoffrey, H., Oriol, V., Jeff, D.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015

  13. Pierre, J., David, P., Histace, A., Edouard, K.: Metric learning with horde: High-order regularizer for deep embeddings. arXiv preprint arXiv:1908.02735 (2019)

  14. Gregory, K.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop (2015)

    Google Scholar 

  15. Jonathan, K., Michael, S., Jia, D., Li, F-F.: 3d object representations for fine-grained categorization. In: 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia (2013)

    Google Scholar 

  16. Ziwei, L., Ping, L., Shi, Q., Xiaogang, W., Xiaoou, T.: Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1096–1104, (2016)

    Google Scholar 

  17. Yair, M.-A., Alexander, T., Thomas, K. L., Sergey, I., Saurabh, S.: No fuss distance metric learning using proxies. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 360–368 (2017)

    Google Scholar 

  18. Michael, O., Georg, W., Horst, P., Horst, B.: Bier - boosting independent embeddings robustly. In: The IEEE International Conference on Computer Vision (ICCV) (2017)

    Google Scholar 

  19. Oren, R., Manohar, P., Piotr, D., Lubomir, B.: Metric learning with adaptive density discrimination. arXiv preprint arXiv:1511.05939 (2015)

  20. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  21. Artsiom, S., Vadim, T., Uta, B., Bjorn, O.: Divide and conquer the embedding space for metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 471–480, (2019)

    Google Scholar 

  22. Florian, S., Dmitry, K., James, P.: Facenet: a unified embedding for face recognition and clustering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  23. Hyun, O.S., Yu, X., Stefanie, J., Silvio, S.: Deep metric learning via lifted structured feature embedding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  24. Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005). vol. 1, pp. 539-546. IEEE (2005)

    Google Scholar 

  25. Szegedy, C., et al.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)

    Google Scholar 

  26. Evgeniya, U., Victor, L.: Learning deep embeddings with histogram loss. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R., (eds.) Advances in Neural Information Processing Systems 29, pp. 4170–4178. Curran Associates Inc (2016)

    Google Scholar 

  27. Ashish, V.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  28. Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D.: Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS 2016, pp. 3637–3645, USA, Curran Associates Inc (2016)

    Google Scholar 

  29. Catherine, W., Steve, B., Peter, W., Pietro, P., Serge, B.: The caltech-ucsd birds-200-2011 dataset (2011)

    Google Scholar 

  30. Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia, MM 2018, pp. 274–282, New York, USA, ACM (2018)

    Google Scholar 

  31. Jian, W., Feng, Z., Shilei, W., Xiao, L., Yuanqing, L.: Deep metric learning with angular loss. In: The IEEE International Conference on Computer Vision (ICCV) (2017)

    Google Scholar 

  32. Xun, W., Xintong, H., Weilin, H., Dengke, D., Matthew, R.S.: Multi-similarity loss with general pair weighting for deep metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5022–5030 (2019)

    Google Scholar 

  33. Chao-Yuan, W., Manmatha, R., Alexander, J.S., Philipp, K.: Sampling matters in deep embedding learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2840–2848 (2017)

    Google Scholar 

  34. Zhirong, W., Alexei, A.E., Stella, X.Y.: Improving generalization via scalable neighborhood component analysis. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 685–701 (2018)

    Google Scholar 

  35. Hong, X., Richard, S., Robert, P.: Deep randomized ensembles for metric learning. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 723–734 (2018)

    Google Scholar 

  36. Zhai, A., Wu, H.Y.: Classification is a strong baseline for deep metric learning (2019)

    Google Scholar 

  37. Feng, Z., et al.: Pyramidal person re-identification via multi-loss dynamic training. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eu Wern Teh .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 242 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Teh, E.W., DeVries, T., Taylor, G.W. (2020). ProxyNCA++: Revisiting and Revitalizing Proxy Neighborhood Component Analysis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12369. Springer, Cham. https://doi.org/10.1007/978-3-030-58586-0_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58586-0_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58585-3

  • Online ISBN: 978-3-030-58586-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics