Learning an Evolutionary Embedding via Massive Knowledge Distillation

Wu, Xiang; He, Ran; Hu, Yibo; Sun, Zhenan

doi:10.1007/s11263-019-01286-x

Learning an Evolutionary Embedding via Massive Knowledge Distillation

Published: 21 January 2020

Volume 128, pages 2089–2106, (2020)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Xiang Wu ORCID: orcid.org/0000-0001-5317-1338¹,
Ran He¹,
Yibo Hu¹ &
…
Zhenan Sun¹

1423 Accesses
11 Citations
Explore all metrics

Abstract

Knowledge distillation methods aim at transferring knowledge from a large powerful teacher network to a small compact student one. These methods often focus on close-set classification problems and matching features between teacher and student networks from a single sample. However, many real-world classification problems are open-set. This paper proposes an Evolutionary Embedding Learning (EEL) framework to learn a fast and accurate student network for open-set problems via massive knowledge distillation. First, we revisit the formulation of canonical knowledge distillation and make it suitable for the open-set problems with massive classes. Second, by introducing an angular constraint, a novel correlated embedding loss (CEL) is proposed to match embedding spaces between the teacher and student network from a global perspective. Lastly, we propose a simple yet effective paradigm towards a fast and accurate student network development for knowledge distillation. We show the possibility to implement an accelerated student network without sacrificing accuracy, compared with its teacher network. The experimental results are quite encouraging. EEL achieves better performance with other state-of-the-art methods for various large-scale open-set problems, including face recognition, vehicle re-identification and person re-identification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visualizing the Embedding Space to Explain the Effect of Knowledge Distillation

Categories of Response-Based, Feature-Based, and Relation-Based Knowledge Distillation

Self-Referenced Deep Learning

Notes

https://github.com/AlfredXiangWu/LightCNN.

References

Ba, J., & Caruana, R. (2014). Do deep nets really need to be deep? In NeurIPS.
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., & Shah, R. (1994). Signature verification using a“ siamese” time delay neural network. In NeurIPS.
Chen, Y., Wang, N., & Zhang, Z. (2018). Darkrank: Accelerating deep metric learning via cross sample similarities transfer. In AAAI.
Chen, J., Yi, D., Yang, J., Zhao, G., Li, S.Z., & Pietikainen, M. (2009) Learning mappings for face synthesis from near infrared to visual light images. In CVPR.
Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., & Bengio, Y. (2016). Binarized neural networks: Training deep neural networks with weights and activations constrained to + 1 or - 1. In NeurIPS.
Czarnecki, W. M., Osindero, S., Jaderberg, M., Swirszcz, G., & Pascanu, R. (2017). Sobolev training for neural networks. In NeurIPS.
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In CVPR.
Ding, S., Lin, L., Wang, G., & Chao, H. Y. (2015). Deep feature learning with relative distance comparison for person re-identification. Pattern Recognition, 48, 2993.
Article Google Scholar
Guo, Y., Zhang, L., Hu, Y., He, X., & Gao, J. (2016). Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In ECCV.
Guo, H., Zhao, C., Liu, Z., Wang, J., & Lu, H. (2018). Learning coarse-to-fine structured feature embedding for vehicle re-identification. In AAAI.
Han, S., Mao, H., & Dally, W. J. (2016). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In ICLR.
Han, S., Pool, J., Tran, J., & Dally, W. J. (2015). Learning both weights and connections for efficient neural network. In NeurIPS.
He, R., Wu, X., Sun, Z., & Tan, T. (2017). Learning invariant deep representation for nir-vis face recognition. In AAAI.
He, R., Wu, X., Sun, Z., & Tan, T. (2018). Wasserstein CNN: Learning invariant features for NIR-VIS face recognition. In TPAMI.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In CVPR.
Hinton, G. E., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. In NeurIPS workshop.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al. (2017). Mobilenets: Efficientconvolutional neural networks for mobile vision applications. CoRR arXiv:1704.04861.
Huang, Z., & Wang, N. (2017). Like what you like: Knowledge distill via neuron selectivity transfer. CoRR arXiv:1707.01219.
Huang, C., Loy, C. C., & Tang, X. (2016). Local similarity-aware deep feature embedding. In NeurIPS.
Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst.
Huang, D., Sun, J., & Wang, Y. (2012). The BUAA-VisNir face database instructions. Technical Report IRIP-TR-12-FR-001, Beihang University, Beijing, China
Iandola, F. N., Moskewicz, M. W., Ashraf, K., Han, S., Dally, W. J., & Keutzer, K. (2016). Squeezenet: Alexnet-level accuracy with 50x fewer parameters and\(<\)1 mb model size. CoRR arXiv:1602.07360.
Kemelmacher-Shlizerman, I., Seitz, S. M., Miller, D., & Brossard, E. (2016). The megaface benchmark: 1 million faces for recognition at scale. In CVPR.
Kim, J., Park, S., & Kwak, N. (2018). Paraphrasing complex network: Network compression via factor transfer. In NeurIPS.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. In ICLR.
LeCun, Y., Cortes, C., & Burges, C. (2010). Mnist handwritten digit database. AT&T Labs [Online]. Available http://yann.lecun.com/exdb/mnist .
Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2017). Pruning filters for efficient convnets. In: ICLR.
Li, S. Z., Yi, D., Lei, Z., & Liao, S. (2013). The casia nir-vis 2.0 face database. In CVPR workshops.
Liao, S., Lei, Z., Yi, D., Li, & S. Z. (2014). A benchmark study of large-scale unconstrained face recognition. In IJCB.
Liu, Y., Cao, J., Li, B., Yuan, Y., Hu, W., Li, Y., & Duan, Y. (2019). Knowledge distillation via instance relationship graph. In CVPR.
Liu, X., Liu, W., Ma, H., & Fu, H. (2016). Large-scale vehicle re-identification in urban surveillance videos. In ICME.
Liu, X., Song, L., Wu, X., & Tan, T. (2016). Transferring deep representation for nir-vis heterogeneous face recognition. In ICB.
Liu, H., Tian, Y., Wang, Y., Pang, L., & Huang, T. (2016). Deep relative distance learning: Tell the difference between similar vehicles. In CVPR.
Liu, W., Wen, Y., Yu, Z., & Yang, M. (2016). Large-margin softmax loss for convolutional neural networks. In ICML.
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017). Sphereface: Deep hypersphere embedding for face recognition. In CVPR.
Liu, X., Liu, W., Mei, T., & Ma, H. (2018). Provid: Progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Transactions on Multimedia, 20, 645–658.
Article Google Scholar
Luo, H., Gu, Y., Liao, X., Lai, S., & Jiang, W. (2019). Bag of tricks and a strong baseline for deep person re-identification. In CVPR Workshops.
Luo, J. H., Wu, J., & Lin, W. (2017). Thinet: A filter level pruning method for deep neural network compression. In ICCV.
Luo, P., Zhu, Z., Liu, Z., Wang, X., & Tang, X. (2016). Face model compression by distilling knowledge from neurons. In AAAI.
Molchanov, P., Tyree, S., Karras, T., Aila, T., & Kautz, J. (2017). Pruning convolutional neural networks for resource efficient inference. In ICLR.
Ng, H., & Winkler, S. (2014). A data-driven approach to cleaning large face datasets. In ICIP.
Park, W., Kim, D., Lu, Y., & Cho, M. (2019). Relational knowledge distillation. In CVPR.
Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. In BMVC.
Passalis, N., & Tefas, A. (2018). Learning deep representations with probabilistic knowledge transfer. In ECCV.
Ranjan, R., Castillo, C. D., & Chellappa, R. (2017). L2-constrained softmax loss for discriminative face verification. CoRR arXiv:1703.09507.
Rastegari, M., Ordonez, V., Redmon, J., & Farhadi, A. (2016). Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV.
Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. In ECCV workshop.
Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C., & Bengio, Y. (2015). Fitnets: Hints for thin deep nets. In ICLR.
Sandler, M., Howard, A. G., Zhu, M., Zhmoginov, A., & Chen, L. (2018). Inverted residuals and linear bottlenecks: Mobile networks for classification, detection and segmentation. CoRR arXiv:1801.04381.
Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In CVPR.
Sohn, K. (2016). Improved deep metric learning with multi-class n-pair loss objective. In NeurIPS.
Song, C., Huang, Y., Ouyang, W., & Wang, L. (2018). Mask-guided contrastive attention model for person re-identification. In CVPR.
Song, H. O., Jegelka, S., Rathod, V., & Murphy, K. (2017). Deep metric learning via facility location. In CVPR.
Song, H.O., Xiang, Y., Jegelka, S., & Savarese, S. (2016). Deep metric learning via lifted structured feature embedding. In CVPR.
Sun, Y., Chen, Y., Wang, X., & Tang, X. (2014). Deep learning face representation by joint identification-verification. In NeurIPS.
Sun, Y., Wang, X., & Tang, X. (2015). Deeply learned face representations are sparse, selective, and robust. In CVPR.
Sun, Y., Zheng, L., Deng, W., & Wang, S. (2017). Svdnet for pedestrian retrieval. In ICCV.
Wang, F., Xiang, X., Cheng, J., & Yuille, A. L. (2017). Normface: \(\text{L}_{2}\) hypersphere embedding for face verification. In ACM MM.
Wang, C., Zhang, Q., Huang, C., Liu, W., & Wang, X. (2018). Mancs: A multi-task attentional network with curriculum sampling for person re-identification. In: ECCV.
Wang, J., Zhou, F., Wen, S., Liu, X., & Lin, Y. (2017). Deep metric learning with angular loss. In ICCV.
Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016). A discriminative feature learning approach for deep face recognition. In ECCV.
Wu, X., Song, L., He, R., & Tan, T. (2018). Coupled deep learning for heterogeneous face recognition. In AAAI.
Wu, X., He, R., Sun, Z., & Tan, T. (2018). A light CNN for deep face representation with noisy labels. IEEE Transactions on Information Forensics and Security, 13, 2884–2896.
Article Google Scholar
Yi, D., Lei, Z., Liao, S., & Li, S. Z. (2014). Learning face representation from scratch. CoRR arXiv:1411.7923.
Yim, J., Joo, D., Bae, J., & Kim, J. (2017). A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In CVPR.
Yuan, Y., Yang, K., & Zhang, C. (2017). Hard-aware deeply cascaded embedding. In ICCV.
Zagoruyko, S., & Komodakis, N. (2017). Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In ICLR.
Zhang, Z., Lan, C., Zeng, W., & Chen, Z. (2019). Densely semantically aligned person re-identification. In CVPR.
Zhang, Y., Xiang, T., Hospedales, T. M., & Lu, H. (2018). Deep mutual learning. In: CVPR.
Zhang, X., Zhou, X., Lin, M., & Sun, J. (2017). Shufflenet: An extremely efficient convolutional neural network for mobile devices. CoRR arXiv:1707.01083.
Zhang, R., Lin, L., Zhang, R., Zuo, W., & Zhang, L. (2015). Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-identification. IEEE Transactions on Image Processing, 24, 4766–4779.
Article MathSciNet Google Scholar
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., & Tian, Q. (2015). Scalable person re-identification: A benchmark. In ICCV.
Zheng, Z., Yang, X., Yu, Z., Zheng, L., Yang, Y., & Kautz, J. (2019). Joint discriminative and generative learning for person re-identification. In CVPR.
Zheng, Z., Zheng, L., & Yang, Y. (2017). Pedestrian alignment network for large-scale person re-identification. CoRR arXiv:1707.00408.
Zheng, Z., Zheng, L., & Yang, Y. (2017). Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In ICCV.
Zhong, Z., Zheng, L., Zheng, Z., Li, S., & Yang, Y. (2018). Camera style adaptation for person re-identification. In CVPR.
Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2017). Learning transferable architectures for scalable image recognition. CoRR arXiv:1707.07012.

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant 61622310, Grant 61721004, and in part by the Beijing Natural Science Foundation Grant JQ18017.

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, and Center for Research on Intelligent Perception and Computing, CASIA Center for Excellence in Brain Science and Intelligence Technology, CAS, University of Chinese Academy of Sciences (UCAS), Beijing, China
Xiang Wu, Ran He, Yibo Hu & Zhenan Sun

Authors

Xiang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ran He
View author publications
You can also search for this author in PubMed Google Scholar
Yibo Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhenan Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ran He.

Additional information

Communicated by Li Liu, Matti Pietikäinen, Jie Qin, Jie Chen, Wanli Ouyang, Luc Van Gool.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, X., He, R., Hu, Y. et al. Learning an Evolutionary Embedding via Massive Knowledge Distillation. Int J Comput Vis 128, 2089–2106 (2020). https://doi.org/10.1007/s11263-019-01286-x

Download citation

Received: 07 March 2019
Accepted: 19 December 2019
Published: 21 January 2020
Issue Date: September 2020
DOI: https://doi.org/10.1007/s11263-019-01286-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning an Evolutionary Embedding via Massive Knowledge Distillation

Abstract

Access this article

Similar content being viewed by others

Visualizing the Embedding Space to Explain the Effect of Knowledge Distillation

Categories of Response-Based, Feature-Based, and Relation-Based Knowledge Distillation

Self-Referenced Deep Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning an Evolutionary Embedding via Massive Knowledge Distillation

Abstract

Access this article

Similar content being viewed by others

Visualizing the Embedding Space to Explain the Effect of Knowledge Distillation

Categories of Response-Based, Feature-Based, and Relation-Based Knowledge Distillation

Self-Referenced Deep Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation