Skip to main content
Log in

Distilling Face Recognition Models Trained Using Margin-Based Softmax Function

  • THEMATIC ISSUE
  • Published:
Automation and Remote Control Aims and scope Submit manuscript

Abstract

The use of convolutional neural networks trained with the margin-based softmax function allows achieving the highest accuracy in the face recognition problem. The development of embedded systems such as smart intercoms has increased interest in lightweight neural networks. Thus, lightweight neural network models, trained using the margin-based softmax function, were proposed for the face identification problem. In the present paper, we propose a distillation method that allows obtaining greater accuracy than other methods for the face recognition problem on LFW, AgeDB-30, and Megaface datasets. The main idea of our approach is to use the class centers of the teacher network to initialize the student network. Then the student network is trained to produce biometric vectors the angles from which to the class centers are equal to the angles in the teacher network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.

Similar content being viewed by others

REFERENCES

  1. Chen, S., Liu, Y., Gao, X., and Han, Z., Mobilefacenets: Efficient CNNs for accurate real-time face verification on mobile devices, in Chin. Conf. Biometric Recognit., Cham: Springer, 2018, pp. 428–438.

  2. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C., Mobilenetv2: Inverted residuals and linear bottlenecks, Proc. IEEE Conf. Comput. Vis. Pattern Recognit., (2018), pp. 4510–4520.

  3. Deng, J., Guo, J., Xue, N., and Zafeiriou, S., Arcface: Additive angular margin loss for deep face recognition, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (2019), pp. 4690–4699.

  4. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., and Song, L., Sphereface: Deep hypersphere embedding for face recognition, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (2017), pp. 212–220.

  5. Wang, H., Wang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., Li, Z., and Liu, W., Cosface: Large margin cosine loss for deep face recognition, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (2018), pp. 5265–5274.

  6. Huang, G.B., Mattar, M., Berg, T., and Learned-Miller, E., Labeled faces in the wild: A database for studying face recognition in unconstrained environments, Workshop Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition (2008).

  7. Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I., and Zafeiriou, S., Agedb: The first manually collected, in-the-wild age database, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (2017), pp. 51–59.

  8. Kemelmacher-Shlizerman, I., Seitz, S.M., Miller, D., and Brossard, E., The megaface benchmark: 1 million faces for recognition at scale, Proc. IEEE Conf. Comput. Vis. PatternRecognit. (2016), pp. 4873–4882.

  9. Hinton, G., Vinyals, O., and Dean, J., Distilling the knowledge in a neural network, 2015. arXiv:1503.02531.

  10. Fukuda, T., Suzuki, M., Kurata, G., Thomas, S., Cui, J., and Ramabhadran, B., Efficient knowledge distillation from an ensemble of teachers, Interspeech, 2017, pp. 3697–3701.

  11. Sau, B.B. and Balasubramanian, V.N., Deep model compression: Distilling knowledge from noisy teachers, 2016. arXiv:1610.09650.

  12. Furlanello, T., Lipton, Z., Tschannen, M., Itti, L., and Anandkumar, A., Born again neural networks, Int. Conf. Mach. Learn. PMLR (2018), pp. 1607–1616.

  13. Huang, Z. and Wang, N., Like what you like: Knowledge distill via neuron selectivity transfer, 2017. arXiv:1707.01219.2017.

  14. Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., and Bengio, Y., Fitnets: Hints for thin deep nets, 2014. arXiv:1412.6550.

  15. Chen, H., Wang, Y., Xu, C., Xu, C., and Tao, D., Learning student networks via feature embedding, IEEE Trans. Neural Networks Learn. Syst., 2020, vol. 32, no. 1, pp. 25–35.

    Article  Google Scholar 

  16. Park, W., Kim, D., Lu, Y., and Cho M., Relational knowledge distillation, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (2019), pp. 3967–3976.

  17. Feng, Y., Wang, H., Hu, H.R., Yu, L., Wang, W., and Wang S., Triplet distillation for deep face recognition, 2020 IEEE Int. Conf. Image Process. (ICIP) (2020), pp. 808–812.

  18. Duong, C.N., Luu, K., Quach, K.G., and Le, N., Shrinkteanet: Million-scale lightweight face recognition via shrinking teacher–student networks, 2019. arXiv:1905.10620.

  19. Nekhaev, D., Milyaev, S., and Laptev, I., Margin based knowledge distillation for mobile face recognition, in Twelfth Int. Conf. Mach. Vis. (ICMV 2019), Int. Soc. Opt. Photonics, 2020, vol. 11433, 114330O.

  20. He, K., Zhang, X., Ren, S., and Sun, J., Deep residual learning for image recognition, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (2016), pp. 770–778.

  21. Zhang, K., Zhang, Z., Li, Z., and Qiao, Y., Joint face detection and alignment using multitask cascaded convolutional networks, IEEE Signal Process. Lett., 2016, vol. 23, no. 10, pp. 1499–1503.

    Article  Google Scholar 

  22. Guo, Y., Zhang, L., Hu, Y., He, X., and Gao, J., Ms-celeb-1m: A dataset and benchmark for large-scale face recognition, in Eur. Conf. Comput. Vis., Cham: Springer, 2016, pp. 87–102.

  23. Ng, H.W. and Winkler, S., A data-driven approach to cleaning large face datasets, 2014 IEEE Int. Conf. Image Process. (ICIP) (2014), pp. 343–347.

  24. Robbins, H. and Monro, S., A stochastic approximation method, Ann. Math. Stat., 1951, pp. 400–407.

  25. Grabovoy, A.V. and Strijov, V.V., Bayesian distillation of deep learning models, Autom. Remote Control, 2021, vol. 82, no. 11, pp. 1846–1856.

    Article  MathSciNet  MATH  Google Scholar 

  26. Grabovoy, A.V. and Strijov, V.V., Probabilistic interpretation of the distillation problem, Autom. Remote Control, 2022, vol. 83, no. 1, pp. 123–137.

    Article  MathSciNet  MATH  Google Scholar 

  27. MarginDistillation: distillation for margin-based softmax. https://github.com/david-svitov/margindistillation. Accessed January 8, 2022.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to D. V. Svitov or S. A. Alyamkin.

Additional information

Translated by V. Potapchouck

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Svitov, D.V., Alyamkin, S.A. Distilling Face Recognition Models Trained Using Margin-Based Softmax Function. Autom Remote Control 83, 1517–1526 (2022). https://doi.org/10.1134/S00051179220100046

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S00051179220100046

Keywords

Navigation