Sub-center ArcFace: Boosting Face Recognition by Large-Scale Noisy Web Faces

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12356)


Margin-based deep face recognition methods (e.g. SphereFace, CosFace, and ArcFace) have achieved remarkable success in unconstrained face recognition. However, these methods are susceptible to the massive label noise in the training data and thus require laborious human effort to clean the datasets. In this paper, we relax the intra-class constraint of ArcFace to improve the robustness to label noise. More specifically, we design K sub-centers for each class and the training sample only needs to be close to any of the K positive sub-centers instead of the only one positive center. The proposed sub-center ArcFace encourages one dominant sub-class that contains the majority of clean faces and non-dominant sub-classes that include hard or noisy faces. Extensive experiments confirm the robustness of sub-center ArcFace under massive real-world noise. After the model achieves enough discriminative power, we directly drop non-dominant sub-centers and high-confident noisy samples, which helps recapture intra-compactness, decrease the influence from noise, and achieve comparable performance compared to ArcFace trained on the manually cleaned dataset. By taking advantage of the large-scale raw web faces (Celeb500K), sub-center Arcface achieves state-of-the-art performance on IJB-B, IJB-C, MegaFace, and FRVT.


Face recognition Sub-class Large-scale Noisy data 



Jiankang Deng acknowledges the Imperial President’s PhD Scholarship. Tongliang Liu acknowledges support from the Australian Research Council Project DE-190101473. Stefanos Zafeiriou acknowledges support from the Google Faculty Fellowship, EPSRC DEFORM (EP/S010203/1) and FACER2VM (EP/N007743/1). We are thankful to Nvidia for the GPU donations.

Supplementary material (287 kb)
Supplementary material 1 (zip 286 KB)


  1. 1.
    Cao, J., Li, Y., Zhang, Z.: Celeb-500k: a large training dataset for face recognition. In: ICIP (2018)Google Scholar
  2. 2.
    Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: A dataset for recognising faces across pose and age. In: FG (2018)Google Scholar
  3. 3.
    Chen, T., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., Xiao, T., Xu, B., Zhang, C., Zhang, Z.: Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv:1512.01274 (2015)
  4. 4.
    Cheng, J., Liu, T., Ramamohanarao, K., Tao, D.: Learning with bounded instance-and label-dependent label noise. In: ICML (2020)Google Scholar
  5. 5.
    Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: CVPR (2019)Google Scholar
  6. 6.
    Deng, J., Guo, J., Zhang, D., Deng, Y., Lu, X., Shi, S.: Lightweight face recognition challenge. In: ICCV Workshops (2019)Google Scholar
  7. 7.
    Deng, J., Guo, J., Zhou, Y., Yu, J., Kotsia, I., Zafeiriou, S.: Retinaface: single-stage dense face localisation in the wild. arXiv:1905.00641 (2019)
  8. 8.
    Deng, J., Zhou, Y., Zafeiriou, S.: Marginal loss for deep face recognition. In: CVPR Workshops (2017)Google Scholar
  9. 9.
    Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: Ms-celeb-1m: a dataset and benchmark for large-scale face recognition. In: ECCV (2016)Google Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  11. 11.
    Hu, W., Huang, Y., Zhang, F., Li, R.: Noise-tolerant paradigm for training face recognition CNNS. In: CVPR (2019)Google Scholar
  12. 12.
    Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical report (2007)Google Scholar
  13. 13.
    Kemelmacher-Shlizerman, I., Seitz, S.M., Miller, D., Brossard, E.: The megaface benchmark: 1 million faces for recognition at scale. In: CVPR (2016)Google Scholar
  14. 14.
    Liu, T., Tao, D.: Classification with noisy labels by importance reweighting. In: TPAMI (2015)Google Scholar
  15. 15.
    Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: Sphereface: deep hypersphere embedding for face recognition. In: CVPR (2017)Google Scholar
  16. 16.
    Masi, I., Tran, A.T., Hassner, T., Sahin, G., Medioni, G.: Face-specific data augmentation for unconstrained face recognition. In: IJCV (2019)Google Scholar
  17. 17.
    Maze, B., et al.: Iarpa janus benchmark-c: Face dataset and protocol. In: ICB (2018)Google Scholar
  18. 18.
    Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I., Zafeiriou, S.: Agedb: the first manually collected in-the-wild age database. In: CVPR Workshops (2017)Google Scholar
  19. 19.
    Movshovitz-Attias, Y., Toshev, A., Leung, T.K., Ioffe, S., Singh, S.: No fuss distance metric learning using proxies. In: ICCV (2017)Google Scholar
  20. 20.
    Müller, R., Kornblith, S., Hinton, G.: Subclass distillation. arXiv:2002.03936 (2020)
  21. 21.
    Oh Song, H., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. In: CVPR (2016)Google Scholar
  22. 22.
    Qian, Q., Shang, L., Sun, B., Hu, J., Li, H., Jin, R.: Softtriple loss: deep metric learning without triplet sampling. In: ICCV (2019)Google Scholar
  23. 23.
    Qian, Q., Tang, J., Li, H., Zhu, S., Jin, R.: Large-scale distance metric learning with uncertainty. In: CVPR (2018)Google Scholar
  24. 24.
    Ranjan, R., et al.: Crystal loss and quality pooling for unconstrained face verification and recognition. arXiv:1804.01159 (2018)
  25. 25.
    Rippel, O., Paluri, M., Dollar, P., Bourdev, L.: Metric learning with adaptive density discrimination. In: ICLR (2016)Google Scholar
  26. 26.
    Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: CVPR (2015)Google Scholar
  27. 27.
    Sengupta, S., Chen, J.C., Castillo, C., Patel, V.M., Chellappa, R., Jacobs, D.W.: Frontal to profile face verification in the wild. In: WACV (2016)Google Scholar
  28. 28.
    Sohn, K.: Improved deep metric learning with multi-class n-pair loss objective. In: NeurIPS (2016)Google Scholar
  29. 29.
    Wan, H., Wang, H., Guo, G., Wei, X.: Separability-oriented subclass discriminant analysis. In: TPAMI (2017)Google Scholar
  30. 30.
    Wang, F., Chen, L., Li, C., Huang, S., Chen, Y., Qian, C., Loy, C.C.: The devil of face recognition is in the noise. In: ECCV (2018)Google Scholar
  31. 31.
    Wang, F., Cheng, J., Liu, W., Liu, H.: Additive margin softmax for face verification. SPL (2018)Google Scholar
  32. 32.
    Wang, H., Wang, Y., Zhou, Z., Ji, X., Gong, D., Zhou, J., Li, Z., Liu, W.: Cosface: Large margin cosine loss for deep face recognition. In: CVPR (2018)Google Scholar
  33. 33.
    Wang, X., Wang, S., Wang, J., Shi, H., Mei, T.: Co-mining: Deep face recognition with noisy labels. In: ICCV (2019)Google Scholar
  34. 34.
    Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: ECCV (2016)Google Scholar
  35. 35.
    Whitelam, C., et al.: Iarpa janus benchmark-b face dataset. In: CVPR Workshops (2017)Google Scholar
  36. 36.
    Wu, C.Y., Manmatha, R., Smola, A.J., Krahenbuhl, P.: Sampling matters in deep embedding learning. In: ICCV (2017)Google Scholar
  37. 37.
    Wu, X., He, R., Sun, Z., Tan, T.: A light CNN for deep face representation with noisy labels. In: TIFS (2018)Google Scholar
  38. 38.
    Xie, W., Li, S., Zisserman, A.: Comparator networks. In: ECCV (2018)Google Scholar
  39. 39.
    Yang, J., Bulat, A., Tzimiropoulos, G.: Fan-face: a simple orthogonal improvement to deep face recognition. In: AAAI (2020)Google Scholar
  40. 40.
    Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch. arXiv:1411.7923 (2014)
  41. 41.
    Zhong, Y., Deng, W., Wang, M., Hu, J., Peng, J., Tao, X., Huang, Y.: Unequal-training for deep face recognition with long-tailed noisy data. In: CVPR (2019)Google Scholar
  42. 42.
    Zhu, M., Martínez, A.M.: Optimal subclass discovery for discriminant analysis. In: CVPR Workshops (2004)Google Scholar
  43. 43.
    Zhu, M., Martinez, A.M.: Subclass discriminant analysis. In: TPAMI (2006)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Imperial CollegeLondonUK
  2. 2.InsightFaceLondonUK
  3. 3.University of SydneySydneyAustralia
  4. 4.University of MelbourneMelbourneAustralia

Personalised recommendations