Advertisement

Semi-Siamese Training for Shallow Face Learning

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12349)

Abstract

Most existing public face datasets, such as MS-Celeb-1M and VGGFace2, provide abundant information in both breadth (large number of IDs) and depth (sufficient number of samples) for training. However, in many real-world scenarios of face recognition, the training dataset is limited in depth, i.e. only two face images are available for each ID. We define this situation as Shallow Face Learning, and find it problematic with existing training methods. Unlike deep face data, the shallow face data lacks intra-class diversity. As such, it can lead to collapse of feature dimension and consequently the learned network can easily suffer from degeneration and over-fitting in the collapsed dimension. In this paper, we aim to address the problem by introducing a novel training method named Semi-Siamese Training (SST). A pair of Semi-Siamese networks constitute the forward propagation structure, and the training loss is computed with an updating gallery queue, conducting effective optimization on shallow training data. Our method is developed without extra-dependency, thus can be flexibly integrated with the existing loss functions and network architectures. Extensive experiments on various benchmarks of face recognition show the proposed method significantly improves the training, not only in shallow face learning, but also for conventional deep face data.

Keywords

Face recognition Shallow face learning 

Notes

Acknowledgement

This work was supported in part by the National Key Research & Development Program (No. 2020YFC2003901), Chinese National Natural Science Foundation Projects #61872367, and #61572307, and Beijing Academy of Artificial Intelligence (BAAI).

Supplementary material

504439_1_En_3_MOESM1_ESM.pdf (146 kb)
Supplementary material 1 (pdf 146 KB)

References

  1. 1.
    Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: a dataset for recognising faces across pose and age. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 67–74. IEEE (2018)Google Scholar
  2. 2.
    Chen, S., Liu, Y., Gao, X., Han, Z.: MobileFaceNets: efficient CNNs for accurate real-time face verification on mobile devices. In: Zhou, J., et al. (eds.) CCBR 2018. LNCS, vol. 10996, pp. 428–438. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-97909-0_46CrossRefGoogle Scholar
  3. 3.
    Cheng, Y., et al.: Know you at one glance: a compact vector representation for low-shot learning. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 1924–1932 (2017)Google Scholar
  4. 4.
    Cheng, Z., Zhu, X., Gong, S.: Surveillance face recognition challenge. arXiv preprint arXiv:1804.09691 (2018)
  5. 5.
    Choe, J., Park, S., Kim, K., Hyun Park, J., Kim, D., Shim, H.: Face generation for low-shot learning using generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 1940–1948 (2017)Google Scholar
  6. 6.
    Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 539–546 (2005)Google Scholar
  7. 7.
    Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)Google Scholar
  8. 8.
    Dosovitskiy, A., Springenberg, J.T., Riedmiller, M., Brox, T.: Discriminative unsupervised feature learning with convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 766–774 (2014)Google Scholar
  9. 9.
    Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intelligence 28(4), 594–611 (2006)CrossRefGoogle Scholar
  10. 10.
    Feng, Z.H., Kittler, J., Awais, M., Huber, P., Wu, X.J.: Wing loss for robust facial landmark localisation with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2235–2245 (2018)Google Scholar
  11. 11.
    Guo, Y., Zhang, L.: One-shot face recognition by promoting underrepresented classes. arXiv preprint arXiv:1707.05574 (2017)
  12. 12.
    Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 87–102. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_6CrossRefGoogle Scholar
  13. 13.
    Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1735–1742 (2006)Google Scholar
  14. 14.
    He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)Google Scholar
  15. 15.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  16. 16.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)Google Scholar
  17. 17.
    Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07–49, University of Massachusetts, Amherst (2007)Google Scholar
  18. 18.
    Kemelmacher-Shlizerman, I., Seitz, S.M., Miller, D., Brossard, E.: The megaface benchmark: 1 million faces for recognition at scale. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4873–4882 (2016)Google Scholar
  19. 19.
    Liao, S., Lei, Z., Yi, D., Li, S.Z.: A benchmark study of large-scale unconstrained face recognition. In: IEEE International Joint Conference on Biometrics, pp. 1–8 (2014)Google Scholar
  20. 20.
    Liu, H., Zhu, X., Lei, Z., Li, S.Z.: Adaptiveface: adaptive margin and sampling for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11947–11956 (2019)Google Scholar
  21. 21.
    Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: Sphereface: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 212–220 (2017)Google Scholar
  22. 22.
    Moschoglou, S., Papaioannou, A., Sagonas, C., Deng, J., Kotsia, I., Zafeiriou, S.: Agedb: the first manually collected, in-the-wild age database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 51–59 (2017)Google Scholar
  23. 23.
    Ranjan, R., Castillo, C.D., Chellappa, R.: L2-constrained softmax loss for discriminative face verification. arXiv preprint arXiv:1703.09507 (2017)
  24. 24.
    Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)Google Scholar
  25. 25.
    Sengupta, S., Chen, J.C., Castillo, C., Patel, V.M., Chellappa, R., Jacobs, D.W.: Frontal to profile face verification in the wild. In: 2016 IEEE Winter Conference on Applications of Computer Vision, pp. 1–9. IEEE (2016)Google Scholar
  26. 26.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2015)Google Scholar
  27. 27.
    Sohn, K.: Improved deep metric learning with multi-class n-pair loss objective. In: Advances in Neural Information Processing Systems, pp. 1857–1865 (2016)Google Scholar
  28. 28.
    Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Advances in Neural Information Processing Systems, pp. 1988–1996 (2014)Google Scholar
  29. 29.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)Google Scholar
  30. 30.
  31. 31.
    Wang, F., et al.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)Google Scholar
  32. 32.
    Wang, F., Cheng, J., Liu, W., Liu, H.: Additive margin softmax for face verification. IEEE Signal Process. Lett. 25(7), 926–930 (2018)CrossRefGoogle Scholar
  33. 33.
    Wang, H., et al.: Cosface: large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)Google Scholar
  34. 34.
    Wang, L., Li, Y., Wang, S.: Feature learning for one-shot face recognition. In: 2018 25th IEEE International Conference on Image Processing, pp. 2386–2390 (2018)Google Scholar
  35. 35.
    Wang, X., Wang, S., Wang, J., Shi, H., Mei, T.: Co-mining: deep face recognition with noisy labels. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9358–9367 (2019)Google Scholar
  36. 36.
    Wang, X., Zhang, S., Wang, S., Fu, T., Shi, H., Mei, T.: Mis-classified vector guided softmax loss for face recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence (2020)Google Scholar
  37. 37.
    Wen, Y., Zhang, K., Li, Z., Qiao, Yu.: A discriminative feature learning approach for deep face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 499–515. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46478-7_31CrossRefGoogle Scholar
  38. 38.
    Wu, Y., Liu, H., Fu, Y.: Low-shot face recognition with hybrid classifiers. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 1933–1939 (2017)Google Scholar
  39. 39.
    Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3733–3742 (2018)Google Scholar
  40. 40.
    Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch. arXiv preprint arXiv:1411.7923 (2014)
  41. 41.
    Yin, X., Yu, X., Sohn, K., Liu, X., Chandraker, M.: Feature transfer learning for face recognition with under-represented data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5704–5713 (2019)Google Scholar
  42. 42.
    Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., Li, S.Z.: Faceboxes: a cpu real-time face detector with high accuracy. In: 2017 IEEE International Joint Conference on Biometrics, pp. 1–9. IEEE (2017)Google Scholar
  43. 43.
    Zhang, X., Zhao, R., Qiao, Y., Wang, X., Li, H.: Adacos: adaptively scaling cosine logits for effectively learning deep face representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10823–10832 (2019)Google Scholar
  44. 44.
    Zhao, K., Xu, J., Cheng, M.M.: Regularface: deep face recognition via exclusive regularization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1136–1144 (2019)Google Scholar
  45. 45.
    Zheng, T., Deng, W.: Cross-pose lfw: A database for studying crosspose face recognition in unconstrained environments. Beijing University of Posts and Telecommunications, Technical Report, pp. 18–01 (2018)Google Scholar
  46. 46.
    Zheng, T., Deng, W., Hu, J.: Cross-age lfw: A database for studying cross-age face recognition in unconstrained environments. arXiv preprint arXiv:1708.08197 (2017) 10
  47. 47.
    Zhu, X., et al.: Large-scale bisample learning on id versus spot face recognition. Int. J. Comput. Vis. 127(6–7), 684–700 (2019)CrossRefGoogle Scholar
  48. 48.
    Zhuang, C., Zhai, A.L., Yamins, D.: Local aggregation for unsupervised learning of visual embeddings. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6002–6012 (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Shanghai UniversityShanghaiChina
  2. 2.JD AI ResearchBeijingChina
  3. 3.NLPR, Institute of Automation, Chinese Academy of SciencesBeijingChina

Personalised recommendations