Advertisement

International Journal of Computer Vision

, Volume 127, Issue 6–7, pp 684–700 | Cite as

Large-Scale Bisample Learning on ID Versus Spot Face Recognition

  • Xiangyu Zhu
  • Hao Liu
  • Zhen LeiEmail author
  • Hailin Shi
  • Fan Yang
  • Dong Yi
  • Guojun Qi
  • Stan Z. Li
Article

Abstract

In real-world face recognition applications, there is a tremendous amount of data with two images for each person. One is an ID photo for face enrollment, and the other is a probe photo captured on spot. Most existing methods are designed for training data with limited breadth (a relatively small number of classes) and sufficient depth (many samples for each class). They would meet great challenges on ID versus Spot (IvS) data, including the under-represented intra-class variations and an excessive demand on computing devices. In this paper, we propose a deep learning based large-scale bisample learning (LBL) method for IvS face recognition. To tackle the bisample problem with only two samples for each class, a classification–verification–classification training strategy is proposed to progressively enhance the IvS performance. Besides, a dominant prototype softmax is incorporated to make the deep learning scalable on large-scale classes. We conduct LBL on a IvS face dataset with more than two million identities. Experimental results show the proposed method achieves superior performance to previous ones, validating the effectiveness of LBL on IvS face recognition.

Keywords

Face recognition ID versus spot Large-scale bisample learning Dominant prototype softmax 

Notes

Acknowledgements

This work was supported by the Chinese National Natural Science Foundation Projects #61876178, #61806196, the National Key Research and Development Plan (Grant No.2016YFC0801002), and AuthenMetric R&D Funds. Zhen Lei is the corresponding author.

References

  1. Babbar, R., Schölkopf, B. (2017). Dismec: Distributed sparse machines for extreme multi-label classification. In Proceedings of the tenth ACM international conference on web search and data mining (pp. 721–729). ACM.Google Scholar
  2. Balntas, V., Riba, E., Ponsa, D., & Mikolajczyk, K. (2016). Learning local feature descriptors with triplets and shallow convolutional neural networks. In British machine vision conference (pp. 119.1–119.11).Google Scholar
  3. Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.zbMATHGoogle Scholar
  4. Bertinetto, L., Henriques, J. F., Valmadre, J., Torr, P. H. S., & Vedaldi, A. (2016). Learning feed-forward one-shot learners. In Neural information processing systems (pp. 523–531).Google Scholar
  5. Bhatia, K., Jain, H., Kar, P., Varma, M., & Jain, P. (2015). Sparse local embeddings for extreme multi-label classification. In Advances in neural information processing systems (pp. 730–738).Google Scholar
  6. Cao, Q., Shen, L., Xie, W., Parkhi, O. M., & Zisserman, A. (2017). Vggface2: A dataset for recognising faces across pose and age. arXiv preprint arXiv:1710.08092.
  7. Chen, W., Chen, X., Zhang, J., & Huang, K. (2017). Beyond triplet loss: A deep quadruplet network for person re-identification. In The conference on computer vision and pattern recognition (pp. 1320–1329).Google Scholar
  8. Choe, J., Park, S., Kim, K., Hyun Park, J., Kim, D., & Shim, H. (2017). Face generation for low-shot learning using generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1940–1948).Google Scholar
  9. Choromanska, A., Agarwal, A., & Langford, J. (2013). Extreme multi class classification. In NIPS Workshop: eXtreme Classification (submitted).Google Scholar
  10. Feifei, L., Fergus, R., & Perona, P. (2006). One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4), 594–611.CrossRefGoogle Scholar
  11. Feng, Z. -H., Kittler, J., Awais, M., Huber, P., & Wu, X. J. (2017). Wing loss for robust facial landmark localisation with convolutional neural networks. arXiv preprint arXiv:1711.06753.
  12. Guo, Y., & Zhang, L. (2017). One-shot face recognition by promoting underrepresented classes. arXiv preprint arXiv:1707.05574.
  13. Guo, Y., Zhang, L., Hu, Y., He, X., & Gao, J. J. (2016). Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In European conference on computer vision (pp. 87–102). Springer.Google Scholar
  14. Gutmann, M., & Hyvärinen, A. (2010). Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. InProceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 297–304).Google Scholar
  15. Hariharan, B., & Girshick, R. (2016). Low-shot visual recognition by shrinking and hallucinating features. arXiv preprint arXiv:1606.02819.
  16. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Computer Vision and Pattern Recognition (pp. 770–778).Google Scholar
  17. Hsu, D. J., Kakade, S. M., Langford, J., & Zhang, T. (2009). Multi-label prediction via compressed sensing. In Advances in neural information processing systems (pp. 772–780).Google Scholar
  18. Huang, C., Loy, C. C., & Tang, X. (2016). Local similarity-aware deep feature embedding. In Advances in neural information processing systems (pp. 1262–1270).Google Scholar
  19. Huang, G. B., Mattar, M., Berg, T., & Learned-Miller, E. (2008). E: Labeled faces in the wild: A database for studying face recognition in unconstrained environments. In Workshop on faces in‘Real-Life’ Images: detection, alignment, and recognition Google Scholar
  20. Koch, G., Zemel, R., & Salakhutdinov, R. (2015). Siamese neural networks for one-shot image recognition. In ICML deep learning workshop (Vol. 2).Google Scholar
  21. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In International conference on neural information processing systems (pp. 1097–1105).Google Scholar
  22. Kumar, V. B., Harwood, B., Carneiro, G., Reid, I., & Drummond, T. (2017). Smart mining for deep metric learning. arXiv preprint arXiv:1704.01285.
  23. Liao, S., Lei, Z., Yi, D., & Li, S. Z. (2014). A benchmark study of large-scale unconstrained face recognition. In IEEE international joint conference on biometrics (pp. 1–8).Google Scholar
  24. Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., & Song, L. (2017a). Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.Google Scholar
  25. Liu, W., Wen, Y., Yu, Z., & Yang, M. (2016). Large-margin softmax loss for convolutional neural networks. In ICML (pp. 507–516).Google Scholar
  26. Liu, W., Zhang, Y. M., Li, X., Yu, Z., Dai, B., Zhao, T., & Song, L. (2017b). Deep hyperspherical learning. In Advances in neural information processing systems (pp. 3953–3963).Google Scholar
  27. Mnih, A., & Kavukcuoglu, K. (2013). Learning word embeddings efficiently with noise-contrastive estimation. In Advances in neural information processing systems (pp. 2265–2273).Google Scholar
  28. Mnih, A., & Teh, Y. W. (2012). A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426.
  29. Nech, A., & Kemelmacher-Shlizerman, I. (2017). Level playing field for million scale face recognition. arXiv preprint arXiv:1705.00393.
  30. Oh Song, H. Xiang, Y., Jegelka, S., & Savarese, S. (2016) Deep metric learning via lifted structured feature embedding. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4004–4012).Google Scholar
  31. Ouyang, W., Wang, X., Zhang, C., & Yang, X. (2016). Factors in finetuning deep model for object detection with long-tail distribution. In Computer vision and pattern recognition (pp. 864–873).Google Scholar
  32. Parkhi, O. M., Vedaldi, A., Zisserman, A., et al. (2015). Deep face recognition. BMVC, 1(3), 6.Google Scholar
  33. Prabhu, Y., & Varma, M. (2014). Fastxml: A fast, accurate and stable tree-classifier for extreme multi-label learning. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 263–272). ACM.Google Scholar
  34. Ranjan, R., Castillo, C. D., & Chellappa, R. (2017). L2-constrained softmax loss for discriminative face verification. arXiv preprint arXiv:1703.09507.
  35. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., & Lillicrap, T. P. (2016). One-shot learning with memory-augmented neural networks. arXiv:1605.06065
  36. Schroff, F., Kalenichenko, D., & Philbin, J. (2015) Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815–823).Google Scholar
  37. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
  38. Smirnov, E., Melnikov, A., Novoselov, S., Luckyanets, E., & Lavrentyeva, G. (2017). Doppelganger mining for face representation learning. In International conference on computer vision Google Scholar
  39. Sohn, K. (2016). Improved deep metric learning with multi-class n-pair loss objective. In Advances in neural information processing systems (pp. 1857–1865).Google Scholar
  40. Sun, C., Shrivastava, A., Singh, S. & Gupta, A. (2017). Revisiting unreasonable effectiveness of data in deep learning era. In 2017 IEEE international conference on computer vision (ICCV) (pp. 843–852). IEEE.Google Scholar
  41. Sun, Y., Chen, Y., Wang, X., & Tang, X. (2014). Deep learning face representation by joint identification-verification. In Advances in neural information processing systems (pp. 1988–1996).Google Scholar
  42. Sun, Y., Wang, X., & Tang, X. (2013). Deep learning face representation from predicting 10,000 classes. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1891–1898).Google Scholar
  43. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, D. V., & Rabinovich, A. (2015). Going deeper with convolutions. In The IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  44. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Computer Vision and Pattern Recognition (pp. 2818–2826).Google Scholar
  45. Tagami, Y. (2017). Annexml: Approximate nearest neighbor search for extreme multi-label classification. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 455–464). ACM.Google Scholar
  46. Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2013) Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1701–1708).Google Scholar
  47. Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). Web-scale training for face identification. arXiv preprint arXiv:1406.5266.
  48. Vaswani, A., Zhao, Y., Fossum, V., & Chiang, D. (2013). Decoding with large-scale neural language models improves translation. In Proceedings of the 2013 conference on empirical methods in natural language processing (pp. 1387–1392).Google Scholar
  49. Vinyals, O., Blundell, C., Lillicrap, T. P., Kavukcuoglu, K., & Wierstra, D. (2016). Matching networks for one shot learning. In Neural information processing systems (pp. 3630–3638).Google Scholar
  50. Wang, C., Zhang, X., & Lan, X. (2017). How to train triplet networks with 100k identities? arXiv preprint arXiv:1709.02940.
  51. Wang, F., Liu, W., Liu, H., & Cheng, J. (2018a). Additive margin softmax for face verification. IEEE Signal Processing Letters, 25, 926–930.CrossRefGoogle Scholar
  52. Wang, F., Xiang, X., Cheng, J., & Yuille, A. L. (2017). Normface: \( l\_2 \) hypersphere embedding for face verification. arXiv preprint arXiv:1704.06369
  53. Wang, H., Wang, Y., Zhou, Z., Ji, X., & Liu, W. (2018b). Cosface: Large margin cosine loss for deep face recognition. In 2018 IEEE conference on computer vision and pattern recognition (CVPR). IEEE.Google Scholar
  54. Wang, Y. X., & Hebert, M. (2016). Learning to learn: model regression networks for easy small sample learning. Berlin: Springer.Google Scholar
  55. Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016). A discriminative feature learning approach for deep face recognition. In European conference on computer vision (pp. 499–515). Springer.Google Scholar
  56. Weston, J., Chopra, S., & Bordes, A. (2014). Memory networks. arXiv preprint arXiv:1410.3916.
  57. Wu, X., He, R., Sun, Z., & Tan, T. (2015). A light CNN for deep face representation with noisy labels. arXiv preprint arXiv:1511.02683.
  58. Xu, C., Tao, D., & Xu, C. (2016). Robust extreme multi-label learning. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1275–1284). ACM.Google Scholar
  59. Xu, Z., Zhu, L., & Yang, Y. (2016). Few-shot object recognition from machine-labeled web images. arXiv preprint arXiv:1612.06152.
  60. Yang, J., Price, B., Cohen, S., & Yang, M. H. (2014). Context driven scene parsing with attention to rare classes. In IEEE conference on computer vision and pattern recognition (pp. 3294–3301).Google Scholar
  61. Yi, D., Lei, Z., Liao, S., & Li, S. Z. (2014). Learning face representation from scratch. In Computer vision and pattern recognition Google Scholar
  62. Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., & Li, S. Z. (2017). Faceboxes: a CPU real-time face detector with high accuracy. arXiv preprint arXiv:1708.05234.
  63. Zhang, X., Fang, Z., Wen, Y., Li, Z., & Qiao, Y. (2017). Range loss for deep face recognition with long-tailed training data. In The IEEE international conference on computer vision (ICCV).Google Scholar
  64. Zhao, Y., Jin, Z., Qi, G., Lu, H., & Hua, X. (2018). A principled approach to hard triplet generation via adversarial nets. In European conference on computer vision.Google Scholar
  65. Zhou, E., Cao, Z., & Yin, Q. (2015). Naive-deep face recognition: Touching the limit of LFW benchmark or not? arXiv preprint arXiv:1501.04690.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Xiangyu Zhu
    • 1
    • 2
  • Hao Liu
    • 1
    • 2
  • Zhen Lei
    • 1
    • 2
    Email author
  • Hailin Shi
    • 1
  • Fan Yang
    • 3
  • Dong Yi
    • 4
  • Guojun Qi
    • 5
  • Stan Z. Li
    • 1
    • 2
  1. 1.Center for Biometrics and Security Research and National Laboratory of Pattern Recognition, Institute of AutomationChinese Academy of SciencesBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.College of SoftwareBeihang UniversityBeijingChina
  4. 4.DAMO AcademyAlibaba GroupZhejiangChina
  5. 5.HUAWEI CloudBostonUSA

Personalised recommendations