Abstract
Robust face representation is imperative to highly accurate face recognition. In this work, we propose an open source face recognition method with deep representation named as VIPLFaceNet, which is a 10-layer deep convolutional neural network with seven convolutional layers and three fully-connected layers. Compared with the well-known AlexNet, our VIPLFaceNet takes only 20% training time and 60% testing time, but achieves 40% drop in error rate on the real-world face recognition benchmark LFW. Our VIPLFaceNet achieves 98.60% mean accuracy on LFW using one single network. An open-source C++ SDK based on VIPLFaceNet is released under BSD license. The SDK takes about 150ms to process one face image in a single thread on an i7 desktop CPU. VIPLFaceNet provides a state-of-the-art start point for both academic and industrial face recognition applications.
Similar content being viewed by others
References
Zhao WY, Chellappa R, Phillips P J, Rosenfeld A. Face recognition: a literature survey. ACM Computing Surveys, 2003, 35(4): 399–458
Liu C, Wechsler H. Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Transactions on Image Processing, 2002, 11(4): 467–476
Ahonen T, Hadid A, Pietikainen M. Face description with local binary patterns: application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(12): 2037–2041
Chen D, Cao X D, Wen F, Sun J. Blessing of dimensionality: highdimensional feature and its efficient compression for face verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3025–3032
Albiol A, Monzo D, Martin A, Sastre J, Albiol A. Face recognition using HOG-EBGM. Pattern Recognition Letters, 2008, 29(10): 1537–1543
Vu N S, Caplier A. Enhanced patterns of oriented edge magnitudes for face recognition and image matching. IEEE Transactions on Image Processing, 2012, 21(3): 1352–1365
Hussain S U, Napoléon T, Jurie F. Face recognition using local quantized patterns. In: Proceedings of British Machive Vision Conference. 2012, 11–20
Bicego M, Lagorio A, Grosso E, Tistarelli M. On the use of SIFT features for face authentication. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2006, 35–35
Kumar R, Banerjee A, Vemuri B C, Pfister H. Trainable convolution filters and their application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1423–1436
Lei Z, Yi D, Li S Z. Discriminant image filter learning for face recognition with local binary pattern like representation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 2512–2517
Xie S F, Shan S G, Chen X L, Meng X, Gao W. Learned local gabor patterns for face representation and recognition. Signal Processing, 2009, 89(12): 2333–2344
Cao Z M, Yin Q, Tang X O, Sun J. Face recognition with learningbased descriptor. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2707–2714
Cui Z, Li W, Xu D, Shan S G, Chen X. Fusing robust face region descriptors via multiple metric learning for face recognition in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3554–3561
Berg T, Belhumeur P N. Tom-vs-Pete classifiers and identitypreserving alignment for face verification. In: Proceedings of British Machine Vision Conference. 2012, 5
Taigman Y, Yang M, Ranzato M, Wolf L. Deepface: closing the gap to human-level performance in face verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1701–1708
Sun Y, Wang X, Tang X. Deep learning face representation from predicting 10,000 classes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1891–1898
Sun Y, Chen Y H, Wang X G, Tang X O. Deep learning face representation by joint identification-verification. In: Proceedings of Advances in Neural Information Processing Systems. 2014, 1988–1996
Sun Y, Wang X G, Tang X O. Deeply learned face representations are sparse, selective, and robust. 2014, arXiv:1412.1265
Schroff F, Kalenichenko D, Philbin J. Facenet: a unified embedding for face recognition and clustering. 2015, arXiv:1503.03832
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems. 2012, 1097–1105
Liu X, Shan S G, Li S X, Hauptmann A G. Everything is in the face? represent faces with object bank. In: Proceedings of Asian Conference on Computer Vision Workshops. 2014, 180–193
Simonyan K, Parkhi O M, Vedaldi A, Zisserman A. Fisher vector faces in the wild. In: Proceedings of British Machive Vision Conference. 2013
Kumar N, Berg A C, Belhumeur P N, Nayar S K. Attribute and simile classifiers for face verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 365–372
Yi D, Lei Z, Liao S C, Li S Z. Learning face representation from scratch, 2014, arXiv:1411.7923
Chen D, Cao X, Wang L, Wen F, Sun J. Bayesian face revisited: a joint formulation. In: Proceedings of European Conference on Computer Vision. 2012, 566–579
Samaria F S, Harter A C. Parameterisation of a stochastic model for human face identification. In: Proceedings of IEEE Workshop on Applications of Computer Vision. 1994, 138–142
Martinez A M. The AR face database. CVC Technical Report, 1998, 24
Phillips P J, Moon H, Rizvi S A, Rauss P J. The feret evaluation methodology for face-recognition algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(10): 1090–1104
Sim T, Baker S, Bsat M. The CMU pose, illumination, and expression (PIE) database. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition. 2002, 46–51
Phillips P J, Flynn P J, Scruggs T, Bowyer K W, Chang J, Hoffman K, Marques J, Min J, Worek W. Overview of the face recognition grand challenge. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2005, 947–954
Lee K C, Ho J, Kriegman D. Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5): 684–698
Gao W, Cao B, Shan S G, Chen X L, Zhou D L, Zhang X H, Zhao D B. The CAS-PEAL large-scale Chinese face database and baseline evaluations. IEEE Transactions on Systems, Man and Cybernetics Part A System and Humans, 2008, 38(1): 149–161
Gross R, Matthews I, Cohn J, Kanade T, Baker S. Multi-pie. Image and Vision Computing, 2010, 28(5): 807–813
Huang G B, Ramesh M, Berg T, Learned-Miller E. Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07–49, 2007
Chen B C, Chen C S, Hsu W H. Cross-age reference coding for ageinvariant face recognition and retrieval. In: Proceedings of European Conference on Computer Vision. 2014, 768–783
Wang D Y, Hoi S C H, Zhu J K. WLFDB: weakly labeled face databases. Technical Report, 2014
Zhang X, Zhang L, Wang X J, Shum H Y. Finding celebrities in billions of web images. IEEE Transactions on Multimedia, 2012, 14(4): 995–1007
Best-Rowden L, Han H, Otto C, Klare B F, Jain A K. Unconstrained face recognition: identifying a person of interest from a media collection. IEEE Transactions on Information Forensics and Security, 2014, 9(12): 2144–2157
Guillaumin M, Verbeek J, Schmid C. Is that you? metric learning approaches for face identification. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 498–505
Taigman Y, Wolf L, Hassner T. Multiple one-shots for utilizing class label information. In: Proceedings of British Machive Vision Conference. 2009, 1–12
Yin Q, Tang X O, Sun J. An associate-predict model for face recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2011, 497–504
Cao X, Wipf D, Wen F, Duan G Q, Sun J. A practical transfer learning algorithm for face verification. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 3208–3215
Lu C C, Tang X O. Surpassing human-level face verification performance on LFW with gaussianface. 2014, arXiv:1404.3840
Parkhi O M, Vedaldi A, Zisserman A. Deep face recognition. Proceedings of the British Machine Vision, 2015, 1(3): 6
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. 2014, arXiv:1409.4842
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014, arXiv:1409.1556
He K M, Zhang X Y, Ren S Q, Sun J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. 2015, arXiv:1502.01852
He KM, Sun J. Convolutional neural networks at constrained time cost. 2014, arXiv:1412.1710
Zhang S S, Zhang C, You Z, Zheng R, Xu B. Asynchronous stochastic gradient descent for DNN training. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 2013, 6660–6663
Chatfield K, Simonyan K, Vedaldi A, Zisserman A. Return of the devil in the details: delving deep into convolutional nets. 2014, arXiv:1405.3531
LeCun Y, Bottou L, Orr G B, Müller K R. Efficient backprop. In: Montavon G, Orr G B, Müller K R, eds. Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, Vol 7700. Berlin: Springer, 2012, 9–48
Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of Internatioal Conference on Machine Learning. 2015, 448–456
Yan S, Shan S G, Chen X, Gao W. Locally assembled binary (LAB) feature with feature-centric cascade for fast and accurate face detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1–7
Zhang J, Shan S G, Kan M N, Chen X L. Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In: Proceedings of European Conference on Computer Vision. 2014, 1–16
Jia Y Q, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia. 2014, 675–678
Zou Q, Zeng J C, Cao L J, Ji R R. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing, 2016, 173: 346–354
Lin C, Chen WQ, Qiu C, Wu Y F, Krishnan S, Zou Q. LibD3C: ensemble classifiers with a clustering and dynamic selection strategy. Neurocomputing, 2014, 123: 424–435
Taigman Y, Yang M, Ranzato M A, Wolf L. Web-scale training for face identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 2746–2754
Liu X, Li S X, Kan M N, Zhang J, Wu S Z, Liu W X, Han H, Shan S G, Chen X L. Agenet: deeply learned regressor and classifier for robust apparent age estimation. In: Proceedings of IEEE International Conference on Computer Vision Workshops. 2015, 258–266
Acknowledgements
This work was partially supported by the National Basic Research Program of China (973 Program) (2015CB351802), and the National Natural Science Foundation of China (Grant Nos. 61402443, 61390511, 61379083, 61222211).
Author information
Authors and Affiliations
Corresponding author
Additional information
Xin Liu recieved the BS degree from Chongqing University, China in 2011. Currently, he is a PhD candidate at the Institute of Computing Technology, Chinese Academy of Sciences, China. His research interests include face recognition, image retrieval, and deep learning.
Meina Kan is an associate professor with the Institute of Computing Technology, Chinese Academy of Sciences, China. She received the PhD degree from the University of Chinese Academy of Sciences, China. Her research mainly focuses on computer vision, especially face recognition, transfer learning, and deep learning.
Wanglong Wu recieved the BS degree from Beijing Jiaotong University, China in 2014. Currently, he is a PhD candidate at the Institute of Computing Technology, Chinese Academy of Sciences, China. His research interests include face recognition and deep learning.
Shiguang Shan received MS degree in computer science from Harbin Institute of Technology, China in 1999, and PhD degree in computer science from the Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS), China in 2004. He joined ICT, CAS in 2002 and has been a professor since 2010. He is now the deputy director of the Key Lab of Intelligent Information Processing of CAS. His research interests cover computer vision, pattern recognition, and machine learning. He especially focuses on face recognition related research topics. He has published more than 200 papers in refereed journals and proceedings.
Xilin Chen received the BS, MS, and PhD degrees in computer science from Harbin Institute of Technology, China in 1988, 1991, and 1994, respectively. He is now a professor with the Institute of Computing Technology, Chinese Academy of Sciences, China. He has authored one book and over 200 papers in refereed journals and proceedings in the areas of computer vision, pattern recognition, image processing, and multimodal interfaces.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Liu, X., Kan, M., Wu, W. et al. VIPLFaceNet: an open source deep face recognition SDK. Front. Comput. Sci. 11, 208–218 (2017). https://doi.org/10.1007/s11704-016-6076-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11704-016-6076-3