Unified convolutional neural network for direct facial keypoints detection

Park, Je-Kang; Kang, Dong-Joong

doi:10.1007/s00371-018-1561-3

Unified convolutional neural network for direct facial keypoints detection

Original Article
Published: 28 May 2018

Volume 35, pages 1615–1626, (2019)
Cite this article

The Visual Computer Aims and scope Submit manuscript

541 Accesses
13 Citations
Explore all metrics

Abstract

We propose a novel approach to directly estimate the position of the facial keypoints via convolutional neural networks (CNN). Our method estimates the global position and the local positions from a unified CNN and combines them through a simplified optimization process. There are twofolds of advantages for our approach. First, the global geometrical position and the local detailed position of the facial keypoints are combined complementarily to avoid local minimums caused by occlusions and pose variations. Second, unlike the traditional method such as a cascade of multiple CNN, we propose a unified deep and large architecture network consisted by global position network and local position network. Our design shares most of computations for facial features between networks, and this efficient high-level features improves largely to the precise estimate of facial keypoints. We conduct comparative experiments with the state-of-the-art researches and commercial services. In experiments, our approach shows a remarkable performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Convolutional Neural Network Based Facial Keypoints Detection

Bi-Level Multi-column Convolutional Neural Networks for Facial Landmark Point Detection

Comparison of Deep Learning Algorithms for Facial Keypoints Detection

References

Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006). https://doi.org/10.1109/TPAMI.2006.244
Article MATH Google Scholar
Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2930–2940 (2013). https://doi.org/10.1109/TPAMI.2013.23
Article Google Scholar
Berretti, S., del Bimbo, A., Pala, P.: Automatic facial expression recognition in real-time from dynamic sequences of 3D face scans. Vis. Comput. 29(12), 1333–1350 (2013). https://doi.org/10.1007/s00371-013-0869-2
Article Google Scholar
Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2887–2894 (2012). https://doi.org/10.1109/CVPR.2012.6248015
Cao, Z., Yin, Q., Tang, X., Sun, J.: Face recognition with learning-based descriptor. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2707–2714 (2010). https://doi.org/10.1109/CVPR.2010.5539992
Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Underst. 61(1), 38–59 (1995). https://doi.org/10.1006/cviu.1995.1004
Article Google Scholar
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active Appearance Models, pp. 484–498. Springer, Berlin (1998). https://doi.org/10.1007/BFb0054760
Book Google Scholar
Ding, L., Ding, X., Fang, C.: 3D face sparse reconstruction based on local linear fitting. Vis. Comput. 30(2), 189–200 (2014). https://doi.org/10.1007/s00371-013-0795-3
Article Google Scholar
Gidaris, S., Komodakis, N.: Locnet: improving localization accuracy for object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 789–798 (2016). https://doi.org/10.1109/CVPR.2016.92
Girshick, R.: Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015). https://doi.org/10.1109/ICCV.2015.169
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034 (2015). https://doi.org/10.1109/ICCV.2015.123
Hu, J., Hua, J.: Pose analysis using spectral geometry. Vis. Comput. 29(9), 949–958 (2013). https://doi.org/10.1007/s00371-013-0850-0
Article Google Scholar
Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst (2007)
Jesorsky, O., Kirchberg, K.J., Frischholz, R.W.: Robust Face Detection Using the Hausdorff Distance, pp. 90–95. Springer, Berlin (2001). https://doi.org/10.1007/3-540-45344-X_14
Book MATH Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., Nevada (2012)
Google Scholar
Liang, L., Xiao, R., Wen, F., Sun, J.: Face Alignment Via Component-Based Discriminative Search, pp. 72–85. Springer, Berlin (2008). https://doi.org/10.1007/978-3-540-88688-4_6
Book Google Scholar
Luxand facesdk. http://www.luxand.com/facesdk/. Accessed 19 July 2017
Microsoft cognitive face. https://azure.microsoft.com/services/cognitive-services/face/. Accessed 19 July 2017
Milborrow, S., Nicolls, F.: Locating Facial Features with an Extended Active Shape Model, pp. 504–513. Springer, Berlin (2008). https://doi.org/10.1007/978-3-540-88693-8_37
Book Google Scholar
Ren, S., Cao, X., Wei, Y., Sun, J.: Face alignment at 3000 FPS via regressing local binary features. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1685–1692 (2014). https://doi.org/10.1109/CVPR.2014.218
Saatci, Y., Town, C.: Cascaded classification of gender and facial expression using active appearance models. In: 7th International Conference on Automatic Face and Gesture Recognition (FGR06), pp. 393–398 (2006). https://doi.org/10.1109/FGR.2006.29
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. ArXiv e-prints (2013). http://arxiv.org/abs/1312.6229
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ArXiv e-prints (2014)
Singh, C., Walia, E., Mittal, N.: Robust two-stage face recognition approach using global and local features. Vis. Comput. 28(11), 1085–1098 (2012). https://doi.org/10.1007/s00371-011-0659-7
Article Google Scholar
Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3476–3483 (2013). https://doi.org/10.1109/CVPR.2013.446
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014). https://doi.org/10.1109/CVPR.2014.220
Xiong, X., la Torre, F.D.: Supervised descent method and its applications to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013). https://doi.org/10.1109/CVPR.2013.75
Zhang, C., Zhang, Z.: Improving multiview face detection with multi-task deep convolutional neural networks. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1036–1041 (2014). https://doi.org/10.1109/WACV.2014.6835990
Zhou, E., Fan, H., Cao, Z., Jiang, Y., Yin, Q.: Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 386–391 (2013). https://doi.org/10.1109/ICCVW.2013.58
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886 (2012). https://doi.org/10.1109/CVPR.2012.6248014

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2016R1A2B4007608), National IT Industry Promotion Agency (NIPA) grant funded by the Korea government (MSIT) (No. S0602-17-1001) and Technology & Information Promotion Agency for SMEs (TIPA) grant funded by the Korea government (MSIT) (No. C0507460).

Author information

Authors and Affiliations

Pusan National University, Busan, Korea
Je-Kang Park & Dong-Joong Kang

Authors

Je-Kang Park
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Joong Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dong-Joong Kang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, JK., Kang, DJ. Unified convolutional neural network for direct facial keypoints detection. Vis Comput 35, 1615–1626 (2019). https://doi.org/10.1007/s00371-018-1561-3

Download citation

Published: 28 May 2018
Issue Date: November 2019
DOI: https://doi.org/10.1007/s00371-018-1561-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unified convolutional neural network for direct facial keypoints detection

Abstract

Access this article

Similar content being viewed by others

Deep Convolutional Neural Network Based Facial Keypoints Detection

Bi-Level Multi-column Convolutional Neural Networks for Facial Landmark Point Detection

Comparison of Deep Learning Algorithms for Facial Keypoints Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unified convolutional neural network for direct facial keypoints detection

Abstract

Access this article

Similar content being viewed by others

Deep Convolutional Neural Network Based Facial Keypoints Detection

Bi-Level Multi-column Convolutional Neural Networks for Facial Landmark Point Detection

Comparison of Deep Learning Algorithms for Facial Keypoints Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation