Advertisement

Unified convolutional neural network for direct facial keypoints detection

  • Je-Kang Park
  • Dong-Joong KangEmail author
Original Article
  • 100 Downloads

Abstract

We propose a novel approach to directly estimate the position of the facial keypoints via convolutional neural networks (CNN). Our method estimates the global position and the local positions from a unified CNN and combines them through a simplified optimization process. There are twofolds of advantages for our approach. First, the global geometrical position and the local detailed position of the facial keypoints are combined complementarily to avoid local minimums caused by occlusions and pose variations. Second, unlike the traditional method such as a cascade of multiple CNN, we propose a unified deep and large architecture network consisted by global position network and local position network. Our design shares most of computations for facial features between networks, and this efficient high-level features improves largely to the precise estimate of facial keypoints. We conduct comparative experiments with the state-of-the-art researches and commercial services. In experiments, our approach shows a remarkable performance.

Keywords

Facial keypoints detection Face alignment Unified convolutional neural networks Shared network 

Notes

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2016R1A2B4007608), National IT Industry Promotion Agency (NIPA) grant funded by the Korea government (MSIT) (No. S0602-17-1001) and Technology & Information Promotion Agency for SMEs (TIPA) grant funded by the Korea government (MSIT) (No. C0507460).

References

  1. 1.
    Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006).  https://doi.org/10.1109/TPAMI.2006.244 CrossRefzbMATHGoogle Scholar
  2. 2.
    Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2930–2940 (2013).  https://doi.org/10.1109/TPAMI.2013.23 CrossRefGoogle Scholar
  3. 3.
    Berretti, S., del Bimbo, A., Pala, P.: Automatic facial expression recognition in real-time from dynamic sequences of 3D face scans. Vis. Comput. 29(12), 1333–1350 (2013).  https://doi.org/10.1007/s00371-013-0869-2 CrossRefGoogle Scholar
  4. 4.
    Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2887–2894 (2012).  https://doi.org/10.1109/CVPR.2012.6248015
  5. 5.
    Cao, Z., Yin, Q., Tang, X., Sun, J.: Face recognition with learning-based descriptor. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2707–2714 (2010).  https://doi.org/10.1109/CVPR.2010.5539992
  6. 6.
    Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Underst. 61(1), 38–59 (1995).  https://doi.org/10.1006/cviu.1995.1004 CrossRefGoogle Scholar
  7. 7.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active Appearance Models, pp. 484–498. Springer, Berlin (1998).  https://doi.org/10.1007/BFb0054760 Google Scholar
  8. 8.
    Ding, L., Ding, X., Fang, C.: 3D face sparse reconstruction based on local linear fitting. Vis. Comput. 30(2), 189–200 (2014).  https://doi.org/10.1007/s00371-013-0795-3 CrossRefGoogle Scholar
  9. 9.
    Gidaris, S., Komodakis, N.: Locnet: improving localization accuracy for object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 789–798 (2016).  https://doi.org/10.1109/CVPR.2016.92
  10. 10.
    Girshick, R.: Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015).  https://doi.org/10.1109/ICCV.2015.169
  11. 11.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016).  https://doi.org/10.1109/CVPR.2016.90
  12. 12.
    He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034 (2015).  https://doi.org/10.1109/ICCV.2015.123
  13. 13.
    Hu, J., Hua, J.: Pose analysis using spectral geometry. Vis. Comput. 29(9), 949–958 (2013).  https://doi.org/10.1007/s00371-013-0850-0 CrossRefGoogle Scholar
  14. 14.
    Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst (2007)Google Scholar
  15. 15.
    Jesorsky, O., Kirchberg, K.J., Frischholz, R.W.: Robust Face Detection Using the Hausdorff Distance, pp. 90–95. Springer, Berlin (2001).  https://doi.org/10.1007/3-540-45344-X_14 zbMATHGoogle Scholar
  16. 16.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
  17. 17.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc., Nevada (2012)Google Scholar
  18. 18.
    Liang, L., Xiao, R., Wen, F., Sun, J.: Face Alignment Via Component-Based Discriminative Search, pp. 72–85. Springer, Berlin (2008).  https://doi.org/10.1007/978-3-540-88688-4_6 Google Scholar
  19. 19.
    Luxand facesdk. http://www.luxand.com/facesdk/. Accessed 19 July 2017
  20. 20.
    Microsoft cognitive face. https://azure.microsoft.com/services/cognitive-services/face/. Accessed 19 July 2017
  21. 21.
    Milborrow, S., Nicolls, F.: Locating Facial Features with an Extended Active Shape Model, pp. 504–513. Springer, Berlin (2008).  https://doi.org/10.1007/978-3-540-88693-8_37 Google Scholar
  22. 22.
    Ren, S., Cao, X., Wei, Y., Sun, J.: Face alignment at 3000 FPS via regressing local binary features. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1685–1692 (2014).  https://doi.org/10.1109/CVPR.2014.218
  23. 23.
    Saatci, Y., Town, C.: Cascaded classification of gender and facial expression using active appearance models. In: 7th International Conference on Automatic Face and Gesture Recognition (FGR06), pp. 393–398 (2006).  https://doi.org/10.1109/FGR.2006.29
  24. 24.
    Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. ArXiv e-prints (2013). http://arxiv.org/abs/1312.6229
  25. 25.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ArXiv e-prints (2014)Google Scholar
  26. 26.
    Singh, C., Walia, E., Mittal, N.: Robust two-stage face recognition approach using global and local features. Vis. Comput. 28(11), 1085–1098 (2012).  https://doi.org/10.1007/s00371-011-0659-7 CrossRefGoogle Scholar
  27. 27.
    Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3476–3483 (2013).  https://doi.org/10.1109/CVPR.2013.446
  28. 28.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014).  https://doi.org/10.1109/CVPR.2014.220
  29. 29.
    Xiong, X., la Torre, F.D.: Supervised descent method and its applications to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013).  https://doi.org/10.1109/CVPR.2013.75
  30. 30.
    Zhang, C., Zhang, Z.: Improving multiview face detection with multi-task deep convolutional neural networks. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1036–1041 (2014).  https://doi.org/10.1109/WACV.2014.6835990
  31. 31.
    Zhou, E., Fan, H., Cao, Z., Jiang, Y., Yin, Q.: Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 386–391 (2013).  https://doi.org/10.1109/ICCVW.2013.58
  32. 32.
    Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886 (2012).  https://doi.org/10.1109/CVPR.2012.6248014

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Pusan National UniversityBusanKorea

Personalised recommendations