Advertisement

Robust 3D Face Alignment with Efficient Fully Convolutional Neural Networks

  • Lei Jiang
  • Xiao-Jun WuEmail author
  • Josef Kittler
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11902)

Abstract

3D face alignment from monocular images is a crucial process in computer vision with applications to face recognition, animation and other areas. However, most algorithms are designed for faces in small to medium poses (below 45\(^\circ \)), lacking the ability to align faces in large poses up to 90\(^\circ \). At the same time, many methods are not efficient. The main challenge is that it is time consuming to determine the parameters accurately. In order to address this issue, this paper proposes a novel and efficient end-to-end 3D face alignment framework. We build an efficient and stable network model through Depthwise Separable Convolution and Densely Connected Convolutional, named MobDenseNet. Simultaneously, different loss functions are used to constrain 3D parameters based on 3D Morphable Model (3DMM) and 3D vertices. Experiments on the challenging AFLW, AFLW2000-3D databases show that our algorithm significantly improves the accuracy of 3D face alignment. Model parameters and complexity of the proposed method are also reduced significantly.

Keywords

3D face alignment 3D Morphable Model Computer vision 

Notes

Acknowledgments

The paper is supported by the National Natural Science Foundation of China (Grant No. 61672265,U1836218), the 111 Project of Ministry of Education of China (Grant No. B12018), and UK EPSRC Grant EP/N007743/1, Muri/EPSRC/ Dstl Grant EP/R018456/1,

Supplementary material

491548_1_En_23_MOESM1_ESM.pdf (367 kb)
Supplementary material 1 (pdf 367 KB)

References

  1. 1.
    Asthana, A., Zafeiriou, S., Cheng, S., Pantic, M.: Robust discriminative response map fitting with constrained local models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3444–3451 (2013)Google Scholar
  2. 2.
    Bettadapura, V.: Face expression recognition and analysis: the state of the art. arXiv preprint arXiv:1203.6722 (2012)
  3. 3.
    Blanz, V., Vetter, T.: Face recognition based on fitting a 3D morphable model. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1063–1074 (2003)CrossRefGoogle Scholar
  4. 4.
    Burgos-Artizzu, X.P., Perona, P., Dollár, P.: Robust face landmark estimation under occlusion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1513–1520 (2013)Google Scholar
  5. 5.
    Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. IEEE Trans. Vis. Comput. Graph. 20(3), 413–425 (2014)CrossRefGoogle Scholar
  6. 6.
    Cao, C., Wu, H., Weng, Y., Shao, T., Zhou, K.: Real-time facial animation with image-based dynamic avatars. ACM Trans. Graph. 35(4) (2016) Google Scholar
  7. 7.
    Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. Int. J. Comput. Vis. 107(2), 177–190 (2014)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Cootes, T., Baldock, E.R., Graham, J.: An introduction to active shape models. Image Process. Anal. 223–248 (2000) Google Scholar
  9. 9.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 6, 681–685 (2001)CrossRefGoogle Scholar
  10. 10.
    Cootes, T.F., Taylor, C.J., Lanitis, A.: Active shape models: evaluation of a multi-resolution method for improving image search. In: BMVC, vol. 1, pp. 327–336 (1994)Google Scholar
  11. 11.
    Cristinacce, D., Cootes, T.F.: Feature detection and tracking with constrained local models. In: BMVC, vol. 1, p. 3 (2006)Google Scholar
  12. 12.
    Dollár, P., Welinder, P., Perona, P.: Cascaded pose regression. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1078–1085. IEEE (2010)Google Scholar
  13. 13.
    Feng, Y., Wu, F., Shao, X., Wang, Y., Zhou, X.: Joint 3D face reconstruction and dense alignment with position map regression network. arXiv preprint arXiv:1803.07835 (2018)
  14. 14.
    Feng, Z.-H., Kittler, J., Awais, M., Huber, P., Wu, X.-J.: Wing loss for robust facial landmark localisation with convolutional neural networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2235–2245. IEEE (2018)Google Scholar
  15. 15.
    Forsyth, D.: Object detection with discriminatively trained part-based models. Computer 2, 6–7 (2014)CrossRefGoogle Scholar
  16. 16.
    Gu, L., Kanade, T.: 3D alignment of face in a single image. In: Null, pp. 1305–1312. IEEE (2006)Google Scholar
  17. 17.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  18. 18.
    Howard, A.G., et al.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
  19. 19.
    Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR, vol. 1, p. 3 (2017)Google Scholar
  20. 20.
    Jourabloo, A., Liu, X.: Pose-invariant 3D face alignment. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3694–3702 (2015)Google Scholar
  21. 21.
    Jourabloo, A., Liu, X.: Large-pose face alignment via CNN-based dense 3D model fitting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4188–4196 (2016)Google Scholar
  22. 22.
    Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  23. 23.
    Koestinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2144–2151. IEEE (2011)Google Scholar
  24. 24.
    Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance. AVSS 2009, pp. 296–301. IEEE (2009)Google Scholar
  25. 25.
    Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Inverted residuals and linear bottlenecks: Mobile networks for classification, detection and segmentation. arXiv preprint arXiv:1801.04381 (2018)
  26. 26.
    Saragih, J., Goecke, R.: A nonlinear discriminative approach to AAM fitting. In: IEEE 11th International Conference on Computer Vision. ICCV 2007, pp. 1–8. IEEE (2007)Google Scholar
  27. 27.
    Saragih, J.M., Lucey, S., Cohn, J.F.: Deformable model fitting by regularized landmark mean-shift. Int. J. Comput. Vis. 91(2), 200–215 (2011)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3476–3483 (2013)Google Scholar
  29. 29.
    Tran, A.T., Hassner, T., Masi, I., Medioni, G.: Regressing robust and discriminative 3D morphable models with a very deep neural network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1493–1502. IEEE (2017)Google Scholar
  30. 30.
    Tran, L., Liu, X.: Nonlinear 3D face morphable model. arXiv preprint arXiv:1804.03786 (2018)
  31. 31.
    Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5987–5995. IEEE (2017)Google Scholar
  32. 32.
    Yan, J., Lei, Z., Yi, D., Li, S.: Learn to combine multiple hypotheses for accurate face alignment. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 392–396 (2013)Google Scholar
  33. 33.
    Yu, X., Huang, J., Zhang, S., Metaxas, D.N.: Face landmark fitting via optimized part mixtures and cascaded deformable model. IEEE Trans. Pattern Anal. Mach. Intell. 11, 2212–2226 (2016)CrossRefGoogle Scholar
  34. 34.
    Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2879–2886. IEEE (2012)Google Scholar
  35. 35.
    Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 146–155 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.School of IoT EngineeringJiangnan UniversityWuxiChina
  2. 2.Center for Vision, Speech and Signal Processing (CVSSP)University of SurryGuildfordUK
  3. 3.Jiangsu Provincial Engineering, Laboratory of Pattern Recognition and Computational IntelligenceJiangnan UniversityWuxiChina

Personalised recommendations