Fast and Precise Face Alignment and 3D Shape Reconstruction from a Single 2D Image

  • Ruiqi Zhao
  • Yan Wang
  • C. Fabian Benitez-Quiroz
  • Yaojie Liu
  • Aleix M. Martinez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9914)

Abstract

Many face recognition applications require a precise 3D reconstruction of the shape of the face, even when only a single 2D image is available. We present a novel regression approach that learns to detect facial landmark points and estimate their 3D shape rapidly and accurately from a single face image. The main idea is to regress a function f(.) that maps 2D images of faces to their corresponding 3D shape from a large number of sample face images under varying pose, illumination, identity and expression. To model the non-linearity of this function, we use a deep neural network and demonstrate how it can be efficiently trained using a large number of samples. During testing, our algorithm runs at more than 30 frames/s on an i7 desktop. This algorithm was the top 2 performer in the 3DFAW Challenge.

Keywords

3D modeling and reconstruction of faces Fine-grained detection 3D shape from a single 2D image Precise and detailed detections 

References

  1. 1.
    Martinez, A., Du, S.: A model of the perception of facial expressions of emotion by humans: research overview and perspectives. J. Mach. Learn. Res. 13(1), 1589–1608 (2012)MathSciNetGoogle Scholar
  2. 2.
    Zhou, X., Leonardos, S., Hu, X., Daniilidis, K.: 3D shape estimation from 2D landmarks: a convex relaxation approach. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4447–4455 (2015)Google Scholar
  3. 3.
    Ramakrishna, V., Kanade, T., Sheikh, Y.: Reconstructing 3D human pose from 2D image landmarks. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 573–586. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  4. 4.
    Lin, Y.-L., Morariu, V.I., Hsu, W., Davis, L.S.: Jointly optimizing 3D model fitting and fine-grained classification. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part IV. LNCS, vol. 8692, pp. 466–480. Springer, Heidelberg (2014)Google Scholar
  5. 5.
    Kar, A., Tulsiani, S., Carreira, J., Malik, J.: Category-specific object reconstruction from a single image. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1966–1974 (2015)Google Scholar
  6. 6.
    Hamsici, O.C., Gotardo, P.F.U., Martinez, A.M.: Learning spatially-smooth mappings in non-rigid structure from motion. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 260–273. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  7. 7.
    Fayad, J., Russell, C., Agapito, L.: Automated articulated structure and 3D shape recovery from point correspondences. In: The IEEE International Conference on Computer Vision (ICCV), pp. 431–438 (2011)Google Scholar
  8. 8.
    Gotardo, P.F.U., Martinez, A.M.: Kernel non-rigid structure from motion. In: IEEE International Conference on Computer Vision (ICCV), pp. 802–809 (2011)Google Scholar
  9. 9.
    Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), pp. 187–194 (1999)Google Scholar
  10. 10.
    Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 296–301 (2009)Google Scholar
  11. 11.
    Dou, P., Wu, Y., Shah, S., Kakadiaris, I.: Robust 3D face shape reconstruction from single images via two-fold coupled structure learning and off-the-shelf landmark detectors. In: the British Machine Vision Conference, BMVA Press (2014)Google Scholar
  12. 12.
    Ding, L., Martinez, A.: Features versus context: an approach for precise and detailed detection and delineation of faces and facial features. IEEE Trans. Pattern Anal. Mach. Intell. 28(8), 1274–1286 (2006)CrossRefGoogle Scholar
  13. 13.
    Rivera, S., Martinez, A.M.: Learning deformable shape manifolds. Pattern Recogn. 45(4), 1792–1801 (2012)CrossRefGoogle Scholar
  14. 14.
    Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  15. 15.
    Xiong, X., la Torre, F.D.: Global supervised descent method. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  16. 16.
    Blanz, V., Vetter, T.: Face recognition based on fitting a 3D morphable model. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1063–1074 (2003)CrossRefGoogle Scholar
  17. 17.
    Booth, J., Roussos, A., Zafeiriou, S., Ponniah, A., Dunaway, D.: A 3D morphable model learnt from 10,000 faces. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016Google Scholar
  18. 18.
    Kemelmacher-Shlizerman, I., Basri, R.: 3D face reconstruction from a single image using a single reference face shape. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 394–405 (2011)CrossRefGoogle Scholar
  19. 19.
    Hamsici, O.C., Martinez, A.M.: Active appearance models with rotation invariant kernels. In: 12th International Conference on Computer Vision (ICCV), pp. 1003–1009 (2009)Google Scholar
  20. 20.
    Xiao, J., Baker, S., Matthews, I., Kanade, T.: Real-time combined 2D+3D active appearance models. In: The IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 535–542 (2004)Google Scholar
  21. 21.
    Gu, L., Kanade, T.: 3D alignment of face in a single image. In: The IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1305–1312 (2006)Google Scholar
  22. 22.
    Jourabloo, A., Liu, X.: Pose-invariant 3D face alignment. In: The International Conference on Computer Vision (ICCV) (2015)Google Scholar
  23. 23.
    Tulyakov, S., Sebe, N.: Regressing a 3D face shape from a single image. In: The International Conference on Computer Vision (ICCV) (2015)Google Scholar
  24. 24.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) (2001)Google Scholar
  25. 25.
    Martínez, A.M., Kak, A.C.: PCA versus LDA. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 228–233 (2001)CrossRefGoogle Scholar
  26. 26.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)Google Scholar
  27. 27.
    Martínez, A.M.: Recognizing imprecisely localized, partially occluded, and expression variant faces from a single sample per class. IEEE Trans. Pattern Anal. Mach. Intell. 24(6), 748–763 (2002)CrossRefGoogle Scholar
  28. 28.
    Tieleman, T., Hinton, G.: Lecture 6.5-RmsProp: Divide the gradient by a running average of its recent magnitude. In: COURSERA: Neural Networks for Machine Learning (2012)Google Scholar
  29. 29.
    Chollet, F.: keras (2015). https://github.com/fchollet/keras
  30. 30.
    Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I., Bergeron, A., Bouchard, N., Warde-Farley, D., Bengio, Y.: Theano: new features and speed improvements. arXiv preprint arXiv:1211.5590 (2012)
  31. 31.
    Gross, R., Matthews, I., Cohn, J., Kanade, T., Baker, S.: Multi-pie. Image Vis. Comput. 28(5), 807–813 (2010)CrossRefGoogle Scholar
  32. 32.
    Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3D dynamic facial expression database. In: 8th IEEE International Conference On Automatic Face & Gesture Recognition, FG 2008, pp. 1–6. IEEE (2008)Google Scholar
  33. 33.
    Zhang, X., Yin, L., Cohn, J.F., Canavan, S., Reale, M., Horowitz, A., Liu, P., Girard, J.M.: BP4D-spontaneous: a high-resolution spontaneous 3D dynamic facial expression database. Image Vis. Comput. 32(10), 692–706 (2014)CrossRefGoogle Scholar
  34. 34.
    Jeni, L.A., Cohn, J.F., Kanade, T.: Dense 3D face alignment from 2D video for real-time use. Image and Vision Computing (2016)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Ruiqi Zhao
    • 1
  • Yan Wang
    • 1
  • C. Fabian Benitez-Quiroz
    • 1
  • Yaojie Liu
    • 1
  • Aleix M. Martinez
    • 1
  1. 1.The Ohio State UniversityColumbusUSA

Personalised recommendations