Advertisement

Inequality-Constrained and Robust 3D Face Model Fitting

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12354)

Abstract

Fitting 3D morphable models (3DMMs) on faces is a well-studied problem, motivated by various industrial and research applications. 3DMMs express a 3D facial shape as a linear sum of basis functions. The resulting shape, however, is a plausible face only when the basis coefficients take values within limited intervals. Methods based on unconstrained optimization address this issue with a weighted \(\ell _2\) penalty on coefficients; however, determining the weight of this penalty is difficult, and the existence of a single weight that works universally is questionable. We propose a new formulation that does not require the tuning of any weight parameter. Specifically, we formulate 3DMM fitting as an inequality-constrained optimization problem, where the primary constraint is that basis coefficients should not exceed the interval that is learned when the 3DMM is constructed. We employ additional constraints to exploit sparse landmark detectors, by forcing the facial shape to be within the error bounds of a reliable detector. To enable operation “in-the-wild”, we use a robust objective function, namely Gradient Correlation. Our approach performs comparably with deep learning (DL) methods on “in-the-wild” data that have inexact ground truth, and better than DL methods on more controlled data with exact ground truth. Since our formulation does not require any learning, it enjoys a versatility that allows it to operate with multiple frames of arbitrary sizes. This study’s results encourage further research on 3DMM fitting with inequality-constrained optimization methods, which have been unexplored compared to unconstrained methods.

Keywords

3D model fitting 3D face reconstruction 3D shape 

Notes

Acknowledgements

This work is partially funded by the Office of the Director, National Institutes of Health (OD) and National Institute of Mental Health (NIMH) of US, under grants R01MH118327, R01MH122599 and R21HD102078.

Supplementary material

504446_1_En_25_MOESM1_ESM.pdf (3.7 mb)
Supplementary material 1 (pdf 3751 KB)

References

  1. 1.
    Bas, A., Smith, W.A.P., Bolkart, T., Wuhrer, S.: Fitting a 3D morphable model to edges: a comparison between hard and soft correspondences. In: Chen, C.-S., Lu, J., Ma, K.-K. (eds.) ACCV 2016. LNCS, vol. 10117, pp. 377–391. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-54427-4_28CrossRefGoogle Scholar
  2. 2.
    Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Proceedings of the Conference on Computer Graphics and Interactive Techniques, pp. 187–194. ACM Press/Addison-Wesley Publishing Co. (1999)Google Scholar
  3. 3.
    Blanz, V., Vetter, T.: Face recognition based on fitting a 3D morphable model. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1063–1074 (2003)CrossRefGoogle Scholar
  4. 4.
    Bolkart, T., Wuhrer, S.: 3D faces in motion: fully automatic registration and statistical analysis. Comput. Vis. Image Understand. 131, 100–115 (2015)CrossRefGoogle Scholar
  5. 5.
    Booth, J., Antonakos, E., Ploumpis, S., Trigeorgis, G., Panagakis, Y., Zafeiriou, S.: A 3D morphable model learnt from 10,000 faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5464–5473. IEEE (2016)Google Scholar
  6. 6.
    Booth, J., Antonakos, E., Ploumpis, S., Trigeorgis, G., Panagakis, Y., Zafeiriou, S.: 3D face morphable models “in-the-wild”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5464–5473. IEEE (2017)Google Scholar
  7. 7.
    Booth, J., et al.: 3D reconstruction of “in-the-wild” faces in images and videos. IEEE Trans. Pattern Anal. Mach. Intell. 40(11), 2638–2652 (2018)CrossRefGoogle Scholar
  8. 8.
    Boyd, S., Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, New York (2004)CrossRefGoogle Scholar
  9. 9.
    Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3d facial landmarks). In: Proceedings of the International Conference on Computer Vision. IEEE (2017)Google Scholar
  10. 10.
    Egger, B., et al.: 3D morphable face models-past, present and future. arXiv preprint arXiv:1909.01815 (2019)
  11. 11.
    Feng, Y., Wu, F., Shao, X., Wang, Y., Zhou, X.: Joint 3D face reconstruction and dense alignment with position map regression network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 557–574. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01264-9_33CrossRefGoogle Scholar
  12. 12.
    Garrido, P., et al.: Reconstruction of personalized 3D face rigs from monocular video. ACM Trans. Graph. 35(3), 1–15 (2016)CrossRefGoogle Scholar
  13. 13.
    Gecer, B., Ploumpis, S., Kotsia, I., Zafeiriou, S.: Ganfit: generative adversarial network fitting for high fidelity 3D face reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1155–1164. IEEE (2019)Google Scholar
  14. 14.
    Gerig, T., et al.: Morphable face models - an open framework. In: Proceedings of the IEEE International Conference on Automatic Face Gesture Recognition, pp. 75–82. IEEE (2018)Google Scholar
  15. 15.
    Guo, Y., Cai, J., Jiang, B., Zheng, J., et al.: CNN-based real-time dense face reconstruction with inverse-rendered photo-realistic face images. IEEE Trans. Pattern Anal. Mach. Intell. 41(6), 1294–1307 (2018)CrossRefGoogle Scholar
  16. 16.
    Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, New York (2003)zbMATHGoogle Scholar
  17. 17.
    Hernandez, M., Hassner, T., Choi, J., Medioni, G.: Accurate 3D face reconstruction via prior constrained structure from motion. Comput. Graph. 66, 14–22 (2017)CrossRefGoogle Scholar
  18. 18.
    Hu, L., et al.: Avatar digitization from a single image for real-time rendering. ACM Trans. Graph. 36(6), 1–14 (2017)CrossRefGoogle Scholar
  19. 19.
    Jackson, A.S., Bulat, A., Argyriou, V., Tzimiropoulos, G.: Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression. IEEE (2017)Google Scholar
  20. 20.
    Liu, Y., Jourabloo, A., Ren, W., Liu, X.: Dense face alignment. In: Proceedings of the International Conference on Computer Vision Workshops, pp. 1619–1628. IEEE (2017)Google Scholar
  21. 21.
    Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: Proceedings of IEEE International Conference on Advanced Video and Signal based Surveillance for Security, Safety and Monitoring in Smart Environments, pp. 296–301. IEEE (2009)Google Scholar
  22. 22.
    Piotraschke, M., Blanz, V.: Automated 3D face reconstruction from multiple images using quality measures. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3418–3427. IEEE (2016)Google Scholar
  23. 23.
    Qu, C., Monari, E., Schuchert, T., Beyerer, J.: Adaptive contour fitting for pose-invariant 3D face shape reconstruction. In: Xie, X., Jones, M.W., Tam, G.K.L. (eds.) Proceedings of the British Machine Vision Conference, pp. 87.1-87.12. BMVA Press (2015)Google Scholar
  24. 24.
    Romdhani, S., Vetter, T.: Estimating 3D shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 986–993. IEEE (2005)Google Scholar
  25. 25.
    Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: Proceedings of the International Conference on Computer Vision Workshops, pp. 397–403. IEEE (2013)Google Scholar
  26. 26.
    Sela, M., Richardson, E., Kimmel, R.: Unrestricted facial geometry reconstruction using image-to-image translation. In: Proceedings of the International Conference on Computer Vision, pp. 1576–1585. IEEE (2017)Google Scholar
  27. 27.
    Shi, F., Wu, H.T., Tong, X., Chai, J.: Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans. Graph. 33(6), 1–13 (2014)CrossRefGoogle Scholar
  28. 28.
    Tewari, A., et al.: FML: face model learning from videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10812–10822. IEEE (2019)Google Scholar
  29. 29.
    Tewari, A., et al.: Self-supervised multi-level face model learning for monocular reconstruction at over 250 hz. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2549–2559. IEEE (2018)Google Scholar
  30. 30.
    Thies, J., Zollhofer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2face: real-time face capture and reenactment of RGB videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2387–2395. IEEE (2016)Google Scholar
  31. 31.
    Tran, L., Liu, F., Liu, X.: Towards high-fidelity nonlinear 3D face morphable model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1126–1135. IEEE (2019)Google Scholar
  32. 32.
    Tzimiropoulos, G., Argyriou, V., Stathaki, T.: Subpixel registration with gradient correlation. IEEE Trans. Image Process. 20(6), 1761–1767 (2010)MathSciNetCrossRefGoogle Scholar
  33. 33.
    Tzimiropoulos, G., Alabort-i-Medina, J., Zafeiriou, S., Pantic, M.: Generic active appearance models revisited. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7726, pp. 650–663. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-37431-9_50CrossRefGoogle Scholar
  34. 34.
    Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: Robust and efficient parametric face alignment. In: Proceedings of the International Conference on Computer Vision, pp. 1847–1854. IEEE (2011)Google Scholar
  35. 35.
    Upton, G., Cook, I.: A Dictionary of Statistics 3e. Oxford University Press, Oxford (2014)zbMATHGoogle Scholar
  36. 36.
    Valstar, M., et al.: Avec 2013: the continuous audio/visual emotion and depression recognition challenge. In: Proceedings of the ACM International Workshop on Audio/visual Emotion Challenge, pp. 3–10. ACM (2013)Google Scholar
  37. 37.
    Wächter, A.: Short tutorial: getting started with ipopt in 90 minutes. In: Naumann, U., Schenk, O., Simon, H.D., Toledo, S. (eds.) Combinatorial Scientific Computing. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, Germany (2009)Google Scholar
  38. 38.
    Wang, K., Ji, Q.: Real time eye gaze tracking with 3D deformable eye-face model. In: Proceedings of the International Conference on Computer Vision, pp. 1003–1011. IEEE (2017)Google Scholar
  39. 39.
    Weise, T., Bouaziz, S., Li, H., Pauly, M.: Realtime performance-based facial animation. ACM Trans. Graph. 30(4), 1–10 (2011)CrossRefGoogle Scholar
  40. 40.
    Xue, N., Deng, J., Cheng, S., Panagakis, Y., Zafeiriou, S.: Side information for face completion: a robust PCA approach. IEEE Trans. Pattern Anal. Mach. Intell. 41(10), 2349–2364 (2019)CrossRefGoogle Scholar
  41. 41.
    Zhang, X., et al.: A high-resolution spontaneous 3d dynamic facial expression database. In: Proceedings of the IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, pp. 1–6. IEEE (2013)Google Scholar
  42. 42.
    Zhou, Q.Y., Park, J., Koltun, V.: Open3D: A modern library for 3D data processing. arXiv:1801.09847 (2018)
  43. 43.
    Zhou, Y., Deng, J., Kotsia, I., Zafeiriou, S.: Dense 3D face decoding over 2500fps: joint texture & shape convolutional mesh decoders. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1097–1106 (2019)Google Scholar
  44. 44.
    Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, June 2016Google Scholar
  45. 45.
    Zhu, X., Liu, X., Lei, Z., Li, S.Z.: Face alignment in full pose range: a 3D total solution. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 78–92 (2017)CrossRefGoogle Scholar
  46. 46.
    Zollhöfer, M., et al.: Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. 33(4), 1–12 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Center for Autism ResearchChildren’s Hospital of PhiladelphiaPhiladelphiaUSA
  2. 2.University of PennsylvaniaPhiladelphiaUSA

Personalised recommendations