Skip to main content

AU-Aware 3D Face Reconstruction through Personalized AU-Specific Blendshape Learning

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13673))

Included in the following conference series:

Abstract

3D face reconstruction and facial action unit (AU) detection have emerged as interesting and challenging tasks in recent years, but are rarely performed in tandem. Image-based 3D face reconstruction, which can represent a dense space of facial motions, is typically accomplished by estimating identity, expression, texture, head pose, and illumination separately via pre-constructed 3D morphable models (3DMMs). Recent 3D reconstruction models can recover high-quality geometric facial details like wrinkles and pores, but are still limited in their ability to recover 3D subtle motions caused by the activation of AUs. We present a multi-stage learning framework that recovers AU-interpretable 3D facial details by learning personalized AU-specific blendshapes from images. Our model explicitly learns 3D expression basis by using AU labels and generic AU relationship prior and then constrains the basis coefficients such that they are semantically mapped to each AU. Our AU-aware 3D reconstruction model generates accurate 3D expressions composed by semantically meaningful AU motion components. Furthermore, the output of the model can be directly applied to generate 3D AU occurrence predictions, which have not been fully explored by prior 3D reconstruction models. We demonstrate the effectiveness of our approach via qualitative and quantitative evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We select a set of eight AU pairs mentioned in Table 1 that are available in both the learned BN and the AU indices with \(C_2:=\{(1,2),(4,7),(6,12),(15,17),(2,6),(2,7),(12,15),(12,17)\}\).

References

  1. Albiero, V., Chen, X., Yin, X., Pang, G., Hassner, T.: img2pose: face alignment and detection via 6dof, face pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7617–7627 (2021)

    Google Scholar 

  2. Amberg, B., Romdhani, S., Vetter, T.: Optimal step nonrigid ICP algorithms for surface registration. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)

    Google Scholar 

  3. Ariano, L., Ferrari, C., Berretti, S., Del Bimbo, A.: Action unit detection by learning the deformation coefficients of a 3D morphable model. Sensors 21(2), 589 (2021)

    Article  Google Scholar 

  4. Bai, Z., Cui, Z., Liu, X., Tan, P.: Riggable 3D face reconstruction via in-network optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6216–6225 (2021)

    Google Scholar 

  5. Bai, Z., Cui, Z., Rahim, J.A., Liu, X., Tan, P.: Deep facial non-rigid multi-view stereo. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5850–5860 (2020)

    Google Scholar 

  6. Bayramoglu, N., Zhao, G., Pietikäinen, M.: CS-3DLBP and geometry based person independent 3D facial action unit detection. In: 2013 International Conference on Biometrics (ICB), pp. 1–6. IEEE (2013)

    Google Scholar 

  7. Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 187–194 (1999)

    Google Scholar 

  8. Booth, J., Roussos, A., Zafeiriou, S., Ponniah, A., Dunaway, D.: A 3D morphable model learnt from 10,000 faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5543–5552 (2016)

    Google Scholar 

  9. Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: FaceWarehouse: a 3D facial expression database for visual computing. IEEE Trans. Visual Comput. Graphics 20(3), 413–425 (2013)

    Google Scholar 

  10. Chang, F.J., Tran, A.T., Hassner, T., Masi, I., Nevatia, R., Medioni, G.: ExpNet: landmark-free, deep, 3D facial expressions. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 122–129. IEEE (2018)

    Google Scholar 

  11. Chaudhuri, B., Vesdapunt, N., Shapiro, L., Wang, B.: Personalized face modeling for improved face reconstruction and motion retargeting. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 142–160. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_9

    Chapter  Google Scholar 

  12. Chu, W.S., la Torre, F.D., Cohn, J.F.: Selective transfer machine for personalized facial action unit detection. In: CVPR (2013)

    Google Scholar 

  13. Chung, J.S., Nagrani, A., Zisserman, A.: VoxCeleb2: deep speaker recognition. arXiv preprint arXiv:1806.05622 (2018)

  14. Corneanu, C., Madadi, M., Escalera, S.: Deep structure inference network for facial action unit recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 309–324. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_19

    Chapter  Google Scholar 

  15. Corneanu, C., Madadi, M., Escalera, S.: Deep structure inference network for facial action unit recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 309–324. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_19

    Chapter  Google Scholar 

  16. Cui, Z., Song, T., Wang, Y., Ji, Q.: Knowledge augmented deep neural networks for joint facial expression and action unit recognition. In: Advances in Neural Information Processing Systems 33 (2020)

    Google Scholar 

  17. Danelakis, A., Theoharis, T., Pratikakis, I.: Action unit detection in 3D facial videos with application in facial expression retrieval and recognition. Multimedia Tools Appl. 77(19), 24813–24841 (2018)

    Article  Google Scholar 

  18. Deng, Y., Yang, J., Xu, S., Chen, D., Jia, Y., Tong, X.: Accurate 3D face reconstruction with weakly-supervised learning: From single image to image set. In: Computer Vision and Pattern Recognition Workshops, pp. 285–295 (2019)

    Google Scholar 

  19. Dib, A., Thebault, C., Ahn, J., Gosselin, P., Theobalt, C., Chevallier, L.: Towards high fidelity monocular face reconstruction with rich reflectance using self-supervised learning and ray tracing. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  20. Ekman, P., Friesen, W.V., Hager, J.C.: Facial action coding system. A Human Face, Salt Lake City, UT (2002)

    Google Scholar 

  21. Feng, Y., Feng, H., Black, M.J., Bolkart, T.: Learning an animatable detailed 3D face model from in-the-wild images, vol. 40 (2021). https://doi.org/10.1145/3450626.3459936

  22. Gerig, T., et al.: Morphable face models-an open framework. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 75–82. IEEE (2018)

    Google Scholar 

  23. Guo, J., Zhu, X., Yang, Y., Yang, F., Lei, Z., Li, S.Z.: Towards fast, accurate and stable 3D dense face alignment. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 152–168. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_10

    Chapter  Google Scholar 

  24. Jacob, G.M., Stenger, B.: Facial action unit detection with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7680–7689 (2021)

    Google Scholar 

  25. Jiao, Y., Niu, Y., Tran, T.D., Shi, G.: 2D+ 3D facial expression recognition via discriminative dynamic range enhancement and multi-scale learning. arXiv preprint arXiv:2011.08333 (2020)

  26. Li, G., Zhu, X., Zeng, Y., Wang, Q., Lin, L.: Semantic relationships guided representation learning for facial action unit recognition. In: AAAI (2019)

    Google Scholar 

  27. Li, H., Weise, T., Pauly, M.: Example-based facial rigging. ACM Trans. Graph. (TOG) 29(4), 1–6 (2010)

    Google Scholar 

  28. Li, H., Sun, J., Xu, Z., Chen, L.: Multimodal 2D+ 3D facial expression recognition with deep fusion convolutional neural network. IEEE Trans. Multimedia 19(12), 2816–2831 (2017)

    Article  Google Scholar 

  29. Li, R., et al.: Learning formation of physically-based face attributes (2020)

    Google Scholar 

  30. Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36(6), 1–194 (2017)

    Google Scholar 

  31. Li, W., Abtahi, F., Zhu, Z.: Action unit detection with region adaptation, multi-labeling learning and optimal temporal fusing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1841–1850 (2017)

    Google Scholar 

  32. Li, W., Abtahi, F., Zhu, Z., Yin, L.: EAC-Net: deep nets with enhancing and cropping for facial action unit detection. IEEE Trans. Pattern Anal. Mach. Intell. 40(11), 2583–2596 (2018)

    Article  Google Scholar 

  33. Li, Y., Chen, J., Zhao, Y., Ji, Q.: Data-free prior model for facial action unit recognition. IEEE Trans. Affect. Comput. 4(2), 127–141 (2013)

    Article  Google Scholar 

  34. Liu, Z., Song, G., Cai, J., Cham, T.J., Zhang, J.: Conditional adversarial synthesis of 3D facial action units. Neurocomputing 355, 200–208 (2019)

    Article  Google Scholar 

  35. Liu, Z., Song, G., Cai, J., Cham, T.J., Zhang, J.: Conditional adversarial synthesis of 3D facial action units. Neurocomputing 355, 200–208 (2019)

    Article  Google Scholar 

  36. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

  37. Müller, Claus: Spherical harmonics. LNM, vol. 17. Springer, Heidelberg (1966). https://doi.org/10.1007/BFb0094775

  38. Niu, X., Han, H., Yang, S., Shan, S.: Local relationship learning with person-specific shape regularization for facial action unit detection. In: CVPR (2019)

    Google Scholar 

  39. Niu, X., Han, H., Yang, S., Huang, Y., Shan, S.: Local relationship learning with person-specific shape regularization for facial action unit detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops (2019)

    Google Scholar 

  40. Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 296–301. IEEE (2009)

    Google Scholar 

  41. Reale, M.J., Klinghoffer, B., Church, M., Szmurlo, H., Yin, L.: Facial action unit analysis through 3d point cloud neural networks. In: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), pp. 1–8. IEEE (2019)

    Google Scholar 

  42. Sanyal, S., Bolkart, T., Feng, H., Black, M.: Learning to regress 3D face shape and expression from an image without 3D supervision. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  43. Savran, A., Sankur, B., Bilge, M.T.: Comparative evaluation of 3D vs. 2D modality for automatic detection of facial action units. Pattern Recogn. 45(2), 767–782 (2012)

    Google Scholar 

  44. Shang, J., et al.: Self-supervised monocular 3D face reconstruction by occlusion-aware multi-view geometry consistency. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 53–70. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_4

    Chapter  Google Scholar 

  45. Shao, Z., Liu, Z., Cai, J., Ma, L.: Deep adaptive attention for joint facial action unit detection and face alignment. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 725–740. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_43

    Chapter  Google Scholar 

  46. Shao, Z., Liu, Z., Cai, J., Wu, Y., Ma, L.: Facial action unit detection using attention and relation learning. IEEE Transactions on Affective Computing (2019)

    Google Scholar 

  47. Song, X., et al.: Unsupervised learning facial parameter regressor for action unit intensity estimation via differentiable renderer. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2842–2851 (2020)

    Google Scholar 

  48. Tewari, et al.: FML: face model learning from videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10812–10822 (2019)

    Google Scholar 

  49. Tewari, A., et al.: Learning complete 3D morphable face models from images and videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3361–3371 (2021)

    Google Scholar 

  50. Tulyakov, S., Vieriu, R.L., Sangineto, E., Sebe, N.: Facecept3D: real time 3D face tracking and analysis. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 28–33 (2015)

    Google Scholar 

  51. Wu, C.Y., Xu, Q., Neumann, U.: Synergy between 3Dmm and 3D landmarks for accurate 3D facial geometry. arXiv preprint arXiv:2110.09772 (2021)

  52. Yan, Y., Lu, K., Xue, J., Gao, P., Lyu, J.: FEAFA: a well-annotated dataset for facial expression analysis and 3d facial animation. In: 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 96–101. IEEE (2019)

    Google Scholar 

  53. Yang, H., et al.: FaceScape: a large-scale high quality 3D face dataset and detailed riggable 3D face prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 601–610 (2020)

    Google Scholar 

  54. Yang, H., Wang, T., Yin, L.: Adaptive multimodal fusion for facial action units recognition. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2982–2990 (2020)

    Google Scholar 

  55. Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3D facial expression database for facial behavior research. In: 7th international conference on automatic face and gesture recognition (FGR06), pp. 211–216. IEEE (2006)

    Google Scholar 

  56. Zhang, X., et al.: BP4D-spontaneous: a high-resolution spontaneous 3D dynamic facial expression database. Image Vis. Comput. 32(10), 692–706 (2014)

    Article  Google Scholar 

  57. Zhang, Y., Dong, W., Hu, B., Ji, Q.: Classifier learning with prior probabilities for facial action unit recognition. In: CVPR (2018)

    Google Scholar 

  58. Zhang, Y., Dong, W., Hu, B., Ji, Q.: Weakly-supervised deep convolutional neural network learning for facial action unit intensity estimation. In: CVPR (2018)

    Google Scholar 

  59. Zhou, Q.Y., Park, J., Koltun, V.: Open3D: a modern library for 3D data processing. arXiv preprint arXiv:1801.09847 (2018)

  60. Zhu, K., Du, Z., Li, W., Huang, D., Wang, Y., Chen, L.: Discriminative attention-based convolutional neural network for 3D facial expression recognition. In: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), pp. 1–8. IEEE (2019)

    Google Scholar 

  61. Zhu, X., Liu, X., Lei, Z., Li, S.Z.: Face alignment in full pose range: a 3D total solution. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2017)

    Google Scholar 

  62. Zhu, X., et al.: Beyond 3DMM space: towards fine-grained 3D face reconstruction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12353, pp. 343–358. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_21

    Chapter  Google Scholar 

Download references

Acknowledgements

The work described in this paper is supported in part by the U.S. National Science Foundation award CNS 1629856.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chenyi Kuang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 4449 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kuang, C., Cui, Z., Kephart, J.O., Ji, Q. (2022). AU-Aware 3D Face Reconstruction through Personalized AU-Specific Blendshape Learning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13673. Springer, Cham. https://doi.org/10.1007/978-3-031-19778-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19778-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19777-2

  • Online ISBN: 978-3-031-19778-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics