User-Invariant Facial Animation with Convolutional Neural Network

  • Shuiquan Wang
  • Zhengxin Cheng
  • Liang Chang
  • Xuejun Qiao
  • Fuqing DuanEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11301)


In this paper, we propose a robust approach for real-time user-invariant and performance-based face animation system using a single ordinary RGB camera with convolutional neural network (CNN), where the facial expression coefficients are used to drive the avatar. Existing shape regression algorithms usually take a two-step procedure to estimate facial expressions: The first is to estimate the 3D positions of facial landmarks, and the second is computing the head poses and expression coefficients. The proposed method directly regresses the face expression coefficients by using CNN. This single-shot regressor for facial expression coefficients is faster than the state-of-the-art single web camera based face animation system. Moreover, our method can avoid the user-specific 3D blendshapes, and thus it is user-invariant. Three different input size CNN architectures are designed and combined with Smoothed L1 and Gaussian loss functions to regress the expression coefficients. Experiments validate the proposed method.


Facial animation CNN Face tracking Expression regression 



This work was supported by the National Natural Science Foundation of China under Grant No. 61572078.


  1. 1.
    Cao, C., Weng, Y., Lin, S.: 3D shape regression for real-time facial animation. ACM Trans. Graph. 32(4), 96 (2013)CrossRefGoogle Scholar
  2. 2.
    Huang, H., Chai, J., Tong, X.: Leveraging motion capture and 3D scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. 30(4), 76–79 (2011)CrossRefGoogle Scholar
  3. 3.
    Zhang, L., Snavely, N., Curless, B.: Spacetime faces: high resolution capture for modeling and animation. ACM Trans. Graph. 23(3), 546–556 (2008)Google Scholar
  4. 4.
    Bradley, D., Heidrich, W., Popa, T.: High resolution passive facial performance capture. ACM Trans. Graph. 29(4), 157–166 (2010)CrossRefGoogle Scholar
  5. 5.
    Weise, T., Bouaziz, S., Li, H.: Realtime performance-based facial animation. ACM Trans. Graph. 30(4), 76–79 (2011)CrossRefGoogle Scholar
  6. 6.
    Sauer, P., Cootes, T., Taylor, C.: Accurate regression procedures for active appearance models. In: BMVC, vol. 1 no. 6, pp. 681–685 (2011)Google Scholar
  7. 7.
    Cao, C., Weng, Y., Zhou, S.: FaceWarehouse: a 3D facial expression database for visual computing. IEEE Trans. Visual Comput. Graphics 20(3), 413–425 (2014)CrossRefGoogle Scholar
  8. 8.
    Zhang, K., Zhang, Z., Li, Z.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Sig. Process. Lett. 23(10), 1499–1503 (2016)CrossRefGoogle Scholar
  9. 9.
    Kang, B.N., Kim, Y., Kim, D.: Deep convolution neural network with stacks of multi-scale convolutional layer block using triplet of faces for face recognition in the wild. In: IEEE International Conference on Systems, Man, and Cybernetics, pp. 4460–4465 (2017)Google Scholar
  10. 10.
    Levi, G., Hassncer, T.: Age and gender classification using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 34–42 (2015)Google Scholar
  11. 11.
    Ranjan, R., Sankaranarayanan, S., Castillo, C.D.: An all-in-one convolutional neural network for face analysis. In: 12th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 17–24(2017)Google Scholar
  12. 12.
    Xiong, X., Torre, F.D.L.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 9, no. 4, pp. 532–539 (2013)Google Scholar
  13. 13.
    Ranjan, R., Zhou, S., Chen, J.C.: Unconstrained age estimation with deep convolutional neural networks. In: IEEE International Conference on Computer Vision Workshop, pp. 351–359 (2015)Google Scholar
  14. 14.
    Ekman, P., Friesen, W.V.: Facial action coding system: a technique for the measurement of facial movement. Rivista Di Psichiatria 47(2), 126–38 (1978)Google Scholar
  15. 15.
    Weng, Y., Cao, C., Hou, Q.: Real-time facial animation on mobile devices. Graph. Models 76(3), 172–179 (2013)CrossRefGoogle Scholar
  16. 16.
    Redmon, R.: Darknet: open source neural networks in C. (2013–2016)
  17. 17.
    Wu, Y., Hassner, T., Kim, K., et al.: Facial landmark detection with tweaked convolutional neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 99, 1 (2015)Google Scholar
  18. 18.
    Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Shuiquan Wang
    • 1
  • Zhengxin Cheng
    • 1
  • Liang Chang
    • 1
  • Xuejun Qiao
    • 2
  • Fuqing Duan
    • 1
    Email author
  1. 1.College of Information Science and TechnologyBeijing Normal UniversityBeijingChina
  2. 2.School of ScienceXi’an University of Architecture and TechnologyXi’anChina

Personalised recommendations