Efficient deformable 3D face model tracking with limited hardware resources

  • Jon GoenetxeaEmail author
  • Luis Unzueta
  • Fadi Dornaika
  • Oihana Otaegui


Face fitting methods align deformable models to faces on images using the information given by the image pixels. However, most algorithms are designed to be used in desktop personal computers (PC), or hardware with significant computational power. These approaches are therefore too demanding for devices with limited computational power, like the increasingly used ARM-based devices. Besides the hardware limitations, the particularities of each operating system include additional challenges to the implementation of real-time face tracking solutions. To fill the lack of methods designed for platforms with a limited computational power we present an efficient way to fit 3D human face models to monocular images. This approach estimates the head pose and gesture in a 3D environment based on a full perspective projection, using parametric non-linear optimisation. We compare the performance of this method running it on similar ARM-based devices with different operating systems (Linux, Android, and iOS). In all cases, we have measured both accuracy and performance. The efficiency of the method makes it possible to run it in real-time (\(\backsim \)30fps) on devices with limited computational power like smartphones and embedded systems. These kind of efficient methods are a vital component for human behaviour analysis applications, like driver monitoring systems and human-machine interfaces for disabled people among others.


Head pose estimation Face tracking Efficient computation Computer vision 



This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 690772, VI-DAS project).

Supplementary material

(MP4 13.0 MB)

(MP4 12.7 MB)

(MP4 13.7 MB)


  1. 1.
    Ahlberg J (2001) Candide-3 - an updated parameterized face, Technical Report LiTH-ISY-r-2326, Image Coding Group, Dept. of Electrical Engineering, Linköping University, SwedenGoogle Scholar
  2. 2.
    Aldrian O, Smith WAP (2013) Inverse rendering of faces with a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 35:1080–1093CrossRefGoogle Scholar
  3. 3.
    Baltrusaitis T, Robinson P, Morency LP (2016) Openface: an open source facial behavior analysis toolkit. In: 2016 IEEE winter conference on applications of computer vision, WACV 2016Google Scholar
  4. 4.
    Bulat A, Tzimiropoulos G (2017) Binarized convolutional landmark localizers for human pose estimation and face alignment with limited resources. In: IEEE international conference on computer vision (ICCV) (Venice, Italy). IEEEGoogle Scholar
  5. 5.
    Bulat A, Tzimiropoulos G (2017) How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks). In: IEEE International Conference on Computer Vision (ICCV). arXiv:1703.07332. IEEE, Venice, Italy
  6. 6.
    Cao C, Weng Y, Lin S, Zhou K (2013) 3D shape regression for real-time facial animation. ACM Trans Graph 32(4):1–10CrossRefGoogle Scholar
  7. 7.
    Cao C, Hou Q, Zhou K (2014) Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans Graph 33(4):1–10Google Scholar
  8. 8.
    Cao C, Weng Y, Zhou S, Tong Y, Zhou K (2014) Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans Vis Comput Graph 20:413–425CrossRefGoogle Scholar
  9. 9.
    Cao X, Wei Y, Wen F, Sun J (2014) Face alignment by explicit shape regression. Int J Comput Vis 107(2):177–190MathSciNetCrossRefGoogle Scholar
  10. 10.
    Cao C, Chai M, Woodford O, Luo L (2018) Stabilized real-time face tracking via a learned dynamic rigidity prior. ACM Trans Graph 37(6):233Google Scholar
  11. 11.
    Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23:681–685CrossRefGoogle Scholar
  12. 12.
    Deng Z, Li K, Zhao Q, Zhang Y, Chen H (2017) Effective face landmark localization via single deep network. CoRR, arXiv:1702.02719
  13. 13.
    Deng J, Guo J, Zhou Y, Yu J, Kotsia I, Zafeiriou S (2019) RetinaFace: single-stage dense face localisation in the wildGoogle Scholar
  14. 14.
    Feng Y, Wu F, Shao X, Wang Y, Zhou X (2018) Joint 3D face reconstruction and dense alignment with position map regression network. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) European conference on computer vision (ECCV). (Munich), Springer, ChamCrossRefGoogle Scholar
  15. 15.
    Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applicationsGoogle Scholar
  16. 16.
    Huber P, Hu G, Tena R, Mortazavian P, Koppen WP (2015) A multiresolution 3D morphable face model and fitting framework international conference on computer vision theory and applications (visapp)Google Scholar
  17. 17.
    Huber P, Hu G, Tena R, Mortazavian P, Koppen WP, Christmas WJ, Rätsch M, Kittler J (2016) A multiresolution 3d morphable face model and fitting framework. In: Proceedings of the 11th joint conference on computer vision, imaging and computer graphics theory and applications (VISIGRAPP), pp 79–86Google Scholar
  18. 18.
    Jeni LA, Cohn JF, Kanade T (2015) Dense 3d face alignment from 2d videos in real-time. In: 11th IEEE international conference and workshops on automatic face and gesture recognition (FG), vol 1, pp 1–8Google Scholar
  19. 19.
    Kazemi V, Josephine S (2014) One millisecond face alignment with an ensemble of regression trees. Computer Vision and Pattern Recognition (CVPR)Google Scholar
  20. 20.
    King DE (2015) Max-margin object detection. CoRR, arXiv:1502.00046
  21. 21.
    Lewis JP (1995) Fast template matching. Pattern Recogn 10(11):120–123Google Scholar
  22. 22.
    Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 9905 LNCS, pp 21–37CrossRefGoogle Scholar
  23. 23.
    Markus N, Frljak M, Pandzic IS, Ahlberg J, Forchheimer R (2013) A method for object detection based on pixel intensity comparisons. CoRR, arXiv:1305.4537
  24. 24.
    Martin Koestinger PMR, Wohlhart Paul, Bischof H (2011) Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: Proc. first IEEE international workshop on benchmarking facial image analysis technologiesGoogle Scholar
  25. 25.
    Ostermann J (2003) Face animation in MPEG-4. Wiley, New York, pp 17–55Google Scholar
  26. 26.
    Richard Hartley AZ (2003) Multiple view geometry, vol. 53. ACSGoogle Scholar
  27. 27.
    Saragih JM, Lucey S, Cohn JF (2009) Face alignment through subspace constrained mean-shifts. In: IEEE 12th international conference on computer vision, pp 1034–1041Google Scholar
  28. 28.
    Shen J, Zafeiriou S, Chrysos GG, Kossaifi J, Tzimiropoulos G, Pantic M (2015) The first facial landmark tracking in-the-wild challenge: benchmark and results. In: 2015 IEEE international conference on computer vision workshop (ICCVW), pp 1003–1011Google Scholar
  29. 29.
    Unzueta L, Pimenta W, Goenetxea J, Santos L, Dornaika F (2014) Efficient generic face model fitting to images and videos. Image and Vision Computing 32(5):321–334CrossRefGoogle Scholar
  30. 30.
    Weng Y, Cao C, Hou Q, Zhou K (2014) Real-time facial animation on mobile devices. Graphical Models 76(3):172–179CrossRefGoogle Scholar
  31. 31.
    Yves Bouguet J (2000) Pyramidal implementation of the lucas kanade feature tracker. Intel Corporation, Microprocessor Research LabsGoogle Scholar
  32. 32.
    Zhang X, Sugano Y, Fritz M, Bulling A (2016) MPIIGaze : in-the-wild dataset and deep appearance-based gaze estimation, pp 1–14Google Scholar
  33. 33.
    Zhu S, Li C, Loy CC, Tang X (2015) Face alignment by coarse-to-fine shape searching. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 07-12-June, pp 4998–5006Google Scholar
  34. 34.
    Zhu X, Lei Z, Liu X, Shi H, Li SZ (2015) Face alignment across large poses: a 3d solution. CoRR, arXiv:1511.07212

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  1. 1.Department of Intelligent Transport Systems and EngineeringVicomtechDonostiaSpain
  2. 2.Computer Engineering FacultyUniversity of the Basque Country EHU/UPVDonostiaSpain
  3. 3.Ikerbasque, Basque Foundation for ScienceBilbaoSpain

Personalised recommendations