Advertisement

Cascaded Continuous Regression for Real-Time Incremental Face Tracking

  • Enrique Sánchez-LozanoEmail author
  • Brais Martinez
  • Georgios Tzimiropoulos
  • Michel Valstar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9912)

Abstract

This paper introduces a novel real-time algorithm for facial landmark tracking. Compared to detection, tracking has both additional challenges and opportunities. Arguably the most important aspect in this domain is updating a tracker’s models as tracking progresses, also known as incremental (face) tracking. While this should result in more accurate localisation, how to do this online and in real time without causing a tracker to drift is still an important open research question. We address this question in the cascaded regression framework, the state-of-the-art approach for facial landmark localisation. Because incremental learning for cascaded regression is costly, we propose a much more efficient yet equally accurate alternative using continuous regression. More specifically, we first propose cascaded continuous regression (CCR) and show its accuracy is equivalent to the Supervised Descent Method. We then derive the incremental learning updates for CCR (iCCR) and show that it is an order of magnitude faster than standard incremental learning for cascaded regression, bringing the time required for the update from seconds down to a fraction of a second, thus enabling real-time tracking. Finally, we evaluate iCCR and show the importance of incremental learning in achieving state-of-the-art performance. Code for our iCCR is available from http://www.cs.nott.ac.uk/~psxes1.

Keywords

Incremental Learning Active Appearance Model Face Tracking Facial Landmark Flexible Parameter 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

The work of Sánchez-Lozano, Martinez and Valstar was supported by the European Union Horizon 2020 research and innovation programme under grant agreement No 645378, ARIA-VALUSPA. The work of Sánchez-Lozano was also supported by the Vice-Chancellor’s Scholarship for Research Excellence provided by the University of Nottingham. The work of Tzimiropoulos was supported in part by the EPSRC project EP/M02153X/1 Facial Deformable Models of Animals. We are also grateful for the given access to the University of Nottingham High Performance Computing Facility, and we would like to thank Jie Shen and Grigoris Chrysos for their insightful help in our tracking evaluation.

Supplementary material

419983_1_En_39_MOESM1_ESM.pdf (159 kb)
Supplementary material 1 (pdf 158 KB)

References

  1. 1.
    Dhall, A., Goecke, R., Joshi, J., Sikka, K., Gedeon, T.: Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In: International Conference on Multimodal Interaction, pp. 461–466 (2014)Google Scholar
  2. 2.
    Zhou, S., Krueger, V., Chellappa, R.: Probabilistic recognition of human faces from video. Comput. Vis. Image Underst. 91(12), 214–245 (2003)CrossRefGoogle Scholar
  3. 3.
    Gross, R., Matthews, I., Baker, S.: Generic vs. person specific active appearance models. Image Vis. Comput. 23(11), 1080–1093 (2005)CrossRefGoogle Scholar
  4. 4.
    Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S., Torr, P.: Struck: Structured output tracking with kernels. Trans. Pattern Anal. Mach. Intell. (2016). doi: 10.1109/TPAMI.2015.2509974
  5. 5.
    Wang, X., Valstar, M., Martinez, B., Khan, M.H., Pridmore, T.: Tric-track: tracking by regression with incrementally learned cascades. In: International Conference on Computer Vision (2015)Google Scholar
  6. 6.
    Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1–3), 125–141 (2008)CrossRefGoogle Scholar
  7. 7.
    Xiong, X., la Torre, F.D.: Supervised descent method for solving nonlinear least squares problems in computer vision. arXiv abs/1405.0601 (2014)Google Scholar
  8. 8.
    Asthana, A., Zafeiriou, S., Cheng, S., Pantic, M.: Incremental face alignment in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)Google Scholar
  9. 9.
    Sagonas, C., Panagakis, Y., Zafeiriou, S., Pantic, M.: RAPS: Robust and efficient automatic construction of person-specific deformable models. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)Google Scholar
  10. 10.
    Sánchez-Lozano, E., De la Torre, F., González-Jiménez, D.: Continuous regression for non-rigid image alignment. In: European Conference on Computer Vision, pp. 250–263 (2012)Google Scholar
  11. 11.
    Dollár, P., Welinder, P., Perona, P.: Cascaded pose regression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1078–1085 (2010)Google Scholar
  12. 12.
    Yan, J., Lei, Z., Yang, Y., Li, S.: Stacked deformable part model with shape regression for object part localization. In: European Conference on Computer Vision, pp. 568–583 (2014)Google Scholar
  13. 13.
    Shen, J., Zafeiriou, S., Chrysos, G.S., Kossaifi, J., Tzimiropoulos, G., Pantic, M.: The first facial landmark tracking in-the-wild challenge: benchmark and results. In: International Conference on Computer Vision - Workshop (2015)Google Scholar
  14. 14.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001)CrossRefGoogle Scholar
  15. 15.
    Matthews, I., Baker, S.: Active appearance models revisited. Int. J. Comput. Vis. 60(2), 135–164 (2004)CrossRefGoogle Scholar
  16. 16.
    Saragih, J.M., Lucey, S., Cohn, J.F.: Deformable model fitting by regularized landmark mean-shift. Int. J. Comput. Vis. 91(2), 200–215 (2011)CrossRefzbMATHMathSciNetGoogle Scholar
  17. 17.
    Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)Google Scholar
  18. 18.
    Tresadern, P., Ionita, M., Cootes, T.: Real-time facial feature tracking on a mobile device. Int. J. Comput. Vis. 96(3), 280–289 (2012)CrossRefGoogle Scholar
  19. 19.
    Tzimiropoulos, G., Pantic, M.: Optimization problems for fast AAM fitting in-the-wild. In: International Conference on Computer Vision (2013)Google Scholar
  20. 20.
    Tzimiropoulos, G., Pantic, M.: Gauss-newton deformable part models for face alignment in-the-wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1851–1858 (2014)Google Scholar
  21. 21.
    Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Underst. 61(1), 38–59 (1995)CrossRefGoogle Scholar
  22. 22.
    Cristinacce, D., Cootes, T.: Feature detection and tracking with constrained local models. In: British Machine Vision Conference, pp. 929–938 (2006)Google Scholar
  23. 23.
    Cootes, T.F., Ionita, M.C., Lindner, C., Sauer, P.: Robust and accurate shape model fitting using random forest regression voting. In: European Conference on Computer Vision, pp. 278–291 (2012)Google Scholar
  24. 24.
    Valstar, M.F., Martinez, B., Binefa, X., Pantic, M.: Facial point detection using boosted regression and graph models. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2729–2736 (2010)Google Scholar
  25. 25.
    Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. Int. J. Comput. Vis. 107(2), 177–190 (2014)CrossRefMathSciNetGoogle Scholar
  26. 26.
    Ren, S., Cao, X., Wei, Y., Sun, J.: Face alignment at 3000 FPS via regressing local binary features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1685–1692 (2014)Google Scholar
  27. 27.
    Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)Google Scholar
  28. 28.
    Yan, J., Lei, Z., Yi, D., Li, S.: Learn to combine multiple hypotheses for accurate face alignment. In: Internation Conference on Computer Vision - Workshop, pp. 392–396 (2013)Google Scholar
  29. 29.
    Tzimiropoulos, G.: Project-out cascaded regression with an application to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3659–3667 (2015)Google Scholar
  30. 30.
    Cootes, T.F., Taylor, C.J.: Statistical models of appearance for computer vision (2004)Google Scholar
  31. 31.
    Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2887–2894 (2012)Google Scholar
  32. 32.
    Brookes, M.: The matrix reference manual (2011)Google Scholar
  33. 33.
    Kristan, M., Matas, J., Leonardis, A., Vojir, T., Pflugfelder, R., Fernandez, G., Nebehay, G., Porikli, F., Čehovin, L.: A novel performance evaluation methodology for single-target trackers. arXiv (2015)Google Scholar
  34. 34.
    Le, V., Brandt, J., Lin, Z., Bourdev, L.D., Huang, T.S.: Interactive facial feature localization. In: European Conference on Computer Vision, pp. 679–692 (2012)Google Scholar
  35. 35.
    Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 545–552 (2011)Google Scholar
  36. 36.
    Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886 (2012)Google Scholar
  37. 37.
    Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: A semi-automatic methodology for facial landmark annotation. In: IEEE Conference on Computer Vision and Pattern Recognition - Workshops (2013)Google Scholar
  38. 38.
    Gross, R., Matthews, I., Cohn, J., Kanade, T., Baker, S.: Multi-pie. Image Vis. Comput. 28(5), 807–813 (2010)CrossRefGoogle Scholar
  39. 39.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  40. 40.
    Chrysos, G.S., Antonakos, E., Zafeiriou, S., Snape, P.: Offline deformable face tracking in arbitrary videos. In: International Conference on Computer Vision - Workshop (2015)Google Scholar
  41. 41.
    Sánchez-Lozano, E., Martinez, B., Valstar, M.: Cascaded regression with sparsified feature covariance matrix for facial landmark detection. Pattern Recogn. Lett. 73, 19–25 (2016)CrossRefGoogle Scholar
  42. 42.
    Yang, J., Deng, J., Zhang, K., Liu, Q.: Facial shape tracking via spatio-temporal cascade shape regression. In: Internationl Conference on Computer Vision - Workshop (2015)Google Scholar
  43. 43.
    Xiao, S., Yan, S., Kassim, A.: Facial landmark detection via progressive initialization. In: International Conference on Computer Vision - Workshop (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Enrique Sánchez-Lozano
    • 1
    Email author
  • Brais Martinez
    • 1
  • Georgios Tzimiropoulos
    • 1
  • Michel Valstar
    • 1
  1. 1.Computer Vision LaboratoryUniversity of NottinghamNottinghamUK

Personalised recommendations