Multimedia Tools and Applications

, Volume 77, Issue 2, pp 2261–2283 | Cite as

Facial point localization via neural networks in a cascade regression framework

  • Anwar SaeedEmail author
  • Ayoub Al-Hamadi
  • Heiko Neumann


Facial point detection gains an increasing importance in computer vision as it plays a vital role in several applications such as facial expression recognition and human behavior analysis. In this work, we propose an approach to locate 49 facial points via neural networks in a cascade regression fashion. The localization process starts by detecting the face, followed by a face cropping refinement task and lastly arriving at the facial point location through five cascades of regressors. In particular, we perform a guided initialization using holistic features extracted from the entire face patch. Then, the points location is refined in the next four cascades using local features extracted from patches enclosing the prior estimates of the points. The generalization capability was improved by performing feature selection at each cascade. By evaluating our approach on samples gathered from four challenging databases, we achieved a location average error for each point ranging between 0.72 % and 1.57 % of the face width. The proposed approach was further evaluated according to the 300-w challenge, where we achieved competitive results to those obtained by state-of-the-art approaches and commercial software packages. Moreover, our approach showed better generalization capability. Finally, we validated the proposed enhancements by studying the impact of several factors on the point localization accuracy.


Facial point detection Face detection Histogram of gradients Cascade regression 



This work is part of the project done within the Transregional Collaborative Research Centre SFB/TRR 62 Companion-Technology for Cognitive Technical Systems funded by the German Research Foundation (DFG).


  1. 1.
    Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1):279–305. doi:10.1016/0004-3702(94)90084-1.
  2. 2.
    Baltrusaitis T, McDuff D, Banda N, Mahmoud M, El Kaliouby R, Robinson P, Picard R (2011) Real-time inference of mental states from facial expressions and upper body gestures. In: IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG 2011), pp 909– 914Google Scholar
  3. 3.
    Baltrusaitis T, Robinson P, Morency L-P (2013) Constrained local neural fields for robust facial landmark detection in the wild. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp 354–361. doi: 10.1109/ICCVW.2013.54
  4. 4.
    Barbu A, She Y, Ding L, Gramajo G (2016) Feature selection with annealing for computer vision and big data learning. IEEE Trans Pattern Anal Mach Intell PP (99):1–1 . doi: 10.1109/TPAMI.2016.2544315
  5. 5.
    Belhumeur P, Jacobs D, Kriegman D, Kumar N (2011) Localizing parts of faces using a consensus of exemplars. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 545 –552. doi: 10.1109/CVPR.2011.5995602
  6. 6.
    Cristinacce D, Cootes TF (2006) Feature detection and tracking with constrained local models. In: Proceedings of the BMVC, pp 95.1–95.10. doi: 10.5244/C.20.95
  7. 7.
    Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005., vol 1, San Diego, CA, USA, pp 886–893Google Scholar
  8. 8.
    Ekman P, Friesen WV (10.1007/BF01115465) Measuring facial movement. J Nonverbal Behav 1(1):56–75Google Scholar
  9. 9.
    Everingham M, Sivic J, Zisserman A (2009) Taking the bite out of automated naming of characters in tv video. Image Vision Comput 27(5):545–559. doi: 10.1016/j.imavis.2008.04.018
  10. 10.
    Ghimire D, Lee J, Li Z-N, Jeong S (2016) Recognition of facial expressions based on salient geometric features and support vector machines. Multimedia Tools and Applications:1–26Google Scholar
  11. 11.
    Gourier N, Hall D, Crowley JL (2004) Estimating Face Orientation from Robust Detection of Salient Facial Features. In: Proceedings of pointing 2004, ICPR, International Workshop on Visual Observation of Deictic GesturesGoogle Scholar
  12. 12.
    Gross R, Matthews I, Cohn J, Kanade T, Baker S (2010) Multi-pie. Image Vision Comput 28(5):807–813. doi: 10.1016/j.imavis.2009.08.002
  13. 13.
    Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182.
  14. 14.
    Hall MA (1999) Correlation-based feature selection for machine learning, Ph.D. thesis, Department of Computer Science. Waikato University, New ZealandGoogle Scholar
  15. 15.
    i ⋅bug - resources. (Accessed: 04- Nov- 2015)
  16. 16.
    Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp 1867–1874. doi: 10.1109/CVPR.2014.241
  17. 17.
    King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758Google Scholar
  18. 18.
    Koller D, Sahami M (1995) Toward optimal feature selection. In: 13th International Conference on Machine Learning, pp 284–292Google Scholar
  19. 19.
    L. Inc., luxand facesdk ver. 6.1, (Dec. 2015)
  20. 20.
    Le V, Brandt J, Lin Z, Bourdev L, Huang TS (2012) Interactive facial feature localization. In: Proceedings of the 12th european conference on computer vision - Volume Part III, ECCV’12,. Springer-Verlag, Berlin, Heidelberg, pp 679–692Google Scholar
  21. 21.
    Lee Y-H, Kim CG, Kim Y, Whangbo TK (2015) Facial landmarks detection using improved active shape model on android platform. Multimedia Tools and Applications 74(20):8821–8830CrossRefGoogle Scholar
  22. 22.
    Li H, Ding H, Huang D, Wang Y, Zhao X, Morvan J-M, Chen L (2015) An efficient multimodal 2d + 3d feature-based approach to automatic facial expression recognition. Comput Vis Image Underst 140(C):83–92. doi: 10.1016/j.cviu.2015.07.005
  23. 23.
    Littlewort G, Whitehill J, Wu T, Fasel I, Frank M, Movellan J, Bartlett M (2011) The computer expression recognition toolbox (cert). In: 2011 IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG 2011), pp 298–305. doi: 10.1109/FG.2011.5771414
  24. 24.
    Long N, Gianola D, Rosa G, Weigel K (2011) Dimension reduction and variable selection for genomic selection: application to predicting milk yield in holsteins. J Anim Breed Genet 128(4):247–257. doi: 10.1111/j.1439-0388.2011.00917.x CrossRefGoogle Scholar
  25. 25.
    M. Inc., Face ++ matlab sdk demo, (Dec 2013)
  26. 26.
    Martinez B, Valstar M, Binefa X, Pantic M (2013) Local evidence aggregation for regression-based facial point detection. IEEE Trans Pattern Anal Mach Intell 35(5):1149–1163. doi: 10.1109/TPAMI.2012.205 CrossRefGoogle Scholar
  27. 27.
    Milborrow S, Morkel J, Nicolls F The MUCT Landmarked Face Database, Pattern Recognition Association of South Africa
  28. 28.
    Milborrow S, Nicolls F (2008) Locating facial features with an extended active shape model. In: Proceedings of the 10th European Conference on Computer Vision: Part IV, ECCV ’08. Springer-Verlag, Berlin, Heidelberg, pp 504–513Google Scholar
  29. 29.
    Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B Cybern 36 (1):106–117. doi: 10.1109/TSMCB.2005.854499
  30. 30.
    Saeed A, Al-Hamadi A, Ghoneim A (2015) Head pose estimation on top of haar-like face detection: A study using the kinect sensor. Sensors 15(9):20945–20966CrossRefGoogle Scholar
  31. 31.
    Saeed A, Al-Hamadi A, Niese R, Elzobi M (2014) Frame-based facial expression recognition using geometrical features. Advances in Human-Computer Interaction 2014 (1):1–13CrossRefGoogle Scholar
  32. 32.
    Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) A semi-automatic methodology for facial landmark annotation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 896–903. doi: 10.1109/CVPRW.2013.132
  33. 33.
    Sebe N, Lew MS, Sun Y, Cohen I, Gevers T, Huang TS (2007) Authentic facial expression analysis. Image Vision Comput 25(12):1856–1863. doi: 10.1016/j.imavis.2005.12.021
  34. 34.
    Smith B, Brandt J, Lin Z, Zhang L (2014) Nonparametric context modeling of local appearance for pose- and expression-robust facial landmark localization. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1741–1748. doi: 10.1109/CVPR.2014.225
  35. 35.
    Sun Y, Wang X, Tang X (2013) Deep convolutional network cascade for facial point detection. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’13. IEEE Computer Society, Washington, DC, USA, pp 3476–3483. doi: 10.1109/CVPR.2013.446.
  36. 36.
    Taner Eskil M, Benli KS (2014) Facial expression recognition based on anatomy. Comput Vis. Image Underst. 119:1–14CrossRefGoogle Scholar
  37. 37.
    Tzimiropoulos G, Pantic M (2014) Gauss-newton deformable part models for face alignment in-the-wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1851–1858. doi: 10.1109/CVPR.2014.239
  38. 38.
    Valstar M, Martinez B, Binefa X, Pantic M (2010) Facial point detection using boosted regression and graph models. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2729–2736. doi: 10.1109/CVPR.2010.5539996
  39. 39.
    Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer vision and pattern recognition, 2001. CVPR 2001, vol 1, Kauai, Hawaii, USA, pp 511–518Google Scholar
  40. 40.
    Werner P, Al-Hamadi A, Niese R, Walter S, Gruss S, Harald C (2013) Towards pain monitoring: Facial expression, head pose, a new database, an automatic system and remaining challenges. In: British Machine Vision Conference (BMVC), Bristol, UKGoogle Scholar
  41. 41.
    Xiong X, De la Torre F (2013) Supervised descent method and its applications to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 532– 539Google Scholar
  42. 42.
    Yan J, Lei Z, Yi D, Li S (2013) Learn to combine multiple hypotheses for accurate face alignment. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp 392–396. doi: 10.1109/ICCVW.2013.126
  43. 43.
    Yu X, Huang J, Zhang S, Yan W, Metaxas DN (2013) Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In: 2013 IEEE International Conference on Computer Vision, pp 1944–1951. doi: 10.1109/ICCV.2013.244
  44. 44.
    Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: Past, present and future. Comput Vision Image Understanding 138:1–24CrossRefGoogle Scholar
  45. 45.
    Zhang L, Tjondronegoro D, Chandran V (2014) Representation of facial expression categories in continuous arousal-valence space: Feature and correlation. Image Vision Comput 32(12):1067–1079. doi: 10.1016/j.imavis.2014.09.005 CrossRefGoogle Scholar
  46. 46.
    Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2879– 2886Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Institute for Information Technology and Communications (IIKT)Otto-von-Guericke-University MagdeburgMagdeburgGermany
  2. 2.Institute of Neural Information ProcessingUniversity of UlmUlmGermany

Personalised recommendations