Skip to main content
Log in

Consensus and stacking based fusion and survey of facial feature point detectors

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Facial landmark detection is a crucial pre-processing step for many applications including face tracking, face recognition and facial affect recognition. Hence, we first aim to investigate and experimentally compare the most successful open source facial feature point detection algorithms published in the last decade. We first present an overview of surveys on facial feature detection algorithms to provide insight into the challenges and innovations. We also propose a consensus-based selection and stacked regression based fusion of facial landmark methods to combine their results in order to achieve superior accuracy. Five open-source algorithms in the literature are objectively compared using the same test data and regression based models have been shown to be more successful. According to the extensive experimental results, the proposed consensus and stacking based fusion method gives the lowest facial landmark detection error as compared to the five most successful algorithms in the literature. Consensus and stacking based fusion of an ensemble of methods boosts the performance of facial landmark detection. The proposed fusion method can also be applied future methods as they emerge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availability Statement

The datasets generated during and/or analysed during the current study are available in the IBUG repository, https://ibug.doc.ic.ac.uk/resources/300-W/.

References

  • Asthana A, Zafeiriou S, Cheng S, Pantic M (2013) Robust discriminative response map fitting with constrained local models. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3444–3451

  • Asthana A, Zafeiriou S, Cheng S, Pantic M (2014) Incremental face alignment in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1859–1866

  • Bailer C, Pagani A, Stricker D (2014) A superior tracking approach: building a strong tracker through fusion. In: European conference on computer vision (ECCV). Springer, pp 170–185

  • Baltrušaitis T, Robinson P, Morency LP (2014) Continuous conditional neural fields for structured regression. In: European conference on computer vision. Springer, pp 593–608

  • Baltrusaitis T, Zadeh A, Lim YC, Morency LP (2018) Openface 2.0: facial behavior analysis toolkit. In: 2018 13th IEEE international conference on automatic face and gesture recognition (FG 2018). IEEE, pp 59–66

  • Belhumeur PN, Jacobs DW, Kriegman DJ, Kumar N (2013) Localizing parts of faces using a consensus of exemplars. IEEE Trans Pattern Anal Mach Intell 35(12):2930–2940

    Article  Google Scholar 

  • Bianco S, Ciocca G, Schettini R (2017) Combination of video change detection algorithms by genetic programming. IEEE Trans Evol Comput 21(6):914–928

    Article  Google Scholar 

  • Bulat A, Tzimiropoulos G (2017) How far are we from solving the 2D and 3D face alignment problem? (And a dataset of 230,000 3D facial landmarks). In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1021–1030

  • Burgos-Artizzu XP, Perona P, Dollár P (2013) Robust face landmark estimation under occlusion. In: Proceedings of the IEEE international conference on computer vision, pp 1513–1520

  • Cao X, Wei Y, Wen F, Sun J (2014) Face alignment by explicit shape regression. Int J Comput Vis 107(2):177–190

    Article  MathSciNet  Google Scholar 

  • Çeliktutan O, Ulukaya S, Sankur B (2013) A comparative study of face landmarking techniques. EURASIP J Image Video Process 1:13

    Article  Google Scholar 

  • Chen L, Su H, Ji Q (2019) Deep structured prediction for facial landmark detection. In: Advances in neural information processing systems, pp 2447–2457

  • Chen Y, Liu L, Phonevilay V, Gu K, Xia R, Xie J, Zhang Q, Yang K (2021a) Image super-resolution reconstruction based on feature map attention mechanism. Appl Intell 51:4367–4380. https://doi.org/10.1007/s10489-020-02116-1

    Article  Google Scholar 

  • Chen Y, Liu L, Tao J, Chen X, Xia R, Zhang Q, Xiong J, Yang K, Xie J (2021b) The image annotation algorithm using convolutional features from intermediate layer of deep learning. Multimed Tools Appl 80(3):4237–4261

    Article  Google Scholar 

  • Chen Y, Liu L, Tao J, Xia R, Zhang Q, Yang K, Xiong J, Chen X (2021c) The improved image inpainting algorithm via encoder and similarity constraint. Vis Comput 37(7):1691–1705

    Article  Google Scholar 

  • Chen Y, Zhang H, Liu L, Tao J, Zhang Q, Yang K, Xia R, Xie J (2021d) Research on image inpainting algorithm of improved total variation minimization method. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-020-02778-2

    Article  Google Scholar 

  • Chrysos GG, Antonakos E, Snape P, Asthana A, Zafeiriou S (2018) A comprehensive performance evaluation of deformable face tracking “in-the-wild’’. Int J Comput Vis 126(2–4):198–232

    Article  MathSciNet  Google Scholar 

  • Everingham M, Sivic J, Zisserman A (2006) Hello! my name is... buffy”–automatic naming of characters in tv video. In: BMVC, 4, p 6

  • Feng ZH, Kittler J, Awais M, Huber P, Wu XJ (2018) Wing loss for robust facial landmark localisation with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2235–2245

  • Gogić I, Ahlberg J, Pandžić IS (2021) Regression-based methods for face alignment: a survey. Signal Process 178:107755

    Article  Google Scholar 

  • Hannane R, Elboushaki A, Afdel K (2020) A divide-and-conquer strategy for facial landmark detection using dual-task CNN architecture. Pattern Recognit 107:107504

    Article  Google Scholar 

  • Huang GB, Mattar M, Berg T, Learned-Miller E (2008) Labeled faces in the wild: a database for studying face recognition in unconstrained environments. In: Workshop on faces in ‘real-life’ images: detection, alignment, and recognition

  • Huang Z, Zhou E, Cao Z (2015) Coarse-to-fine face alignment with multi-scale local patch regression. arXiv preprint arXiv:151104901

  • Jesorsky O, Kirchberg KJ, Frischholz RW (2001) Robust face detection using the hausdorff distance. In: International conference on audio-and video-based biometric person authentication. Springer, pp 90–95

  • Jin X, Tan X (2017) Face alignment in-the-wild: a survey. Comput Vis Image Underst 162:1–22

    Article  Google Scholar 

  • Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE conference on computer vision and pattern recognition (CVPR), pp 1867–1874. https://doi.org/10.1109/CVPR.2014.241

  • Kim HW, Kim HJ, Rho S, Hwang E (2020) Augmented emtcnn: a fast and accurate facial landmark detection network. Appl Sci 10(7):2253

    Article  Google Scholar 

  • Kuncheva LI (2002) Switching between selection and fusion in combining classifiers: an experiment. IEEE Trans Syst Man Cybern Part B (Cybern) 32(2):146–156

    Article  Google Scholar 

  • Lai H, Xiao S, Pan Y, Cui Z, Feng J, Xu C, Yin J, Yan S (2016) Deep recurrent regression for facial landmark detection. IEEE Trans Circuits Syst Video Technol 28(5):1144–1157

    Article  Google Scholar 

  • Le V, Brandt J, Lin Z, Bourdev L, Huang TS (2012) Interactive facial feature localization. In: European conference on computer vision. Springer, pp 679–692

  • Leang I, Herbin S, Girard B, Droulez J (2018) On-line fusion of trackers for single-object tracking. Pattern Recognit 74:459–473

    Article  Google Scholar 

  • Lee D, Park H, Yoo CD (2015) Face alignment using cascade gaussian process regression trees. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4204–4212

  • Liu Z, Zhu X, Hu G, Guo H, Tang M, Lei Z, Robertson NM, Wang J (2019) Semantic alignment: Finding semantically consistent ground-truth for facial landmark detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3467–3476

  • Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression. In: IEEE conference on computer vision and pattern recognition-workshops. IEEE, pp 94–101

  • Lv J, Shao X, Xing J, Cheng C, Zhou X (2017) A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3317–3326

  • Mendes-Moreira J, Soares C, Jorge AM, Sousa JFD (2012) Ensemble approaches for regression: a survey. ACM Comput Surv (CSUR) 45(1):1–40

    Article  MATH  Google Scholar 

  • Miao X, Zhen X, Liu X, Deng C, Athitsos V, Huang H (2018) Direct shape regression networks for end-to-end face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5040–5049

  • Milborrow S, Nicolls F (2008) Locating facial features with an extended active shape model. In: European conference on computer vision. Springer, pp 504–513

  • Perrone MP, Cooper LN (1993) When networks disagree: ensemble methods for hybrid neural networks. How We Learn; How We Remember: Toward an Understanding of Brain and Neural Systems. World Scientific Publishing Co Pte Ltd, September 1995, 342–358

  • Ren S, Cao X, Wei Y, Sun J (2014) Face alignment at 3000 fps via regressing local binary features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1685–1692

  • Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: Proceedings of the IEEE international conference on computer vision workshops, pp 397–403

  • Saragih JM, Lucey S, Cohn JF (2011) Deformable model fitting by regularized landmark mean-shift. Int J Comput Vis 91(2):200–215

    Article  MathSciNet  MATH  Google Scholar 

  • Shen J, Zafeiriou S, Chrysos GG, Kossaifi J, Tzimiropoulos G, Pantic M (2015) The first facial landmark tracking in-the-wild challenge: benchmark and results. In: Proceedings of the IEEE international conference on computer vision workshops, pp 50–58

  • Tzimiropoulos G, Pantic M (2014) Gauss–Newton deformable part models for face alignment in-the-wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1851–1858

  • Uricár M, Franc V, Hlavác V (2012) Detector of facial landmarks learned by the structured output SVM. In: VISAPP, pp 547–556

  • Valle R, Buenaposada JM, Valdes A, Baumela L (2018) A deeply-initialized coarse-to-fine ensemble of regression trees for face alignment. In: Proceedings of the European conference on computer vision (ECCV), pp 585–601

  • Valle R, Buenaposada JM, Baumela L (2020) Cascade of encoder-decoder CNNs with learned coordinates regressor for robust facial landmarks detection. Pattern Recognit Lett 136:326–332

    Article  Google Scholar 

  • Valstar M, Martinez B, Binefa X, Pantic M (2010) Facial point detection using boosted regression and graph models. In: 2010 IEEE conference on computer vision and pattern recognition. IEEE, pp 2729–2736

  • Vukadinovic D, Pantic M (2005) Fully automatic facial feature point detection using Gabor feature based boosted classifiers. In: 2005 IEEE international conference on systems, man and cybernetics, vol 2. IEEE, pp 1692–1698

  • Wang N, Gao X, Tao D, Yang H, Li X (2018) Facial feature point detection: a comprehensive survey. Neurocomputing 275:50–65

    Article  Google Scholar 

  • Wu Y, Ji Q (2019) Facial landmark detection: a literature survey. Int J Comput Vis 127(2):115–142

    Article  Google Scholar 

  • Wu Y, Hassner T, Kim K, Medioni G, Natarajan P (2018) Facial landmark detection with tweaked convolutional neural networks. IEEE Trans Pattern Anal Mach Intell 40(12):3067–3074

    Article  Google Scholar 

  • Xiao S, Yan S, Kassim AA (2015) Facial landmark detection via progressive initialization. In: Proceedings of the IEEE international conference on computer vision workshops, pp 33–40

  • Xiong X, De la Torre F (2013) Supervised descent method and its applications to face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 532–539

  • Yan Y, Duffner S, Phutane P, Berthelier A, Naturel X, Blanc C, Garcia C, Chateau T (2020) Fine-grained facial landmark detection exploiting intermediate feature representations. Comput Vis Image Underst 200:103036

    Article  Google Scholar 

  • Yang H, Jia X, Loy CC, Robinson P (2015a) An empirical study of recent face alignment methods. arXiv preprint. arXiv:151105049

  • Yang J, Deng J, Zhang K, Liu Q (2015b) Facial shape tracking via spatio-temporal cascade shape regression. In: Proceedings of the IEEE international conference on computer vision workshops, pp 41–49

  • Yang J, Liu Q, Zhang K (2017) Stacked hourglass network for robust facial landmark localisation. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 79–87

  • Zadeh A, Chong Lim Y, Baltrusaitis T, Morency LP (2017) Convolutional experts constrained local model for 3D facial landmark detection. In: Proceedings of the IEEE international conference on computer vision workshops, pp 2519–2528

  • Zhang J, Shan S, Kan M, Chen X (2014a) Coarse-to-fine auto-encoder networks (cfan) for real-time face alignment. In: European conference on computer vision. Springer, pp 1–16

  • Zhang Z, Luo P, Loy CC, Tang X (2014b) Facial landmark detection by deep multi-task learning. In: European conference on computer vision (ECCV). Springer, pp 94–108

  • Zhu S, Li C, Loy CC, Tang X (2015) Face alignment by coarse-to-fine shape searching. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 4998–5006. 10.1109/CVPR.2015.7299134

  • Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 2879–2886

  • Zou X, Zhong S, Yan L, Zhao X, Zhou J, Wu Y (2019) Learning robust facial landmark detection via hierarchical structured ensemble. In: Proceedings of the IEEE international conference on computer vision, pp 141–150

Download references

Acknowledgements

This work has been supported by The Scientific and Technological Research Council of Turkey (TUBITAK) under the project number: 116E088.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sezer Ulukaya.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Deceased: Esra Nur Sandıkçı.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ulukaya, S., Sandıkçı, E.N. & Eroğlu Erdem, Ç. Consensus and stacking based fusion and survey of facial feature point detectors. J Ambient Intell Human Comput 14, 9947–9957 (2023). https://doi.org/10.1007/s12652-021-03662-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-021-03662-3

Keywords

Navigation