Abstract
We address the problem of robust face alignment in the presence of occlusions, which remains a lingering problem in facial analysis despite intensive long-term studies. This paper proposes an adaptive attention-based graph convolutional network for face alignment. Different from most existing methods that ignore the structural information, we combine local features and global structural relationships to construct the landmark-connection graph and optimize the graph to improve the robustness of the model under occlusion conditions. Specifically, we introduce a novel graph convolutional network architecture consisting of three parts: GCN-global, GCN-local, and the adaptive channel attention module. GCN-global estimates the global transformation of landmarks through 3D face fitting to obtain initial coordinates. Considering the interaction between vertexes and edges in the graph, GCN-local jointly trains local edges and vertexes to improve the accuracy. The channel attention module can adaptively select essential features to enhance the performance. In addition, to reduce the influence of occlusion parts on the other landmarks and improve the working efficiency, we apply the preprocessing module to select which keypoints need to be connected. Our method achieves 5.17% mean error with 1.89% failure rate on COFW dataset and 4.16% mean error on 300W-Full dataset. Extensive experiments demonstrate that our method outperforms most state-of-the-art models on three public datasets, including WFLW, COFW, and 300W.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig1_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig3_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig5_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig6_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig8_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig9_HTML.jpg)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08531-y/MediaObjects/521_2023_8531_Fig10_HTML.jpg)
Similar content being viewed by others
Data availibility
Data openly available in a public repository. The data that support the findings of this study are openly available at https://wywu.github.io/projects/LAB/WFLW.htmlhttps://data.caltech.edu/records/20099https://ibug.doc.ic.ac.uk/resources/300-W/.
References
Elharrouss O, Almaadeed N, Al-Maadeed S, Khelifi F (2022) Pose-invariant face recognition with multitask cascade networks. Neural Comput Appl 34(8):6039–6052
Chowdary MK, Nguyen TN, Hemanth DJ (2021) Deep learning-based facial emotion recognition for human–computer interaction applications. Neural Comput Appl 1–18
Khan A, Hayat S, Ahmad M, Cao J, Tahir MF, Ullah A, Javed MS (2021) Learning-detailed 3d face reconstruction based on convolutional neural networks from a single image. Neural Comput Appl 33(11):5951–5964
Yang J, Liu Q, Zhang K (2017) Stacked hourglass network for robust facial landmark localisation. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 79–87
Kowalski M, Naruniec J, Trzcinski T (2017) Deep alignment network: a convolutional neural network for robust face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 88–97
Zhang J, Hu H, Feng S (2020) Robust facial landmark detection via heatmap-offset regression. IEEE Trans Image Process 29:5050–5064
Wang H, Cheng R, Zhou J, Tao L, Kwan HK (2022) Multistage model for robust face alignment using deep neural networks. Cogn Comput 14(3):1123–1139
Yang Z, Shao X, Wan J, Gao R, Lai Z (2022) Mixed attention hourglass network for robust face alignment. Int J Mach Learn Cybern 13(4):869–881
Cao X, Wei Y, Wen F, Sun J (2014) Face alignment by explicit shape regression. Int J Comput Vis 107(2):177–190
Trigeorgis G, Snape P, Nicolaou MA, Antonakos E, Zafeiriou S (2016) Mnemonic descent method: a recurrent process applied for end-to-end face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4177–4187
Wu W, Qian C, Yang S, Wang Q, Cai Y, Zhou Q (2018) Look at boundary: a boundary-aware face alignment algorithm. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2129–2138
Feng ZH, Kittler J, Awais M, Huber P, Wu XJ (2018) Wing loss for robust facial landmark localisation with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2235–2245
Burgos-Artizzu XP, Perona P, Dollár P (2013) Robust face landmark estimation under occlusion. In: Proceedings of the IEEE international conference on computer vision, pp 1513–1520
Sagonas C, Antonakos E, Tzimiropoulos G, Zafeiriou S, Pantic M (2016) 300 faces in-the-wild challenge: database and results. Image Vis Comput 47:3–18
Cootes TF, Taylor CJ, Cooper DH, Graham J (1995) Active shape models-their training and application. Comput Vis Image Underst 61(1):38–59
Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23(6):681–685
Asthana A, Zafeiriou S, Cheng S, Pantic M (2013) Robust discriminative response map fitting with constrained local models. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3444–3451
Fard AP, Abdollahi H, Mahoor M (2021) Asmnet: a lightweight deep neural network for face alignment and pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1521–1530
Zhang J, Di L, Liang J (2021) Face alignment based on fusion subspace and 3d fitting. IET Image Proc 15(1):16–27
Salem E, Hassaballah M, Mahmoud MM, Ali AMM (2021) Facial features detection: a comparative study. In: The International conference on artificial intelligence and computer vision. Springer, Berlin, pp 402–412
Hassaballah M, Bekhet S, Rashed AA, Zhang G (2019) Facial features detection and localization, pp 33–59
Hassaballah M, Salem E, Ali AMM, Mahmoud MM (2022) Deep recurrent regression with a heatmap coupling module for facial landmarks detection. Cogn Comput 1–15
Lin C, Zhu B, Wang Q, Liao R, Qian C, Lu J, Zhou J (2021) Structure-coherent deep feature learning for robust face alignment. IEEE Trans Image Process 30:5313–5326
Sun Z, Ke Q, Rahmani H, Bennamoun M, Wang G, Liu J (2022) Human action recognition from various data modalities: a review. IEEE Trans Pattern Anal Mach Intell
Ju M, Luo J, Wang Z, Luo H (2021) Adaptive feature fusion with attention mechanism for multi-scale target detection. Neural Comput Appl 33(7):2769–2781
Li W, Lu Y, Zheng K, Liao H, Lin C, Luo J, Cheng CT, Xiao J, Lu L, Kuo CF et al (2020) Structured landmark detection via topology-adapting deep graph learning. In: European conference on computer vision. Springer, Berlin, pp 266–283
Wang J, Sun K, Cheng T, Jiang B, Deng C, Zhao Y, Liu D, Mu Y, Tan M, Wang X et al (2020) Deep high-resolution representation learning for visual recognition. IEEE Trans Pattern Anal Mach Intell 43(10):3349–3364
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Lv J, Shao X, Xing J, Cheng C, Zhou X (2017) A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3317–3326
Zhu X, Lei Z, Liu X, Shi H, Li SZ (2016) Face alignment across large poses: a 3d solution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 146–155
Hu J, Shen L, Sun G (2018) Queeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Ren S, Cao X, Wei Y, Sun J (2014) Face alignment at 3000 fps via regressing local binary features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1685–1692
Wu W, Yang S (2017) Leveraging intra and inter-dataset variations for robust face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 150–159
Jin H, Liao S, Shao L (2021) Pixel-in-pixel net: towards efficient facial landmark detection in the wild. Int J Comput Vis 1–21
Wan J, Lai Z, Li J, Zhou J, Gao C (2021) Robust facial landmark detection by multiorder multiconstraint deep networks. IEEE Trans Neural Netw Learn Syst
Ma J, Li J, Du B, Wu J, Wan J, Xiao Y (2022) Robust face alignment by dual-attentional spatial-aware capsule networks. Pattern Recogn 122:108297
Burgos-Artizzu XP, Perona, P, Dollár P (2013) Robust face landmark estimation under occlusion. In: Proceedings of the IEEE international conference on computer vision, pp 1513–1520
Zhang Z, Luo P, Loy CC, Tang X (2014) Facial landmark detection by deep multi-task learning. In: European conference on computer vision, pp 94–108
Zhu S, Li C, Change Loy C, Tang X (2015) Face alignment by coarse-to-fine shape searching. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4998–5006
Ghiasi G, Fowlkes, CC (2014) Occlusion coherence: localizing occluded faces with a hierarchical deformable part model. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2385–2392
Liu Q, Deng J, Yang J, Liu G, Tao D (2016) Adaptive cascade regression model for robust face alignment. IEEE Trans Image Process 26(2):797–807
Wan J, Lai Z, Shen L, Zhou J, Gao C, Xiao G, Hou X (2021) Robust facial landmark detection by cross-order cross-semantic deep network. Neural Netw 136:233–243
Kumar A, Chellappa R (2018) Disentangling 3d pose in a dendritic CNN for unconstrained 2d face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 430–439
Dong X, Yan Y, Ouyang W, Yang Y (2018) Style aggregated network for facial landmark detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 379–388
Li L, Zhou L (2021) Real-time facial landmark detection by attention-driven lightweight network. In: 2021 IEEE 4th advanced information management, communicates, electronic and automation control conference (IMCEC), vol 4. IEEE, pp 290–294
Lan X, Hu Q, Cheng J (2021) Revisting quantization error in face alignment. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1521–1530
Acknowledgements
This study is supported by Open Project of Key Laboratory of Ministry of Public Security for Road Traffic Safety, No.2021ZDSYSKFKT04.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest to this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fan, J., Liang, J., Liu, H. et al. Robust face alignment via adaptive attention-based graph convolutional network. Neural Comput & Applic 35, 15129–15142 (2023). https://doi.org/10.1007/s00521-023-08531-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08531-y