A Robust Facial Landmark Detector with Mixed Loss

Zhang, Xian; Tong, Xinjie; Li, Ziyu; Yang, Wankou

doi:10.1007/978-3-030-36189-1_21

Xian Zhang^13,14,
Xinjie Tong¹³,
Ziyu Li^13,14 &
…
Wankou Yang^13,14

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11935))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

1452 Accesses

Abstract

Facial landmark detection is one of the most important tasks in face image and video analysis. Existing algorithms based on deep convolutional neural networks have achieved good performance in public benchmarks and practical applications such as face verification, expression analysis, beauty applications and so on. However, the performance of a facial landmark detector degrades significantly when dealing with challenging facial images in the presence of extreme appearance variations such as pose, expression, occlusion, etc. To mitigate these difficulties, we propose a robust facial landmark detection algorithm based on coordinates regression in an end-to-end training fashion. By using the soft-argmax function, the network weights can be optimised with a mixed loss function. The online pose-based data augmentation technology is used to effectively solve the data imbalance problem and improve the robustness of the proposed method. Experiments conducted on the 300-W and AFLW datasets demonstrate that the performance of the proposed algorithm is competitive to the state-of-the-art heatmap regression algorithms, in terms of accuracy. Besides, our method achieves real-time speed on 300-W with 68 landmarks, which runs at 85 FPS on a Tesla v100 GPU.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2930–2940 (2011)
Article Google Scholar
Burgos-Artizzu, X.P., Perona, P., Dollár, P.: Robust face landmark estimation under occlusion. In: 2013 IEEE International Conference on Computer Vision, pp. 1513–1520 (2013)
Google Scholar
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 484–498. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0054760
Chapter Google Scholar
Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Underst. 61, 38–59 (1995)
Article Google Scholar
Cootes, T.F., Walker, K.N., Taylor, C.J.: View-based active appearance models. Image Vision Comput. 20, 657–664 (2000)
Article Google Scholar
Dong, X., Yan, Y., Ouyang, W., Yang, Y.: Style aggregated network for facial landmark detection. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 379–388 (2018)
Google Scholar
Dong, X., Yu, S.-I., Weng, X., Wei, S.-E., Yang, Y., Sheikh, Y.: Supervision-by-registration: an unsupervised approach to improve the precision of facial landmark detectors. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 360–368 (2018)
Google Scholar
Feng, Z.-H., Kittler, J., Awais, M., Huber, P., Wu, X.: Wing loss for robust facial landmark localisation with convolutional neural networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2235–2245 (2018)
Google Scholar
Feng, Z.-H., Kittler, J., Xiaojun, W.: Mining hard augmented samples for robust facial landmark localization with CNNs. IEEE Signal Process. Lett. 26(3), 450–454 (2019)
Article Google Scholar
Gower, J.C.: Generalized procrustes analysis. Psychometrika 40, 33–51 (1975)
Article MathSciNet Google Scholar
Guo, X., et al.: PFLD: a practical facial landmark detector. ArXiv, abs/1902.10859 (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Kahraman, F., Gökmen, M., Darkner, S., Larsen, R.: An active illumination and appearance (AIA) model for face alignment. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7 (2007)
Google Scholar
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)
Google Scholar
Köstinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2144–2151 (2011)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2012)
Article Google Scholar
Kumar, N., Belhumeur, P., Nayar, S.: FaceTracer: a search engine for large collections of images with faces. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5305, pp. 340–353. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88693-8_25
Chapter Google Scholar
Le, V., Brandt, J., Lin, Z., Bourdev, L., Huang, T.S.: Interactive facial feature localization. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 679–692. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33712-3_49
Chapter Google Scholar
Liu, Y., et al.: Grand challenge of 106-point facial landmark localization. ArXiv, abs/1905.03469 (2019)
Google Scholar
Luo, B., Shen, J., Wang, Y., Pantic, M.: The iBUG eye segmentation dataset. In: ICCSW (2018)
Google Scholar
Luvizon, D.C., Tabia, H., Picard, D.: Human pose regression by combining indirect part detection and contextual information. CoRR, abs/1710.02322 (2017)
Google Scholar
Lv, J.-J., Shao, X., Xing, J., Cheng, C., Zhou, X.: A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3691–3700 (2017)
Google Scholar
Merget, D., Rock, M., Rigoll, G.: Robust facial landmark detection via a fully-convolutional local-global context network. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 781–790 (2018)
Google Scholar
Messer, K., Matas, J., Kittler, J., Luettin, J., Maître, G.: XM2VTSDB: The extended M2VTS database (1999)
Google Scholar
Milborrow, S., Nicolls, F.: Locating facial features with an extended active shape model. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5305, pp. 504–513. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88693-8_37
Chapter Google Scholar
Nibali, A., He, Z., Morgan, S., Prendergast, L.: Numerical coordinate regression with convolutional neural networks. CoRR, abs/1801.07372 (2018)
Google Scholar
Paszke, A., et al.: Automatic differentiation in PyTorch, Alban Desmaison (2017)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article Google Scholar
Sagonas, C., Tzimiropoulos, G., Zafeiriou, S.P., Pantic, M.: 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 397–403 (2013)
Google Scholar
Saragih, J.M., Goecke, R.: A nonlinear discriminative approach to AAM fitting. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8 (2007)
Google Scholar
Benitez-Quiroz, C.F., Srinivasan, R., Martínez, A.M.: EmotioNet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5562–5570 (2016)
Google Scholar
Taigman, Y., Yang, M.W., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2015)
Google Scholar
Wei, S.-E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4724–4732 (2016)
Google Scholar
Xiao, S., Feng, J., Xing, J., Lai, H., Yan, S., Kassim, A.: Robust facial landmark detection via recurrent attentive-refinement networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 57–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_4
Chapter Google Scholar
Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013)
Google Scholar
Zhang, J., Shan, S., Kan, M., Chen, X.: Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 1–16. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_1
Chapter Google Scholar
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_7
Chapter Google Scholar
Zhu, S., Li, C., Loy, C.C., Tang, X.: Unconstrained face alignment via cascaded compositional learning. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3409–3417 (2016)
Google Scholar
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886 (2012)
Google Scholar
Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3d solution. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 146–155 (2016)
Google Scholar
Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 787–796 (2015)
Google Scholar
Liu, F., Zeng, D., Zhao, Q., Liu, X.: Joint face alignment and 3D face reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 545–560. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_33
Chapter Google Scholar
Liu, F., Zhao, Q., Liu, X., Zeng, D.: Joint face alignment and 3d face reconstruction with application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(6), 1312–1320 (2017)
Google Scholar
Lu, J., Liong, V.E., Zhou, X., Zhou, J.: Learning compact binary face descriptor for face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37, 2041–2056 (2015)
Article Google Scholar
Lu, J., Tan, Y.-P., Wang, G.: Discriminative multimanifold analysis for face recognition from a single training sample per person. In: 2011 International Conference on Computer Vision, pp. 1943–1950 (2011)
Google Scholar

Download references

Acknowledgement

This work is partly supported by the National Natural Science Foundation of China (61773117, 61703096 and 61473086), the Jiangsu key R&D plan (BE2017157) and the Natural Science Foundation of Jiangsu Province (BK20170691).

Author information

Authors and Affiliations

School of Automation, Southeast University, Nanjing, 210096, China
Xian Zhang, Xinjie Tong, Ziyu Li & Wankou Yang
Key Lab of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University, Nanjing, 210096, China
Xian Zhang, Ziyu Li & Wankou Yang

Authors

Xian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xinjie Tong
View author publications
You can also search for this author in PubMed Google Scholar
Ziyu Li
View author publications
You can also search for this author in PubMed Google Scholar
Wankou Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wankou Yang .

Editor information

Editors and Affiliations

Nanjing University of Science and Technology, Nanjing, China
Zhen Cui
Nanjing University of Science and Technology, Nanjing, China
Jinshan Pan
Nanjing University of Science and Technology, Nanjing, China
Shanshan Zhang
Nanjing University of Science and Technology, Nanjing, China
Liang Xiao
Nanjing University of Science and Technology, Nanjing, China
Jian Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Tong, X., Li, Z., Yang, W. (2019). A Robust Facial Landmark Detector with Mixed Loss. In: Cui, Z., Pan, J., Zhang, S., Xiao, L., Yang, J. (eds) Intelligence Science and Big Data Engineering. Visual Data Engineering. IScIDE 2019. Lecture Notes in Computer Science(), vol 11935. Springer, Cham. https://doi.org/10.1007/978-3-030-36189-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-36189-1_21
Published: 29 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36188-4
Online ISBN: 978-3-030-36189-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics