Advertisement

A Robust Facial Landmark Detector with Mixed Loss

  • Xian Zhang
  • Xinjie Tong
  • Ziyu Li
  • Wankou YangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11935)

Abstract

Facial landmark detection is one of the most important tasks in face image and video analysis. Existing algorithms based on deep convolutional neural networks have achieved good performance in public benchmarks and practical applications such as face verification, expression analysis, beauty applications and so on. However, the performance of a facial landmark detector degrades significantly when dealing with challenging facial images in the presence of extreme appearance variations such as pose, expression, occlusion, etc. To mitigate these difficulties, we propose a robust facial landmark detection algorithm based on coordinates regression in an end-to-end training fashion. By using the soft-argmax function, the network weights can be optimised with a mixed loss function. The online pose-based data augmentation technology is used to effectively solve the data imbalance problem and improve the robustness of the proposed method. Experiments conducted on the 300-W and AFLW datasets demonstrate that the performance of the proposed algorithm is competitive to the state-of-the-art heatmap regression algorithms, in terms of accuracy. Besides, our method achieves real-time speed on 300-W with 68 landmarks, which runs at 85 FPS on a Tesla v100 GPU.

Keywords

Facial landmark detection Mixed loss Soft-argmax Pose-based data augmentation 

Notes

Acknowledgement

This work is partly supported by the National Natural Science Foundation of China (61773117, 61703096 and 61473086), the Jiangsu key R&D plan (BE2017157) and the Natural Science Foundation of Jiangsu Province (BK20170691).

References

  1. 1.
    Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2930–2940 (2011)CrossRefGoogle Scholar
  2. 2.
    Burgos-Artizzu, X.P., Perona, P., Dollár, P.: Robust face landmark estimation under occlusion. In: 2013 IEEE International Conference on Computer Vision, pp. 1513–1520 (2013)Google Scholar
  3. 3.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 484–498. Springer, Heidelberg (1998).  https://doi.org/10.1007/BFb0054760CrossRefGoogle Scholar
  4. 4.
    Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models-their training and application. Comput. Vis. Image Underst. 61, 38–59 (1995)CrossRefGoogle Scholar
  5. 5.
    Cootes, T.F., Walker, K.N., Taylor, C.J.: View-based active appearance models. Image Vision Comput. 20, 657–664 (2000) CrossRefGoogle Scholar
  6. 6.
    Dong, X., Yan, Y., Ouyang, W., Yang, Y.: Style aggregated network for facial landmark detection. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 379–388 (2018)Google Scholar
  7. 7.
    Dong, X., Yu, S.-I., Weng, X., Wei, S.-E., Yang, Y., Sheikh, Y.: Supervision-by-registration: an unsupervised approach to improve the precision of facial landmark detectors. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 360–368 (2018)Google Scholar
  8. 8.
    Feng, Z.-H., Kittler, J., Awais, M., Huber, P., Wu, X.: Wing loss for robust facial landmark localisation with convolutional neural networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2235–2245 (2018)Google Scholar
  9. 9.
    Feng, Z.-H., Kittler, J., Xiaojun, W.: Mining hard augmented samples for robust facial landmark localization with CNNs. IEEE Signal Process. Lett. 26(3), 450–454 (2019)CrossRefGoogle Scholar
  10. 10.
    Gower, J.C.: Generalized procrustes analysis. Psychometrika 40, 33–51 (1975)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Guo, X., et al.: PFLD: a practical facial landmark detector. ArXiv, abs/1902.10859 (2019)Google Scholar
  12. 12.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)Google Scholar
  13. 13.
    Kahraman, F., Gökmen, M., Darkner, S., Larsen, R.: An active illumination and appearance (AIA) model for face alignment. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7 (2007)Google Scholar
  14. 14.
    Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)Google Scholar
  15. 15.
    Köstinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2144–2151 (2011)Google Scholar
  16. 16.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2012)CrossRefGoogle Scholar
  17. 17.
    Kumar, N., Belhumeur, P., Nayar, S.: FaceTracer: a search engine for large collections of images with faces. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5305, pp. 340–353. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-88693-8_25CrossRefGoogle Scholar
  18. 18.
    Le, V., Brandt, J., Lin, Z., Bourdev, L., Huang, T.S.: Interactive facial feature localization. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 679–692. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33712-3_49CrossRefGoogle Scholar
  19. 19.
    Liu, Y., et al.: Grand challenge of 106-point facial landmark localization. ArXiv, abs/1905.03469 (2019)Google Scholar
  20. 20.
    Luo, B., Shen, J., Wang, Y., Pantic, M.: The iBUG eye segmentation dataset. In: ICCSW (2018)Google Scholar
  21. 21.
    Luvizon, D.C., Tabia, H., Picard, D.: Human pose regression by combining indirect part detection and contextual information. CoRR, abs/1710.02322 (2017)Google Scholar
  22. 22.
    Lv, J.-J., Shao, X., Xing, J., Cheng, C., Zhou, X.: A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3691–3700 (2017)Google Scholar
  23. 23.
    Merget, D., Rock, M., Rigoll, G.: Robust facial landmark detection via a fully-convolutional local-global context network. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 781–790 (2018)Google Scholar
  24. 24.
    Messer, K., Matas, J., Kittler, J., Luettin, J., Maître, G.: XM2VTSDB: The extended M2VTS database (1999)Google Scholar
  25. 25.
    Milborrow, S., Nicolls, F.: Locating facial features with an extended active shape model. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5305, pp. 504–513. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-88693-8_37CrossRefGoogle Scholar
  26. 26.
    Nibali, A., He, Z., Morgan, S., Prendergast, L.: Numerical coordinate regression with convolutional neural networks. CoRR, abs/1801.07372 (2018)Google Scholar
  27. 27.
    Paszke, A., et al.: Automatic differentiation in PyTorch, Alban Desmaison (2017)Google Scholar
  28. 28.
    Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)CrossRefGoogle Scholar
  29. 29.
    Sagonas, C., Tzimiropoulos, G., Zafeiriou, S.P., Pantic, M.: 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 397–403 (2013)Google Scholar
  30. 30.
    Saragih, J.M., Goecke, R.: A nonlinear discriminative approach to AAM fitting. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8 (2007)Google Scholar
  31. 31.
    Benitez-Quiroz, C.F., Srinivasan, R., Martínez, A.M.: EmotioNet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5562–5570 (2016)Google Scholar
  32. 32.
    Taigman, Y., Yang, M.W., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)Google Scholar
  33. 33.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556 (2015)Google Scholar
  34. 34.
    Wei, S.-E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4724–4732 (2016)Google Scholar
  35. 35.
    Xiao, S., Feng, J., Xing, J., Lai, H., Yan, S., Kassim, A.: Robust facial landmark detection via recurrent attentive-refinement networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 57–72. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_4CrossRefGoogle Scholar
  36. 36.
    Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013)Google Scholar
  37. 37.
    Zhang, J., Shan, S., Kan, M., Chen, X.: Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 1–16. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10605-2_1CrossRefGoogle Scholar
  38. 38.
    Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10599-4_7CrossRefGoogle Scholar
  39. 39.
    Zhu, S., Li, C., Loy, C.C., Tang, X.: Unconstrained face alignment via cascaded compositional learning. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3409–3417 (2016)Google Scholar
  40. 40.
    Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886 (2012)Google Scholar
  41. 41.
    Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3d solution. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 146–155 (2016)Google Scholar
  42. 42.
    Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 787–796 (2015)Google Scholar
  43. 43.
    Liu, F., Zeng, D., Zhao, Q., Liu, X.: Joint face alignment and 3D face reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 545–560. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_33CrossRefGoogle Scholar
  44. 44.
    Liu, F., Zhao, Q., Liu, X., Zeng, D.: Joint face alignment and 3d face reconstruction with application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(6), 1312–1320 (2017)Google Scholar
  45. 45.
    Lu, J., Liong, V.E., Zhou, X., Zhou, J.: Learning compact binary face descriptor for face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37, 2041–2056 (2015)CrossRefGoogle Scholar
  46. 46.
    Lu, J., Tan, Y.-P., Wang, G.: Discriminative multimanifold analysis for face recognition from a single training sample per person. In: 2011 International Conference on Computer Vision, pp. 1943–1950 (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Xian Zhang
    • 1
    • 2
  • Xinjie Tong
    • 1
  • Ziyu Li
    • 1
    • 2
  • Wankou Yang
    • 1
    • 2
    Email author
  1. 1.School of AutomationSoutheast UniversityNanjingChina
  2. 2.Key Lab of Measurement and Control of Complex Systems of EngineeringMinistry of Education, Southeast UniversityNanjingChina

Personalised recommendations