Skip to main content

Multi-view Consensus CNN for 3D Facial Landmark Placement

Part of the Lecture Notes in Computer Science book series (LNIP,volume 11361)

Abstract

The rapid increase in the availability of accurate 3D scanning devices has moved facial recognition and analysis into the 3D domain. 3D facial landmarks are often used as a simple measure of anatomy and it is crucial to have accurate algorithms for automatic landmark placement. The current state-of-the-art approaches have yet to gain from the dramatic increase in performance reported in human pose tracking and 2D facial landmark placement due to the use of deep convolutional neural networks (CNN). Development of deep learning approaches for 3D meshes has given rise to the new subfield called geometric deep learning, where one topic is the adaptation of meshes for the use of deep CNNs. In this work, we demonstrate how methods derived from geometric deep learning, namely multi-view CNNs, can be combined with recent advances in human pose tracking. The method finds 2D landmark estimates and propagates this information to 3D space, where a consensus method determines the accurate 3D face landmark position. We utilise the method on a standard 3D face dataset and show that it outperforms current methods by a large margin. Further, we demonstrate how models trained on 3D range scans can be used to accurately place anatomical landmarks in magnetic resonance images.

Keywords

  • 3D facial landmarks
  • Multi-view CNN
  • Geometric deep learning

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ben-Israel, A., Greville, T.N.: Generalized Inverses: Theory and Applications. Springer, Heidelberg (2003). https://doi.org/10.1007/b97366

    CrossRef  MATH  Google Scholar 

  2. Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Proceedings of Computer Graphics and Interactive Techniques, pp. 187–194 (1999)

    Google Scholar 

  3. Boscaini, D., Masci, J., Rodolà, E., Bronstein, M.: Learning shape correspondence with anisotropic convolutional neural networks. In: Proceedings of NIPS, pp. 3189–3197 (2016)

    Google Scholar 

  4. Bowyer, K.W., Chang, K., Flynn, P.: A survey of approaches and challenges in 3D and multi-modal 3D+ 2D face recognition. Comput. Vis. Image Underst. 101(1), 1–15 (2006)

    CrossRef  Google Scholar 

  5. Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond Euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017)

    CrossRef  Google Scholar 

  6. Bulat, A., Tzimiropoulos, G.: Convolutional aggregation of local evidence for large pose face alignment. In: Proceedings of BMVC (2016)

    Google Scholar 

  7. Bulat, A., Tzimiropoulos, G.: Two-Stage convolutional part heatmap regression for the 1st 3D Face Alignment in the Wild (3DFAW) challenge. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 616–624. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_43

    CrossRef  Google Scholar 

  8. Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). arXiv preprint arXiv:1703.07332 (2017)

  9. Chang, J.B., Small, K.H., Choi, M., Karp, N.S.: Three-dimensional surface imaging in plastic surgery: foundation, practical applications, and beyond. Plast. Reconstr. Surg. 135(5), 1295–1304 (2015)

    CrossRef  Google Scholar 

  10. Creusot, C., Pears, N., Austin, J.: A machine-learning approach to keypoint detection and landmarking on 3D meshes. International journal of computer vision 102(1–3), 146–179 (2013)

    CrossRef  Google Scholar 

  11. Dale, A.M., Fischl, B., Sereno, M.I.: Cortical surface-based analysis: I. segmentation and surface reconstruction. Neuroimage 9(2), 179–194 (1999)

    CrossRef  Google Scholar 

  12. Delingette, H.: Modélisation, Déformation et Reconnaissance d’Objets Tridimensionnels á l’Aide de Maillages Simplexes. Ph.D. thesis, L’École Centrale de Paris (1994)

    Google Scholar 

  13. Fagertun, J., et al.: 3D facial landmarks: Inter-operator variability of manual annotation. BMC Med. Imaging 14(1), 35 (2014)

    CrossRef  Google Scholar 

  14. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. In: Readings in Computer Vision, pp. 726–740. Elsevier (1987)

    Google Scholar 

  15. Ge, L., Liang, H., Yuan, J., Thalmann, D.: Robust 3D hand pose estimation in single depth images: from single-view CNN to multi-view CNNs. In: Proceedings of CVPR, pp. 3593–3601 (2016)

    Google Scholar 

  16. Gilani, S.Z., Shafait, F., Mian, A.: Shape-based automatic detection of a large number of 3D facial landmarks. In: Proceedings of CVPR, pp. 4639–4648. IEEE (2015)

    Google Scholar 

  17. Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y.: Deep Learning. MIT press, Cambridge (2016)

    MATH  Google Scholar 

  18. Gordon, G.G.: Face recognition based on depth and curvature features. In: Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 808–810. IEEE (1992)

    Google Scholar 

  19. Grewe, C.M., Zachow, S.: Fully automated and highly accurate dense correspondence for facial surfaces. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 552–568. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_38

    CrossRef  Google Scholar 

  20. Hammond, P., et al.: 3D analysis of facial morphology. Am. J. Med. Genet. Part A 126(4), 339–348 (2004)

    CrossRef  Google Scholar 

  21. Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)

    MATH  Google Scholar 

  22. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of CVPR, pp. 770–778 (2016)

    Google Scholar 

  23. Deng, J., Zhou, Y., Cheng, S., Zafeiriou, S.: Cascade multi-view hourglass model for robust 3D face alignment. In: FG (2018)

    Google Scholar 

  24. Jourabloo, A., Liu, X.: Pose-invariant 3D face alignment. In: Proceedings of ICCV, pp. 3694–3702 (2015)

    Google Scholar 

  25. Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3D surface construction algorithm. In: ACM Siggraph Computer Graphics, vol. 21, pp. 163–169. ACM (1987)

    Google Scholar 

  26. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29

    CrossRef  Google Scholar 

  27. Odena, A., Dumoulin, V., Olah, C.: Deconvolution and Checkerboard Artifacts. Distill (2016) https://doi.org/10.23915/distill.00003

  28. Paulsen, R.R.: Statistical shape analysis of the human ear canal with application to in-the-ear hearing aid design. Ph.D. thesis, Technical University of Denmark (2004)

    Google Scholar 

  29. Paulsen, R.R., Marstal, K.K., Laugesen, S., Harder, S.: Creating ultra dense point correspondence over the entire human head. In: Sharma, P., Bianchi, F.M. (eds.) SCIA 2017. LNCS, vol. 10270, pp. 438–447. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59129-2_37

    CrossRef  Google Scholar 

  30. Perakis, P., Passalis, G., Theoharis, T., Kakadiaris, I.A.: 3D facial landmark detection under large yaw and expression variations. IEEE Transact. Pattern Anal. Mach. Intell. 35(7), 1552–1564 (2013)

    CrossRef  Google Scholar 

  31. Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view cnns for object classification on 3D data. In: Proceedings of CVPR, pp. 5648–5656 (2016)

    Google Scholar 

  32. Salazar, A., Wuhrer, S., Shu, C., Prieto, F.: Fully automatic expression-invariant face correspondence. Mach. Vis. Appl. 25(4), 859–879 (2014)

    CrossRef  Google Scholar 

  33. Sedaghat, N., Zolfaghari, M., Amiri, E., Brox, T.: Orientation-boosted Voxel nets for 3D object recognition. In: British Machine Vision Conference (BMVC) (2017)

    Google Scholar 

  34. Sela, M., Richardson, E., Kimmel, R.: Unrestricted facial geometry reconstruction using image-to-image translation. arXiv (2017)

    Google Scholar 

  35. Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. In: Proceedings of CVPR, pp. 1297–1304 (2011)

    Google Scholar 

  36. Sled, J.G., Zijdenbos, A.P., Evans, A.C.: A nonparametric method for automatic correction of intensity nonuniformity in MRI data. IEEE Transact. Med. Imaging 17(1), 87–97 (1998)

    CrossRef  Google Scholar 

  37. Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)

    Google Scholar 

  38. Wiles, O., Zisserman, A.: SilNet: single-and multi-view reconstruction by learning from silhouettes. In: Proceedings of BMVC (2017)

    Google Scholar 

  39. Yang, J., Liu, Q., Zhang, K.: Stacked hourglass network for robust facial landmark localisation. In: Proceedings of CVPR, pp. 2025–2033. IEEE (2017)

    Google Scholar 

  40. Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3D facial expression database for facial behavior research. In: Proceedings of FGR, pp. 211–216. IEEE (2006)

    Google Scholar 

  41. Zheng, Y., Liu, D., Georgescu, B., Nguyen, H., Comaniciu, D.: 3D deep learning for efficient and robust landmark detection in volumetric data. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 565–572. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24553-9_69

    CrossRef  Google Scholar 

  42. Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: Proceedings of CVPR, pp. 146–155 (2016)

    Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rasmus R. Paulsen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Paulsen, R.R., Juhl, K.A., Haspang, T.M., Hansen, T., Ganz, M., Einarsson, G. (2019). Multi-view Consensus CNN for 3D Facial Landmark Placement. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11361. Springer, Cham. https://doi.org/10.1007/978-3-030-20887-5_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20887-5_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20886-8

  • Online ISBN: 978-3-030-20887-5

  • eBook Packages: Computer ScienceComputer Science (R0)