Skip to main content

Towards Unbiased Label Distribution Learning for Facial Pose Estimation Using Anisotropic Spherical Gaussian

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13672))

Included in the following conference series:

Abstract

Facial pose estimation refers to the task of predicting face orientation from a single RGB image. It is an important research topic with a wide range of applications in computer vision. Label distribution learning (LDL) based methods have been recently proposed for facial pose estimation, which achieve promising results. However, there are two major issues in existing LDL methods. First, the expectations of label distributions are biased, leading to a biased pose estimation. Second, fixed distribution parameters are applied for all learning samples, severely limiting the model capability. In this paper, we propose an Anisotropic Spherical Gaussian (ASG)-based LDL approach for facial pose estimation. In particular, our approach adopts the spherical Gaussian distribution on a unit sphere which constantly generates unbiased expectation. Meanwhile, we introduce a new loss function that allows the network to learn the distribution parameter for each learning sample flexibly. Extensive experimental results show that our method sets new state-of-the-art records on AFLW2000 and BIWI datasets.

Z. Cao and D. Liu—Equal contributions.

Q. Wang—The analysis and all work described in this paper was performed by the authors at Purdue and RIT. Qifan Wang served as an advisor to the project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Albiero, V., Chen, X., Yin, X., Pang, G., Hassner, T.: img2pose: face alignment and detection via 6DoF, face pose estimation. In: CVPR (2021)

    Google Scholar 

  2. Cao, Z., Chu, Z., Liu, D., Chen, Y.: A vector-based representation to enhance head pose estimation. In: WACV (2021)

    Google Scholar 

  3. Chang, F.J., Tuan Tran, A., Hassner, T., Masi, I., Nevatia, R., Medioni, G.: FaceposeNet: making a case for landmark-free face alignment. In: ICCV Workshops (2017)

    Google Scholar 

  4. Chen, Z., Liu, Z., Hu, H., Bai, J., Lian, S., Shi, F., Wang, K.: A realistic face-to-face conversation system based on deep neural networks. In: ICCV (2019)

    Google Scholar 

  5. Cheng, Z., et al.: Physical attack on monocular depth estimation with optimal adversarial patches. In: ECCV (2022)

    Google Scholar 

  6. Cui, Y., Yan, L., Cao, Z., Liu, D.: TF-blender: temporal feature blender for video object detection. In: ICCV (2021)

    Google Scholar 

  7. De Rousiers, C., Bousseau, A., Subr, K., Holzschuch, N., Ramamoorthi, R.: Real-time rough refraction. In: Symposium on Interactive 3D Graphics and Games, pp. 111–118 (2011)

    Google Scholar 

  8. Diaz, R., Marathe, A.: Soft labels for ordinal regression. In: CVPR (2019)

    Google Scholar 

  9. Fan, Y.Y., et al.: Label distribution-based facial attractiveness computation by deep residual learning. IEEE Trans. Multimedia 20(8), 2196–2208 (2017)

    Article  Google Scholar 

  10. Fanelli, G., Dantone, M., Gall, J., Fossati, A., Van Gool, L.: Random forests for real time 3D face analysis. Int. J. Comput. Vis. 101(3), 437–458 (2013)

    Article  Google Scholar 

  11. Fisher, R.A.: Dispersion on a sphere. Proc. R. Soc. London Ser. A Math. Phys. Sci. 217(1130), 295–305 (1953)

    Google Scholar 

  12. Gao, G., Lauri, M., Zhang, J., Frintrop, S.: Occlusion resistant object rotation regression from point cloud segments. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11129, pp. 716–729. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11009-3_44

    Chapter  Google Scholar 

  13. Geng, X., Hou, P.: Pre-release prediction of crowd opinion on movies by label distribution learning. In: IJCAI (2015)

    Google Scholar 

  14. Geng, X., Xia, Y.: Head pose estimation based on multivariate label distribution. In: CVPR (2014)

    Google Scholar 

  15. Geng, X., Yin, C., Zhou, Z.H.: Facial age estimation by learning from label distributions. IEEE Trans. Pattern Anal. Mach. Intell. 35(10), 2401–2412 (2013). https://doi.org/10.1109/TPAMI.2013.51

    Article  Google Scholar 

  16. Geronimo, D., Lopez, A.M., Sappa, A.D., Graf, T.: Survey of pedestrian detection for advanced driver assistance systems. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1239–1258 (2009)

    Article  Google Scholar 

  17. González, Á.: Measurement of areas on a sphere using Fibonacci and latitude-longitude lattices. Math. Geosci. 42(1), 49–64 (2010)

    Article  MathSciNet  Google Scholar 

  18. Hara, K., Nishino, K., Ikeuchi, K.: Multiple light sources and reflectance property estimation based on a mixture of spherical distributions. In: ICCV (2005)

    Google Scholar 

  19. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

    Google Scholar 

  20. Hsu, H.W., Wu, T.Y., Wan, S., Wong, W.H., Lee, C.Y.: QuatNet: quaternion-based head pose estimation with multiregression loss. IEEE Trans. Multimedia 21(4), 1035–1046 (2018)

    Article  Google Scholar 

  21. Huang, B., Chen, R., Xu, W., Zhou, Q.: Improving head pose estimation using two-stage ensembles with top-k regression. Image Vis. Comput. 93, 103827 (2020)

    Google Scholar 

  22. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: CVPR (2014)

    Google Scholar 

  23. Kent, J.T.: The Fisher-Bingham distribution on the sphere. J. R. Stat. Soc. Ser. B (Methodol.) 44(1), 71–80 (1982)

    MathSciNet  MATH  Google Scholar 

  24. Lepetit, V., Moreno-Noguer, F., Fua, P.: EPnP: an accurate O(n) solution to the PnP problem. Int. J. Comput. Vis. 81(2), 155 (2009)

    Article  Google Scholar 

  25. Liu, D., Cui, Y., Tan, W., Chen, Y.: SG-Net: spatial granularity network for one-stage video instance segmentation. In: CVPR (2021)

    Google Scholar 

  26. Liu, et al..: DenserNet: weakly supervised visual localization using multi-scale feature aggregation. In: AAAI (2021)

    Google Scholar 

  27. Liu, X., et al.: AgeNet: deeply learned regressor and classifier for robust apparent age estimation. In: ICCVW (2015)

    Google Scholar 

  28. Liu, Z., Chen, Z., Bai, J., Li, S., Lian, S.: Facial pose estimation by deep learning from label distributions. In: CVPR Workshops (2019)

    Google Scholar 

  29. Liu, Z., Hu, H., Wang, Z., Wang, K., Bai, J., Lian, S.: Video synthesis of human upper body with realistic face. In: 2019 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), pp. 200–202. IEEE (2019)

    Google Scholar 

  30. Liu, Z., et al.: Unveiling the power of mixup for stronger classifiers. arXiv preprint arXiv:2103.13027 (2021)

  31. Mahendran, S., Ali, H., Vidal, R.: 3D pose regression using convolutional neural networks. In: ICCV Workshops (2017)

    Google Scholar 

  32. Koestinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: Proceedings of the First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies (2011)

    Google Scholar 

  33. Murphy-Chutorian, E., Doshi, A., Trivedi, M.M.: Head pose estimation for driver assistance systems: a robust algorithm and experimental evaluation. In: 2007 IEEE Intelligent Transportation Systems Conference, pp. 709–714. IEEE (2007)

    Google Scholar 

  34. Peng, S., Liu, Y., Huang, Q., Zhou, X., Bao, H.: PVNet: pixel-wise voting network for 6DoF pose estimation. In: CVPR (2019)

    Google Scholar 

  35. Ruiz, N., Chong, E., Rehg, J.M.: Fine-grained head pose estimation without keypoints. In: CVPR Workshops (2018)

    Google Scholar 

  36. Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: ICCV Workshops (2013)

    Google Scholar 

  37. Song, C., Song, J., Huang, Q.: HybridPose: 6D object pose estimation under hybrid representations. In: CVPR (2020)

    Google Scholar 

  38. Valle, R., Buenaposada, J.M., Baumela, L.: Multi-task head pose estimation in-the-wild. IEEE Trans. Pattern Anal. Mach. Intell. 43, 2874–2881 (2020)

    Google Scholar 

  39. Wang, J., Ren, P., Gong, M., Snyder, J., Guo, B.: All-frequency rendering of dynamic, spatially-varying reflectance. In: ACM SIGGRAPH Asia 2009 papers, pp. 1–10 (2009)

    Google Scholar 

  40. Xiang, S.: Eliminating topological errors in neural network rotation estimation using self-selecting ensembles. ACM Trans. Graph. (TOG) 40(4), 1–21 (2021)

    Article  Google Scholar 

  41. Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199 (2017)

  42. Xu, K., Sun, W.L., Dong, Z., Zhao, D.Y., Wu, R.D., Hu, S.M.: Anisotropic spherical gaussians. ACM Trans. Graph. (TOG) 32(6), 1–11 (2013)

    Google Scholar 

  43. Yan, L., et al.: GL-RG: global-local representation granularity for video captioning. In: IJCAI (2022)

    Google Scholar 

  44. Yang, H., Mou, W., Zhang, Y., Patras, I., Gunes, H., Robinson, P.: Face alignment assisted by head pose estimation. arXiv preprint arXiv:1507.03148 (2015)

  45. Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: CVPR (2016)

    Google Scholar 

  46. Yang, T.Y., Chen, Y.T., Lin, Y.Y., Chuang, Y.Y.: FSA-net: learning fine-grained structure aggregation for head pose estimation from a single image. In: CVPR (2019)

    Google Scholar 

  47. Zakharov, S., Shugurov, I., Ilic, S.: DPOD: 6D pose object detector and refiner. In: ICCV (2019)

    Google Scholar 

  48. Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)

    Article  Google Scholar 

  49. Zhang, Z., Wang, M., Geng, X.: Crowd counting in public video surveillance by label distribution learning. Neurocomputing 166, 151–163 (2015)

    Article  Google Scholar 

  50. Zhou, Y., Barnes, C., Lu, J., Yang, J., Li, H.: On the continuity of rotation representations in neural networks. In: CVPR (2019)

    Google Scholar 

  51. Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: CVPR (2016)

    Google Scholar 

  52. Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: CVPR (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiwen Cao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cao, Z., Liu, D., Wang, Q., Chen, Y. (2022). Towards Unbiased Label Distribution Learning for Facial Pose Estimation Using Anisotropic Spherical Gaussian. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13672. Springer, Cham. https://doi.org/10.1007/978-3-031-19775-8_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19775-8_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19774-1

  • Online ISBN: 978-3-031-19775-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics