Abstract
Virtual facial avatars will play an increasingly important role in immersive communication, games and the metaverse, and it is therefore critical that they be inclusive. This requires accurate recovery of the albedo, regardless of age, sex, or ethnicity. While significant progress has been made on estimating 3D facial geometry, appearance estimation has received less attention. The task is fundamentally ambiguous because the observed color is a function of albedo and lighting, both of which are unknown. We find that current methods are biased towards light skin tones due to (1) strongly biased priors that prefer lighter pigmentation and (2) algorithmic solutions that disregard the light/albedo ambiguity. To address this, we propose a new evaluation dataset (FAIR) and an algorithm (TRUST) to improve albedo estimation and, hence, fairness. Specifically, we create the first facial albedo evaluation benchmark where subjects are balanced in terms of skin color, and measure accuracy using the Individual Typology Angle (ITA) metric. We then address the light/albedo ambiguity by building on a key observation: the image of the full scene –as opposed to a cropped image of the face– contains important information about lighting that can be used for disambiguation. TRUST regresses facial albedo by conditioning on both the face region and a global illumination signal obtained from the scene image. Our experimental results show significant improvement compared to state-of-the-art methods on albedo estimation, both in terms of accuracy and fairness. The evaluation benchmark and code are available for research purposes at https://trust.is.tue.mpg.de.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
For GANFIT [25], the albedos contain a significant amount of baked-in lighting, and were captured with lower light conditions, hence the tendency to do well on dark skin tones.
- 4.
- 5.
There are exceptions to this, such as a scenes where some faces are in shadow or where the lighting is high-frequency.
- 6.
- 7.
Note that these scenes are completely different from those used in the evaluation benchmark.
References
Adamson, A.S., Smith, A.: Machine learning and health care disparities in dermatology. JAMA Dermatol. 154(11), 1247–1248 (2018)
Aldrian, O., Smith, W.A.: Inverse rendering of faces with a 3D morphable model. Trans. Pattern Anal. Mach. Intell. (PAMI) 35(5), 1080–1093 (2012)
Bai, Z., Cui, Z., Liu, X., Tan, P.: Riggable 3D face reconstruction via in-network optimization. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6216–6225 (2021)
Bas, A., Smith, W.A.P.: What does 2D geometric information really tell us about 3D face shape? Int. J. Comput. Vis. (IJCV) 127(10), 1455–1473 (2019)
Bianco, S., Schettini, R.: Adaptive color constancy using faces. Trans. Pattern Anal. Mach. Intell. (PAMI) 36(8), 1505–1518 (2014)
Blanz, V., Romdhani, S., Vetter, T.: Face identification across different poses and illuminations with a 3D morphable model. In: International Conference on Automatic Face & Gesture Recognition (FG), pp. 202–207 (2002)
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: SIGGRAPH, pp. 187–194 (1999)
Buolamwini, J., Gebru, T.: Gender shades: intersectional accuracy disparities in commercial gender classification. In: Conference on Fairness, Accountability and Transparency, pp. 77–91. PMLR (2018)
Chardon, A., Cretois, I., Hourseau, C.: Skin colour typology and suntanning pathways. Int. J. Cosmet. Sci. 13(4), 191–208 (1991)
Chaudhuri, B., Vesdapunt, N., Shapiro, L., Wang, B.: Personalized face modeling for improved face reconstruction and motion retargeting. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 142–160. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_9
Chen, A., Chen, Z., Zhang, G., Mitchell, K., Yu, J.: Photo-realistic facial details synthesis from single image. In: International Conference on Computer Vision (ICCV), pp. 9429–9439 (2019)
Choi, H., Choi, K., Suk, H.: Performance of the 14 skin-colored patches in accurately estimating human skin color. In: Computational Imaging XV, pp. 62–65 (2017)
Dai, H., Pears, N., Smith, W., Duncan, C.: Statistical modeling of craniofacial shape and texture. Int. J. Comput. Vis. (IJCV) 128(2), 547–571 (2019)
Del Bino, S., Sok, J., Bessac, E., Bernerd, F.: Relationship between skin response to ultraviolet exposure and skin color type. Pigment Cell Res. 19(6), 606–614 (2006)
Del Bino, S., Bernerd, F.: Variations in skin colour and the biological consequences of ultraviolet radiation exposure. Br. J. Dermatol. 169, 33–40 (2013)
Deng, Y., Yang, J., Xu, S., Chen, D., Jia, Y., Tong, X.: Accurate 3D face reconstruction with weakly-supervised learning: from single image to image set. In: Conference on Computer Vision and Pattern Recognition Workshops (CVPR-W) (2019)
Dooley, S., et al.: Comparing human and machine bias in face recognition. arXiv preprint arXiv:2110.08396 (2021)
Drozdowski, P., Rathgeb, C., Dantcheva, A., Damer, N., Busch, C.: Demographic bias in biometrics: a survey on an emerging challenge. Trans. Technol. Soc. 1(2), 89–103 (2020)
Egger, B., Schönborn, S., Schneider, A., Kortylewski, A., Morel-Forster, A., Blumer, C., Vetter, T.: Occlusion-aware 3D morphable models and an illumination prior for face image analysis. Int. J. Comput. Vis. (IJCV) 126(12), 1269–1287 (2018)
Egger, B., et al.: 3D morphable face models - past, present, and future. Trans. Graph. (TOG) 39(5), 1–38 (2020)
Egger, B., Sutherland, S., Medin, S.C., Tenenbaum, J.: Identity-expression ambiguity in 3D morphable face models. arXiv preprint arXiv:2109.14203 (2021)
Feng, Y., Feng, H., Black, M.J., Bolkart, T.: Learning an animatable detailed 3D face model from in-the-wild images. Trans. Graph. (Proc. SIGGRAPH) 40(4), 1–13 (2021)
Fitzpatrick, T.B.: The validity and practicality of sun-reactive skin types I through VI. Arch. Dermatol. 124(6), 869–871 (1988)
Gecer, B., Deng, J., Zafeiriou, S.: Ostec: one-shot texture completion. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7628–7638 (2021)
Gecer, B., Ploumpis, S., Kotsia, I., Zafeiriou, S.: Ganfit: Generative adversarial network fitting for high fidelity 3d face reconstruction. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1155–1164 (2019)
Genova, K., Cole, F., Maschinot, A., Sarna, A., Vlasic, D., Freeman, W.T.: Unsupervised training for 3D morphable model regression. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8377–8386 (2018)
Gerig, T., et al.: Morphable face models - an open framework. In: International Conference on Automatic Face & Gesture Recognition (FG), pp. 75–82 (2018)
Hu, G., Mortazavian, P., Kittler, J., Christmas, W.: A facial symmetry prior for improved illumination fitting of 3D morphable model. In: 2013 International Conference on Biometrics (ICB), pp. 1–6. IEEE (2013)
Kim, H., Zollhöfer, M., Tewari, A., Thies, J., Richardt, C., Theobalt, C.: InverseFaceNet: deep monocular inverse face rendering. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4625–4634 (2018)
Kim, T., et al.: Countering racial bias in computer graphics research. arXiv preprint arXiv:2103.15163 (2021)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kinyanjui, N.M., et al.: Fairness of classifiers across skin tones in dermatology. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 320–329. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_31
Kips, R., Gori, P., Perrot, M., Bloch, I.: CA-GAN: weakly supervised color aware GAN for controllable makeup transfer. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 280–296. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_17
Kips, R., Tran, L., Malherbe, E., Perrot, M.: Beyond color correction: skin color estimation in the wild through deep learning. Electronic Imaging 2020(5), 1–82 (2020)
Krasin, I., et al.: Openimages: a public dataset for large-scale multi-label and multi-class image classification, 2(3), 18 (2017). Dataset available from https://github.com/openimages
Krishnapriya, K.S., Albiero, V., Vangara, K., King, M.C., Bowyer, K.W.: Issues related to face recognition accuracy varying based on race and skin tone. IEEE Trans. Technol. Soc. 1(1), 8–20 (2020). https://doi.org/10.1109/TTS.2020.2974996
Lattas, A., et al.: AvatarMe: realistically renderable 3D facial reconstruction. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 760–769 (2020)
Li, R., et al.: Learning formation of physically-based face attributes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3410–3419 (2020)
Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM Transactions on Graphics, (Proc. SIGGRAPH Asia) 36(6), 1–17 (2017). https://doi.org/10.1145/3130800.3130813
Lin, J., Yuan, Y., Shao, T., Zhou, K.: Towards high-fidelity 3D face reconstruction from in-the-wild images using graph convolutional networks. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5891–5900 (2020)
Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations. arXiv preprint arXiv:1811.12359 (2018)
Locatello, F., et al.: Disentangling factors of variation using few labels. arXiv preprint arXiv:1905.01258 (2019)
Marguier, J., Bhatti, N., Baker, H., Harville, M., Süsstrunk, S.: Assessing human skin color from uncalibrated images. Int. J. Imaging Syst. Technol. 17(3), 143–151 (2007)
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. 54(6) (2021). https://doi.org/10.1145/3457607
Merler, M., Ratha, N., Feris, R.S., Smith, J.R.: Diversity in faces. arXiv preprint arXiv:1901.10436 (2019)
Osoba, O.A., Welser IV, W.: An intelligence in our image: the risks of bias and errors in artificial intelligence. Rand Corporation (2017)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (NeurIPS) (2019)
Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance. pp. 296–301. IEEE (2009)
Pichon, L.C., Landrine, H., Corral, I., Hao, Y., Mayer, J.A., Hoerster, K.D.: Measuring skin cancer risk in african americans: is the fitzpatrick skin type classification scale culturally sensitive. Ethn. Dis. 20(2), 174–179 (2010)
Rajkomar, A., Hardt, M., Howell, M.D., Corrado, G., Chin, M.H.: Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 169(12), 866–872 (2018)
Ramamoorthi, R., Hanrahan, P.: A signal-processing framework for inverse rendering. In: Pocock, L. (ed.) SIGGRAPH, pp. 117–128 (2001)
Ravi, N., et al.: PyTorch3d. https://github.com/facebookresearch/pytorch3d (2020)
Robinson, J.P., Livitz, G., Henon, Y., Qin, C., Fu, Y., Timoner, S.: Face recognition: too bias, or not too bias? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2020)
Sahasrabudhe, M., Shu, Z., Bartrum, E., Güler, R.A., Samaras, D., Kokkinos, I.: Lifting autoencoders: unsupervised learning of a fully-disentangled 3D morphable model using deep non-rigid structure from motion. In: International Conference on Computer Vision Workshops (ICCV-W), pp. 4054–4064 (2019)
Saito, S., Wei, L., Hu, L., Nagano, K., Li, H.: Photorealistic facial texture inference using deep neural networks. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5144–5153 (2017)
Schönborn, S., Egger, B., Morel-Forster, A., Vetter, T.: Markov chain monte Carlo for automated face image analysis. Int. J. Comput. Vis. (IJCV) 123(2), 160–183 (2017)
Sengupta, S., Kanazawa, A., Castillo, C.D., Jacobs, D.W.: SfSNet: learning shape, reflectance and illuminance of facesin the wild. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6296–6305 (2018)
Shang, J., et al.: Self-supervised monocular 3D face reconstruction by occlusion-aware multi-view geometry consistency. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 53–70. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_4
Shi, F., Wu, H.T., Tong, X., Chai, J.: Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans. Graph. (TOG) 33(6), 1–13 (2014)
Shu, Z., Yumer, E., Hadap, S., Sunkavalli, K., Shechtman, E., Samaras, D.: Neural face editing with intrinsic image disentangling. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5541–5550 (2017)
Smith, W.A.P., Seck, A., Dee, H., Tiddeman, B., Tenenbaum, J., Egger, B.: A morphable face albedo model. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5010–5019 (2020)
Terhörst, P., Kolf, J.N., Huber, M., Kirchbuchner, F., Damer, N., Moreno, A.M., Fierrez, J., Kuijper, A.: A comprehensive study on face recognition biases beyond demographics. IEEE Trans. Technol. Soc. 3(1), 16–30 (2021)
Tewari, A., et al.: FML: face model learning from videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10812–10822 (2019)
Tewari, A., et al.: Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2549–2559 (2018)
Tewari, A., et al.: MoFA: model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In: International Conference on Computer Vision (ICCV) (2017)
Thies, J., Zollhöfer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2Face: real-time face capture and reenactment of RGB videos. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2387–2395 (2016)
Tran, L., Liu, F., Liu, X.: Towards high-fidelity nonlinear 3D face morphable model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1126–1135 (2019)
Tran, L., Liu, X.: Nonlinear 3D face morphable model. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7346–7355 (2018)
Wang, M., Deng, W., Hu, J., Tao, X., Huang, Y.: Racial faces in the wild: reducing racial bias by information maximization adaptation network. In: International Conference on Computer Vision (ICCV), pp. 692–702 (2019)
Wen, Y., Liu, W., Raj, B., Singh, R.: Self-supervised 3D face reconstruction via conditional estimation. In: International Conference on Computer Vision (ICCV), pp. 13289–13298 (2021)
Wilson, B., Hoffman, J., Morgenstern, J.: Predictive inequity in object detection. arXiv preprint arXiv:1902.11097 (2019)
Yamaguchi, S., et al.: High-fidelity facial reflectance and geometry inference from an unconstrained image. Trans. Graph. (TOG) 37(4), 1–14 (2018)
Youn, J., et al.: Relationship between skin phototype and med in korean, brown skin. Photodermatol. Photoimmunol. Photomed. 13(5–6), 208–211 (1997)
Acknowledgements
We thank S. Sanyal for the helpful suggestions, O. Ben-Dov, R. Danecek, Y. Wen for helping with the baselines, N. Athanasiou, Y. Feng, Y. Xiu for proof-reading, and B. Pellkofer for the technical support.
Disclosure: MJB has received research gift funds from Adobe, Intel, Nvidia, Meta/Facebook, and Amazon. MJB has financial interests in Amazon, Datagen Technologies, and Meshcapade GmbH. While MJB was a part-time employee of Amazon during a portion of this project, his research was performed solely at, and funded solely by, the Max Planck Society. While TB is a part-time employee of Amazon, his research was performed solely at, and funded solely by, MPI.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Feng, H., Bolkart, T., Tesch, J., Black, M.J., Abrevaya, V. (2022). Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13673. Springer, Cham. https://doi.org/10.1007/978-3-031-19778-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-19778-9_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19777-2
Online ISBN: 978-3-031-19778-9
eBook Packages: Computer ScienceComputer Science (R0)