Dense 3D facial reconstruction from a single depth image in unconstrained environment

Abstract

With the increasing demands of applications in virtual reality such as 3D films, virtual human–machine interactions and virtual agents, the analysis of 3D human face is considered to be more and more important as a fundamental step in these tasks. Due to information provided by the additional dimension, 3D facial reconstruction enables aforementioned tasks to be achieved with higher accuracy than those based on 2D facial analysis. The denser the 3D facial model is, the more information it could provide. However, most existing dense 3D facial reconstruction methods require complicated processing and high system cost. To this end, this paper presents a novel method that simplifies the process of dense 3D facial reconstruction by employing only one frame of depth data obtained with an off-the-shelf RGB-D sensor. The proposed method is composed of two main stages: (a) the acquisition of the initial 3D facial point cloud with automatically 3D facial region cropping, and (b) the generating of the dense facial point cloud with RBF-based adaptive 3D point interpolation. Experiments reported in this paper demonstrate the competitive results with real-world data.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. Amidror I (2002) Scattered data interpolation methods for electronic imaging systems: a survey. J Electron Imaging 11:157–176

    Article  Google Scholar 

  2. Beis JS, Lowe DG (1997) Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In: Proceedings of IEEE computer society conference on computer vision and pattern recognition, 1997. IEEE, pp 1000–1006

  3. Bradley D, Heidrich W, Popa T, Sheffer (2010) A high resolution passive facial performance capture. In: ACM transactions on graphics (TOG), 2010, vol 4. ACM, p 41

  4. Brown RA (2015) Building kd Tree in O (knlog n) Time. J Comput Graph Tech 4:50–68

  5. Chen G, Li J, Zeng J, Wang B, Lu G (2016) Optimizing human model reconstruction from RGB-D images based on skin detection. Virtual Real 20:159–172

    Article  Google Scholar 

  6. Chiabrando F, Chiabrando R, Piatti D, Rinaudo F (2009) Sensors for 3D imaging: metric evaluation and calibration of a CCD/CMOS time-of-flight camera. Sensors 9:10080–10096

    Article  Google Scholar 

  7. Danescu R, Oniga F, Turcu V, Cristea O (2012) Long baseline stereovision for automatic detection and ranging of moving objects in the night sky. Sensors 12:12940–12963

    Article  Google Scholar 

  8. Essabbah M, Bouyer G, Otmane S, Mallem M (2014) A framework to design 3D interaction assistance in constraints-based virtual environments. Virtual Real 18:219–234

    Article  Google Scholar 

  9. Fasshauer GE, McCourt MJ (2012) Stable evaluation of Gaussian radial basis function interpolants. SIAM J Sci Comput 34:A737–A762

    MathSciNet  Article  MATH  Google Scholar 

  10. Franke R, Nielson GM (1991) Scattered data interpolation and applications: a tutorial and survey. In: Hagen H, Roller D (eds) Geometric modeling. Computer Graphics—Systems and Applications. Springer, Berlin, pp 131–160

  11. Garcia E, Dugelay J-L (2001) Low cost 3D face acquisition and modeling. In: Proceedings of international conference on information technology: coding and computing. IEEE, pp 657–661

  12. Han J, Shao L, Xu D, Shotton J (2013) Enhanced computer vision with microsoft kinect sensor: a review. IEEE Trans Cybern 43:1318–1334

    Article  Google Scholar 

  13. Har-Peled S, Indyk P, Motwani R (2012) Approximate nearest neighbor: towards removing the curse of dimensionality. Theory Comput 8:321–350

    MathSciNet  Article  MATH  Google Scholar 

  14. Hartley R, Zisserman A (2003) Multiple view geometry in computer vision. Cambridge University Press, Cambridge

    Google Scholar 

  15. Hernandez M, Choi J, Medioni G (2015) Near laser-scan quality 3-D face reconstruction from a low-quality depth stream. Image Vis Comput 36:61–69

    Article  Google Scholar 

  16. Hernoux F, Christmann O (2015) A seamless solution for 3D real-time interaction: design and evaluation. Virtual Real 19:1–20

    Article  Google Scholar 

  17. Hossain MS, Akbar M, Starkey JD (2007) Inexpensive construction of a 3D face model from stereo images. In: 10th international conference on computer and information technology. iccit 2007. IEEE, pp 1–6

  18. Hwang J, Yu S, Kim J, Lee S (2012) 3D face modeling using the multi-deformable method. Sensors 12:12870–12889

    Article  Google Scholar 

  19. Jo J, Choi H, Kim I-J, Kim J (2015) Single-view-based 3D facial reconstruction method robust against pose variations. Pattern Recogn 48:73–85

    Article  Google Scholar 

  20. Lee M, Choi C-H (2014) Real-time facial shape recovery from a single image under general, unknown lighting by rank relaxation. Comput Vis Image Underst 120:59–69

    Article  Google Scholar 

  21. Lee W-S, Soon A, Zhu L (2007) 3D facial model exaggeration builder for small or large sized model manufacturing. Virtual Real 11:229–239

    Article  Google Scholar 

  22. Lin I-C, Yeh J-S, Ouhyoung M (2002) Extracting 3D facial animation parameters from multiview video clips. IEEE Comput Graph Appl 22:72–80

    Article  Google Scholar 

  23. Mecca R, Wetzler A, Kimmel R, Bruckstein AM (2013) Direct shape recovery from photometric stereo with shadows. In: 2013 International conference on 3D vision-3DV 2013. IEEE, pp 382–389

  24. Nguyen TT, Slaughter DC, Max N, Maloof JN, Sinha N (2015) Structured light-based 3D reconstruction system for plants. Sensors 15:18587–18612

    Article  Google Scholar 

  25. Niclass C, Soga M, Matsubara H, Ogawa M, Kagami M (2014) A 0.18-m CMOS SoC for a 100-m-range 10-frame/s 200 96-pixel time-of-flight depth sensor. IEEE J Solid State Circuits 49:315–330

    Article  Google Scholar 

  26. Nielson G, Hagen H, Muller H (1997) Scientific visualization. Institute of Electrical and Electronics Engineers, New York

    Google Scholar 

  27. Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06). IEEE, pp 2161–2168

  28. Qu C, Brinkman W-P, Ling Y, Wiggers P, Heynderickx I (2013) Human perception of a conversational virtual human: an empirical study on the effect of emotion and culture. Virtual Real 17:307–321

    Article  Google Scholar 

  29. Rao PV, Rao SKM (2014) Performance issues on K-mean partitioning clustering algorithm. Int J Comput IJC 14:41–51

    Google Scholar 

  30. Sibson R (1981) A brief description of natural neighbour interpolation. Interpret Multivar Data 21:21–36

    Google Scholar 

  31. Silpa-Anan C, Hartley R (2008) Optimised KD-trees for fast image descriptor matching. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008. IEEE, pp 1–8

  32. Wilson PI, Fernandez J (2006) Facial feature detection using Haar classifiers. J Comput Sci Coll 21:127–133

    Google Scholar 

  33. Yu H, Garrod OG, Schyns PG (2012) Perception-driven facial expression synthesis. Comput Graph 36:152–162

    Article  Google Scholar 

  34. Zhang Z (2012) Microsoft kinect sensor and its effect. IEEE Multimed 19:4–10

    Article  Google Scholar 

  35. Zhu J, Wang L, Yang R, Davis J (2008) Fusion of time-of-flight depth and stereo for high accuracy depth maps. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008. IEEE, pp 1–8

Download references

Acknowledgements

The Engineering and Physical Sciences Research Council Project (EPSRC), UK (No. EP/N025849/1); EU seventh framework programme under Grant Agreement No. 611391; National Natural Science Foundation of China (NSFC) (No. 41576011); and the International Science and Technology Cooperation Program of China (ISTCP) (No. 2014DFA10410).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Junyu Dong.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, S., Yu, H., Wang, T. et al. Dense 3D facial reconstruction from a single depth image in unconstrained environment. Virtual Reality 22, 37–46 (2018). https://doi.org/10.1007/s10055-017-0311-6

Download citation

Keywords

  • Virtual face
  • Three-dimensional image acquisition
  • Three-dimensional sensing
  • 3D interpolation