Depth Recovery with Face Priors

  • Chongyu Chen
  • Hai Xuan Pham
  • Vladimir Pavlovic
  • Jianfei Cai
  • Guangming Shi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9006)


Existing depth recovery methods for commodity RGB-D sensors primarily rely on low-level information for repairing the measured depth estimates. However, as the distance of the scene from the camera increases, the recovered depth estimates become increasingly unreliable. The human face is often a primary subject in the captured RGB-D data in applications such as the video conference. In this paper we propose to incorporate face priors extracted from a general sparse 3D face model into the depth recovery process. In particular, we propose a joint optimization framework that consists of two main steps: deforming the face model for better alignment and applying face priors for improved depth recovery. The two main steps are iteratively and alternatively operated so as to help each other. Evaluations on benchmark datasets demonstrate that the proposed method with face priors significantly outperforms the baseline method that does not use face priors, with up to 15.1 % improvement in depth recovery quality and up to 22.3 % in registration accuracy.



This research, which is carried out at BeingThere Centre, is mainly supported by the Singapore National Research Foundation under its International Research Centre @ Singapore Funding Initiative and administered by the IDM Programme Office. This research is also partially supported by the 111 Project (No. B07048), China.


  1. 1.
    Mutto, C., Zanuttigh, P., Cortelazzo, G.: Microsoft Kinect\(^{{\rm TM}}\) range camera. In: Mutto, C., Zanuttigh, P., Cortelazzo, G. (eds.) Time-of-Flight Cameras and Microsoft Kinect\(^{{\rm TM}}\). SpringerBriefs in Electrical and Computer Engineering, pp. 33–47. Springer, Boston (2012)CrossRefGoogle Scholar
  2. 2.
    Maimone, A., Fuchs, H.: Encumbrance-free telepresence system with real-time 3D capture and display using commodity depth cameras. In: International Symposium Mixed Augmented Reality (ISMAR), pp. 137–146. IEEE, Basel, Switzerland (2011)Google Scholar
  3. 3.
    Kuster, C., Popa, T., Zach, C., Gotsman, C., Gross, M.: Freecam: a hybrid camera system for interactive free-viewpoint video. In: Proceedings of the Vision, Modeling, and Vision (VMV), Berlin, Germany, pp. 17–24 (2011)Google Scholar
  4. 4.
    Zhang, C., Cai, Q., Chou, P., Zhang, Z., Martin-Brualla, R.: Viewport: a distributed, immersive teleconferencing system with infrared dot pattern. IEEE Multimedia 20, 17–27 (2013)CrossRefGoogle Scholar
  5. 5.
    Min, D., Lu, J., Do, M.: Depth video enhancement based on weighted mode filtering. IEEE Trans. Image Process. 21, 1176–1190 (2012)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Richardt, C., Stoll, C., Dodgson, N.A., Seidel, H.P., Theobalt, C.: Coherent spatiotemporal filtering, upsampling and rendering of RGBZ videos. Comp. Graph. Forum 31, 247–256 (2012)CrossRefGoogle Scholar
  7. 7.
    Yang, J., Ye, X., Li, K., Hou, C.: Depth recovery using an adaptive color-guided auto-regressive model. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 158–171. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  8. 8.
    Zhao, M., Tan, F., Fu, C.W., Tang, C.K., Cai, J., Cham, T.J.: High-quality Kinect depth filtering for real-time 3D telepresence. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2013)Google Scholar
  9. 9.
    Chen, C., Cai, J., Zheng, J., Cham, T.J., Shi, G.: A color-guided, region-adaptive and depth-selective unified framework for Kinect depth recovery. In: International Workshop Multimedia Signal Processing (MMSP), pp. 8–12. IEEE, Pula, Italy (2013)Google Scholar
  10. 10.
    Qi, F., Han, J., Wang, P., Shi, G., Li, F.: Structure guided fusion for depth map inpainting. Pattern Recogn. Lett. 34, 70–76 (2013)CrossRefGoogle Scholar
  11. 11.
    Li, H., Yu, J., Ye, Y., Bregler, C.: Realtime facial animation with on-the-fly correctives. ACM Trans. Graph. 32, 42:1–42:10 (2013)Google Scholar
  12. 12.
    Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: FaceWarehouse: a 3D facial expression database for visual computing. IEEE Trans. Vis. Comput. Graph. 20, 413–425 (2014)CrossRefGoogle Scholar
  13. 13.
    Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: International Conference on Computer Vision (ICCV), pp. 839–846. IEEE, Bombay, India (1998)Google Scholar
  14. 14.
    Petschnigg, G., Szeliski, R., Agrawala, M., Cohen, M., Hoppe, H., Toyama, K.: Digital photography with flash and no-flash image pairs. ACM Trans. Graph. 23, 664–672 (2004)CrossRefGoogle Scholar
  15. 15.
    Lai, P., Tian, D., Lopez, P.: Depth map processing with iterative joint multilateral filtering. In: Picture Coding Symposium (PCS), pp. 9–12. IEEE, Nagoya, Japan (2010)Google Scholar
  16. 16.
    Khoshelham, K., Elberink, S.O.: Accuracy and resolution of Kinect depth data for indoor mapping applications. Sensors 12, 1437–1454 (2012)CrossRefGoogle Scholar
  17. 17.
    Cootes, T., Taylor, C., Cooper, D., Graham, J.: Active shape models - their training and applications. Comput. Vis. Image Underst. 61, 39–59 (1995)CrossRefGoogle Scholar
  18. 18.
    Cootes, T., Edwards, G., Taylor, C.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23, 681–684 (2001)CrossRefGoogle Scholar
  19. 19.
    Matthews, I., Baker, S.: Active appearance models revisited. Int. J. Comput. Vis. 60, 135–164 (2004)CrossRefGoogle Scholar
  20. 20.
    Baltruaitis, T., Robinson, P., Matthews, I., Morency, L.P.: 3D constrained local model for rigid and non-rigid facial tracking. In: CVPR, pp. 2610–2617 (2012)Google Scholar
  21. 21.
    Wang, H., Dopfer, A., Wang, C.: 3D AAM based face alignment under wide angular variations using 2D and 3D data. In: ICRA (2012)Google Scholar
  22. 22.
    Cai, Q., Gallup, D., Zhang, C., Zhang, Z.: 3D deformable face tracking with a commodity depth camera. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 229–242. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  23. 23.
    Ahlberg, J.: Face and facial feature tracking using the active appearance algorithm. In: 2nd European Workshop on Advanced Video-Based Surveillance Systems (AVBS), London, UK, pp. 89–93 (2001)Google Scholar
  24. 24.
    DeCarlo, D., Metaxas, D.: Optical flow constraints on deformable models with applications to face tracking. Int. J. Comput. Vis. 38, 99–127 (2000)CrossRefzbMATHGoogle Scholar
  25. 25.
    Dornaika, F., Ahlberg, J.: Fast and reliable active appearance model search for 3D face tracking. IEEE Trans. Syst. Man Cybern. 34, 1838–1853 (2004)CrossRefGoogle Scholar
  26. 26.
    Dornaika, F., Orozco, J.: Real-time 3D face and facial feature tracking. J. Real-time Image Proc. 2, 35–44 (2007)CrossRefGoogle Scholar
  27. 27.
    Orozco, J., Rudovic, O., Gonzàlez, J., Pantic, M.: Hierarchical on-line appearance-based tracking for 3D head pose, eyebrows, lips, eyelids and irises. Image Vis. Comput. 31, 322–340 (2013)CrossRefGoogle Scholar
  28. 28.
    Ahlberg, J.: An updated parameterized face. Technical report, Image Coding Group. Department of Electrical Engineering, Linkoping University (2001)Google Scholar
  29. 29.
    Pham, H.X., Pavlovic, V.: Hybrid on-line 3D face and facial actions tracking in RGBD video sequences. In: Proceedings of the International Conference on Pattern Recognition (ICPR) (2014)Google Scholar
  30. 30.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR, pp. I-511–I-518 (2001)Google Scholar
  31. 31.
    Saragih, J.M., Lucey, S., Cohn, J.F.: Deformable model fitting by regularized landmark mean-shift. Int. J. Comput. Vis. 91, 200–215 (2011)CrossRefzbMATHMathSciNetGoogle Scholar
  32. 32.
    Arun, K.S., Huang, T.S., Blostein, S.D.: Least-squares fitting of two 3D point sets. IEEE Trans. Pattern Anal. Mach. Intell. 9, 698–700 (1987)CrossRefGoogle Scholar
  33. 33.
    Low, K.: Linear least-squares optimization for point-to-plane ICP surface registration. Technical report TR04-004, Department of Computer Science, University of North Carolina at Chapel Hill (2004)Google Scholar
  34. 34.
    Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3D dynamic facial expression database. In: 8th IEEE International Conference on Automatic Face Gesture Recognition, pp. 1–6. IEEE (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Chongyu Chen
    • 1
    • 2
  • Hai Xuan Pham
    • 3
  • Vladimir Pavlovic
    • 3
  • Jianfei Cai
    • 4
  • Guangming Shi
    • 1
  1. 1.School of Electronic EngineeringXidian UniversityXi’anChina
  2. 2.Institute for Media InnovationNanyang Technological UniversitySingaporeSingapore
  3. 3.Department of Computer ScienceRutgers UniversityNew BrunswickUSA
  4. 4.School of Computer EngineeringNanyang Technological UniversitySingaporeSingapore

Personalised recommendations