The Visual Computer

, Volume 30, Issue 10, pp 1157–1168 | Cite as

Depth map enhancement based on color and depth consistency

  • Yanke Wang
  • Fan Zhong
  • Qunsheng Peng
  • Xueying Qin
Original Article


Current low-cost depth sensing techniques, such as Microsoft Kinect, still can achieve only limited precision. The resultant depth maps are often found to be noisy, misaligned with the color images, and even contain many large holes. These limitations make it difficult to be adopted by many graphics applications. In this paper, we propose a computational approach to address the problem. By fusing raw depth values with image color, edges and smooth priors in a Markov random field optimization framework, both misalignment and large holes can be eliminated effectively, our method thus can produce high-quality depth maps that are consistent with the color image. To achieve this, a confidence map is estimated for adaptive weighting of different cues, an image inpainting technique is introduced to handle large holes, and contrasts in the color image are also considered for an accurate alignment. Experimental results demonstrate the effectiveness of our method.


Depth map Enhancement  Markov random field Kinect 



The authors gratefully acknowledge the anonymous reviewers for their comments to help us to improve our paper. This work is supported by 973 program of China (No. 2009CB320802), NSF of China (Nos. U1035004, 61173070, 61202149), the National Science and Technology Pillar Program (No. 2012BAF10B03-3).


  1. 1.
    Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: SIGGRAPH, New York, pp. 417–424 (2000)Google Scholar
  2. 2.
    Bornemann, F., Marz, T.: Fast image inpainting based on coherence transport. J. Math. Imaging Vis. 28, 259–278 (2007)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Breckon, T.P., Fisher, R.B.: A hierarchical extension to 3d non-parametric surface relief completion. Pattern Recognit. 45, 172–185 (2012). doi: 10.1016/j.patcog.2011.04.021 CrossRefGoogle Scholar
  4. 4.
    Chan, T.F., Shen, J.: Mathematical models for local nontexture inpaintings. SIAM J. Appl. Math. 62, 1019–1043 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Comaniciu, D., Meer, P., Member, S.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)CrossRefGoogle Scholar
  6. 6.
    Cui, Y., Schuon, S., Chan, D., Thrun, S., Theobalt, C.: 3d shape scanning with a time-of-flight camera. In: IEEE Conference on Computer Vision and Pattern Recognition, IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1173–1180 (2010)Google Scholar
  7. 7.
    Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: SIGGRAPH, SIGGRAPH, New York, pp. 303–312 (1996)Google Scholar
  8. 8.
    Diebel, J., Thrun, S.: An application of markov random fields to range sensing. In: Proceedings of Advanced Neural Information Processing Systems, pp. 291–298 (2005)Google Scholar
  9. 9.
    Freedman, B., Shpunt, A., Machline, M., Arieli, Y.: Depth mapping using projected patterns (2009)Google Scholar
  10. 10.
    Hirschmuller, H., Scharstein, D.: Evaluation of cost functions for stereo matching. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Los Alamitos, pp. 1–8 (2007)Google Scholar
  11. 11.
    Huhle, B., Schairer, T., Jenke, P., Straier, W.: Fusion of range and color images for denoising and resolution enhancement with a non-local filter. Comput. Vis. Image Understand. 114(12), 1336–1345 (2010)CrossRefGoogle Scholar
  12. 12.
    Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., Fitzgibbon, A.: Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th Annual ACM Symposium on User interface Software and Technology (UIST ’11), New York, pp. 559–568 (2011)Google Scholar
  13. 13.
    Lazaros, N., Sirakoulis, G.C., Gasteratos, A.: Review of stereo vision algorithms: from software to hardware. Int. J. Optomechatron. 2, 435–462 (2008). doi: 10.1080/15599610802438680 CrossRefGoogle Scholar
  14. 14.
    Le Meur, O., Guillemot, C.: Super-resolution-based inpainting. In: European Conference on Computer Vision, Florence (2012)Google Scholar
  15. 15.
    Mac Aodha, O., Campbell, N.D.F., Nair, A., Brostow, G.J.: Patch based synthesis for single depth image super-resolution. In: European Conference on Computer Vision, pp. 71–84 (2012)Google Scholar
  16. 16.
    Min, D., Lu, J., Do, M.: Depth video enhancement based on weighted mode filtering. IEEE Trans. Image Process. 21(3), 1176–1190 (2012)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Newcombe, R.A., Davison, A.J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., Fitzgibbon, A.: Kinectfusion: real-time dense surface mapping and tracking. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136 (2011)Google Scholar
  18. 18.
    NVIDIA: Nvidia cuda c programming guide (2011)Google Scholar
  19. 19.
    Park, J., Kim, H., Tai, Y.W., Brown, M.S., Kweon, I.: High quality depth map upsampling for 3d-tof cameras. In: IEEE International Conference on Computer Vision, pp. 1623–1630 (2011)Google Scholar
  20. 20.
    Petschnigg, G., Szeliski, R., Agrawala, M., Cohen, M., Hoppe, H., Toyama, K.: Digital photography with flash and no-flash image pairs. In: SIGGRAPH. ACM, New York, pp. 664–672 (2004)Google Scholar
  21. 21.
    Richardt, C., Stoll, C., Dodgson, N.A., Seidel, H.P., Theobalt, C.: Coherent spatiotemporal filtering, upsampling and rendering of rgbz videos. In: Computer Graphics Forum (Proceedings of Eurographics), vol. 31, no. 2 (2012)Google Scholar
  22. 22.
    Schuon, S., Theobalt, C., Davis, J., Thrun, S.: Lidarboost: depth superresolution for tof 3d shape scanning. In: IEEE Conference on Computer Vision and, Pattern Recognition, pp. 343–350 (2009)Google Scholar
  23. 23.
    Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: IEEE Computer Society Conference on Computer Vision and, Pattern Recognition, vol. 1, pp. 519–528 (2006)Google Scholar
  24. 24.
    Sharf, A., Alexa, M., Cohen-Or, D.: Context-based surface completion. In: SIGGRAPH ’04. ACM, New York, pp. 878–887 (2004)Google Scholar
  25. 25.
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, Washington, DC, pp. 1297–1304 (2011)Google Scholar
  26. 26.
    Strasdat, H., Montiel, J.M.M., Davison, A.J.: Real-time monocular slam: why filter? In: IEEE International Conference on Robotics and Automation, pp. 2657–2664 (2010)Google Scholar
  27. 27.
    Telea, A.: An image inpainting technique based on the fast marching method. J. Graph. Tools 9(1), 23–34 (2004)CrossRefGoogle Scholar
  28. 28.
    Yang, Q., Yang, R., Davis, J., Nister, D.: Spatial-depth super resolution for range images. In: IEEE Conference on Computer Vision and, Pattern Recognition, pp. 1–8 (2007)Google Scholar
  29. 29.
    Zoran, D., Weiss, Y.: From learning models of natural image patches to whole image restoration. In: IEEE International Conference on Computer Vision, pp. 479–486 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yanke Wang
    • 1
  • Fan Zhong
    • 1
  • Qunsheng Peng
    • 2
  • Xueying Qin
    • 1
    • 3
  1. 1.School of Computer Science and TechnologyShandong UniversityJinanPeople’s Republic of China
  2. 2.State Key Lab of CAD&CGZhejiang UniversityHangzhouPeople’s Republic of China
  3. 3. Shandong Provincial Key Laboratory of Network Based Intelligent ComputingJinanPeople’s Republic of China

Personalised recommendations