Advertisement

Deep Depth Super-Resolution: Learning Depth Super-Resolution Using Deep Convolutional Neural Network

  • Xibin Song
  • Yuchao DaiEmail author
  • Xueying Qin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10114)

Abstract

Depth image super-resolution is an extremely challenging task due to the information loss in sub-sampling. Deep convolutional neural network has been widely applied to color image super-resolution. Quite surprisingly, this success has not been matched to depth super-resolution. This is mainly due to the inherent difference between color and depth images. In this paper, we bridge up the gap and extend the success of deep convolutional neural network to depth super-resolution. The proposed deep depth super-resolution method learns the mapping from a low-resolution depth image to a high-resolution one in an end-to-end style. Furthermore, to better regularize the learned depth map, we propose to exploit the depth field statistics and the local correlation between depth image and color image. These priors are integrated in an energy minimization formulation, where the deep neural network learns the unary term, the depth field statistics works as global model constraint and the color-depth correlation is utilized to enforce the local structure in depth image. Extensive experiments on various depth super-resolution benchmark datasets show that our method outperforms the state-of-the-art depth image super-resolution methods with a margin.

Keywords

Root Mean Square Error Color Image Depth Image Convolutional Neural Network Deep Neural Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgement

This work is supported by 863 program of China (No. 2015-AA016405), NSF of China (Nos. 61672326, 61420106007), ARC Grants (Nos. DE140100180, LP100100588, DP120103896) and China Scholarship Council.

Supplementary material

416263_1_En_22_MOESM1_ESM.pdf (641 kb)
Supplementary material 1 (pdf 640 KB)

References

  1. 1.
    Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56, 116–124 (2013)CrossRefGoogle Scholar
  2. 2.
    Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., et al.: Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, pp. 559–568 (2011)Google Scholar
  3. 3.
    Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A naturalistic open source movie for optical flow evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33783-3_44 CrossRefGoogle Scholar
  4. 4.
    Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014). doi: 10.1007/978-3-319-10593-2_13 Google Scholar
  5. 5.
    Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)Google Scholar
  6. 6.
    Kim, J., Kwon Lee, J., Mu Lee, K.: Deeply-recursive convolutional network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1637–1645 (2016)Google Scholar
  7. 7.
    Freeman, W.T., Jones, T.R., Pasztor, E.C.: Example-based super-resolution. IEEE Comput. Graph. Appl. 22, 56–65 (2002)CrossRefGoogle Scholar
  8. 8.
    Mac Aodha, O., Campbell, N.D.F., Nair, A., Brostow, G.J.: Patch based synthesis for single depth image super-resolution. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 71–84. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33712-3_6 CrossRefGoogle Scholar
  9. 9.
    Ferstl, D., Ruther, M., Bischof, H.: Variational depth superresolution using example-based edge representations. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 513–521 (2015)Google Scholar
  10. 10.
    Xie, J., Feris, R.S., Sun, M.T.: Edge-guided single depth image super resolution. IEEE Trans. Image Proc. 25, 428–438 (2016)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Park, J., Kim, H., Tai, Y.W., Brown, M.S., Kweon, I.: High quality depth map upsampling for 3d-tof cameras. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1623–1630 (2011)Google Scholar
  12. 12.
    Yang, J., Ye, X., Li, K., Hou, C.: Depth recovery using an adaptive color-guided auto-regressive model. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 158–171. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33715-4_12 CrossRefGoogle Scholar
  13. 13.
    Ferstl, D., Reinbacher, C., Ranftl, R., Rüther, M., Bischof, H.: Image guided depth upsampling using anisotropic total generalized variation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 993–1000 (2013)Google Scholar
  14. 14.
    Matsuo, K., Aoki, Y.: Depth image enhancement using local tangent plane approximations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3574–3583 (2015)Google Scholar
  15. 15.
    Lu, J., Forsyth, D.: Sparse depth super resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2245–2253 (2015)Google Scholar
  16. 16.
    Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Proc. 19, 2861–2873 (2010)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Wang, S., Zhang, L., Liang, Y., Pan, Q.: Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2216–2223 (2012)Google Scholar
  18. 18.
    Kim, K.I., Kwon, Y.: Single-image super-resolution using sparse regression and natural image prior. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1127–1133 (2010)CrossRefGoogle Scholar
  19. 19.
    Yang, M.C., Wang, Y.C.F.: A self-learning approach to single image super-resolution. IEEE Trans. Multimedia 15, 498–508 (2013)CrossRefGoogle Scholar
  20. 20.
    Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. ACM Trans. Graph. 23, 689–694 (2004)CrossRefGoogle Scholar
  21. 21.
    Li, B., Shen, C., Dai, Y., van den Hengel, A., He, M.: Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1119–1127 (2015)Google Scholar
  22. 22.
    Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD Images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33715-4_54 CrossRefGoogle Scholar
  23. 23.
    Diebel, J., Thrun, S.: An application of markov random fields to range sensing. In: Proceedings of the Advances in Neural Information Processing Systems, vol. 5, pp. 291–298 (2005)Google Scholar
  24. 24.
    Chartrand, R., Yin, W.: Iteratively reweighted algorithms for compressive sensing. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3869–3872 (2008)Google Scholar
  25. 25.
    Ajanthan, T., Hartley, R., Salzmann, M., Li, H.: Iteratively reweighted graph cut for multi-label MRFs with non-convex priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5144–5152 (2015)Google Scholar
  26. 26.
    Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comp. Vis. 47, 7–42 (2002)CrossRefzbMATHGoogle Scholar
  27. 27.
    Scharstein, D., Szeliski, R.: High-accuracy stereo depth maps using structured light. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 195–202 (2003)Google Scholar
  28. 28.
    Scharstein, D., Pal, C.: Learning conditional random fields for stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
  29. 29.
    Hirschmüller, H., Scharstein, D.: Evaluation of cost functions for stereo matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
  30. 30.
    Handa, A., Whelan, T., McDonald, J., Davison, A.: A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In: IEEE International Conference on Robotics and Automation, pp. 1524–1531 (2014)Google Scholar
  31. 31.
    Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Boissonnat, J.-D., Chenin, P., Cohen, A., Gout, C., Lyche, T., Mazure, M.-L., Schumaker, L. (eds.) Curves and Surfaces 2010. LNCS, vol. 6920, pp. 711–730. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-27413-8_47 CrossRefGoogle Scholar
  32. 32.
    Timofte, R., Smet, V., Gool, L.: Anchored neighborhood regression for fast example-based super-resolution. In: Proceedings of the IEEE Conference on Computer Vision, pp. 1920–1927 (2013)Google Scholar
  33. 33.
    Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5197–5206 (2015)Google Scholar
  34. 34.
    Hornácek, M., Rhemann, C., Gelautz, M., Rother, C.: Depth super resolution by rigid body self-similarity in 3d. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1123–1130 (2013)Google Scholar
  35. 35.
    Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In: Proceedings of the British Machine Vision Conference (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyShandong UniversityJinanChina
  2. 2.Research School of EngineeringAustralian National UniversityCanberraAustralia

Personalised recommendations