Skip to main content
Log in

Large scaling factor depth map super-resolution using progressive joint-multilateral filtering

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Depth images captured by conventional RGB-D sensors such as ToF cameras have limited resolution. Despite recent advances in depth camera technology, there is still a significant difference between the resolution of depth and color images. Therefore, depth map Super-Resolution (SR) techniques have received attention. Specifically, achieving an algorithm performing well at large scaling factors is of great importance and also challenging. In most existing methods, the up-sampling of low resolution depth images to the desired size is performed by an interpolation operation during the beginning stage and quality improvement filters are applied then. Due to the different nature of depth images and their sparsity, magnifying the images in a single step brings heavy artifacts specially at large up-sampling factors (e.g., 16). To tackle this problem, we propose a progressive multi-step depth map SR method where interpolation and modified enhancement processes are applied iteratively. This extremely improves the quality of the output depth image. Moreover, considering the importance of edges and discontinuities in depth images, instead of using conventional symmetric kernel, an edge directed kernel is applied which effectively avoids blurring. In addition, texture copying and depth bleeding artifacts are reduced employing a depth range filter. Quantitative and qualitative results of comprehensive experiments on Middlebury and real-world datasets demonstrate the effectiveness of our approach over prior depth SR works, especially for large scaling factors of 16, 32 and even 64.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Chowdhary CL (2019) 3D object recognition system based on local shape descriptors and depth data analysis. Recent Pat Comput Sci 12(1):18–24. https://doi.org/10.2174/2213275911666180821092033.

  2. Chowdhary CL, Muatjitjeja K, Jat DS (2015) Three-dimensional object recognition based intelligence system for identification. In: International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), Windhoek, Namibia, pp 162-166. https://doi.org/10.1109/ETNCC.2015.7184827

  3. Cui Y, Schuon S, Chan D, Thrun S, Theobalt C (2010) 3D shape scanning with a time-of-flight camera. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, pp 1173-1180. https://doi.org/10.1109/CVPR.2010.5540082

  4. da Silva SPP, Almeida JS, Ohata EF, Rodrigues JJPC, de Albuquerque VHC, Rebouças PP (2020) Monocular vision aided depth map from RGB images to estimate of localization and support to navigation of mobile robots. IEEE Sens J 20(20):12040–12048. https://doi.org/10.1109/JSEN.2020.2964735

    Article  Google Scholar 

  5. Diebel J, Thrun S (2005) An application of markov random fields to range sensing. In: Proceedings of Conference on Neural Information Processing Systems (NIPS). MIT Press, Cambridge

  6. DIML RGB-D, Dataset. Available online at https://dimlrgbd.github.io/. Accessed Dec 2020

  7. Ferstl D, Reinbacher C, Ranftl R, Ruether M, Bischof H (2013) Image guided depth upsampling using anisotropic total generalized variation. In: IEEE International Conference on Computer Vision, Sydney, NSW, pp 993-1000. https://doi.org/10.1109/ICCV.2013.127

  8. Ferstl D, Rüther M, Bischof H (2015) Variational depth superresolution using example-based edge representations. In: IEEE International Conference on Computer Vision (ICCV), pp 513-521. https://doi.org/10.1109/ICCV.2015.66

  9. Giulio Marin G, Agresti L, Minto P, Zanuttigh (2019) A multi-camera dataset for depth estimation in an indoor scenario. Data in Brief 27. https://doi.org/10.1016/j.dib.2019.104619

  10. He K, Sun J, Tang X (2013) Guided image filtering. IEEE Trans Pattern Anal Mach Intell 35(6):1397–1409. https://doi.org/10.1109/TPAMI.2012.213

    Article  Google Scholar 

  11. Hornácek M, Rhemann C, Gelautz M, Rother C (2013) Depth super resolution by rigid body self-similarity in 3D. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1123-1130. https://doi.org/10.1109/CVPR.2013.149

  12. Horng Y, Tseng Y, Chang T (2010) Stereoscopic images generation with directional Gaussian filter. In: Proceedings of IEEE International Symposium on Circuits and Systems, Paris, pp 2650-2653. https://doi.org/10.1109/ISCAS.2010.5537052

  13. Hui TW, Loy CC, Tang X (2016) Depth map super-resolution by deep multi-scale guidance. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision – ECCV 2016. ECCV 2016, vol 9907. Springer, Cham. https://doi.org/10.1007/978-3-319-46487-9_22

  14. Kaashki NN (2018) RGB-D face recognition under various conditions via 3D constrained local model. J Vis Commun Image Represent 52:66–85. https://doi.org/10.1016/j.jvcir.2018.02.003

  15. Kim Y, Ham B, Oh C, Sohn K (2016) Structure selective depth superresolution for RGB-D cameras. IEEE Trans Image Process 25(11):5227–5238. https://doi.org/10.1109/TIP.2016.2601262

    Article  MathSciNet  MATH  Google Scholar 

  16. Kopf J, Cohen MF, Lischinski D, Uyttendaele M (2007) Joint bilateral upsampling. ACM Trans Graph 26(3):96-es. https://doi.org/10.1145/1276377.1276497

    Article  Google Scholar 

  17. Landau MJ, Choo BY, Beling PA (2016) Simulating Kinect infrared and depth images. IEEE Trans Cybern 46(12):3018–3031

    Article  Google Scholar 

  18. Li Z, Zheng J, Zhu Z, Yao W, Wu S (2015) Weighted guided image filtering. IEEE Trans Image Process 24(1):120–129. https://doi.org/10.1109/TIP.2014.2371234

    Article  MathSciNet  MATH  Google Scholar 

  19. Li Y, Wang Y, Wang D (2018) Multiple RGB-D sensor-based 3-D reconstruction and localization of indoor environment for mini MAV. Comput Electr Eng 70:509–524

    Article  Google Scholar 

  20. Liao Y-W, Chen M-J, Yeh C-H, Lin J-R, Chen C-W (2019) Efficient inter-prediction depth coding algorithm based on depth map segmentation for 3D-HEVC. Multimed Tools Appl 78:10181–10205. https://doi.org/10.1007/s11042-018-6547-7

    Article  Google Scholar 

  21. Liu X, Kang K, Liu Y (2017) Stereoscopic image quality assessment based on depth and texture information. IEEE Syst J 11(4):2829–2838. https://doi.org/10.1109/JSYST.2015.2478119

    Article  Google Scholar 

  22. Lo K, Wang YF, Hua K (2018) Edge-preserving depth map upsampling by joint trilateral filter. IEEE Trans Cybern 48(1):371–384. https://doi.org/10.1109/TCYB.2016.2637661

    Article  Google Scholar 

  23. Lu H et al (2017) Depth map reconstruction for underwater kinect camera using inpainting and local image mode filtering. IEEE Access 5:7115–7122. https://doi.org/10.1109/ACCESS.2017.2690455

    Article  Google Scholar 

  24. Mac Aodha O, Campbell NDF, Nair A, Brostow GJ (2012) Patch based synthesis for single depth image super-resolution. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Computer Vision – ECCV 2012. ECCV 2012, vol 7574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33712-3_6

  25. Massimo Camplani A, Paiement M, Mirmehdi D, Damen S, Hannuna, Tilo, Burghardt L, Tao (2017) Multiple human tracking in RGB-depth data, a survey. IET Comput Vision 11(4):265–285. https://doi.org/10.1049/iet-cvi.2016.0178

    Article  Google Scholar 

  26. Middlebury Stereo Dataset. Available online at https://vision.middlebury.edu/stereo/data/ . Accessed Dec 2020

  27. Paris S, Kornprobst P, Tumblin J, Durand F (2009) Bilateral filtering: theory and applications. Found Trends Comput Graph Vis 4(1):1–73. https://doi.org/10.1561/0600000020

  28. Park J, Kim H, Tai Y-W, Brown MS, Kweon I (2011) High quality depth map upsampling for 3D-TOF cameras. In: International Conference on Computer Vision, Barcelona, Spain, pp 1623-1630. https://doi.org/10.1109/ICCV.2011.6126423

  29. Petschnigg G, Szeliski R, Agrawala M, Cohen M, Hoppe H (2004) Digital photography with flash and no-flash image pairs. ACM Trans Graph 23(3):664–672. https://doi.org/10.1145/1015706.1015777

    Article  Google Scholar 

  30. Riegler G, Rüther M, Bischof H (2016) ATGV-Net: accurate depth super-resolution. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision – ECCV 2016. ECCV 2016, vol 9907. Springer, Cham. https://doi.org/10.1007/978-3-319-46487-9_17

  31. Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vis 47:7–42. https://doi.org/10.1023/A:1014573219977

    Article  MATH  Google Scholar 

  32. Scharstein D, Szeliski R (2003) High-accuracy stereo depth maps using structured light. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), vol 1, pp 195-202, Madison, WI

  33. Sun CC, Wang YH, Sheu MH (2017) Fast motion object detection algorithm using complementary depth image on an RGB-D camera. IEEE Sens J 17(17):5728–5734

    Article  Google Scholar 

  34. Tech G, Chen Y, Müller K, Ohm J, Vetro A, Wang Y (2016) Overview of the multiview and 3D Extensions of High Efficiency Video Coding. IEEE Trans Circuits Syst Video Technol 26(1):35–49. https://doi.org/10.1109/TCSVT.2015.2477935

    Article  Google Scholar 

  35. Voo KHB, Bong DBL (2018) Quality assessment of stereoscopic image by 3D structural similarity. Multimed Tools Appl 77:2313–2332. https://doi.org/10.1007/s11042-017-4361-2

    Article  Google Scholar 

  36. Wang Y, Zhang J, Liu Z, Wu Q, Zhang Z, Jia Y (2018) Depth super-resolution on RGB-D video sequences with large displacement 3D motion. IEEE Trans Image Process 27(7):3571–3585

    Article  MathSciNet  Google Scholar 

  37. Xie J, Feris RS, Yu S, Sun M (2015) Joint super resolution and denoising from a single depth image. IEEE Trans Multimed 17(9):1525–1537. https://doi.org/10.1109/TMM.2015.2457678

    Article  Google Scholar 

  38. Xie J, Feris RS, Sun MT (2016) Edge-guided single depth image super resolution. IEEE Trans Image Process 25(1):428–438

    Article  MathSciNet  Google Scholar 

  39. Zhai G, Min X (2020) Perceptual image quality assessment: a survey. Sci China Inf Sci 63:211301. https://doi.org/10.1007/s11432-019-2757-1

    Article  Google Scholar 

  40. Zhang Y, Ding L, Gaurav Sharma (2019) Local-linear-fitting-based matting for joint hole filling and depth upsampling of RGB-D images. J Electron Imaging 28(3). https://doi.org/10.1117/1.JEI.28.3.033019

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Payman Moallem.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khoddami, A.A., Moallem, P. & Kazemi, M. Large scaling factor depth map super-resolution using progressive joint-multilateral filtering. Multimed Tools Appl 81, 11461–11478 (2022). https://doi.org/10.1007/s11042-022-12253-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12253-z

Keywords

Navigation