Edge-preserving interpolation of depth data exploiting color information


The extraction of depth information associated to dynamic scenes is an intriguing topic, because of its perspective role in many applications, including free viewpoint and 3D video systems. Time-of-flight (ToF) range cameras allow for the acquisition of depth maps at video rate, but they are characterized by a limited resolution, specially if compared with standard color cameras. This paper presents a super-resolution method for depth maps that exploits the side information from a standard color camera: the proposed method uses a segmented version of the high-resolution color image acquired by the color camera in order to identify the main objects in the scene and a novel surface prediction scheme in order to interpolate the depth samples provided by the ToF camera. Effective solutions are provided for critical issues such as the joint calibration between the two devices and the unreliability of the acquired data. Experimental results on both synthetic and real-world scenes have shown how the proposed method allows to obtain a more accurate interpolation with respect to standard interpolation approaches and state-of-the-art joint depth and color interpolation schemes.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21


  1. 1.

    In the following description, we will call samples the input depth values obtained by reprojecting the ToF data, and pixels, the output pixels of the high-resolution depth map.

  2. 2.

    Threshold values of 0.1 and 0.4 refer to a depth value range between 0 and 1.

  3. 3.

    The errors reported in this section are measured in pixels on the high-resolution image of the color cameras

  4. 4.

    The acquired data for this setup is available online at the address http://lttm.dei.unipd.it/downloads/superres/.

  5. 5.

    In both cases, we just warped the images using a 3D mesh built from the depth data; no ad hoc post processing algorithms were used.


  1. 1.

    Ballan L, Brusco N, Cortelazzo GM (2005) 3D passive shape recovery from texture and silhouette information. In: Proceedings of IEEE European conference on visual media production (CVMP). London

  2. 2.

    Beder C, Koch R (2008) Calibration of focal length and 3D pose based on the reflectance and depth image of a planar object. Int J Intell Syst Technol Appl 5:285–294

    Google Scholar 

  3. 3.

    Bouguet J, Matlab camera calibration toolbox (2000). http://www.vision.caltech.edu/bouguetj/calib_doc/. Accessed 6 May 2013

  4. 4.

    Diebel J, Thrun S (2005) An application of Markov random fields to range sensing. In: Proceedings of conference on neural information processing systems (NIPS)

  5. 5.

    Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vision 59(2):167–181

    Article  Google Scholar 

  6. 6.

    Fischler M, Bolles R (1987) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. In: Readings in computer vision: issues, problems, principles, and paradigms. Morgan Kaufmann, San Francisco, pp 726–740

  7. 7.

    Garro V, dal Mutto C, Zanuttigh P, Cortelazzo G (2009) A novel interpolation scheme for range data with side information. In: Proceedings of IEEE European conference on visual media production (CVMP), pp 52–60

  8. 8.

    Guan L, Franco J, Pollefeys M (2008) 3D object reconstruction with heterogeneous sensor data. In: Proceedings of international symposium on 3D data processing, visualization and transmission (3DPVT)

  9. 9.

    Hartley R, Zisserman A (2004) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press, Cambridge

    Google Scholar 

  10. 10.

    Hartley RI, Sturm P (1994) Triangulation. In: Proceedings of ARPA image understanding workshop, pp 957–966

  11. 11.

    Horn BKP (1987) Closed-form solution of absolute orientation using unit quaternions. J Opt Soc Am A 4(4):629–642

    MathSciNet  Article  Google Scholar 

  12. 12.

    Kim Y, Chan D, Theobalt C, Thrun S (2008) Design and calibration of a multi-view TOF sensor fusion system. In: Proceedings of IEEE CVPR workshop on time-of-flight computer vision

  13. 13.

    Kim Y, Theobalt C, Diebel J, Kosecka J, Micusik B, Thrun S (2009) Multi-view image and TOF sensor fusion for dense 3d reconstruction. In: Proceedings of 3-D digital imaging and modeling conference (3DIM)

  14. 14.

    Kopf J, Cohen MF, Lischinski D, Uyttendaele M (2007) Joint bilateral upsampling. ACM Trans Graph 26(3):96

    Article  Google Scholar 

  15. 15.

    Langmann B, Hartmann K, Loffeld O (2011) Comparison of depth super-resolution methods for 2D/3D images. Int J Comput Inf Syst Ind Manag Appl 3:635–645

    Google Scholar 

  16. 16.

    Li Y, Xue T, Sun L, Liu J (2012) Joint example-based depth map super-resolution. In: Proceedings of IEEE international conference on multimedia and expo (ICME), pp 985–988

  17. 17.

    Lindner M, Lambers M, Kolb A (2008) Sub-pixel data fusion and edge-enhanced distance refinement for 2D/3D images. Int J Intell Syst Technol Appl 5:344–354

    Google Scholar 

  18. 18.

    Lu J, Min D, Pahwa R, Do M (2011) A revisit to MRF-based depth map super-resolution and enhancement. In: Proceedings of international conference on acoustics, speech and signal processing (ICASSP), pp 985–988

  19. 19.

    Dal Mutto C, Zanuttigh P, Cortelazzo G (2010) A probabilistic approach to TOF and stereo data fusion. In: Proceedings of international symposium on 3D data processing, visualization and transmission (3DPVT)

  20. 20.

    Schuon S, Theobalt C, Davis J, Thrun S (2008) High-quality scanning using time-of-flight depth super resolution. In: Proceedings of CVPR workshop on time-of-flight computer vision, pp 1–7

  21. 21.

    Kahlmann T, Ingensand H (2008) Calibration and development for increased accuracy of 3D range image cameras. J Appl Geodesy 2:1–11

    Article  Google Scholar 

  22. 22.

    Yang Q, Yang R, Davis J, Nister D (2007) Spatial-depth super resolution for range images. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8

  23. 23.

    Zanuttigh P, Cortelazzo G (2009) Compression of depth information for 3d rendering. In: Proceedings of 3D TV conference

  24. 24.

    Zhang L, Curless B, Seitz S (2003) Spacetime stereo: shape recovery for dynamic scenes. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 367–374

  25. 25.

    Zhang Z (1998) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22:1330–1334

    Article  Google Scholar 

  26. 26.

    Zhu J, Wang L, Yang R, Davis J (2008) Fusion of time-of-flight depth and stereo for high accuracy depth maps. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8

Download references

Author information



Corresponding author

Correspondence to Pietro Zanuttigh.

Appendix: Appendix: Bilinear interpolation on nonregular grids

Appendix: Appendix: Bilinear interpolation on nonregular grids

In the proposed approach, after the calibration step, the available samples are not regularly distributed over a lattice. This appendix shows how the well-known bilinear interpolation scheme can be extended to nonregular grids.

Fig. 22

Bilinear interpolation configuration

Referring to Fig. 22, the depth of the red point p(x, y) is estimated from the depth D i = D(p i ), i = 1, . . , 4 of the four blue samples p i (x i , y i ), i = 1, . . , 4. The procedure works in two steps: firstly, we estimate the depth of the two yellow points p a (x a , y a ) = p a (x, y a ) and p b (x b , y b ) = p a (x, y b ), and then the depth of p is computed by interpolating the ones of p a and p b . Let us define with Δx i = |p i p| x = |x i x|, i = 1, . . , 4 and Δy i = |p i p| y = |y i y|; i = 1, . . , 4 the absolute value of the differences between the x and y coordinates of the available low-resolution samples (blue samples) and the coordinates of the point that is estimated (in red), i.e., the absolute value of the x and y components of the vectors connecting samples p i with p. First of all, the depth \(D_{a} \triangleq D(\mathbf {p}_{a})\) of point p a (x, y a ) is estimated by linearly interpolating the depths of p 1 and p 2.

$$ \hat{D_{a}} = \frac{\Delta x_{2}}{\Delta x_{1} + \Delta x_{2}} D_{1} + \frac{\Delta x_{1}}{\Delta x_{1} + \Delta x_{2}} D_{2} $$
$$ = C_{1} D_{1} + C_{2} D_{2} $$

where C 1 = Δx 2/(Δx 1 + Δx 2) and C 2 = Δx 1/(Δx 1 + Δx 2). The same procedure is applied to the estimate of depth D(p b ) of p b (x, y b ) from p 3 and p 4:

$$\begin{array}{*{20}l} \hat{D_{b}} &= \frac{\Delta x_{4}}{\Delta x_{3} + \Delta x_{4}} D_{3} + \frac{\Delta x_{3}}{\Delta x_{3} + \Delta x_{4}} D_{4} \end{array} $$
$$ = C_{3} D_{3} + C_{4} D_{4} $$

where C 3 = Δ x 4/(Δx 3 + Δx 4) and C 4 = Δx 3/(Δx 3 + Δx 4). The vertical coordinates Δy a = y a y and Δy b = yy b of p a and p b with respect to p can be computed as follows:

$$\begin{array}{*{20}l} \Delta y_{a} &= \frac{\Delta x_{2}}{\Delta x_{1} + \Delta x_{2}} \Delta y_{1} + \frac{\Delta x_{1}}{\Delta x_{1} + \Delta x_{2}} \Delta y_{2} \end{array} $$
$$ = C_{3} \Delta y_{3} + C_{4} \Delta y_{4} $$
$$\begin{array}{*{20}l} \Delta y_{b} = \frac{\Delta x_{4}}{\Delta x_{3} + \Delta x_{4}} \Delta y_{3} + \frac{\Delta x_{3}}{\Delta x_{3} + \Delta x_{4}} \Delta y_{4} \end{array} $$
$$ =C_{3} \Delta y_{3} + C_{4} \Delta y_{4} $$

In the second step, the depths D a and D b of p a and p b are linearly interpolated to get the depth of p:

$$ \hat{D}(\mathbf{p}) = \frac{\Delta y_{b}}{\Delta y_{a} + \Delta y_{b}} \hat{D}_{a} + \frac{\Delta y_{a}}{\Delta y_{a} + \Delta y_{b}} \hat{D}_{b} $$
$$ = C_{a} C_{1} D_{1} + C_{a} C_{2} D_{2} + C_{b} C_{3} D_{3} + C_{b} C_{4} D_{4} \\ $$
$$ = \gamma_{1} D_{1} + \gamma_{2} D_{2} + \gamma_{3} D_{3} + \gamma_{4} D_{4} $$

where C a = Δy b /(Δy a + Δy b ) , C b = Δy a /(Δy a + Δy b ), γ 1 = C a C 1, γ 2 = C a C 2, γ 3 = C b C 3 and γ 4 = C b C 4. Equation 21 has been obtained by replacing \(\hat {D_{a}}\) and \(\hat {D_{a}}\) in Eq. 20 with their expressions from Eqs. 13 and 15. Note how the final result is a weighted average of the four samples where the weights depend on the positions of the various samples as in standard bilinear interpolation. This approach is directly used on the low-resolution samples when the segmented region contains all the four samples, while in the other cases, the missing samples are firstly estimated by the methods of Section 3.2, and then Eq. 22 is applied.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Garro, V., Dal Mutto, C., Zanuttigh, P. et al. Edge-preserving interpolation of depth data exploiting color information. Ann. Telecommun. 68, 597–613 (2013). https://doi.org/10.1007/s12243-013-0389-0

Download citation


  • Depth map
  • Interpolation
  • Super resolution
  • Calibration
  • Time of flight