The Visual Computer

, Volume 34, Issue 1, pp 67–81 | Cite as

Superpixel-based color–depth restoration and dynamic environment modeling for Kinect-assisted image-based rendering systems

  • Chong WangEmail author
  • Shing-Chow Chan
  • Zhen-Yu Zhu
  • Li Zhang
  • Heung-Yeung Shum
Original Article


Depth information is an important ingredient in many multiview applications including image-based rendering (IBR). With the advent of electronics, low-cost and high-speed depth cameras, such as the Microsoft Kinect, are getting increasingly popular. In this paper, we propose a superpixel-based joint color–depth restoration approach for Kinect depth camera and study its application to view synthesis in IBR systems. Thus, an edge-based matching method is proposed to reduce the color–depth registration errors. Then the Kinect depth map is restored based on probabilistic color–depth superpixels, probabilistic local polynomial regression and joint color–depth matting. The proposed restoration algorithm does not only inpaint the missing data, but also correct and refine the depth map to provide better color–depth consistency. Last but not the least, a dynamic background modeling scheme is proposed to address the disocclusion problem in the view synthesis for dynamic environment. The experimental results show the effectiveness of the proposed algorithm and system.


Image-based rendering Superpixel Kinect Background modeling Local polynomial regression 

Supplementary material

Supplementary material 1 (mp4 29564 KB)


  1. 1.
    Chan, S.C., Shum, H.Y., Ng, K.T.: Image-based rendering and synthesis. IEEE Signal Process. Mag. 24(6), 22–33 (2007). doi: 10.1109/MSP.2007.905702 CrossRefGoogle Scholar
  2. 2.
    Shum, H.Y., Chan, S.C., Kang, S.B.: Image-based rendering. Springer, New york (2008)zbMATHGoogle Scholar
  3. 3.
    Han, J., Pauwels, E., de Zeeuw, P., de With, P.: Employing a rgb-d sensor for real-time tracking of humans across multiple re-entries in a smart environment. IEEE Trans. Consum. Electron. 58(2), 255–263 (2012). doi: 10.1109/TCE.2012.6227420 CrossRefGoogle Scholar
  4. 4.
    Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J.T., Gusev, G.: 3D self-portraits. ACM Trans. Graph. (TOG) 32(6), 187 (2013)Google Scholar
  5. 5.
    Pedersoli, F., Benini, S., Adami, N., Leonardi, R.: Xkin: an open source framework for hand pose and gesture recognition using kinect. Vis. Comput. 30(10), 1107 (2014). doi: 10.1007/s00371-014-0921-x CrossRefGoogle Scholar
  6. 6.
    Wang, C., Liu, Z., Chan, S.C.: Superpixel-based hand gesture recognition with kinect depth camera. IEEE Trans. Multimed. 17(1), 29–39 (2015). doi: 10.1109/TMM.2014.2374357 CrossRefGoogle Scholar
  7. 7.
    Khoshelham, K., Elberink, S.O.: Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 12(2), 1437–1454 (2012)CrossRefGoogle Scholar
  8. 8.
  9. 9.
    Herrera, C.D., Kannala, J., Heikkilä, J.: Joint depth and color camera calibration with distortion correction. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 2058–2064 (2012). doi: 10.1109/TPAMI.2012.125 CrossRefGoogle Scholar
  10. 10.
    Zhang, C., Zhang, Z.: Calibration between depth and color sensors for commodity depth cameras. In: Proc. IEEE Int Multimedia and Expo (ICME) Conf, pp 1–6, doi:  10.1109/ICME.2011.6012191(2011)
  11. 11.
    Wang, Y., Zhong, F., Peng, Q., Qin, X.: Depth map enhancement based on color and depth consistency. Vis. Comput. 30(10), 1157 (2014). doi: 10.1007/s00371-013-0896-z CrossRefGoogle Scholar
  12. 12.
    Zhu, Z.Y., Zhang, S., Chan, S.C., Shum, H.Y.: Object-based rendering and 3-D reconstruction using a moveable image-based system. IEEE Trans. Circuits Syst. Video Technol. 22(10), 1405–1419 (2012). doi: 10.1109/TCSVT.2012.2198133 CrossRefGoogle Scholar
  13. 13.
    Janoch, A., Karayev, S., Jia, Y., Barron, J., Fritz, M., Saenko, K., Darrell, T.: A category-level 3-D object dataset: Putting the kinect to work. In: Proc. IEEE Int Computer Vision Workshops (ICCV Workshops) Conf, pp 1168–1174, doi:  10.1109/ICCVW.2011.6130382 (2011)
  14. 14.
    Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: Proc. IEEE Int Computer Vision Workshops (ICCV Workshops) Conf, pp. 601–608, doi: 10.1109/ICCVW.2011.6130298 (2011)
  15. 15.
    Ding, K., Chen, W., Wu, X.: Optimum inpainting for depth map based on l0 total variation. Vis. Comput. 30(12), 1311 (2014). doi: 10.1007/s00371-013-0888-z CrossRefGoogle Scholar
  16. 16.
    Matyunin, S., Vatolin, D., Berdnikov, Y., Smirnov, M.: Temporal filtering for depth maps generated by kinect depth camera. In: Proc. 3DTV Conf.: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), pp. 1–4, doi: 10.1109/3DTV.2011.5877202 (2011)
  17. 17.
    Butler, D.A., Izadi, S., Hilliges, O., Molyneaux, D., Hodges, S., Kim, D.: Shake’n’sense: reducing interference for overlapping structured light depth cameras. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’12, pp. 1933–1936. ACM, New York. doi:  10.1145/2207676.2208335 (2012)
  18. 18.
    Zitnick, C.L., Kang, S.B.: Stereo for image-based rendering using image over-segmentation. Int. J. Comput. Vis. 75(1), 49–65 (2007)Google Scholar
  19. 19.
    Chaurasia, G., Duchene, S., Sorkine-Hornung, O., Drettakis, G.: Depth synthesis and local warps for plausible image-based navigation. ACM Trans. Graph. (TOG) 32(3), 30 (2013)CrossRefGoogle Scholar
  20. 20.
    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012). doi: 10.1109/TPAMI.2012.120 CrossRefGoogle Scholar
  21. 21.
    Chen, X., Zou, D., Zhou, S., Zhao, Q., Tan, P.: Image matting with local and nonlocal smooth priors. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 1902–1907. doi: 10.1109/CVPR.2013.248 (2013)
  22. 22.
    Kong, X., Ng, M., Zhou, Z.H.: Transductive multilabel learning via label set propagation. IEEE Trans. Knowl. Data Eng. 25(3), 704–719 (2013). doi: 10.1109/TKDE.2011.141 CrossRefGoogle Scholar
  23. 23.
    Takeda, H., Farsiu, S., Milanfar, P.: Kernel regression for image processing and reconstruction. IEEE Trans. Image Process. 16(2), 349–366 (2007). doi: 10.1109/TIP.2006.888330 MathSciNetCrossRefGoogle Scholar
  24. 24.
    Zhang, Z., Chan, S.C., Wang, C.: A new regularized adaptive windowed lomb periodogram for time-frequency analysis of nonstationary signals with impulsive components. IEEE Trans. Instrum. Meas. 61(8), 2283–2304 (2012). doi: 10.1109/TIM.2012.2186655 CrossRefGoogle Scholar
  25. 25.
    Katkovnik, V., Egiazarian, K., Astola, J.: A spatially adaptive nonparametric regression image deblurring. IEEE Trans. Image Process. 14(10), 1469–1478 (2005). doi: 10.1109/TIP.2005.851705 CrossRefGoogle Scholar
  26. 26.
    Wang, C., Chan, S.C.: A new bandwidth adaptive non-local kernel regression algorithm for image/video restoration and its GPU realization. In: Proc. IEEE Int Circuits and Systems (ISCAS) Symp, pp. 1388–1391. doi: 10.1109/ISCAS.2013.6572114 (2013)
  27. 27.
    Zhang, Z., Chan, S.C.: On kernel selection of multivariate local polynomial modelling and its application to image smoothing and reconstruction. J. Signal Process. Syst. 64(3), 361–374 (2011). doi: 10.1007/s11265-010-0495-4 CrossRefGoogle Scholar
  28. 28.
    Paris, S., Durand, F.: A fast approximation of the bilateral filter using a signal processing approach. Int. J. Comput. Vis. 81(1), 24–52 (2009). doi: 10.1007/s11263-007-0110-8 CrossRefGoogle Scholar
  29. 29.
    Wang, C., Zhu, Z.Y., Chan, S.C., Shum, H.Y.: Real-time depth image acquisition and restoration for image based rendering and processing systems. J. Signal Process. Syst. 79(1), 1–18 (2013). doi: 10.1007/s11265-013-0819-2 Google Scholar
  30. 30.
    Vázquez, C., Tam, W.J., Speranza, F.: Stereoscopic imaging: filling disoccluded areas in depth image-based rendering. In: Proc. SPIE 6392, Three-Dimensional TV, Video, and Display V, 63920D, vol 6392, pp. 1–12. doi:  10.1117/12.685047 (2006)
  31. 31.
    Liu, W., Zhang, D., Cui, M., Ding, J.: An enhanced depth map based rendering method with directional depth filter and image inpainting. Visu. Comput. 32(5), 579 (2016). doi: 10.1007/s00371-015-1074-2 CrossRefGoogle Scholar
  32. 32.
    Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recogn. Lett. 27(7), 773–780 (2006). doi: 10.1016/j.patrec.2005.11.005 CrossRefGoogle Scholar
  33. 33.
    Wren, C., Azarbayejani, A., Darrell, T., Pentland, A.: Pfinder: real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intell. 19(7):780–785. doi: 10.1109/34.598236 (1997)
  34. 34.
    Berman, A., Dadourian, A., Vlahos, P.: Method for removing from an image the background surrounding a selected object. US Patent 6,134,346 (2000)Google Scholar
  35. 35.
    Munshi, A.: The opencl specification 2.0. (2014)

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.Ningbo UniversityNingboPeople’s Republic of China
  2. 2.The University of Hong KongHong KongPeople’s Republic of China
  3. 3.DJI CorporationShenzhenPeople’s Republic of China
  4. 4.Microsoft CorporationRedmondUSA

Personalised recommendations