Proxy Clouds for Live RGB-D Stream Processing and Consolidation

  • Adrien KaiserEmail author
  • Jose Alonso Ybanez Zepeda
  • Tamy Boubekeur
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11210)


We propose a new multiplanar superstructure for unified real-time processing of RGB-D data. Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through augmented reality. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. Our approach, named Proxy Clouds, consists in generating and updating through time a single set of compact local statistics parameterized over detected planar proxies, which are fed from raw RGB-D data. Proxy Clouds provide several processing primitives, which improve the quality of the RGB-D stream on-the-fly or lighten further operations. Experimental results confirm that our light weight analysis framework copes well with embedded execution as well as moderate memory and computational capabilities compared to state-of-the-art methods. Processing of RGB-D data with Proxy Clouds includes noise and temporal flickering removal, hole filling and resampling. As a substitute of the observed scene, our proxy cloud can additionally be applied to compression and scene reconstruction. We present experiments performed with our framework in indoor scenes of different natures within a recent open RGB-D dataset.


RGB-D stream 3D geometric primitives Data reinforcement Depth improvement Online processing Scene reconstruction 



This work is partially supported by the French National Research Agency under grant ANR 16-LCV2-0009-01 ALLEGORI and by BPI France, under grant PAPAYA. We also wish to thank the authors of 3DLite [42], BundleFusion [39] and ScanNet [50] for providing the dataset we use.

Supplementary material

474211_1_En_16_MOESM1_ESM.pdf (23.2 mb)
Supplementary material 1 (pdf 23737 KB)

Supplementary material 2 (avi 65557 KB)


  1. 1.
    Endres, F., Hess, J., Sturm, J., Cremers, D., Burgard, W.: 3-D mapping with an RGB-D camera. IEEE Trans. Robot. 30(1), 177–187 (2014)CrossRefGoogle Scholar
  2. 2.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Umeyama, S.: Least-squares estimation of transformation parameters between two point patterns. IEEE Trans. Pattern Anal. Mach. Intell. 13(4), 376–380 (1991)CrossRefGoogle Scholar
  4. 4.
    Hulik, R., Spanel, M., Smrz, P., Materna, Z.: Continuous plane detection in point-cloud data based on 3D Hough transform. J. Vis. Commun. Image Representation 25(1), 86–97 (2014)CrossRefGoogle Scholar
  5. 5.
    Schnabel, R., Wahl, R., Klein, R.: Efficient RANSAC for point-cloud shape detection. Comput. Graph. Forum 26(2), 214–226 (2007)CrossRefGoogle Scholar
  6. 6.
    Holz, D., Holzer, S., Rusu, R.B., Behnke, S.: Real-time plane segmentation using RGB-D cameras. In: Röfer, T., Mayer, N.M., Savage, J., Saranlı, U. (eds.) RoboCup 2011. LNCS (LNAI), vol. 7416, pp. 306–317. Springer, Heidelberg (2012). Scholar
  7. 7.
    Feng, C., Taguchi, Y., Kamat, V.R.: Fast plane extraction in organized point clouds using agglomerative hierarchical clustering. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 6218–6225. IEEE (2014)Google Scholar
  8. 8.
    Kaiser, A., Ybanez Zepeda, J.A., Boubekeur, T.: A survey of simple geometric primitives detection methods for captured 3D data. In: Computer Graphics Forum (2018, to appear)Google Scholar
  9. 9.
    Li, L.: Filtering for 3D time-of-flight sensors. Technical report SLOA230, Texas Instruments, January 2016Google Scholar
  10. 10.
    Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, pp. 839–846. IEEE (1998)Google Scholar
  11. 11.
    Shao, L., Han, J., Kohli, P., Zhang, Z. (eds.): Computer Vision and Machine Learning with RGB-D Sensors. Advances in Computer Vision and Pattern Recognition. Springer, Heidelberg (2014). Scholar
  12. 12.
    Essmaeel, K., Gallo, L., Damiani, E., De Pietro, G., Dipandà, A.: Temporal denoising of kinect depth data. In: Eighth International Conference on Signal Image Technology and Internet Based Systems (SITIS), pp. 47–52. IEEE (2012)Google Scholar
  13. 13.
    Liu, S., Chen, C., Kehtarnava, N.: A computationally efficient denoising and hole-filling method for depth image enhancement. In: Kehtarnavaz, N., Carlsohn, M.F. (eds.) SPIE Conference on Real-Time Image and Video Processing, SPIE, April 2016Google Scholar
  14. 14.
    Le, A.V., Jung, S.W., Won, C.S.: Directional joint bilateral filter for depth images. Sensors 14(7), 11362–11378 (2014)CrossRefGoogle Scholar
  15. 15.
    Newcombe, R.A., et al.: Kinectfusion: real-time dense surface mapping and tracking. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136. IEEE, October 2011Google Scholar
  16. 16.
    Bapat, A., Ravi, A., Raman, S.: An iterative, non-local approach for restoring depth maps in RGB-D images. In: Twenty First National Conference on Communications (NCC), pp. 1–6. IEEE (2015)Google Scholar
  17. 17.
    Camplani, M., Salgado, L.: Adaptive spatio-temporal filter for low-cost camera depth maps. In: IEEE International Conference on Emerging Signal Processing Applications (ESPA), pp. 33–36. IEEE (2012)Google Scholar
  18. 18.
    Schmeing, M., Jiang, X.: Color segmentation based depth image filtering. In: Jiang, X., Bellon, O.R.P., Goldgof, D., Oishi, T. (eds.) WDIA 2012. LNCS, vol. 7854, pp. 68–77. Springer, Heidelberg (2013). Scholar
  19. 19.
    Chen, L., Lin, H., Li, S.: Depth image enhancement for kinect using region growing and bilateral filter. In: 21st International Conference on Pattern Recognition (ICPR), pp. 3070–3073. IEEE (2012)Google Scholar
  20. 20.
    Wu, C., Zollhöfer, M., Nießner, M., Stamminger, M., Izadi, S., Theobalt, C.: Real-time shading-based refinement for consumer depth cameras. ACM Trans. Graph. (TOG) 33(6), 200 (2014)zbMATHGoogle Scholar
  21. 21.
    Kopf, J., Cohen, M.F., Lischinski, D., Uyttendaele, M.: Joint bilateral upsampling. ACM Trans. Graph. (ToG) 26(3), 96 (2007)CrossRefGoogle Scholar
  22. 22.
    Min, D., Lu, J., Do, M.N.: Depth video enhancement based on weighted mode filtering. IEEE Trans. Image Process. 21(3), 1176–1190 (2012)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Liu, R., et al.: Hole-filling based on disparity map and inpainting for depth-image-based rendering. Int. J. Hybrid Inf. Technol. 9(5), 145–164 (2016)CrossRefGoogle Scholar
  24. 24.
    Buyssens, P., Daisy, M., Tschumperlé, D., Lézoray, O.: Superpixel-based depth map inpainting for RGB-D view synthesis. In: IEEE International Conference on Image Processing (ICIP), pp. 4332–4336. IEEE (2015)Google Scholar
  25. 25.
    Solh, M., AlRegib, G.: Hierarchical hole-filling for depth-based view synthesis in FTV and 3D video. IEEE J. Sel. Topics Sig. Process. 6(5), 495–504 (2012)CrossRefGoogle Scholar
  26. 26.
    Schnabel, R., Degener, P., Klein, R.: Completion and reconstruction with primitive shapes. Comput. Graph. Forum 28(2), 503–512 (2009)CrossRefGoogle Scholar
  27. 27.
    Biswas, J., Veloso, M.: Planar polygon extraction and merging from depth images. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3859–3864. IEEE (2012)Google Scholar
  28. 28.
    Labbé, M., Michaud, F.: Online global loop closure detection for large-scale multi-session graph-based slam. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2661–2666. IEEE, September 2014Google Scholar
  29. 29.
    Hsiao, M., Westman, E., Zhang, G., Kaess, M.: Keyframe-based dense planar slam. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 5110–5117. IEEE, May 2017Google Scholar
  30. 30.
    Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)CrossRefGoogle Scholar
  31. 31.
    Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., Kolb, A.: Real-time 3D reconstruction in dynamic scenes using point-based fusion. In: International Conference on 3D Vision (3DV), pp. 1–8. IEEE, June 2013Google Scholar
  32. 32.
    Salas-Moreno, R.F., Glocken, B., Kelly, P.H., Davison, A.J.: Dense planar slam. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 157–164. IEEE, September 2014Google Scholar
  33. 33.
    Elghor, H.E., Roussel, D., Ababsa, F., Bouyakhf, E.H.: Planes detection for robust localization and mapping in RGB-D slam systems. In: International Conference on 3D Vision (3DV), pp. 452–459. IEEE, October 2015Google Scholar
  34. 34.
    Dou, M., Guan, L., Frahm, J.-M., Fuchs, H.: Exploring high-level plane primitives for indoor 3D reconstruction with a hand-held RGB-D camera. In: Park, J.-I., Kim, J. (eds.) ACCV 2012. LNCS, vol. 7729, pp. 94–108. Springer, Heidelberg (2013). Scholar
  35. 35.
    Kaess, M.: Simultaneous localization and mapping with infinite planes. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 4605–4611. IEEE, May 2015Google Scholar
  36. 36.
    Gao, X., Zhang, T.: Robust RGB-D simultaneous localization and mapping using planar point features. Robot. Auton. Syst. 72, 1–14 (2015)CrossRefGoogle Scholar
  37. 37.
    Zhang, E., Cohen, M.F., Curless, B.: Emptying, refurnishing, and relighting indoor spaces. ACM Trans. Graph. (TOG) 35(6), 174 (2016)Google Scholar
  38. 38.
    Nießner, M., Zollhöfer, M., Izadi, S., Stamminger, M.: Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. Graph. (ToG) 32(6), 169 (2013)CrossRefGoogle Scholar
  39. 39.
    Dai, A., Nießner, M., Zollhöfer, M., Izadi, S., Theobalt, C.: Bundlefusion: real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Trans. Graph. (TOG) 36(3), 24 (2017)CrossRefGoogle Scholar
  40. 40.
    Zhang, Y., Xu, W., Tong, Y., Zhou, K.: Online structure analysis for real-time indoor scene reconstruction. ACM Trans. Graph. (TOG) 34(5), 159 (2015)CrossRefGoogle Scholar
  41. 41.
    Dzitsiuk, M., Sturm, J., Maier, R., Ma, L., Cremers, D.: De-noising, stabilizing and completing 3D reconstructions on-the-go using plane priors. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3976–3983. IEEE, May 2017Google Scholar
  42. 42.
    Huang, J., Dai, A., Guibas, L., Niessner, M.: 3Dlite: towards commodity 3D scanning for content creation. ACM Trans. Graph. (TOG) 36(6), 203 (2017)CrossRefGoogle Scholar
  43. 43.
    Kass, M., Solomon, J.: Smoothed local histogram filters. ACM Trans. Graph. (TOG) 29(4), 100 (2010)CrossRefGoogle Scholar
  44. 44.
    Serra, J.: Image Analysis and Mathematical Morphology. Academic Press, Inc., Cambridge (1983)Google Scholar
  45. 45.
    Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)CrossRefGoogle Scholar
  46. 46.
    Nenci, F., Spinello, L., Stachniss, C.: Effective compression of range data streams for remote robot operations using H. 264. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3794–3799. IEEE, September 2014Google Scholar
  47. 47.
    Cignoni, P., Rocchini, C., Scopigno, R.: Metro: measuring error on simplified surfaces. Comput. Graph. Forum 17(2), 167–174 (1998)CrossRefGoogle Scholar
  48. 48.
    Lefebvre, S., Hoppe, H.: Compressed random-access trees for spatially coherent data. In: Kautz, J., Pattanaik, S. (eds.) Proceedings of the 18th Eurographics Conference on Rendering Techniques, pp. 339–349. Eurographics Association (2007)Google Scholar
  49. 49.
    Raposo, C., Lourenco, M., Goncalves Almeida Antunes, M., Barreto, J.P.: Plane-based odometry using an RGB-D camera. In: British Machine Vision Conference (BMVC). Elsevier, September 2013Google Scholar
  50. 50.
    Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: Scannet: richly-annotated 3D reconstructions of indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, July 2017Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Adrien Kaiser
    • 1
    Email author
  • Jose Alonso Ybanez Zepeda
    • 2
  • Tamy Boubekeur
    • 1
  1. 1.LTCI, Telecom ParisTech, Paris-Saclay UniversityParisFrance
  2. 2.AyotleLe Kremlin BicetreFrance

Personalised recommendations