Skip to main content

Proxy Clouds for Live RGB-D Stream Processing and Consolidation

Part of the Lecture Notes in Computer Science book series (LNIP,volume 11210)

Abstract

We propose a new multiplanar superstructure for unified real-time processing of RGB-D data. Modern RGB-D sensors are widely used for indoor 3D capture, with applications ranging from modeling to robotics, through augmented reality. Nevertheless, their use is limited by their low resolution, with frames often corrupted with noise, missing data and temporal inconsistencies. Our approach, named Proxy Clouds, consists in generating and updating through time a single set of compact local statistics parameterized over detected planar proxies, which are fed from raw RGB-D data. Proxy Clouds provide several processing primitives, which improve the quality of the RGB-D stream on-the-fly or lighten further operations. Experimental results confirm that our light weight analysis framework copes well with embedded execution as well as moderate memory and computational capabilities compared to state-of-the-art methods. Processing of RGB-D data with Proxy Clouds includes noise and temporal flickering removal, hole filling and resampling. As a substitute of the observed scene, our proxy cloud can additionally be applied to compression and scene reconstruction. We present experiments performed with our framework in indoor scenes of different natures within a recent open RGB-D dataset.

Keywords

  • RGB-D stream
  • 3D geometric primitives
  • Data reinforcement
  • Depth improvement
  • Online processing
  • Scene reconstruction

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-01231-1_16
  • Chapter length: 17 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-01231-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.

Notes

  1. 1.

    The area of a pixel at given depth Z is given by \(a(Z) = tan(\frac{fov_H}{res_H}) tan(\frac{fov_V}{res_V}) Z^2\). With \(fov=\) (60\(^{\circ }\),45\(^{\circ }\)) and \(res=(320,240)\), we have \(a(8\,\mathrm{m}) \approx 0.00068539\,\mathrm{m}^{2}\) \(\approx (2.6\,\mathrm{cm})^{2}\).

  2. 2.

    3DLite dataset: http://graphics.stanford.edu/projects/3dlite/#data.

  3. 3.

    Structure sensor: http://structure.io.

References

  1. Endres, F., Hess, J., Sturm, J., Cremers, D., Burgard, W.: 3-D mapping with an RGB-D camera. IEEE Trans. Robot. 30(1), 177–187 (2014)

    CrossRef  Google Scholar 

  2. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    MathSciNet  CrossRef  Google Scholar 

  3. Umeyama, S.: Least-squares estimation of transformation parameters between two point patterns. IEEE Trans. Pattern Anal. Mach. Intell. 13(4), 376–380 (1991)

    CrossRef  Google Scholar 

  4. Hulik, R., Spanel, M., Smrz, P., Materna, Z.: Continuous plane detection in point-cloud data based on 3D Hough transform. J. Vis. Commun. Image Representation 25(1), 86–97 (2014)

    CrossRef  Google Scholar 

  5. Schnabel, R., Wahl, R., Klein, R.: Efficient RANSAC for point-cloud shape detection. Comput. Graph. Forum 26(2), 214–226 (2007)

    CrossRef  Google Scholar 

  6. Holz, D., Holzer, S., Rusu, R.B., Behnke, S.: Real-time plane segmentation using RGB-D cameras. In: Röfer, T., Mayer, N.M., Savage, J., Saranlı, U. (eds.) RoboCup 2011. LNCS (LNAI), vol. 7416, pp. 306–317. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32060-6_26

    CrossRef  Google Scholar 

  7. Feng, C., Taguchi, Y., Kamat, V.R.: Fast plane extraction in organized point clouds using agglomerative hierarchical clustering. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 6218–6225. IEEE (2014)

    Google Scholar 

  8. Kaiser, A., Ybanez Zepeda, J.A., Boubekeur, T.: A survey of simple geometric primitives detection methods for captured 3D data. In: Computer Graphics Forum (2018, to appear)

    Google Scholar 

  9. Li, L.: Filtering for 3D time-of-flight sensors. Technical report SLOA230, Texas Instruments, January 2016

    Google Scholar 

  10. Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, pp. 839–846. IEEE (1998)

    Google Scholar 

  11. Shao, L., Han, J., Kohli, P., Zhang, Z. (eds.): Computer Vision and Machine Learning with RGB-D Sensors. Advances in Computer Vision and Pattern Recognition. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-319-08651-4

    CrossRef  MATH  Google Scholar 

  12. Essmaeel, K., Gallo, L., Damiani, E., De Pietro, G., Dipandà, A.: Temporal denoising of kinect depth data. In: Eighth International Conference on Signal Image Technology and Internet Based Systems (SITIS), pp. 47–52. IEEE (2012)

    Google Scholar 

  13. Liu, S., Chen, C., Kehtarnava, N.: A computationally efficient denoising and hole-filling method for depth image enhancement. In: Kehtarnavaz, N., Carlsohn, M.F. (eds.) SPIE Conference on Real-Time Image and Video Processing, SPIE, April 2016

    Google Scholar 

  14. Le, A.V., Jung, S.W., Won, C.S.: Directional joint bilateral filter for depth images. Sensors 14(7), 11362–11378 (2014)

    CrossRef  Google Scholar 

  15. Newcombe, R.A., et al.: Kinectfusion: real-time dense surface mapping and tracking. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136. IEEE, October 2011

    Google Scholar 

  16. Bapat, A., Ravi, A., Raman, S.: An iterative, non-local approach for restoring depth maps in RGB-D images. In: Twenty First National Conference on Communications (NCC), pp. 1–6. IEEE (2015)

    Google Scholar 

  17. Camplani, M., Salgado, L.: Adaptive spatio-temporal filter for low-cost camera depth maps. In: IEEE International Conference on Emerging Signal Processing Applications (ESPA), pp. 33–36. IEEE (2012)

    Google Scholar 

  18. Schmeing, M., Jiang, X.: Color segmentation based depth image filtering. In: Jiang, X., Bellon, O.R.P., Goldgof, D., Oishi, T. (eds.) WDIA 2012. LNCS, vol. 7854, pp. 68–77. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40303-3_8

    CrossRef  Google Scholar 

  19. Chen, L., Lin, H., Li, S.: Depth image enhancement for kinect using region growing and bilateral filter. In: 21st International Conference on Pattern Recognition (ICPR), pp. 3070–3073. IEEE (2012)

    Google Scholar 

  20. Wu, C., Zollhöfer, M., Nießner, M., Stamminger, M., Izadi, S., Theobalt, C.: Real-time shading-based refinement for consumer depth cameras. ACM Trans. Graph. (TOG) 33(6), 200 (2014)

    MATH  Google Scholar 

  21. Kopf, J., Cohen, M.F., Lischinski, D., Uyttendaele, M.: Joint bilateral upsampling. ACM Trans. Graph. (ToG) 26(3), 96 (2007)

    CrossRef  Google Scholar 

  22. Min, D., Lu, J., Do, M.N.: Depth video enhancement based on weighted mode filtering. IEEE Trans. Image Process. 21(3), 1176–1190 (2012)

    MathSciNet  CrossRef  Google Scholar 

  23. Liu, R., et al.: Hole-filling based on disparity map and inpainting for depth-image-based rendering. Int. J. Hybrid Inf. Technol. 9(5), 145–164 (2016)

    CrossRef  Google Scholar 

  24. Buyssens, P., Daisy, M., Tschumperlé, D., Lézoray, O.: Superpixel-based depth map inpainting for RGB-D view synthesis. In: IEEE International Conference on Image Processing (ICIP), pp. 4332–4336. IEEE (2015)

    Google Scholar 

  25. Solh, M., AlRegib, G.: Hierarchical hole-filling for depth-based view synthesis in FTV and 3D video. IEEE J. Sel. Topics Sig. Process. 6(5), 495–504 (2012)

    CrossRef  Google Scholar 

  26. Schnabel, R., Degener, P., Klein, R.: Completion and reconstruction with primitive shapes. Comput. Graph. Forum 28(2), 503–512 (2009)

    CrossRef  Google Scholar 

  27. Biswas, J., Veloso, M.: Planar polygon extraction and merging from depth images. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3859–3864. IEEE (2012)

    Google Scholar 

  28. Labbé, M., Michaud, F.: Online global loop closure detection for large-scale multi-session graph-based slam. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2661–2666. IEEE, September 2014

    Google Scholar 

  29. Hsiao, M., Westman, E., Zhang, G., Kaess, M.: Keyframe-based dense planar slam. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 5110–5117. IEEE, May 2017

    Google Scholar 

  30. Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)

    CrossRef  Google Scholar 

  31. Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., Kolb, A.: Real-time 3D reconstruction in dynamic scenes using point-based fusion. In: International Conference on 3D Vision (3DV), pp. 1–8. IEEE, June 2013

    Google Scholar 

  32. Salas-Moreno, R.F., Glocken, B., Kelly, P.H., Davison, A.J.: Dense planar slam. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 157–164. IEEE, September 2014

    Google Scholar 

  33. Elghor, H.E., Roussel, D., Ababsa, F., Bouyakhf, E.H.: Planes detection for robust localization and mapping in RGB-D slam systems. In: International Conference on 3D Vision (3DV), pp. 452–459. IEEE, October 2015

    Google Scholar 

  34. Dou, M., Guan, L., Frahm, J.-M., Fuchs, H.: Exploring high-level plane primitives for indoor 3D reconstruction with a hand-held RGB-D camera. In: Park, J.-I., Kim, J. (eds.) ACCV 2012. LNCS, vol. 7729, pp. 94–108. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37484-5_9

    CrossRef  Google Scholar 

  35. Kaess, M.: Simultaneous localization and mapping with infinite planes. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 4605–4611. IEEE, May 2015

    Google Scholar 

  36. Gao, X., Zhang, T.: Robust RGB-D simultaneous localization and mapping using planar point features. Robot. Auton. Syst. 72, 1–14 (2015)

    CrossRef  Google Scholar 

  37. Zhang, E., Cohen, M.F., Curless, B.: Emptying, refurnishing, and relighting indoor spaces. ACM Trans. Graph. (TOG) 35(6), 174 (2016)

    Google Scholar 

  38. Nießner, M., Zollhöfer, M., Izadi, S., Stamminger, M.: Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. Graph. (ToG) 32(6), 169 (2013)

    CrossRef  Google Scholar 

  39. Dai, A., Nießner, M., Zollhöfer, M., Izadi, S., Theobalt, C.: Bundlefusion: real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Trans. Graph. (TOG) 36(3), 24 (2017)

    CrossRef  Google Scholar 

  40. Zhang, Y., Xu, W., Tong, Y., Zhou, K.: Online structure analysis for real-time indoor scene reconstruction. ACM Trans. Graph. (TOG) 34(5), 159 (2015)

    CrossRef  Google Scholar 

  41. Dzitsiuk, M., Sturm, J., Maier, R., Ma, L., Cremers, D.: De-noising, stabilizing and completing 3D reconstructions on-the-go using plane priors. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3976–3983. IEEE, May 2017

    Google Scholar 

  42. Huang, J., Dai, A., Guibas, L., Niessner, M.: 3Dlite: towards commodity 3D scanning for content creation. ACM Trans. Graph. (TOG) 36(6), 203 (2017)

    CrossRef  Google Scholar 

  43. Kass, M., Solomon, J.: Smoothed local histogram filters. ACM Trans. Graph. (TOG) 29(4), 100 (2010)

    CrossRef  Google Scholar 

  44. Serra, J.: Image Analysis and Mathematical Morphology. Academic Press, Inc., Cambridge (1983)

    Google Scholar 

  45. Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)

    CrossRef  Google Scholar 

  46. Nenci, F., Spinello, L., Stachniss, C.: Effective compression of range data streams for remote robot operations using H. 264. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3794–3799. IEEE, September 2014

    Google Scholar 

  47. Cignoni, P., Rocchini, C., Scopigno, R.: Metro: measuring error on simplified surfaces. Comput. Graph. Forum 17(2), 167–174 (1998)

    CrossRef  Google Scholar 

  48. Lefebvre, S., Hoppe, H.: Compressed random-access trees for spatially coherent data. In: Kautz, J., Pattanaik, S. (eds.) Proceedings of the 18th Eurographics Conference on Rendering Techniques, pp. 339–349. Eurographics Association (2007)

    Google Scholar 

  49. Raposo, C., Lourenco, M., Goncalves Almeida Antunes, M., Barreto, J.P.: Plane-based odometry using an RGB-D camera. In: British Machine Vision Conference (BMVC). Elsevier, September 2013

    Google Scholar 

  50. Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: Scannet: richly-annotated 3D reconstructions of indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, July 2017

    Google Scholar 

Download references

Acknowledgements.

This work is partially supported by the French National Research Agency under grant ANR 16-LCV2-0009-01 ALLEGORI and by BPI France, under grant PAPAYA. We also wish to thank the authors of 3DLite [42], BundleFusion [39] and ScanNet [50] for providing the dataset we use.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrien Kaiser .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 2 (avi 65557 KB)

Supplementary material 1 (pdf 23737 KB)

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Kaiser, A., Ybanez Zepeda, J.A., Boubekeur, T. (2018). Proxy Clouds for Live RGB-D Stream Processing and Consolidation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds) Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science(), vol 11210. Springer, Cham. https://doi.org/10.1007/978-3-030-01231-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01231-1_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01230-4

  • Online ISBN: 978-3-030-01231-1

  • eBook Packages: Computer ScienceComputer Science (R0)