VolumeDeform: Real-Time Volumetric Non-rigid Reconstruction

  • Matthias Innmann
  • Michael Zollhöfer
  • Matthias Nießner
  • Christian Theobalt
  • Marc Stamminger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9912)

Abstract

We present a novel approach for the reconstruction of dynamic geometric shapes using a single hand-held consumer-grade RGB-D sensor at real-time rates. Our method builds up the scene model from scratch during the scanning process, thus it does not require a pre-defined shape template to start with. Geometry and motion are parameterized in a unified manner by a volumetric representation that encodes a distance field of the surface geometry as well as the non-rigid space deformation. Motion tracking is based on a set of extracted sparse color features in combination with a dense depth constraint. This enables accurate tracking and drastically reduces drift inherent to standard model-to-depth alignment. We cast finding the optimal deformation of space as a non-linear regularized variational optimization problem by enforcing local smoothness and proximity to the input constraints. The problem is tackled in real-time at the camera’s capture rate using a data-parallel flip-flop optimization strategy. Our results demonstrate robust tracking even for fast motion and scenes that lack geometric features.

Notes

Acknowledgments

We thank Angela Dai for the video voice over and Richard Newcombe for the DynamicFusion comparison sequences. This research is funded by the German Research Foundation (DFG) – grant GRK-1773 Heterogeneous Image System –, the ERC Starting Grant 335545 CapReal, the Max Planck Center for Visual Computing and Communications (MPC-VCC), and the Bayerische Forschungsstiftung (For3D).

Supplementary material

Supplementary material 1 (mp4 51648 KB)

419983_1_En_22_MOESM2_ESM.pdf (145 kb)
Supplementary material 2 (pdf 144 KB)

References

  1. 1.
    Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: Proceedings of ISMAR, pp. 127–136 (2011)Google Scholar
  2. 2.
    Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., Fitzgibbon, A.: KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera. In: Proceedings of UIST, pp. 559–568 (2011)Google Scholar
  3. 3.
    Roth, H., Vona, M.: Moving volume KinectFusion. In: Proceedings of BMVC (2012)Google Scholar
  4. 4.
    Zeng, M., Zhao, F., Zheng, J., Liu, X.: Octree-based fusion for realtime 3D reconstruction. Graph. Models 75, 126–136 (2012)CrossRefGoogle Scholar
  5. 5.
    Chen, J., Bautembach, D., Izadi, S.: Scalable real-time volumetric surface reconstruction. ACM Trans. Graph. (TOG) 32(4), 113 (2013)MATHGoogle Scholar
  6. 6.
    Nießner, M., Zollhöfer, M., Izadi, S., Stamminger, M.: Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. Graph. (TOG) 32, 169 (2013)Google Scholar
  7. 7.
    Whelan, T., Johannsson, H., Kaess, M., Leonard, J., McDonald, J.: Robust tracking for real-time dense RGB-D mapping with kintinuous. Technical report Query date: 2012–10-25(2012)Google Scholar
  8. 8.
    Steinbruecker, F., Sturm, J., Cremers, D.: Volumetric 3D mapping in real-time on a CPU. In: Proceedings of ICRA, Hongkong, China (2014)Google Scholar
  9. 9.
    Theobalt, C., de Aguiar, E., Stoll, C., Seidel, H.P., Thrun, S.: Performance capture from multi-view video. In: Ronfard, R., Taubin, G. (eds.) Image and Geometry Processing for 3-D Cinematography, pp. 127–149. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Zollhöfer, M., Nießner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., Stamminger, M.: Real-time non-rigid reconstruction using an rgb-d camera. ACM Trans. Graph. (TOG) 33(4), 1–12 (2014)CrossRefGoogle Scholar
  11. 11.
    Zeng, M., Zheng, J., Cheng, X., Liu, X.: Templateless quasi-rigid shape modeling with implicit loop-closure. In: Proceedings of CVPR, pp. 145–152. IEEE (2013)Google Scholar
  12. 12.
    Mitra, N.J., Flöry, S., Ovsjanikov, M., Gelfand, N., Guibas, L.J., Pottmann, H.: Dynamic geometry registration. In: Proceedings of SGP, pp. 173–182 (2007)Google Scholar
  13. 13.
    Tevs, A., Berner, A., Wand, M., Ihrke, I., Bokeloh, M., Kerber, J., Seidel, H.P.: Animation cartographyintrinsic reconstruction of shape and motion. ACM TOG 31(2), 12 (2012)CrossRefGoogle Scholar
  14. 14.
    Bojsen-Hansen, M., Li, H., Wojtan, C.: Tracking surfaces with evolving topology. ACM TOG 31(4), 53 (2012)CrossRefGoogle Scholar
  15. 15.
    Dou, M., Fuchs, H., Frahm, J.M.: Scanning and tracking dynamic objects with commodity depth cameras. In: Proceedings of ISMAR, pp. 99–106. IEEE (2013)Google Scholar
  16. 16.
    Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A., Izadi, S.: 3D scanning deformable objects with a single RGBD sensor. In: Proceedings of CVPR, June 2015Google Scholar
  17. 17.
    Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015Google Scholar
  18. 18.
    Sorkine, O., Alexa, M.: As-rigid-as-possible surface modeling. In: Proceedings of SGP, Citeseer, pp. 109–116 (2007)Google Scholar
  19. 19.
    Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: RGB-D mapping: using kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Robot. Res. 31, 647–663 (2012)CrossRefGoogle Scholar
  20. 20.
    Stückler, J., Behnke, S.: Integrating depth and color cues for dense multi-resolution scene mapping using RGB-D cameras. In: Proceedings of IEEE MFI (2012)Google Scholar
  21. 21.
    Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., Kolb, A.: Real-time 3D reconstruction in dynamic scenes using point-based fusion. In: Proceedings of 3DV, pp. 1–8. IEEE (2013)Google Scholar
  22. 22.
    Turk, G., Levoy, M.: Zippered polygon meshes from range images. In: Proceedings of SIGGRAPH, pp. 311–318 (1994)Google Scholar
  23. 23.
    Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: Proceedings of SIGGRAPH, pp. 303–312. ACM (1996)Google Scholar
  24. 24.
    Kazhdan, M., Bolitho, M., Hoppe, H.: Poisson surface reconstruction. In: Proceedings of SGP (2006)Google Scholar
  25. 25.
    Zhou, Q.Y., Koltun, V.: Dense scene reconstruction with points of interest. ACM TOG 32(4), 112 (2013)MATHGoogle Scholar
  26. 26.
    Fuhrmann, S., Goesele, M.: Floating scale surface reconstruction. In: Proceedings of ACM SIGGRAPH (2014)Google Scholar
  27. 27.
    Rusinkiewicz, S., Hall-Holt, O., Levoy, M.: Real-time 3D model acquisition. ACM TOG 21(3), 438–446 (2002)CrossRefGoogle Scholar
  28. 28.
    Weise, T., Wismer, T., Leibe, B., Gool, L.V.: In-hand scanning with online loop closure. In: Proceedings of 3DIM, October 2009Google Scholar
  29. 29.
    Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: Proceedings of 3DIM, pp. 145–152 (2001)Google Scholar
  30. 30.
    Steinbruecker, F., Kerl, C., Sturm, J., Cremers, D.: Large-scale multi-resolution surface reconstruction from RGB-D sequences. In: ICCV, Sydney, Australia (2013)Google Scholar
  31. 31.
    Starck, J., Hilton, A.: Surface capture for performance-based animation. CGAA 27(3), 21–31 (2007)Google Scholar
  32. 32.
    Ye, G., Liu, Y., Hasler, N., Ji, X., Dai, Q., Theobalt, C.: Performance capture of interacting characters with handheld kinects. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 828–841. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  33. 33.
    Collet, A., Chuang, M., Sweeney, P., Gillett, D., Evseev, D., Calabrese, D., Hoppe, H., Sullivan, S.: High-quality streamable free-viewpoint video. ACM Trans. Graph. (SIGGRAPH) 34, 4 (2015)CrossRefGoogle Scholar
  34. 34.
    de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.P., Thrun, S.: Performance capture from sparse multi-view video. ACM TOG (Proc. SIGGRAPH) 27, 1–10 (2008)CrossRefGoogle Scholar
  35. 35.
    Allain, B., Franco, J.S., Boyer, E.: An efficient volumetric framework for shape tracking. In: CVPR 2015-IEEE International Conference on Computer Vision and Pattern Recognition (2015)Google Scholar
  36. 36.
    Guo, K., Xu, F., Wang, Y., Liu, Y., Dai, Q.: Robust non-rigid motion tracking and surface reconstruction using l0 regularization. In: Proceedings of ICCV (2015)Google Scholar
  37. 37.
    Hernández, C., Vogiatzis, G., Brostow, G.J., Stenger, B., Cipolla, R.: Non-rigid photometric stereo with colored lights. In: Proceedings of ICCV, pp. 1–8. IEEE (2007)Google Scholar
  38. 38.
    Li, H., Sumner, R.W., Pauly, M.: Global correspondence optimization for non-rigid registration of depth scans. In: Computer Graphics Forum, vol. 27, pp. 1421–1430. Wiley Online Library (2008)Google Scholar
  39. 39.
    Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. ACM TOG 28(5), 175 (2009)CrossRefGoogle Scholar
  40. 40.
    Li, H., Luo, L., Vlasic, D., Peers, P., Popović, J., Pauly, M., Rusinkiewicz, S.: Temporally coherent completion of dynamic shapes. ACM Trans. Graph. (TOG) 31(1), 2 (2012)CrossRefGoogle Scholar
  41. 41.
    Gall, J., Rosenhahn, B., Seidel, H.P.: Drift-free tracking of rigid and articulated objects. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008, CVPR 2008, pp. 1–8, June 2008Google Scholar
  42. 42.
    Garg, R., Roussos, A., Agapito, L.: Dense variational reconstruction of non-rigid surfaces from monocular video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1272–1279 (2013)Google Scholar
  43. 43.
    Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J.T., Gusev, G.: 3D self-portraits. ACM TOG 32(6), 187 (2013)Google Scholar
  44. 44.
    Tong, J., Zhou, J., Liu, L., Pan, Z., Yan, H.: Scanning 3D full human bodies using Kinects. TVCG 18(4), 643–650 (2012)Google Scholar
  45. 45.
    Malleson, C., Klaudiny, M., Hilton, A., Guillemaut, J.Y.: Single-view RGBD-based reconstruction of dynamic human geometry. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 307–314, December 2013Google Scholar
  46. 46.
    Malleson, C., Klaudiny, M., Guillemaut, J.Y., Hilton, A.: Structured representation of non-rigid surfaces from single view 3D point tracks. In: 2014 2nd International Conference on 3D Vision, vol. 1, pp. 625–632, December 2014Google Scholar
  47. 47.
    Wang, R., Wei, L., Vouga, E., Huang, Q., Ceylan, D., Medioni, G., Li, H.: Capturing dynamic textured surfaces of moving targets. In: Proceedings of the European Conference on Computer Vision (ECCV) (2016)Google Scholar
  48. 48.
    Sumner, R.W., Schmid, J., Pauly, M.: Embedded deformation for shape manipulation. ACM TOG 26(3), 80 (2007)CrossRefGoogle Scholar
  49. 49.
    Lorensen, W., Cline, H.: Marching cubes: a high resolution 3D surface construction algorithm. Proc. SIGGRAPH 21(4), 163–169 (1987)CrossRefGoogle Scholar
  50. 50.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV 1999 (1999)Google Scholar
  51. 51.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)CrossRefGoogle Scholar
  52. 52.
    Horn, B.K.P.: Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. A 4(4), 629–642 (1987)CrossRefGoogle Scholar
  53. 53.
    Gower, J.C.: Generalized procrustes analysis. Psychometrika 40(1), 31–51 (1975)CrossRefMATHMathSciNetGoogle Scholar
  54. 54.
    Umeyama, S.: Least-squares estimation of transformation parameters between two point patterns. IEEE Trans. Pattern Anal. Mach. Intell. 13(4), 376–380 (1991)CrossRefGoogle Scholar
  55. 55.
    Weber, D., Bender, J., Schnoes, M., Stork, A., Fellner, D.: Efficient GPU data structures and methods to solve sparse linear systems in dynamics applications. CGF 32(1), 16–26 (2013)Google Scholar
  56. 56.
    Wu, C., Zollhöfer, M., Nießner, M., Stamminger, M., Izadi, S., Theobalt, C.: Real-time shading-based refinement for consumer depth cameras. ACM Trans. Graph. (TOG) 33(6) (2014). doi:10.1145/2661229.2661232
  57. 57.
    Zollhöfer, M., Dai, A., Innmann, M., Wu, C., Stamminger, M., Theobalt, C., Nießner, M.: Shading-based refinement on volumetric signed distance functions. ACM Trans. Graph. (TOG) 34 (2015). doi:10.1145/2766887
  58. 58.
    DeVito, Z., Mara, M., Zollöfer, M., Bernstein, G., Theobalt, C., Hanrahan, P., Fisher, M., Nießner, M.: Opt: a domain specific language for non-linear least squares optimization in graphics and imaging. arXiv preprint arXiv:1604.06525 (2016)

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Matthias Innmann
    • 1
  • Michael Zollhöfer
    • 2
  • Matthias Nießner
    • 3
  • Christian Theobalt
    • 2
  • Marc Stamminger
    • 1
  1. 1.University of Erlangen-NurembergErlangenGermany
  2. 2.Max-Planck-Institute for InformaticsSaarbrückenGermany
  3. 3.Stanford UniversityStanfordUSA

Personalised recommendations