International Journal of Computer Vision

, Volume 49, Issue 2–3, pp 175–214

Multi-View Scene Capture by Surfel Sampling: From Video Streams to Non-Rigid 3D Motion, Shape and Reflectance

  • Rodrigo L. Carceroni
  • Kiriakos N. Kutulakos
Article

Abstract

In this paper we study the problem of recovering the 3D shape, reflectance, and non-rigid motion properties of a dynamic 3D scene. Because these properties are completely unknown and because the scene's shape and motion may be non-smooth, our approach uses multiple views to build a piecewise-continuous geometric and radiometric representation of the scene's trace in space-time. A basic primitive of this representation is the dynamic surfel, which (1) encodes the instantaneous local shape, reflectance, and motion of a small and bounded region in the scene, and (2) enables accurate prediction of the region's dynamic appearance under known illumination conditions. We show that complete surfel-based reconstructions can be created by repeatedly applying an algorithm called Surfel Sampling that combines sampling and parameter estimation to fit a single surfel to a small, bounded region of space-time. Experimental results with the Phong reflectancemodel and complex real scenes (clothing, shiny objects, skin) illustrate our method's ability to explain pixels and pixel variations in terms of their underlying causes—shape, reflectance, motion, illumination, and visibility.

stereoscopic vision 3D reconstruction multiple-view geometry multi-view stereo space carving motion analysis multi-view motion estimation direct estimation methods image warping deformation analysis 3D motion capture reflectance modeling illumination modeling Phong reflectance model 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amenta, N., Bern, M., and Kamvysselis, M. 1998. A new Voronoi-based surface reconstruction algorithm. In Proc. SIGGRAPH'98, pp. 415–421.Google Scholar
  2. Anandan, P. 1989. A computational framework and an algorithm for the measurement of visual motion. Int. J. Computer Vision, 2:283–310.Google Scholar
  3. Avidan, S. and Shashua, A. 2000. Trajectory triangulation: 3D reconstruction of moving points from a monocular image sequence. IEEE Trans. Pattern Anal. Machine Intell., 22(4):348–357.Google Scholar
  4. Baraff, D. and Witkin, A. 1998. Large steps in cloth simulation. In Proc. SIGGRAPH'98, pp. 43–54.Google Scholar
  5. Belhumeur, P.N. 1996. A Bayesian approach to binocular stereopsis. Int. J. Computer Vision, 19(3):237–260.Google Scholar
  6. Ben-Ezra, M., Peleg, S., and Werman, M. 2000. Real-time motion analysis with linear programming. Computer Vision and Image Understanding, 78(1):32–52.Google Scholar
  7. Béréziat, D., Herlin, I., and Younes, L. 2000. A generalized optical flow constraint and its physical interpretation. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 487–492.Google Scholar
  8. Black, M.J. 1999. Explaining optical flow events with parameterized spatio-temporal models. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 326–332.Google Scholar
  9. Black, M.J. and Anandan, P. 1996. The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields. Computer Vision and Image Understanding, 63(1):75–104.Google Scholar
  10. Black, M.J., Fleet, D.J., and Yacoob, Y. 2000. Robustly estimating changes in image appearance. Computer Vision and Image Understanding, 78(1):8–31.Google Scholar
  11. Blake, A. and Bulthoff, H. 1991. Shape from specularities: Computation and psychophysics. Phil. Trans. R. Soc. Lond., 331:237–252.Google Scholar
  12. Blinn, J.F. 1978. Simulation of wrinkled surfaces. Computer Graphics, 12(3):286–292.Google Scholar
  13. Bouguet, J.-Y. and Perona, P. 1998. 3D photography on your desk. In Proc. 6th Int. Conf. on Computer Vision, pp. 43–50.Google Scholar
  14. Bregler, C., Hertzmann, A., and Biermann, H. 2000. Recovering non-rigid 3D shape from image streams. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 690–696.Google Scholar
  15. Bregler, C. and Malik, J. 1998. Tracking people with twists and exponential maps. In Proc. Computer Vision and Pattern Recognition Conf., pp. 8–15.Google Scholar
  16. Brodsky, T., Fermuller, C., and Aloimonos, Y. 1999. Shape from video. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 146–151.Google Scholar
  17. Burt, P.J. and Adelson, E.H. 1983. The Laplacian pyramid as a compact image code. IEEE Trans. on Communications, 31(4):532–540.Google Scholar
  18. Carceroni, R.L. and Kutulakos, K.N. 1999a. Toward recovering shape and motion of 3D curves from multi-view image sequences. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 192–197.Google Scholar
  19. Carceroni, R.L. and Kutulakos, K.N. 1999b. Multi-view 3D shape and motion recovery on the spatio-temporal curve manifold. In Proc. 7th Int. Conf. on Computer Vision., vol. 1, pp. 520–527.Google Scholar
  20. Caspi, Y. and Irani, M. 2000. A step towards sequence-to-sequence alignment. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 682–689.Google Scholar
  21. Chen, Q. and Medioni, G. 1999. A volumetric stereo matching method: Application to image-based modeling. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 29–34.Google Scholar
  22. Collins, R.T. 1996. A space-sweep approach to true multi-image matching. In Proc. Computer Vision and Pattern Recognition Conf., pp. 358–363.Google Scholar
  23. Cook, R. and Torrance, K.E. 1981. A reflectance model for computer graphics. Computer Graphics, 15:307–316.Google Scholar
  24. DeCarlo, D. and Metaxas, D. 1998. Deformable model-based shape and motion analysis from images using motion residual error. In Proc. 6th Int. Conf. on Computer Vision, pp. 113–119.Google Scholar
  25. DeCarlo, D. and Metaxas, D. 2000. Optical flow constraints on deformable models with applications to face tracking. Int. J. Computer Vision, 38(2):99–127.Google Scholar
  26. Delamare, Q. and Faugeras, O. 1999. 3D articulated models and multi-view tracking with silhouettes. In Proc. 7th Int. Conf. on Computer Vision, vol. 2, pp. 716–721.Google Scholar
  27. Deutscher, J., Blake, A., and Reid, I. 2000. Articulated body motion capture by annealed particle filtering. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 126–133.Google Scholar
  28. do Carmo, M.P. 1976. Differential Geometry of Curves and Surfaces. Prentice-Hall: Englewood Cliffs, NJ.Google Scholar
  29. Drummond, T. and Cipolla, R. 2000. Real-time tracking of multiple articulated structures in multiple views. In Proc. 6th European Conf. on Computer Vision, vol. 2, pp. 20–36.Google Scholar
  30. Faugeras, O. and Keriven, R. 1998. Complete dense stereovision using level set methods. In Proc. 5th European Conf. on Computer Vision, pp. 379–393.Google Scholar
  31. Faugeras, O.D. and Keriven, R. 1998. Variational principles, surface evolution, PDE's, level set methods and the stereo problem. IEEE Trans. Image Processing, 7(3):336–344.Google Scholar
  32. Fleet, D.J., Black, M.J., Yacoob, Y., and Jepson, A.D. 2000. Design and use of linear models for image motion analysis. Int. J. Computer Vision, 35(3):169–191.Google Scholar
  33. Fleet, D.J. and Jepson, A.D. 1990. Computation of component image velocity from local phase information. Int. J. Computer Vision, 5(1):77–104.Google Scholar
  34. Foley, J.D., van Dam, A., Feiner, S.K., and Hughes, J.F. 1990. Computer Graphics Principles and Practice. Addison-Wesley.Google Scholar
  35. Forsyth, D. and Zisserman, A. 1991. Reflections on shading. IEEE Trans. Pattern Anal. Machine Intell., 13(7):671–679.Google Scholar
  36. Fua, P. 1997. From multiple stereo views to multiple 3-D surfaces. Int. J. Computer Vision, 24(1):19–35.Google Scholar
  37. Fua, P. 1999. Using model-driven bundle-adjustment to model heads from raw video image sequences. In Proc. 7th Int. Conf. on Computer Vision, vol. 1, pp. 46–53.Google Scholar
  38. Fua, P. and Leclerc, Y.G. 1995. Object-centered surface reconstruction: Combining multi-image stereo and shading. Int. J. Computer Vision, 16:35–56.Google Scholar
  39. Gaucher, L. and Medioni, G. 1999. Accurate motion flow estimation with discontinuities. In Proc. 7th Int. Conf. on Computer Vision, vol. 2, pp. 695–702.Google Scholar
  40. Guenter, B., Grimm, C., Malvar, H., and Wood, D. 1998. Making faces. In Proc. SIGGRAPH'98, pp. 55–66.Google Scholar
  41. Haussecker, H.W. and Fleet, D.J. 2000. Computing optical flow with physical models of brightness variation. In Proc. Computer Vision and Pattern Recogition Conf., vol. 2, pp. 760–767.Google Scholar
  42. Horn, B.K.P. 1986. Robot Vision. MIT Press.Google Scholar
  43. Irani, M. 1999. Multi-frame optical flow estimation using subspace constraints. In Proc. 7th Int. Conf. on Computer Vision, vol. 1, pp. 626–633.Google Scholar
  44. Irani, M. and Peleg, S. 1991. Improving resolution by image registration. CVGIP: Graphical Models and Image Processing, 53:231–239.Google Scholar
  45. Irani, M., Rousso, B., and Peleg, S. 1997. Recovery of ego-motion using region alignment. IEEE Trans. Pattern Anal. Machine Intell., 19(3):268–272.Google Scholar
  46. Jin, H., Yezzi, A., and Soatto, S. 2000. Integrating multi-frame shape cues in a variational framework. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 169–176.Google Scholar
  47. Ju, S.X., Black, M.J., and Jepson, A.D. 1996. Skin and bones: Multi-layer, locally affine, optical flow and regularization with transparency. In Proc. Computer Vision Pattern Recognition Conf., pp. 307–314.Google Scholar
  48. Kanatani, K. and Ohta, N. 1999. Accuracy bounds and optimal computation of homography for image mosaicing applications. In Proc. 7th Int. Conf. on Computer Vision, vol. 1, pp. 73–78.Google Scholar
  49. Koenderink, J.J. 1990. Solid Shape. MIT Press.Google Scholar
  50. Koenderink, J.J., Doorn, A.J.V., Dana, K.J., and Nayar, S. 1999. Bidirectional reflection distribution of thoroughly pitted surfaces. Int. J. Computer Vision, 31(2/3):129–144.Google Scholar
  51. Kutulakos, K.N. 2000. Approximate N-View stereo. In Proc. 6th European Conf. on Computer Vision, vol. 1, pp. 67–83.Google Scholar
  52. Kutulakos, K.N. and Seitz, S.M. 2000. A theory of shape by space carving. Int. J. Computer Vision, 38(3):199–218. Marr Prize Special Issue.Google Scholar
  53. Lafortune, E.P.F., Foo, S., Torrance, K.E., and Greenberg, D.P. 1997. Non-linear approximation of reflectance functions. In Proc. SIGGRAPH'97, pp. 117–126.Google Scholar
  54. Langer, M.S. and Zucker, S.W. 1994. Shape-from-shading on a cloudy day. J. Opt. Soc. Am. A, 11(2):467–478.Google Scholar
  55. Lin, S. and Lee, S.W. 1999. A representation of specular appearance. In Proc. 7th Int. Conf. on Computer Vision, vol. 2, pp. 849–854.Google Scholar
  56. Lin, S. and Lee, S.W. 2000. An appearance representation for multiple reflection components. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 105–110.Google Scholar
  57. Loop, C. and Zhang, Z. 1999. Computing rectifying homographies for stereo vision. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 125–131.Google Scholar
  58. Lowe, D.G. 1991. Fitting parameterized three-dimensional models to images. IEEE Trans. Pattern Anal. Machine Intell., 13(5):441–449.Google Scholar
  59. Lu, R., Koenderinck, J.J., and Cappers, A.M.L. 1999. Specularities on surfaces with tangential hairs or grooves. In Proc. 7th Int. Conf. on Computer Vision, vol. 1, pp. 2–7.Google Scholar
  60. Narayanan, P.J., Rander, P.W., and Kanade, T. 1998. Constructing virtual worlds using dense stereo. In Proc. 6th Int. Conf. on Computer Vision, pp. 3–10.Google Scholar
  61. Nayar, S.K., Fang, X., and Boult, T.E. 1993. Removal of specularities using color and polarization. In Proc. Computer Vision and Pattern Recognition Conf., pp. 583–590.Google Scholar
  62. Negahdaripour, S. 1998. Revised definition of optical frow: Integration of radiometric and geometric cues for dynamic scene analysis. IEEE Trans. Pattern Anal. Machine Intell., 20(9):961–979.Google Scholar
  63. Ohta, Y. and Kanade, T. 1985. Stereo by intra-and inter-scanline search using dynamic programming. IEEE Trans. Pattern Anal. Machine Intell., 7(2):139–154.Google Scholar
  64. Oren, M. and Nayar, S.K. 1997. A theory of specular surface geometry. Int. J. Computer Vision, 24(2):105–124.Google Scholar
  65. Papin, C., Bouthemy, P., and Rochard, G. 2000. Tracking and characterization of highly deformable cloud structures. In Proc. 6th European Conf. on Computer Vision, vol. 2, pp. 428–442.Google Scholar
  66. Pratt, W.K. 1991. Digital Image Processing. John Wiley & Sons.Google Scholar
  67. Press, W.H., Flannery, B.P., Teukolsky, S.A., and Vetterling, W.T. 1988. Numerical Recipies in C. Cambridge University Press.Google Scholar
  68. Ramamoorthi, R. and Hanrahan, P. 2001. A signal processing framework for inverse rendering. In Proc. SIGGRAPH'01, pp. 117–128.Google Scholar
  69. Roy, S. and Cox, I.J. 1998. A maximum-flow formulation of the N-camera stereo correspondence problem. In Proc. 6th Int. Conf. on Computer Vision, pp. 492–499.Google Scholar
  70. Samaras, D. and Metaxas, D. 1998. Incorporating illumination constraints in deformable models. In Proc. Computer Vision and Pattern Recognition Conf., pp. 322–329.Google Scholar
  71. Sato, Y. and Ikeuchi, K. 1994. Temporal-color space analysis of reflection. J. Opt. Soc. Am. A, 11(11):2990–3002.Google Scholar
  72. Sato, Y., Wheeler, M.D., and Ikeuchi, K. 1997. Object shape and reflectance modeling from observation. In Proc. SIGGRAPH'97, pp. 379–387.Google Scholar
  73. Seitz, S.M. and Dyer, C.R. 1999. Photorealistic scene reconstruction by voxel coloring. Int. J. Computer Vision, 35(2):151–173.Google Scholar
  74. Sidenbladh, H., Black, M.J., and Fleet, D.J. 2000. Stochastic tracking of 3D human figures using 2D image motion. In Proc. 6th European Conf. on Computer Vision, vol. 2, pp. 702–718.Google Scholar
  75. Silva, C. and Santos-Victor, J. 2000. Intrinsic images for dense stereo matching with occlusions. In Proc. 6th European Conf. on Computer Vision, vol. 1, pp. 100–114.Google Scholar
  76. Shashua, A. 1992. Geometry and photometry in 3D visual recognition. Ph.D. Thesis, MIT.Google Scholar
  77. Smith, P., Drummond, T., and Cipolla, R. 2000. Motion segmentation by tracking edge information over multiple frames. In Proc. 6th European Conf. on Computer Vision, vol. 2, pp. 396–410.Google Scholar
  78. Snow, D., Viola, P., and Zabih, R. 2000. Exact voxel occupancy with graph cuts. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 345–352.Google Scholar
  79. Szeliski, R. 1996. Video mosaics for virtual environments. IEEE Computer Graphics and Applications, 16(2):22–30.Google Scholar
  80. Szeliski, R. 1999. A multi-view approach to motion and stereo. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 157–163.Google Scholar
  81. Szeliski, R., Avidan, S., and Anandan, P. 2000. Layer extraction from multiple images containing reflections and transparency. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 246–253.Google Scholar
  82. Szeliski, R. and Golland, P. 1998. Stereo matching with transparency and matting. In Proc. 6th Int. Conf. on Computer Vision, pp. 517–524.Google Scholar
  83. Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. Int. J. Computer Vision, 9(2):137–154.Google Scholar
  84. Torrance, K.E. and Sparrow, E.M. 1967. Theory of off-specular reflection from roughened surfaces. J. Opt. Soc. Am., 57:1105–1114.Google Scholar
  85. Tzovaras, D. and Grammalidis, N. 1997. Object-based coding of stereo image sequences using joint 3-D motion/disparity compensation. IEEE Trans. on Circuits and Systems for Video Technology, 7(2):312–327.Google Scholar
  86. Vedula, S., Baker, S., Rander, P., Collins, R., and Kanade, T. 1999. Three-dimensional scene flow. In Proc. 7th Int. Conf. on Computer Vision, vol. 2, pp. 722–729.Google Scholar
  87. Vedula, S., Baker, S., Seitz, S., and Kanade, T. 2000. Shape and motion carving in 6D. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 592–598.Google Scholar
  88. Wang, J.Y. and Adelson, E.H. 1993. Layered representation for motion analysis. In Proc. Computer Vision and Pattern Recognition Conf., pp. 361–366.Google Scholar
  89. Watt, A. 2000. 3D Computer Graphics. 3rd edn., Addison-Wesley.Google Scholar
  90. Wexler, Y. and Shashua, A. 1999. Q-warping: Direct computation of quadratic reference surfaces. In Proc. Computer Vision and Pattern Recognition Conf., vol. 1, pp. 333–338.Google Scholar
  91. Wolff, L.B., Nayar, S.K., and Oren, M. 1998. Improved diffuse reflection models for computer vision. Int. J. Computer Vision, 30(1):55–71.Google Scholar
  92. Wood, D.N., Azuma, D.I., Aldinger, K., Curless, B., and Duchamp, T. 2000. Surface light fields for 3D photography. In Proc. SIGGRAPH'00, pp. 287–296.Google Scholar
  93. Yacoob, Y. and Davis, L.S. 2000. Learned models for estimation of rigid and articulated human motion from stationary or moving camera. Int. J. Computer Vision, 36(1):5–30.Google Scholar
  94. Ye, M. and Haralick, R.M. 2000. Two-stage robust optical flow estimation. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 623–628.Google Scholar
  95. Yu, Y., Debevec, P., Malik, J., and Hawkins, T. 1999. Inverse global illumination: Recovering reflectance models of real scenes from photographs. In Proc. SIGGRAPH'99, pp. 215–224.Google Scholar
  96. Zelnik-Manor, L. and Irani, M. 2000. Multi-frame estimation of planar motion. IEEE Trans. Pattern Anal. Machine Intell., 22(10):1105–1116.Google Scholar
  97. Zhang, Y. and Kambhamettu, C. 2000. Integrated 3D scene flow and structure recovery from multiview image sequences. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 674–681.Google Scholar
  98. Zhou, L. and Kambhamettu, C. 2000. Hierarchical structure and nonrigid motion recovery from monocular views. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 752–759.Google Scholar
  99. Zhou, L., Kambhamettu, C., and Goldgof, D.B. 2000. Fluid structure and motion analysis from multi-spectrum 2D cloud image sequences. In Proc. Computer Vision and Pattern Recognition Conf., vol. 2, pp. 744–751.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Rodrigo L. Carceroni
    • 1
  • Kiriakos N. Kutulakos
    • 2
  1. 1.Departamento de Ciência da ComputaçãoUniversidade Federal de Minas GeraisBelo Horizonte, MGBrazil
  2. 2.Department of Computer ScienceUniversity of TorontoTorontoCanada

Personalised recommendations