Advertisement

International Journal of Computer Vision

, Volume 62, Issue 3, pp 221–247 | Cite as

Shape-From-Silhouette Across Time Part I: Theory and Algorithms

  • Kong-man (German) CheungEmail author
  • Simon Baker
  • Takeo Kanade
Article

Abstract

Shape-From-Silhouette (SFS) is a shape reconstruction method which constructs a 3D shape estimate of an object using silhouette images of the object. The output of a SFS algorithm is known as the Visual Hull (VH). Traditionally SFS is either performed on static objects, or separately at each time instant in the case of videos of moving objects. In this paper we develop a theory of performing SFS across time: estimating the shape of a dynamic object (with unknown motion) by combining all of the silhouette images of the object over time. We first introduce a one dimensional element called a Bounding Edge to represent the Visual Hull. We then show that aligning two Visual Hulls using just their silhouettes is in general ambiguous and derive the geometric constraints (in terms of Bounding Edges) that govern the alignment. To break the alignment ambiguity, we combine stereo information with silhouette information and derive a Temporal SFS algorithm which consists of two steps: (1) estimate the motion of the objects over time (Visual Hull Alignment) and (2) combine the silhouette information using the estimated motion (Visual Hull Refinement). The algorithm is first developed for rigid objects and then extended to articulated objects. In the Part II of this paper we apply our temporal SFS algorithm to two human-related applications: (1) the acquisition of detailed human kinematic models and (2) marker-less motion tracking.

Keywords

3D reconstruction Shape-From-Silhouette Visual Hull across time stereo temporal alignment alignment ambiguity visibility 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, J., Cai, Q., Liao, W., and Sabata, B. 1994. Articulated and elastic non-rigid motion: A review. In Proceedings of IEEE Workshop on Motion of Non-rigid and Articulated Objects’94, pp. 16–22.Google Scholar
  2. Ahuja, N. and Veenstra, J. 1989. Generating octrees from object silhouettes in orthographic views. IEEE Transactions Pattern Analysis and Machine Intelligence, 11(2):137–149.CrossRefGoogle Scholar
  3. Baumgart, B.G. 1974. Geometric modeling for computer vision. Ph.D. thesis, Stanford University.Google Scholar
  4. Besl, P. and McKay, N. 1992. A method of registration of 3D shapes. IEEE Transaction on Pattern Analysis and Machine Intelligence, 14(2):239–256.CrossRefGoogle Scholar
  5. Bottino, A. and Laurentini, A. 2000. Non-intrusive silhouette based motion capture. In Proceedings of the Fourth World Multiconference on Systemics, Cybernetics and Informatics SCI 2001, pp. 23–26.Google Scholar
  6. Buehler, C., Matusik, W., McMillan, L., and Gortler, S. 1999. Creating and rendering image-based visual hulls. Technical Report MIT-LCS-TR-780, MIT.Google Scholar
  7. Buehler, C., Matusik, W., and McMillan, L. 2001. Polyhedral visual hulls for real-time rendering. In Proceedings of the 12th Eurographics Workshop on Rendering.Google Scholar
  8. Cheung, G., Baker, S., and Kanade, T. 2003. Visual hull alignment and refinement across time:a 3D reconstruction algorithm combining shape-frame-silhouette with stereo. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’03), Madison, MI.Google Scholar
  9. Cheung, G. 2003. Visual Hull Construction, Alignment and Refinement for Human Kinematic Modeling, Motion Tracking and Rendering. Ph.D. thesis, Carnegie Mellon University.Google Scholar
  10. Delamarre, Q. and Faugeras, O. 1999. 3D articulated models and multi-view tracking with silhouettes. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.Google Scholar
  11. Dempster, A., Laird, N., and Rubin, D. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of Statistical Society, B 39:1–38.Google Scholar
  12. Dennis, J. and Schnabel, R. 1983. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice Hall, Englewood Cliffs, NJ.Google Scholar
  13. Irani, M., Hassner, T., and Anandan, P. 2002. What does the scene look like from a scene point? In Proceedings of European Conference on Computer Vision (ECCV’02), Copenhagen, Denmark. pp. 883–897.Google Scholar
  14. Jain, A. 1989. Fundamentals of Digital Image Processing. Prentice Hall.Google Scholar
  15. Joshi, T., Ahuja, N., and Ponce, J. 1994. Towards structure and motion estimation from dynamic silhouettes. In Proceedings of IEEE Workshop on Motion of Non-rigid and Articulated Objects, pp. 166–171.Google Scholar
  16. Joshi, T., Ahuja, N., and Ponce, J. 1995. Structure and motion estimation from dynamic silhouettes under perspective projection. Technical Report UIUC-BI-AI-RCV-95-02, University of Illinois Urbana Champaign.Google Scholar
  17. Kakadiaris, I. and Metaxas, D. 1998. 3D human body model acquisition from multiple views. International Journal on Computer Vision, 30(3):191–218.CrossRefGoogle Scholar
  18. Ke, Q. and Kanade, T. 2001. A subspace approach to layer extraction. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Kauai, HI.Google Scholar
  19. Kim, Y. and Aggarwal, J. 1986. Rectangular parallelepiped coding: A volumetric representation of three dimensional objects. IEEE Journal of Robotics and Automation, RA-2:127–134.Google Scholar
  20. Krahnstoever, N., Yeasin, M., and Sharma, R. 2001. Automatic acquisition and initialization of kinematic models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Technical Sketches, Kauai, HI.Google Scholar
  21. Krahnstoever, N., Yeasin, M., and Sharma, R. 2003. Automatic acquisition and initialization of articulated models. In To appear in Machine Vision and Applications (to accepted).Google Scholar
  22. Kurazume, R., Nishino, K., Zhang, Z., and Ikeuchi, K. 2002. Simultaneous 2D images and 3D geometric model registration for texture mapping utilizing reflectance attribute. In Proceedings of Asian Conference on Computer Vision (ACCV’02), vol. 1, pp. 99–106.Google Scholar
  23. Kutulakos, K. and Seitz, S. 2000. A theory of shape by space carving. International Journal of Computer Vision, 38(3):199–218.CrossRefGoogle Scholar
  24. Laurentini, A. 1991. The visual hull: A new tool for contour-based image understanding. In Proceedings of the Seventh Scandinavian Conference on Image Analysis, pp. 993–1002.Google Scholar
  25. Laurentini, A. 1994. The visual hull concept for silhouette-based image understanding. IEEE Transactions Pattern Analysis and Machine Intelligence, 16(2):150–162.CrossRefGoogle Scholar
  26. Laurentini, A. 1995. How far 3D shapes can be understood from 2D silhouettes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(2):188–195.CrossRefGoogle Scholar
  27. Laurentini, A. 1999. The visual hull of curved objects. In Proceedings of International Conference on Computer Vision (ICCV’99), Corfu, Greece.Google Scholar
  28. Lazebnik, S., Boyer, E., and Ponce, J. 2001. On computing exact visual hulls of solids bounded by smooth surfaces. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’01), Kauai HI.Google Scholar
  29. Martin, W. and Aggarwal, J. 1983. Volumetric descriptions of objects from multiple views. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5(2):150–174.Google Scholar
  30. Matusik, W. 2001. Image-based visual hulls. Master’s thesis, Massachusetts Institute of Technology.Google Scholar
  31. Matusik, W., Buehler, C., Raskar, R., Gortler, S., and McMillan, L. 2000. Image-based visual hulls. In Computer Graphics Annual Conference Series (SIGGRAPH’00), New Orleans, LA.Google Scholar
  32. Mendonca, P., Wong, K., and Cipolla, R. 2000. Camera pose estimation and reconstruction from image profiles under circular motion. In Proceedings of European Conference on Computer Vision (ECCV’00), Dublin, Ireland, pp. 864–877.Google Scholar
  33. Mendonca, P., Wong, K., and Cipolla, R. 2001. Epipolar geometry from profiles under circular motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6):604–616.CrossRefGoogle Scholar
  34. Moezzi, S., Tai, L., and Gerard, P. 1997. Virtual view generation for 3D digital video. IEEE Computer Society Multimedia, 4(1).Google Scholar
  35. Noborio, H., Fukuda, S., and Arimoto, S. 1988. Construction of the octree approximating three-dimensional objects by using multiple views. IEEE Transactions Pattern Analysis and Machine Intelligence, 10(6):769–782.CrossRefGoogle Scholar
  36. Okutomi, M. and Kanade, T. 1993. A multiple-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(4):353–363.CrossRefGoogle Scholar
  37. Poelman, C. and Kanade, T. 1992. A paraperspective factorization method for shape and motion recovery. Technical Report CMU-CS-TR-92-208, Carnegie Mellon University, Pittsburgh, PA.Google Scholar
  38. Potmesil, M. 1987. Generating octree models of 3D objects from their silhouettes in a sequence of images. Computer Vision, Graphics and Image Processing, 40:1–20.Google Scholar
  39. Press, W., Teukolsky, S., Vetterling, W., and Flannery, B. 1993. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press.Google Scholar
  40. Quan, L. and Kanade, T. 1996. A factorization method for affine structure from line correspondences. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’96), San Francisco, CA, pp. 803–808.Google Scholar
  41. Rusinkiewicz, S. and Levoy, M. 2001. Efficient variants of the ICP algorithm. In Third International Conference on 3D Digital Imaging and Modeling, pp. 145–152.Google Scholar
  42. Sawhney, H. and Ayer, S. 1996. Compact representations of videos through dominant and multiple motion estimation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 18(8):814–830.CrossRefGoogle Scholar
  43. Shanmukh, K. and Pujari, A. 1991. Volume intersection with optimal set of directions. Pattern Recognition Letter, 12:165–170.CrossRefGoogle Scholar
  44. Szeliski, R. 1993. Rapid octree construction from image sequences. Computer Vision, Graphics and Image Processing: Image Understanding, 58(1):23–32.Google Scholar
  45. Szeliski, R. 1994. Image mosaicing for tele-reality applications. Technical Report CRL 94/2, Compaq Cambridge Research Laboratory.Google Scholar
  46. Szeliski, R. and Golland, P. 1998. Stereo matching with transparency and matting. In Proceedings of the Sixth International Conference on Computer Vision (ICCV’98), pp. 517–524, Bombay, India.Google Scholar
  47. Tomasi, C. and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9(2):137–154.CrossRefGoogle Scholar
  48. Vijayakumar, B., Kriegman, D., and Ponce, J. 1996. Structure and motion of curved 3D objects from monocular silhouettes. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’96), San Francisco, CA, pp. 327–334.Google Scholar
  49. Wheeler, M. 1996. Automatic Modeling and Localization for Object Recognition. PhD thesis, Carnegie Mellon University.Google Scholar
  50. Wong, K. and Cipolla, R. 2001. Head model acquisition and silhouettes. In Proceedings of International Workshop on Visual Form (IWVF-4).Google Scholar
  51. Wong, K. and Cipolla, R. 2001. Structure and motion from silhouettes. In Proceedings of International Conference on Computer Vision (ICCV’01), Vancouver, Canada.Google Scholar
  52. Zhang, Z. 1994. Iterative point matching for registration of free-form curves and surfaces. International Journal of Computer Vision, 13(2):119–152.CrossRefGoogle Scholar

Copyright information

© Springer Science + Business Media, Inc. 2004

Authors and Affiliations

  • Kong-man (German) Cheung
    • 1
    Email author
  • Simon Baker
    • 1
  • Takeo Kanade
    • 1
  1. 1.The Robotics InstituteCarnegie Mellon UniversityNew York

Personalised recommendations