Advertisement

International Journal of Computer Vision

, Volume 90, Issue 3, pp 283–303 | Cite as

Multi-view Occlusion Reasoning for Probabilistic Silhouette-Based Dynamic Scene Reconstruction

  • Li GuanEmail author
  • Jean-Sébastien Franco
  • Marc Pollefeys
Article

Abstract

In this paper, we present an algorithm to probabilistically estimate object shapes in a 3D dynamic scene using their silhouette information derived from multiple geometrically calibrated video camcorders. The scene is represented by a 3D volume. Every object in the scene is associated with a distinctive label to represent its existence at every voxel location. The label links together automatically-learned view-specific appearance models of the respective object, so as to avoid the photometric calibration of the cameras. Generative probabilistic sensor models can be derived by analyzing the dependencies between the sensor observations and object labels. Bayesian reasoning is then applied to achieve robust reconstruction against real-world environment challenges, such as lighting variations, changing background etc. Our main contribution is to explicitly model the visual occlusion process and show: (1) static objects (such as trees or lamp posts), as parts of the pre-learned background model, can be automatically recovered as a byproduct of the inference; (2) ambiguities due to inter-occlusion between multiple dynamic objects can be alleviated, and the final reconstruction quality is drastically improved. Several indoor and outdoor real-world datasets are evaluated to verify our framework.

Keywords

Multi-view 3D reconstruction Bayesian inference Graphical model Shape-from-silhouette Occlusion 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apostoloff, N., & Fitzgibbon, A. (2005). Learning spatiotemporal T-junctions for occlusion detection. In CVPR (Vol. 2, pp. 553–559). Google Scholar
  2. Baumgart, B. G. (1974). Geometric modeling for computer vision. PhD thesis, 1974. Google Scholar
  3. De Bonet, J. S., & Viola, P. (1999). Roxels: responsibility weighted 3d volume reconstruction. In ICCV (Vol. 1, pp. 418–425). Google Scholar
  4. Broadhurst, A., Drummond, T., & Cipolla, R. (2001). A probabilistic framework for the space carving algorithm. In ICCV (Vol. 1, pp. 388–393). Google Scholar
  5. Brostow, G., & Essa, I. (1999). Motion based decompositing of video. In ICCV (Vol. 1, pp. 8–13). Google Scholar
  6. Elfes, A. (1989). Using occupancy grids for mobile robot perception and navigation. IEEE Computer, 22(6), 46–57. Special issue on autonomous intelligent machines. Google Scholar
  7. Elgammal, A., Duraiswami, R., Harwood, D., & Davis, L. (2002). Background and foreground modeling using nonparametric kernel density estimation for visual surveillance. Proceedings of the IEEE, 90(7), 1151–1163. CrossRefGoogle Scholar
  8. Favaro, P., Duci, A., Ma, Y., & Soatto, S. (2003). On exploiting occlusions in multiple-view geometry. In ICCV (Vol. 1, pp. 479–486). Google Scholar
  9. Fleuret, F., Berclaz, J., Lengagne, R., & Fua, P. (2007). Multi-camera people tracking with a probabilistic occupancy map. IEEE Transactions on Patern Analysis and Machine Intelligence, 30(2), 267–282. CrossRefGoogle Scholar
  10. Franco, J.-S., & Boyer, E. (2003). Exact polyhedral visual hulls. In BMVC (Vol. 1, pp. 329–338). Google Scholar
  11. Franco, J.-S., & Boyer, E. (2005). Fusion of multi-view silhouette cues using a space occupancy grid. In ICCV (Vol. 2, pp. 1747–1753). Google Scholar
  12. Furukawa, Y., & Ponce, J. (2006). Carved visual hulls for image-based modeling. In ECCV (Vol. 1, pp. 564–577). Google Scholar
  13. Grauman, K., Shakhnarovich, G., & Darrell, T. (2003). A Bayesian approach to image-based visual hull reconstruction. In CVPR (Vol. 1, pp. 187–194). Google Scholar
  14. Guan, L., Sinha, S., Franco, J.-S., & Pollefeys, M. (2006). Visual hull construction in the presence of partial occlusion. In 3DPVT (Vol. 1, pp. 413–420). Google Scholar
  15. Guan, L., Franco, J.-S., & Pollefeys, M. (2007). 3D occlusion inference from silhouette cues. In CVPR (pp. 1–8). Google Scholar
  16. Guan, L., Franco, J.-S., & Pollefeys, M. (2008). Multi-object shape estimation and tracking from silhouette cues. In CVPR (pp. 1–8). Google Scholar
  17. Gupta, A., Mittal, A., & Davis, L. S. (2007). Cost: an approach for camera selection and multi-object inference ordering in dynamic scenes. In ICCV (pp. 1–8). Google Scholar
  18. Hoiem, D., Stein, A., Efros, A., & Hebert, M. (2007). Recovering occlusion boundaries from a single image. In ICCV (pp. 1–8). Google Scholar
  19. Ilie, A., & Welsh, G. (2005). Ensuring color consistency across multiple cameras. In ICCV (Vol. 2, pp. 1268–1275). Google Scholar
  20. Joshi, N., Wilburn, B., Vaish, V., Levoy, M., & Horowitz, M. (2005). Automatic color calibration for large camera arrays (UCSD CSE Tech Report CS2005-0821). Google Scholar
  21. Keck, M., & Davis, J. (2008). 3D occlusion recovery using few cameras. In CVPR (pp. 1–8). Google Scholar
  22. Kim, K., Harwood, D., & Davis, L. (2005). Background updating for visual surveillance. In ISVC (Vol. 1, pp. 337–346). Google Scholar
  23. Kutulakos, K., & Seitz, S. (2000). A theory of shape by space carving. International Journal of Computer Vision, 38(3), 199–218. zbMATHCrossRefGoogle Scholar
  24. Laurentini, A. (1994). The visual hull concept for silhouette-based image understanding. IEEE Transactions on Patern Analysis and Machine Intelligence, 16(2), 150–162. CrossRefGoogle Scholar
  25. Lazebnik, S., Boyer, E., & Ponce, J. (2001). On computing exact visual hulls of solids bounded by smooth surfaces. In CVPR (Vol. 1, pp. 156–161). Google Scholar
  26. Margaritis, D., & Thrun, S. (1998). Learning to locate an object in 3d space from a sequence of camera images. In ICML (Vol. 1, pp. 332–340). Google Scholar
  27. Matusik, W., Buehler, C., Raskar, R., Gortler, S., & McMillan, L. (2000). Image-based visual hulls. In Siggraph (Vol. 1, pp. 369–374). Google Scholar
  28. Matusik, W., Buehler, C., & Mcmillan, L. (2001). Polyhedral visual hulls for real-time rendering. In Proceedings of eurographics workshop on rendering (Vol. 1, pp. 115–126). Google Scholar
  29. Mittal, A., & Davis, L. S. (2003). M2tracker: a multi-view approach to segmenting and tracking people in a cluttered scene. International Journal of Computer Vision, 51(3), 189–203. CrossRefGoogle Scholar
  30. Otsuka, K., & Mukawa, N. (2004). Multiview occlusion analysis for tracking densely populated objects based on 2-D visual angles. In CVPR (Vol. 1, pp. 90–97). Google Scholar
  31. Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47, 7–42. zbMATHCrossRefGoogle Scholar
  32. Seitz, S., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In CVPR (Vol. 1, pp. 519–528). Google Scholar
  33. Sinha, S., & Pollefeys, M. (2005). Multi-view reconstruction using photo-consistency and exact silhouette constraints: a maximum-flow formulation. In ICCV (Vol. 1, pp. 349–356). Google Scholar
  34. Slabaugh, G., Culbertson, B. W., Malzbender, T., Stevens, M. R., & Schafer, R. (2004). Methods for volumetric reconstruction of visual scenes. International Journal of Computer Vision, 57, 179–199. CrossRefGoogle Scholar
  35. Snow, D., Viola, P., & Zabih, R. (2000). Exact voxel occupancy with graph cuts. In CVPR (Vol. 1, pp. 345–353). Google Scholar
  36. Stauffer, C., & Grimson, W. E. L. (1999). Adaptive background mixture models for real-time tracking. In CVPR (Vol. 2, pp. 246–252). Google Scholar
  37. Takamatsu, J., Matsushita, Y., & Ikeuchi, K. (2008). Estimating camera response functions using probabilistic intensity similarity. In CVPR (pp. 1–8). Google Scholar
  38. Yang, D., Gonzalez-Banos, H., & Guibas, L. (2003). Counting people in crowds with a real-time network of simple image sensors. In ICCV (Vol. 1, pp. 122–129). Google Scholar
  39. Ziegler, R., Matusik, W., Pfister, H., & McMillan, L. (2003). 3D reconstruction using labeled image regions. In EG symposium on geometry processing (Vol. 1, pp. 248–259). Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Li Guan
    • 1
    Email author
  • Jean-Sébastien Franco
    • 2
  • Marc Pollefeys
    • 1
    • 3
  1. 1.UNC-Chapel HillChapel HillUSA
  2. 2.LaBRI—INRIA Sud-OuestUniversity of BordeauxTalence CedexFrance
  3. 3.ETH-ZürichZürichSwitzerland

Personalised recommendations