International Journal of Computer Vision

, Volume 79, Issue 2, pp 137–158 | Cite as

Inter-Image Statistics for 3D Environment Modeling

  • Luz A. Torres-MéndezEmail author
  • Gregory Dudek


In this article we present a method for automatically recovering complete and dense depth maps of an indoor environment by fusing incomplete data for the 3D environment modeling problem. The geometry of indoor environments is usually extracted by acquiring a huge amount of range data and registering it. By acquiring a small set of intensity images and a very limited amount of range data, the acquisition process is considerably simplified, saving time and energy consumption. In our method, the intensity and partial range data are registered first by using an image-based registration algorithm. Then, the missing geometric structures are inferred using a statistical learning method that integrates and analyzes the statistical relationships between the visual data and the available depth on terms of small patches. Experiments on real-world data on a variety of sampling strategies demonstrate the feasibility of our method.


3D environment modeling Sensor fusion Markov random fields 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11263_2007_108_MOESM1_ESM.pdf (1.6 mb)


  1. Baker, S., & Kanade, T. (2002). Limits on super-resolution and how to break them. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1167–1183. CrossRefGoogle Scholar
  2. Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239. CrossRefGoogle Scholar
  3. Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679–698. CrossRefGoogle Scholar
  4. Cobzas, D. (2003). Image-based models with applications in robot navigation. PhD thesis, University of Alberta, Canada. Google Scholar
  5. Connolly, C. I. (1984). Cumulative generation of octree models from range data. In Proceedings of the international conference on robotics (pp. 25–32), March 1984. Google Scholar
  6. Criminisi, A., Perez, P., & Toyama, K. (2003). Object removal by exemplar-based inpainting. In IEEE computer vision and pattern recognition. Google Scholar
  7. Davies, D. (1986). Uncertainty and how to treat it: modeling under uncertainty. In Proceedings of the first international conference on modeling under uncertainty (pp. 16–18), April 1986. Google Scholar
  8. Diebel, J., & Thrun, S. (2005). An application of Markov random fields to range sensing. In Advances in neural information processing systems (NIPS) (pp. 291–298). Cambridge: MIT Press. Google Scholar
  9. Efros, A., & Freeman, W. T. (2001). Image quilting for texture synthesis and transfer. In SIGGRAPH (pp. 1033–1038). Google Scholar
  10. Efros, A., & Leung, T. K. (1999). Texture synthesis by non-parametric sampling. In IEEE international conference on computer vision (pp. 1033–1038), September 1999. Google Scholar
  11. El-Hakim, S. F. (1998). A multi-sensor approach to creating accurate virtual environments. Journal of Photogrammetry and Remote Sensing, 53(6), 379–391. CrossRefGoogle Scholar
  12. Felzenszwalb, P. F., & Huttenlocher, D. P. (2006). Efficient belief propagation for early vision. International Journal of Computer Vision, 70(1), 41–54. CrossRefGoogle Scholar
  13. Fitzgibbon, A. W., & Zisserman, A. (1998). Automatic 3D model acquisition and generation of new images from video sequences. In Proceedings of European signal processing conference (pp. 1261–1269). Google Scholar
  14. Freeman, W. T., & Adelson, E. H. (1991). The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(9), 891–906. CrossRefGoogle Scholar
  15. Freeman, W. T., & Torralba, A. (2003). Shape recipes: scene representations that refer to the image. In Advances in neural information processing systems (NIPS) (Vol. 15). Cambridge: MIT Press. Google Scholar
  16. Freeman, W. T., Pasztor, E. C., & Carmichael, O. T. (2000). Learning low-level vision. International Journal of Computer Vision, 20(1), 25–47. CrossRefGoogle Scholar
  17. Fua, P. (1993). A parallel stereo algorithm that produces dense depth maps and preserves image feature. Machine Vision and Applications, 6, 35–49. CrossRefGoogle Scholar
  18. Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741. zbMATHCrossRefGoogle Scholar
  19. Hammersley, J. M., & Clifford, P. (1971). Markov field on finite graphs and lattices. Unpublished. Google Scholar
  20. Hertzmann, A., Jacobs, C. E., Oliver, N., Curless, B., & Salesin, D. H. (2001). Images analogies. In SIGGRAPH (pp. 327–340), August 2001. Google Scholar
  21. Hoiem, D., Efros, A. A., & Herbert, M. (2005). Geometric context from a single image. In Proceedings of ICCV. Google Scholar
  22. Howe, C. Q., & Purves, D. (2002). Range image statistics can explain the anomalous perception of length. Proceedings of the National Academy of Sciences, USA, 99, 13184–13188. CrossRefGoogle Scholar
  23. Kolmogorov, V., & Zabih, R. (2001). Computing visual correspondence with occlusions using graph cuts. In Proceedings ICCV (pp. 508–515). Google Scholar
  24. Kuipers, B. (1978). Modelling spatial knowledge. Cognitive Science, 2, 1291–1353. CrossRefGoogle Scholar
  25. Lee, A., Pedersen, K., & Mumford, D. (2001). The complex statistics of high-contrast patches in natural images. Private correspondence. Google Scholar
  26. Levoy, M., Pulli, K., Curless, B., Rusinkiewicz, S., Koller, D., Pereira, L., Ginzton, M., Anderson, S., Davis, J., Ginsberg, J., Shade, J., & Fulk, D. (2000). The digital Michelangelo project: 3D scanning of large statues. In Proceedings of ACM SIGGRAPH (pp. 131–144), July 2000. Google Scholar
  27. Little, J. J., & Gillett, W. E. (1990). Direct evidence for occlusion in stereo and motion. Image and Vision Computing, 8, 328–340. CrossRefGoogle Scholar
  28. Michels, J., Saxena, A., & Ng, A. Y. (2005). High speed obstacle avoidance using monocular vision and reinforcement learning. In Proceedings of the 21th international conference on machine learning (ICML). Google Scholar
  29. Nelson, W., & Cox, I. (1988). Local path control for an autonomous vehicle. In IEEE international conference on robotics and automation (pp. 1504–1510), New York, NY. Google Scholar
  30. Nyland, L., McAllister, D., Popescu, V., McCue, C., Lastra, A., Rademacher, P., Oliveira, M., Bishop, G., Meenakshisundaram, G., Cutts, M., & Fuchs, H. (1999). The impact of dense range data on computer graphics. In Proceedings of multi-view modeling and analysis workshop (MVIEW part of CVPR) (p. 8), June 1999. Google Scholar
  31. Poggio, T. A., Gamble, E. B., & Little, J. J. (1988). Parallel integration of vision modules. Science, 242, 436–439. CrossRefMathSciNetGoogle Scholar
  32. Pollefeys, M., Koch, R., Vergauwen, M., & Van Gool, L. (1998). Metric 3D surface reconstruction from uncalibrated images sequences. In Proceedings of SMILE workshop (post-ECCV) (pp. 138–153). Google Scholar
  33. Potetz, B., & Lee, T. S. (2003). Statistical correlations between 2D images and 3D structures in natural scenes. Journal of Optical Society of America, 20(7), 1292–1303. CrossRefGoogle Scholar
  34. Pulli, K., Cohen, M., Duchamp, T., Hoppe, H., McDonald, J., Shapiro, L., & Stuetzle, W. (1997). Surface modeling and display from range and color data. In Lecture notes in computer science (Vol. 1310, pp. 385–397). Berlin: Springer. Google Scholar
  35. Saxena, A., Chung, S. H., & Ng, A. Y. (2006). Learning depth from single monocular images. In Proceedings of the advances in neural information processing systems (NIPS). Cambridge: MIT Press. Google Scholar
  36. Saxena, A., Schulte, J., & Ng, A. Y. (2007). Depth estimation using monocular and stereo cues. In Proceedings of international joint conference on artificial intelligence (IJCAI). Google Scholar
  37. Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision, 47(7), 7–42. zbMATHCrossRefGoogle Scholar
  38. Scharstein, D., & Szeliski, R. (2003). High-accuracy stereo depth maps using structured light. In IEEE CVPR (Vol. 1, pp. 195–202). Google Scholar
  39. Sequeira, V., Ng, K., Wolfart, E., Goncalves, J. G. M., & Hogg, D. C. (1999). Automated reconstruction of 3D models from real environments. ISPRS Journal of Photogrammetry and Remote Sensing, 54, 1–22. CrossRefGoogle Scholar
  40. Shafer, G. (1976). Mathematical theory of evidence. Princeton: Princeton University Press. zbMATHGoogle Scholar
  41. Stamos, I., & Allen, P. K. (2000). 3D model construction using range and image data. In CVPR, June 2000. Google Scholar
  42. Stan, Z. (1995). In T. L. Kunii (Ed.), Computer science workbench. Li–Markov random field modeling in image analysis. Berlin: Springer. Google Scholar
  43. Sun, J., Zheng, N., & Shum, H. (2003). Stereo matching using belief propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(7), 787–800. CrossRefGoogle Scholar
  44. Sun, J., Li, Y., Kang, S., & Shum, H.-Y. (2005). Symmetric stereo matching for occlusion handling. In Proceedings CVPR (Vol. II, pp. 399–406). Google Scholar
  45. Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., Tappen, M., & Rother, C. (2006). A comparative study of energy minimization methods for Markov random fields. In ECCV (Vol. 2, pp. 19–26). Google Scholar
  46. Tomasi, C., & Kanade, T. (1992). Shape and motion from image streams under orthography: a factorization approach. International Journal of Computer Vision, 9(2), 137–154. CrossRefGoogle Scholar
  47. Torralba, A., & Oliva, A. (2002). Depth estimation from image structure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), 1226–1238. CrossRefGoogle Scholar
  48. Torres-Méndez, L.A. (2005). Statistics of visual and partial depth data for mobile robot environment modeling. PhD thesis, McGill University. Google Scholar
  49. Torres-Méndez, L. A., & Dudek, G. (2002). Range synthesis for 3D environment modeling. In IEEE workshop on applications of computer vision (pp. 231–236), Orlando, FL. Google Scholar
  50. Torres-Méndez, L. A., & Dudek, G. (2004a). Statistical inference and synthesis in the image domain for mobile robot environment modeling. In Proceedings of the IEEE/RSJ conference on intelligent robots and systems (IROS), September 2004. New York: IEEE Press. Google Scholar
  51. Torres-Méndez, L. A., & Dudek, G. (2004b). Reconstruction of 3D models from intensity image and partial depth. In AAAI proceedings (pp. 476–481), July 2004. Google Scholar
  52. Torres-Méndez, L. A., & Dudek, G. (2006). Statistics of visual and partial depth data for mobile robot environment modeling. In A. Gelbukh & C. A. Reyes-Garca (Eds.), Lecture notes in artificial intelligence : Vol. 4293. Mexican international conference on artificial intelligence (pp. 715–725). Berlin: Springer. Google Scholar
  53. Wei, L., & Levoy, M. (2000). Fast texture synthesis using tree-structured vector quantization. In SIGGRAPH (pp. 479–488), July 2000. Google Scholar
  54. Whaite, P., & Ferrie, F. P. (1997). Autonomous exploration: driven by uncertainty. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(3), 193–205. CrossRefGoogle Scholar
  55. Winkler, G. (1995). Image analysis, random fields, and dynamic Monte Carlo methods: a mathematical introduction. Berlin: Springer. zbMATHGoogle Scholar
  56. Zhang, L., & Seitz, S. M. (2005). Parameter estimation for MRF stereo. In CVPR (Vol. 2, pp. 288–295). Google Scholar
  57. Zhang, R., Tsai, P.-S., Cryer, J. E., & Shah, M. (1999). Shape from shading: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(8), 690–706. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.Centre for Intelligent MachinesMcGill UniversityMontrealCanada

Personalised recommendations