Top–Down Bayesian Inference of Indoor Scenes

  • Luca Del PeroEmail author
  • Kobus Barnard
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)


The task of inferring the 3D layout of indoor scenes from images has seen many recent advancements. Understanding the basic 3D geometry of these environments is important for higher level applications, such as object recognition and robot navigation. In this chapter, we present our Bayesian generative model for understanding indoor environments. We model the 3D geometry of a room and the objects within it with non-overlapping 3D boxes, which provide approximations for both the room boundary and objects like tables and beds. We separately model the imaging process (camera parameters), and an image likelihood, thus providing a complete, generative statistical model for image data. A key feature of this work is using prior information and constraints on the 3D geometry of the scene elements, which addresses ambiguities in the imaging process in a top–down fashion. We also describe and relate this work to other state-of-the-art approaches, and discuss techniques that have become standard in this field, such as estimating the camera pose from a triplet of vanishing points.


Camera Parameter Inference Process World Coordinate System Indoor Scene Picture Frame 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This material is based upon work supported by the National Science Foundation under Grant No. 0747511. We thank Joseph Schlecht for his contributions and suggestions in designing the code base. We also acknowledge the valuable help of Joshua Bowdish, Ernesto Brau, Andrew Emmott, Daniel Fried, Jinyan Guan, Emily Hartley, Bonnie Kermgard, and Philip Lee.


  1. 1.
    Coughlan JM, Yuille AL (1999) Manhattan world: compass direction from a single image by Bayesian inference. In: ICCV Google Scholar
  2. 2.
    Del Pero L, Guan J, Brau E, Schlecht J, Barnard K (2011) Sampling bedrooms. In: CVPR Google Scholar
  3. 3.
    Del Pero L, Bowdish J, Fried D, Kermgard B, Hartley E, Barnard K (2012) Bayesian geometric modeling of indoor scenes. In: CVPR Google Scholar
  4. 4.
    Delage E, Lee HL, Ng AY (2005) Automatic single-image 3d reconstructions of indoor Manhattan world scenes. In: ISRR Google Scholar
  5. 5.
    Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711–732 MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Green PJ (2003) Trans-dimensional Markov chain Monte Carlo. In: Highly structured stochastic systems Google Scholar
  7. 7.
    Gupta A, Satkin S, Efros AA, Hebert M (2011) From 3D scene geometry to human workspace. In: CVPR Google Scholar
  8. 8.
    Hartley RI, Zisserman A (2004) Multiple view geometry in computer vision. Cambridge University Press, Cambridge CrossRefzbMATHGoogle Scholar
  9. 9.
    Hedau V, Hoiem D, Forsyth D (2009) Recovering the spatial layout of cluttered rooms. In: ICCV Google Scholar
  10. 10.
    Hedau V, Hoiem D, Forsyth D (2010) Thinking inside the box: using appearance models and context based on room geometry. In: ECCV Google Scholar
  11. 11.
    Hedau V, Hoiem D, Forsyth D (2012) Recovering free space of indoor scenes from a single image. In: CVPR Google Scholar
  12. 12.
    Hoiem D, Efros AA, Hebert M (2005) Geometric context from a single image. In: ICCV Google Scholar
  13. 13.
    Hoiem D, Efros AA, Hebert M (2006) Putting objects in perspective. In: CVPR Google Scholar
  14. 14.
    Karsch K, Hedau V, Forsyth D, Hoiem D (2011) Rendering synthetic objects into legacy photographs. In: SIGGRAPH Asia Google Scholar
  15. 15.
    Lee DC, Hebert M, Kanade T (2009) Geometric reasoning for single image structure recovery. In: CVPR Google Scholar
  16. 16.
    Lee DC, Gupta A, Hebert M, Kanade T (2010) Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. In: NIPS Google Scholar
  17. 17.
    Neal RM (1993) Probabilistic inference using Markov chain Monte Carlo methods. Technical report Google Scholar
  18. 18.
    Rother C (2002) A new approach to vanishing point detection in architectural. Image Vis Comput 20(9–10):647–655 CrossRefGoogle Scholar
  19. 19.
    Schlecht J, Barnard K (2009) Learning models of object structure. In: NIPS Google Scholar
  20. 20.
    Schwing A, Hazan T, Pollefeys M, Urtasun R (2012) Efficient structure prediction with latent variables for general graphics models. In: CVPR Google Scholar
  21. 21.
    Shi F, Zhang X, Liu Y (2004) A new method of camera pose estimation using 2D-3D corner correspondence. Pattern Recognit Lett 25(10):1155–1163 CrossRefGoogle Scholar
  22. 22.
    Tsai G, Xu C, Liu J, Kuipers B (2011) Real-time indoor scene understanding using Bayesian filtering with motion cues. In: ICCV Google Scholar
  23. 23.
    Tu Z, Zhu S (2002) Image segmentation by data-driven Markov chain Monte-Carlo. In: PAMI Google Scholar
  24. 24.
    Wang H, Gould S, Koller D (2010) Discriminative learning with latent variables for cluttered indoor scene understanding. In: ECCV Google Scholar
  25. 25.
    Yu SX, Zhang H, Malik J (2008) Inferring spatial layout from a single image via depth-ordered grouping, In: POCV Google Scholar
  26. 26.
    Zhu S-C, Zhang R, Tu Z (2000) Integrating top–down/bottom–up for object recognition by data driven Markov chain Monte Carlo. In: CVPR Google Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  1. 1.University of ArizonaTucsonUSA

Personalised recommendations