Indoor Segmentation and Support Inference from RGBD Images

  • Nathan Silberman
  • Derek Hoiem
  • Pushmeet Kohli
  • Rob Fergus
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7576)


We present an approach to interpret the major surfaces, objects, and support relations of an indoor scene from an RGBD image. Most existing work ignores physical interactions or is applied only to tidy rooms and hallways. Our goal is to parse typical, often messy, indoor scenes into floor, walls, supporting surfaces, and object regions, and to recover support relationships. One of our main interests is to better understand how 3D cues can best inform a structured 3D interpretation. We also contribute a novel integer programming formulation to infer physical support relations. We offer a new dataset of 1449 RGBD images, capturing 464 diverse indoor scenes, with detailed annotations. Our experiments demonstrate our ability to infer support relations in complex scenes and verify that our 3D scene cues and inferred support lead to better object segmentation.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: ICCV (2005)Google Scholar
  2. 2.
    Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: ICCV (2009)Google Scholar
  3. 3.
    Hedau, V., Hoiem, D., Forsyth, D.: Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 224–237. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  4. 4.
    Lee, D.C., Hebert, M., Kanade, T.: Geometric reasoning for single image structure recovery. In: CVPR (2009)Google Scholar
  5. 5.
    Lee, D.C., Gupta, A., Hebert, M., Kanade, T.: Estimating spatial layout of rooms using volumetric reasoning about objects and surfaces. In: NIPS (2010)Google Scholar
  6. 6.
    Gupta, A., Efros, A.A., Hebert, M.: Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 482–496. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Gupta, A., Satkin, S., Efros, A.A., Hebert, M.: From 3d scene geometry to human workspace. In: CVPR (2011)Google Scholar
  8. 8.
    Hoiem, D., Efros, A.A., Hebert, M.: Recovering occlusion boundaries from an image. Int. J. Comput. Vision 91, 328–346 (2011)MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Russell, B.C., Torralba, A.: Building a database of 3d scenes from user annotations. In: CVPR (2009)Google Scholar
  10. 10.
    Zhang, C., Wang, L., Yang, R.: Semantic Segmentation of Urban Scenes Using Dense Depth Maps. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 708–721. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. 11.
    Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: ICCV Workshop on 3D Representation and Recognition (2011)Google Scholar
  12. 12.
    Karayev, S., Janoch, A., Jia, Y., Barron, J., Fritz, M., Saenko, K., Darrell, T.: A category-level 3-d database: Putting the kinect to work. In: ICCV Workshop on Consumer Depth Cameras for Computer Vision (2011)Google Scholar
  13. 13.
    Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view rgb-d object dataset. In: ICRA (2011)Google Scholar
  14. 14.
    Koppula, H., Anand, A., Joachims, T., Saxena, A.: Semantic labeling of 3d point clouds for indoor scenes. In: NIPS (2011)Google Scholar
  15. 15.
    Levin, A., Lischinski, D., Weiss, Y.: Colorization using optimization. In: SIGGRAPH (2004)Google Scholar
  16. 16.
    Coughlan, J., Yuille, A.: Manhattan world: orientation and outlier detection by Bayesian inference. Neural Computation 15 (2003)Google Scholar
  17. 17.
    Kosecka, J., Zhang, W.: Video Compass. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part IV. LNCS, vol. 2353, pp. 476–490. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  18. 18.
    Arbelaez, P.: Boundary extraction in natural images using ultrametric contour maps. In: Proc. POCV (2006)Google Scholar
  19. 19.
    Tighe, J., Lazebnik, S.: SuperParsing: Scalable Nonparametric Image Parsing with Superpixels. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 352–365. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Nathan Silberman
    • 1
  • Derek Hoiem
    • 2
  • Pushmeet Kohli
    • 3
  • Rob Fergus
    • 1
  1. 1.Courant InstituteNew York UniversityUSA
  2. 2.Department of Computer ScienceUniversity of Illinois at Urbana-ChampaignUSA
  3. 3.Microsoft ResearchCambridgeUK

Personalised recommendations