Computational Visual Media

, Volume 1, Issue 4, pp 267–278 | Cite as

3D indoor scene modeling from RGB-D data: a survey

  • Kang Chen
  • Yu-Kun Lai
  • Shi-Min HuEmail author
Open Access
Review Article


3D scene modeling has long been a fundamental problem in computer graphics and computer vision. With the popularity of consumer-level RGB-D cameras, there is a growing interest in digitizing real-world indoor 3D scenes. However, modeling indoor 3D scenes remains a challenging problem because of the complex structure of interior objects and poor quality of RGB-D data acquired by consumer-level sensors. Various methods have been proposed to tackle these challenges. In this survey, we provide an overview of recent advances in indoor scene modeling techniques, as well as public datasets and code libraries which can facilitate experiments and evaluation.


RGB-D camera 3D indoor scenes geometric modeling semantic modeling survey 


  1. [1]
    Merrell, P.; Schkufza, E.; Li, Z.; Agrawala, M.; Koltun, V. Interactive furniture layout using interior design guidelines. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 87, 2011.CrossRefGoogle Scholar
  2. [2]
    Yu, L.-F.; Yeung, S.-K.; Tang, C.-K.; Terzopoulos, D.; Chan, T. F.; Osher, S. J. Make it home: Automatic optimization of furniture arrangement. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 86, 2011.CrossRefGoogle Scholar
  3. [3]
    Xiao, J.; Furukawa, Y. Reconstructing the world's museums. International Journal of Computer Vision Vol. 110, No. 3, 243–258, 2014.CrossRefGoogle Scholar
  4. [4]
    Izadi, S.; Kim, D.; Hilliges, O.; Molyneaux, D.; Newcombe, R.; Kohli, P.; Shotton, J.; Hodges, S.; Freeman, D.; Davison, A.; Fitzgibbon, A. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, 559–568, 2011.Google Scholar
  5. [5]
    Newcombe, R. A.; Izadi, S.; Hilliges, O.; Molyneaux, D.; Kim, D.; Davison, A. J.; Kohi, P.; Shotton, J.; Hodges, S.; Fitzgibbon, A. KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of 2011 10th IEEE International Symposium on Mixed and Augmented Reality, 127–136, 2011.CrossRefGoogle Scholar
  6. [6]
    Savva, M.; Chang, A. X.; Hanrahan, P.; Fisher, M.; Niener, M. SceneGrok: Inferring action maps in 3D environments. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 212, 2014.CrossRefGoogle Scholar
  7. [7]
    Chen, K.; Lai, Y.-K.; Wu, Y.-X.; Martin, R.; Hu, S.-M. Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 208, 2014.CrossRefGoogle Scholar
  8. [8]
    Iddan, G. J.; Yahav, G. Three-dimensional imaging in the studio and elsewhere. In: Proceedings of the International Society for Optics and Photonics, Vol. 4289, No. 48, 48–55, 2001.Google Scholar
  9. [9]
    Anand, A.; Koppula, H. S.; Joachims, T.; Saxena, A. Contextually guided semantic labeling and search for three-dimensional point clouds. International Journal of Robotics Research Vol. 32, No. 1, 19–34, 2013.CrossRefGoogle Scholar
  10. [10]
    Koppula, H. S.; Anand, A.; Joachims, T.; Saxena, A. Semantic labeling of 3D point clouds for indoor scenes. In: Proceedings of the Conference on Neural Information Processing Systems, 244–252, 2011.Google Scholar
  11. [11]
    Lai, K.; Bo, L.; Fox, D. Unsupervised feature learning for 3D scene labeling. In: Proceedings of 2014 IEEE International Conference on Robotics and Automation, 3050–3057, 2014.CrossRefGoogle Scholar
  12. [12]
    Silberman, N.; Fergus, R. Indoor scene segmentation using a structured light sensor. In: Proceedings of 2011 IEEE International Conference on Computer Vision Workshops, 601–608, 2011.CrossRefGoogle Scholar
  13. [13]
    Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from RGBD images. In: Proceedings of the 12th European Conference on Computer Vision-Volume Part V, 746–760, 2012.Google Scholar
  14. [14]
    Xiao, J.; Owens, A.; Torralba, A. SUN3D: A database of big spaces reconstructed using SfM and object labels. In: Proceedings of 2013 IEEE International Conference on Computer Vision, 1625–1632, 2013.CrossRefGoogle Scholar
  15. [15]
    Mattausch, O.; Panozzo, D.; Mura, C.; Sorkine- Hornung, O.; Pajarola, R. Object detection and classification from large-scale cluttered indoor scans. Computer Graphics Forum Vol. 33, No. 2, 11–21, 2014.CrossRefGoogle Scholar
  16. [16]
    Rusu, R. B.; Cousins, S. 3D is here: Point cloud library (PCL). In: Proceedings of 2011 IEEE International Conference on Robotics and Automation, 1–4, 2011.CrossRefGoogle Scholar
  17. [17]
    Information on http://wwwmrptorg.Google Scholar
  18. [18]
    Besl, P. J.; McKay, N. D. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 14, No. 2, 239–256, 1992.CrossRefGoogle Scholar
  19. [19]
    Chen, Y.; Medioni, G. Object modeling by registration of multiple range images. Image and Vision Computing Vol. 10, No. 3, 145–155, 1992.CrossRefGoogle Scholar
  20. [20]
    Durrant-Whyte, H.; Bailey, T. Simultaneous localization and mapping: Part I. IEEE Robotics & Automation Magazine Vol. 13, No. 2, 99–110, 2006.CrossRefGoogle Scholar
  21. [21]
    Curless, B.; Levoy, M. A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, 303–312, 1996.Google Scholar
  22. [22]
    Heredia, F.; Favier, R. Kinect Fusion extensions to large scale environments. Available at http:// wwwpointcloudsorg/blog/srcs/fheredia.Google Scholar
  23. [23]
    Endres, F.; Hess, J.; Engelhard, N.; Sturm, J.; Burgard, W. An evaluation of the RGB-D SLAM system. In: Proceedings of 2012 IEEE International Conference on Robotics and Automation, 1691–1696, 2012.CrossRefGoogle Scholar
  24. [24]
    Information on http://openslamorg.Google Scholar
  25. [25]
    Lowe, D. G. Object recognition from local scaleinvariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, Vol. 2, 1150–1157, 1999.Google Scholar
  26. [26]
    Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-up robust features (SURF). Computer Vision and Image Understanding Vol. 110, No. 3, 346–359, 2008.CrossRefGoogle Scholar
  27. [27]
    Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In: Proceedings of 2011 IEEE International Conference on Computer Vision, 2564–2571, 2011.CrossRefGoogle Scholar
  28. [28]
    Fischler, M. A.; Bolles, R. C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM Vol. 24, No. 6, 381–395, 1981.MathSciNetCrossRefGoogle Scholar
  29. [29]
    Tsai, C.-Y.; Wang, C.-W.; Wang, W.-Y. Design and implementation of a RANSAC RGB-D mapping algorithm for multi-view point cloud registration. In: Proceedings of 2013 International Automatic Control Conference, 367–370, 2013.CrossRefGoogle Scholar
  30. [30]
    Henry, P.; Krainin, M.; Herbst, E.; Ren, X.; Fox, D. RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. International Journal of Robotics Research Vol. 31, No. 5, 647–663, 2012.CrossRefGoogle Scholar
  31. [31]
    Li, M.; Lin, R.; Wang H.; Xu, H. An efficient SLAM system only using RGBD sensors. In: Proceedings of 2013 IEEE International Conference on Robotics and Biomimetics, 1653–1658, 2013.CrossRefGoogle Scholar
  32. [32]
    Lin, R.; Wang, Y.; Yang, S. RGBD SLAM for indoor environment. In: Proceedings of the 1st International Conference on Cognitive Systems and Information Processing, 161–175, 2014.Google Scholar
  33. [33]
    Duda, R. O.; Hart, P. E. Use of the Hough transformation to detect lines and curves in pictures. Communications of the ACM Vol. 15, No. 1, 11–15, 1972.zbMATHCrossRefGoogle Scholar
  34. [34]
    Stockman, G.; Shapiro, L. Computer Vision. Upper Saddle River, NJ, USA: Prentice Hall, 2001.Google Scholar
  35. [35]
    Oesau, S.; Lafarge, F.; Alliez, P. Indoor scene reconstruction using feature sensitive primitive extraction and graph-cut. ISPRS Journal of Photogrammetry and Remote Sensing Vol. 90, 68–82, 2014.CrossRefGoogle Scholar
  36. [36]
    Sanchez, V.; Zakhor, A. Planar 3D modeling of building interiors from point cloud data. In: Proceedings of 2012 19th IEEE International Conference on Image Processing, 1777–1780, 2012.CrossRefGoogle Scholar
  37. [37]
    Li, Y.; Wu, X.; Chrysathou, Y.; Sharf, A.; Cohen-Or, D.; Mitra, N. J. GlobFit: Consistently fitting primitives by discovering global relations. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 52, 2011.Google Scholar
  38. [38]
    Arikan, M.; Schwärzler, M.; Flory, S.; Wimmer, M.; Maierhofer, S. O-snap: Optimization-based snapping for modeling architecture. ACM Transactions on Graphics Vol. 32, No. 1, Article No. 6, 2013.CrossRefGoogle Scholar
  39. [39]
    Kim, Y. M.; Mitra, N. J.; Yan, D.-M.; Guibas, L. Acquiring 3D indoor environments with variability and repetition. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 138, 2012.CrossRefGoogle Scholar
  40. [40]
    Nan, L.; Xie, K.; Sharf, A. A search-classify approach for cluttered indoor scene understanding. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 137, 2012.CrossRefGoogle Scholar
  41. [41]
    Shao, T.; Xu, W.; Zhou, K.; Wang, J.; Li, D.; Guo, B. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 136, 2012.CrossRefGoogle Scholar
  42. [42]
    Zhou, Q.-Y.; Koltun, V. Dense scene reconstruction with points of interest. ACM Transactions on Graphics Vol. 32, No. 4, Article No. 112, 2013.Google Scholar
  43. [43]
    Salas-Moreno, R. F.; Newcombe, R. A.; Strasdat, H.; Kelly, P. H. J.; Davison, A. J. SLAM++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, 1352–1359, 2013.CrossRefGoogle Scholar

Copyright information

© The Author(s) 2015

Authors and Affiliations

  1. 1.Tsinghua UniversityBeijingChina
  2. 2.Cardiff UniversityCardiffUK

Personalised recommendations