Building 3D semantic maps for mobile robots using RGB-D camera

Zhao, Zhe; Chen, Xiaoping

doi:10.1007/s11370-016-0201-x

Building 3D semantic maps for mobile robots using RGB-D camera

Original Research Paper
Published: 01 July 2016

Volume 9, pages 297–309, (2016)
Cite this article

Intelligent Service Robotics Aims and scope Submit manuscript

1486 Accesses
28 Citations
Explore all metrics

Abstract

The wide availability of affordable RGB-D sensors changes the landscape of indoor scene analysis. Years of research on simultaneous localization and mapping (SLAM) have made it possible to merge multiple RGB-D images into a single point cloud and provide a 3D model for a complete indoor scene. However, these reconstructed models only have geometry information, not including semantic knowledge. The advancements in robot autonomy and capabilities for carrying out more complex tasks in unstructured environments can be greatly enhanced by endowing environment models with semantic knowledge. Towards this goal, we propose a novel approach to generate 3D semantic maps for an indoor scene. Our approach creates a 3D reconstructed map from a RGB-D image sequence firstly, then we jointly infer the semantic object category and structural class for each point of the global map. 12 object categories (e.g. walls, tables, chairs) and 4 structural classes (ground, structure, furniture and props) are labeled in the global map. In this way, we can totally understand both the object and structure information. In order to get semantic information, we compute semantic segmentation for each RGB-D image and merge the labeling results by a Dense Conditional Random Field. Different from previous techniques, we use temporal information and higher-order cliques to enforce the label consistency for each image labeling result. Our experiments demonstrate that temporal information and higher-order cliques are significant for the semantic mapping procedure and can improve the precision of the semantic mapping results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

Article 06 March 2024

Huan Yin, Xuecheng Xu, … Yue Wang

VOX2BIM+ - A Fast and Robust Approach for Automated Indoor Point Cloud Segmentation and Building Model Generation

Article Open access 30 May 2023

Jan Martens & Jörg Blankenbach

Resolution-sensitive self-supervised monocular absolute depth estimation

Article 05 April 2024

Yuquan Zhou, Chentao Zhang, … Jianhuan Zhang

Notes

References

Arbelaez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Pattern Anal Mach Intell 33(5):898–916
Article Google Scholar
Banica D, Sminchisescu C (2013) CPMC-3D-O2P: Semantic segmentation of rgb-d images using cpmc and second order pooling. CoRR. arXiv:1312.7715
Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. In: Computer vision–ECCV 2006, pp 404–417. Springer, New York
Chen Y, Shuai W, Chen X (2015) A probabilistic, variable-resolution and effective quadtree representation for mapping of large environments. In: Advanced Robotics (ICAR), 2015 International Conference on, pp 605–610. IEEE
Couprie C, Farabet C, Najman L, LeCun Y (2013) Indoor semantic segmentation using depth information. arXiv:1301.3572 (preprint)
Engelhard N, Endres F, Hess J, Sturm J, Burgard W (2011) Real-time 3d visual slam with a hand-held rgb-d camera. In: Proc. of the RGB-D Workshop on 3D Perception in Robotics at the European Robotics Forum. Vasteras, Sweden
Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vision 59(2):167–181
Article Google Scholar
Grisetti G, Grzonka S, Stachniss C, Pfaff P, Burgard W (2007) Efficient estimation of accurate maximum likelihood maps in 3d. In: IROS, pp 3472–3478
Gupta S, Arbelaez P, Malik J (2013) Perceptual organization and recognition of indoor scenes from rgb-d images. In: Computer Vision and Pattern Recognition (CVPR), IEEE Conference on, pp 564–571. IEEE
Henry P, Krainin M, Herbst E, Ren X, Fox D (2010) Rgb-d mapping: using depth cameras for dense 3d modeling of indoor environments. In: The 12th International Symposium on Experimental Robotics (ISER), vol 20, pp 22–25
Henry P, Krainin M, Herbst E, Ren X, Fox D (2012) Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments. I. J Robot Res 31(5):647–663
Article Google Scholar
Hermans A, Floros G, Leibe B (2014) Dense 3d semantic mapping of indoor scenes from rgb-d images. In: IEEE International Conference on Robotics and Automation (ICRA), pp 2631–2638
Koppula HS, Anand A, Joachims T, Saxena A (2011) Semantic labeling of 3d point clouds for indoor scenes. In: NIPS, pp 244–252
Krähenbühl P, Koltun V (2012) Efficient inference in fully connected crfs with gaussian edge potentials. CoRR. arXiv:1210.5644
Lucas BD, Kanade T et al (1981) An iterative image registration technique with an application to stereo vision. IJCAI 81:674–679
Google Scholar
Newcombe RA, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison AJ, Kohli P, Shotton J, Hodges S, Fitzgibbon AW (2011) Kinectfusion: real-time dense surface mapping and tracking. In: ISMAR, pp 127–136
Nüchter A, Hertzberg J (2008) Towards semantic maps for mobile robots. Robot Auton Syst 56(11):915–926
Article Google Scholar
Ren X, Bo L, Fox D (2012) Rgb-(d) scene labeling: features and algorithms. In: CVPR, pp 2759–2766
Shi J, Tomasi C (1994) Good features to track. In: Computer vision and pattern recognition, pp 593–600. IEEE
Silberman N, Fergus R (2011) Indoor scene segmentation using a structured light sensor. In: ICCV Workshops, pp 601–608
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. ECCV 5:746–760
Google Scholar
Stuckler J, Biresev N, Behnke S (2012) Semantic mapping using object-class segmentation of rgb-d images. In: Intelligent Robots and Systems (IROS), pp 3005–3010. IEEE
Valentin JP, Sengupta S, Warrell J, Shahrokni A, Torr PH (2013) Mesh based semantic modelling for indoor and outdoor scenes. In: Computer Vision and Pattern Recognition (CVPR), pp 2067–2074. IEEE
Whelan T, Kaess M, Fallon M, Johannsson H, Leonard J, McDonald J (2012) Kintinuous: Spatially extended KinectFusion. In: RSS Workshop on RGB-D: advanced reasoning with depth cameras. Sydney
Xu C, Xiong C, Corso JJ (2012) Streaming hierarchical video segmentation. In: Computer Vision–ECCV, pp 626–639. Springer, New York
Zhao Z, Chen X (2014) Semantic mapping for object category and structural class. In: Intelligent Robots and Systems (IROS 2014), pp 724–729. IEEE

Download references

Author information

Authors and Affiliations

University of Science and Technology of China, No. 96, JinZhai Road, Baohe District, Hefei, 230026, Anhui, People’s Republic of China
Zhe Zhao & Xiaoping Chen

Authors

Zhe Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoping Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhe Zhao.

Additional information

This research is supported by the USTC Key-Direction Research Fund under Grant WK0110000028 and the National Natural Science Foundation of China under Grant 61175057.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, Z., Chen, X. Building 3D semantic maps for mobile robots using RGB-D camera. Intel Serv Robotics 9, 297–309 (2016). https://doi.org/10.1007/s11370-016-0201-x

Download citation

Received: 28 January 2016
Accepted: 20 June 2016
Published: 01 July 2016
Issue Date: October 2016
DOI: https://doi.org/10.1007/s11370-016-0201-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Building 3D semantic maps for mobile robots using RGB-D camera

Abstract

Access this article

Similar content being viewed by others

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

VOX2BIM+ - A Fast and Robust Approach for Automated Indoor Point Cloud Segmentation and Building Model Generation

Resolution-sensitive self-supervised monocular absolute depth estimation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Building 3D semantic maps for mobile robots using RGB-D camera

Abstract

Access this article

Similar content being viewed by others

A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems

VOX2BIM+ - A Fast and Robust Approach for Automated Indoor Point Cloud Segmentation and Building Model Generation

Resolution-sensitive self-supervised monocular absolute depth estimation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation