Advertisement

A Review of Segmentation Methods for 3D Semantic Mapping

  • Cristina Romero-GonzálezEmail author
  • Jesus Martínez-Gómez
  • Ismael García-Varea
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1093)

Abstract

A 3D semantic map can be defined as a grid-based representation of the environment, where each bin stores a probability distribution over the possible elements to be found in it. This probability distribution can be obtained with any state-of-the-art image classifier, while the 3D position depends on the localization accuracy of the robot, the sensitivity of its RGB-D sensor, and the segmentation of the input image. In this paper, we focus on this last factor, to explore different options for image segmentation that might improve 3D maps. We will compare various approaches based on the use of 2D and 3D information to find relevant clusters of information. They will be evaluated to assess their suitability for real-time applications.

Keywords

Image segmentation Object detection Image classification Robot vision Deep learning 

Notes

Acknowledgments

This work has been partially sponsored by the Regional Council of Education, Culture and Sports of Castilla-La Mancha under grant number SBPLY/17/180501/000493, supported with Feder funds.

References

  1. 1.
    Romero-González, C., Martínez-Gómez, J., García-Varea, I.: 3D semantic maps for scene segmentation. In: ROBOT 2017: Third Iberian Robotics Conference, Sevilla, Spain, pp. 603–612 (2018)Google Scholar
  2. 2.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  3. 3.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25, pp. 1097–1105 (2012)Google Scholar
  4. 4.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014Google Scholar
  5. 5.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)Google Scholar
  6. 6.
    Tompson, J., Goroshin, R., Jain, A., LeCun, Y., Bregler, C.: Efficient object localization using convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015Google Scholar
  7. 7.
    Girshick, R.: Fast R-CNN. In: The IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  8. 8.
    Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  9. 9.
    Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Computer Vision – ECCV 2014, pp. 740–755. Springer, Heidelberg (2014)Google Scholar
  10. 10.
    Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  11. 11.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015Google Scholar
  12. 12.
    Rabbani, T., Van Den Heuvel, F., Vosselmann, G.: Segmentation of point clouds using smoothness constraint. Int. Arch. Photogrammetry Remote Sens. Spatial Inf. Sci. 36(5), 248–253 (2006)Google Scholar
  13. 13.
    Schoeler, M., Papon, J., Worgotter, F.: Constrained planar cuts - object partitioning for point clouds. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015Google Scholar
  14. 14.
    Papon, J., Abramov, A., Schoeler, M., Wörgötter, F.: Voxel cloud connectivity segmentation - supervoxels for point clouds. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2027–2034, June 2013Google Scholar
  15. 15.
    Zhou, B., Zhao, H., Puig, X., Xiao, T., Fidler, S., Barriuso, A., Torralba, A.: Semantic understanding of scenes through the ADE20K dataset. Int. J. Comput. Vis. 127(3), 302–321 (2019)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Cristina Romero-González
    • 1
    Email author
  • Jesus Martínez-Gómez
    • 1
  • Ismael García-Varea
    • 1
  1. 1.University of Castilla-La ManchaAlbaceteSpain

Personalised recommendations