Advertisement

Boosting LiDAR-Based Semantic Labeling by Cross-modal Training Data Generation

  • Florian PiewakEmail author
  • Peter Pinggera
  • Manuel Schäfer
  • David Peter
  • Beate Schwarz
  • Nick Schneider
  • Markus Enzweiler
  • David Pfeiffer
  • Marius Zöllner
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11134)

Abstract

Mobile robots and autonomous vehicles rely on multi-modal sensor setups to perceive and understand their surroundings. Aside from cameras, LiDAR sensors represent a central component of state-of-the-art perception systems. In addition to accurate spatial perception, a comprehensive semantic understanding of the environment is essential for efficient and safe operation. In this paper we present a novel deep neural network architecture called LiLaNet for point-wise, multi-class semantic labeling of semi-dense LiDAR data. The network utilizes virtual image projections of the 3D point clouds for efficient inference. Further, we propose an automated process for large-scale cross-modal training data generation called Autolabeling, in order to boost semantic labeling performance while keeping the manual annotation effort low. The effectiveness of the proposed network architecture as well as the automated data generation process is demonstrated on a manually annotated ground truth dataset. LiLaNet is shown to significantly outperform current state-of-the-art CNN architectures for LiDAR data. Applying our automatically generated large-scale training data yields a boost of up to 14% points compared to networks trained on manually annotated data only.

Keywords

Semantic point cloud labeling Semantic segmentation Semantic scene understanding Automated training data generation Automated label trasfer 

References

  1. 1.
    Armeni, I., Sax, S., Zamir, A.R., et al.: Joint 2D–3D-semantic data for indoor scene understanding. In: arXiv preprint: arXiv:1702.01105 (2017)
  2. 2.
    Bai, H., Cai, S., Ye, N., et al.: Intention-aware online POMDP planning for autonomous driving in a crowd. In: International Conference on Robotics and Automation (ICRA) (2015)Google Scholar
  3. 3.
    Caltagirone, L., Scheidegger, S., Svensson, L., et al.: Fast LIDAR-based road detection using fully convolutional neural networks. In: arXiv preprint: arXiv:1706.08355 (2017)
  4. 4.
    Cordts, M.: Understanding Cityscapes: Efficient Urban Semantic Scene Understanding. Ph.D. thesis, Technische Universität Darmstadt (2017)Google Scholar
  5. 5.
    Cordts, M., Omran, M., Ramos, S., et al.: The Cityscapes dataset for semantic urban scene understanding. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  6. 6.
    Couprie, C., Farabet, C., Najman, L., et al.: Indoor semantic segmentation using depth information. In: arXiv preprint: arXiv:1301.3572 (2013)
  7. 7.
    Dai, A., Chang, A.X., Savva, M., et al.: ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  8. 8.
    Dewan, A., Oliveira, G.L., Burgard, W.: Deep semantic classification for 3D LiDAR Data. In: arXiv preprint: arXiv:1706.08355 (2017)
  9. 9.
    Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., et al.: A review on deep learning techniques applied to semantic segmentation. In: arXiv preprint: arXiv:1704.06857 (2017)
  10. 10.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  11. 11.
    Gupta, S., Girshick, R., Arbeláez, P., Malik, J.: Learning rich features from RGB-D images for object detection and segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 345–360. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10584-0_23CrossRefGoogle Scholar
  12. 12.
    Hackel, T., Savinov, N., Ladicky, L., et al.: SEMANTIC3D.NET: A new large-scale Point Cloud Classification Benchmark. Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS) IV-1/W1, pp. 91–98 (2017)CrossRefGoogle Scholar
  13. 13.
    Hartley, R., Zisserman, A.: Multiple View Geometry. Cambridge University Press (2003)Google Scholar
  14. 14.
    He, K., Zhang, X., Ren, S., et al.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  15. 15.
    Iandola, F.N., Han, S., Moskewicz, M.W., et al.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5MB model size. In: arXiv preprint: arXiv:1602.07360 (2016)
  16. 16.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: arXiv preprint: arXiv:1412.6980 (2014)
  17. 17.
    Levinson, J., Askeland, J., Becker, J., et al.: Towards fully autonomous driving: Systems and algorithms. In: Intelligent Vehicles Symposium (IV) (2011)Google Scholar
  18. 18.
    Li, Y., Bu, R., Sun, M., et al.: PointCNN. In: arXiv preprint: arXiv:1801.07791 (2018)
  19. 19.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  20. 20.
    Nuss, D.: A Random Finite Set Approach for Dynamic Occupancy Grid Maps with Real-Time Application. Ph.D. thesis, University of Ulm (2016)Google Scholar
  21. 21.
    Oliveira, G.L., Burgard, W., Brox, T.: Efficient deep models for monocular road segmentation. In: International Conference on Intelligent Robots and Systems (IROS) (2016)Google Scholar
  22. 22.
    Qi, C.R., Su, H., Mo, K., et al.: PointNet: Deep learning on point sets for 3D classification and segmentation. In: Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  23. 23.
    Qi, C.R., Yi, L., Su, H., et al.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems (NIPS) (2017)Google Scholar
  24. 24.
    Riegler, G., Ulusoy, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  25. 25.
    Russakovsky, O., Deng, J., Su, H., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33715-4_54CrossRefGoogle Scholar
  27. 27.
    Szegedy, C., Vanhoucke, V., Ioffe, S., et al.: Rethinking the inception architecture for computer vision. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  28. 28.
    Tchapmi, L.P., Choy, C.B., Armeni, I., et al.: SEGCloud: semantic segmentation of 3D point clouds. In: arXiv preprint: arXiv:1710.07563 (2017)
  29. 29.
    Tosteberg, P.: Semantic Segmentation of Point Clouds using Deep Learning. Master thesis, Linköping University (2017)Google Scholar
  30. 30.
    Urmson, C., Baker, C., Dolan, J., et al.: Autonomous driving in traffic: boss and the urban challenge. AI Mag. 30(2), 17–28 (2009)CrossRefGoogle Scholar
  31. 31.
    Varga, R., Costea, A., Florea, H., et al.: Super-sensor for 360-degree environment perception: point cloud segmentation using image features. In: International Conference on Intelligent Transportation Systems (ITSC) (2017)Google Scholar
  32. 32.
    Vu, T.d., Burlet, J., Aycard, O., et al.: Grid-based localization and local mapping with moving object detection and tracking grid-based localization and local mapping with moving object detection and tracking. J. Inf. Fusion 12(1), 58–69 (2011)CrossRefGoogle Scholar
  33. 33.
    Wei, J., Snider, J.M., Kim, J., et al.: Towards a viable autonomous driving research platform. In: Intelligent Vehicles Symposium (IV) (2013)Google Scholar
  34. 34.
    Wu, B., Wan, A., Yue, X., et al.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud. In: arXiv preprint: arXiv:1710.07368 (2017)
  35. 35.
    Ziegler, J., Bender, P., Schreiber, M., et al.: Making Bertha Drive - An Autonomous Journey on a Historic Route. Intell. Transp. Syst. Mag. 6(2), 8–20 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Daimler AG, R&DStuttgartGermany
  2. 2.Karlsruhe Institute of Technology (KIT)KarlsruheGermany
  3. 3.Forschungszentrum Informatik (FZI)KarlsruheGermany

Personalised recommendations