Advertisement

Active Perception Using Light Curtains for Autonomous Driving

Conference paper
  • 702 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12350)

Abstract

Most real-world 3D sensors such as LiDARs perform fixed scans of the entire environment, while being decoupled from the recognition system that processes the sensor data. In this work, we propose a method for 3D object recognition using light curtains, a resource-efficient controllable sensor that measures depth at user-specified locations in the environment. Crucially, we propose using prediction uncertainty of a deep learning based 3D point cloud detector to guide active perception. Given a neural network’s uncertainty, we develop a novel optimization algorithm to optimally place light curtains to maximize coverage of uncertain regions. Efficient optimization is achieved by encoding the physical constraints of the device into a constraint graph, which is optimized with dynamic programming. We show how a 3D detector can be trained to detect objects in a scene by sequentially placing uncertainty-guided light curtains to successively improve detection accuracy. Links to code can be found on the project webpage.

Keywords

Active vision Robotics Autonomous driving 3D vision 

Notes

Acknowledgements

We thank Matthew O’Toole for feedback on the initial draft of this paper. This material is based upon work supported by the National Science Foundation under Grants No. IIS-1849154, IIS-1900821 and by the United States Air Force and DARPA under Contract No. FA8750-18-C-0092.

Supplementary material

504441_1_En_44_MOESM1_ESM.pdf (391 kb)
Supplementary material 1 (pdf 391 KB)

Supplementary material 2 (mp4 38883 KB)

References

  1. 1.
    Bajcsy, R.: Active perception. Proc. IEEE 76(8), 966–1005 (1988)CrossRefGoogle Scholar
  2. 2.
    Bartels, J.R., Wang, J., Whittaker, W.R., Narasimhan, S.G.: Agile depth sensing using triangulation light curtains. In: The IEEE International Conference on Computer Vision (ICCV), October 2019Google Scholar
  3. 3.
    Caesar, H., et al.: nuscenes: a multimodal dataset for autonomous driving. arXiv preprint arXiv:1903.11027 (2019)
  4. 4.
    Cheng, R., Agarwal, A., Fragkiadaki, K.: Reinforcement learning of active vision for manipulating objects under occlusions. arXiv preprint arXiv:1811.08067 (2018)
  5. 5.
    Connolly, C.: The determination of next best views. In: Proceedings of 1985 IEEE International Conference on Robotics and Automation, vol. 2, pp. 432–435. IEEE (1985)Google Scholar
  6. 6.
    Daudelin, J., Campbell, M.: An adaptable, probabilistic, next-best view algorithm for reconstruction of unknown 3-D objects. IEEE Robot. Autom. Lett. 2(3), 1540–1547 (2017)CrossRefGoogle Scholar
  7. 7.
    Denzler, J., Brown, C.M.: Information theoretic sensor data selection for active object recognition and state estimation. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 145–157 (2002)CrossRefGoogle Scholar
  8. 8.
    Doumanoglou, A., Kouskouridas, R., Malassiotis, S., Kim, T.K.: Recovering 6D object pose and predicting next-best-view in the crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3583–3592 (2016)Google Scholar
  9. 9.
    Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. In: CVPR (2016)Google Scholar
  10. 10.
    Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)CrossRefGoogle Scholar
  11. 11.
    Haner, S., Heyden, A.: Covariance propagation and next best view planning for 3D reconstruction. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 545–556. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33709-3_39CrossRefGoogle Scholar
  12. 12.
    Isler, S., Sabzevari, R., Delmerico, J., Scaramuzza, D.: An information gain formulation for active volumetric 3D reconstruction. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 3477–3484. IEEE (2016)Google Scholar
  13. 13.
    Kriegel, S., Rink, C., Bodenmüller, T., Suppa, M.: Efficient next-best-scan planning for autonomous 3D surface reconstruction of unknown objects. J. Real-Time Image Proc. 10(4), 611–631 (2015)CrossRefGoogle Scholar
  14. 14.
    Ku, J., Mozifian, M., Lee, J., Harakeh, A., Waslander, S.L.: Joint 3D proposal generation and object detection from view aggregation. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–8. IEEE (2018)Google Scholar
  15. 15.
    Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: PointPillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)Google Scholar
  16. 16.
    Meyer, G.P., Laddha, A., Kee, E., Vallespi-Gonzalez, C., Wellington, C.K.: LaserNet: an efficient probabilistic 3D object detector for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12677–12686 (2019)Google Scholar
  17. 17.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)Google Scholar
  18. 18.
    Scott, W.R., Roth, G., Rivest, J.F.: View planning for automated three-dimensional object reconstruction and inspection. ACM Comput. Surv. (CSUR) 35(1), 64–96 (2003)CrossRefGoogle Scholar
  19. 19.
    Shi, S., Wang, X., Li, H.: Pointrcnn: 3D object proposal generation and detection from point cloud. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–779 (2019)Google Scholar
  20. 20.
    Simony, M., Milzy, S., Amendey, K., Gross, H.M.: Complex-YOLO: an euler-region-proposal for real-time 3D object detection on point clouds. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)Google Scholar
  21. 21.
    Vasquez-Gomez, J.I., Sucar, L.E., Murrieta-Cid, R., Lopez-Damian, E.: Volumetric next-best-view planning for 3D object reconstruction with positioning error. Int. J. Adv. Rob. Syst. 11(10), 159 (2014)CrossRefGoogle Scholar
  22. 22.
    Wang, J., Bartels, J., Whittaker, W., Sankaranarayanan, A.C., Narasimhan, S.G.: Programmable triangulation light curtains. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)Google Scholar
  23. 23.
    Wilkes, D.: Active object recognition (1994)Google Scholar
  24. 24.
    Wu, Z., et al.: 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015Google Scholar
  25. 25.
    Yan, Y., Mao, Y., Li, B.: SECOND: sparsely embedded convolutional detection. Sensors 18(10), 3337 (2018)CrossRefGoogle Scholar
  26. 26.
    Yang, B., Liang, M., Urtasun, R.: HDNET: exploiting HD maps for 3D object detection. In: Conference on Robot Learning, pp. 146–155 (2018)Google Scholar
  27. 27.
    Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)Google Scholar
  28. 28.
    Zhu, B., Jiang, Z., Zhou, X., Li, Z., Yu, G.: Class-balanced grouping and sampling for point cloud 3D object detection. arXiv preprint arXiv:1908.09492 (2019)
  29. 29.
    Zolfaghari Bengar, J., et al.: Temporal coherence for active learning in videos. arXiv preprint arXiv:1908.11757 (2019)

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations