Skip to main content

Testing the Safety of Self-driving Vehicles by Simulating Perception and Prediction

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12371))

Included in the following conference series:

Abstract

We present a novel method for testing the safety of self-driving vehicles in simulation. We propose an alternative to sensor simulation, as sensor simulation is expensive and has large domain gaps. Instead, we directly simulate the outputs of the self-driving vehicle’s perception and prediction system, enabling realistic motion planning testing. Specifically, we use paired data in the form of ground truth labels and real perception and prediction outputs to train a model that predicts what the online system will produce. Importantly, the inputs to our system consists of high definition maps, bounding boxes, and trajectories, which can be easily sketched by a test engineer in a matter of minutes. This makes our approach a much more scalable solution. Quantitative results on two large-scale datasets demonstrate that we can realistically test motion planning using our simulations.

K. Wong and Q. Zhang—Indicates equal contribution. Work done during Qiang’s internship at Uber ATG.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We use the terms prediction and motion forecasting interchangeably.

  2. 2.

    Actors’ future orientations are approximated from their predicted waypoints using finite differences, and their bounding box sizes remain constant over time.

  3. 3.

    True positive, false positive, and false negative detections are determined by IoU following the detection AP metric. In our experiments, we use a 0.5 IoU threshold for cars and vehicles and 0.3 IoU for pedestrians and bicyclists.

  4. 4.

    Our representation uses bounding boxes and trajectories. Most self-driving datasets provide this as ground truth labels for the standard perception and prediction task. For perception and prediction simulation, we use these labels as inputs instead.

  5. 5.

    As of nuScenes map v1.0.

  6. 6.

    Note that ACC always uses the same driving behavior.

References

  1. Alhaija, H.A., Mustikovela, S.K., Mescheder, L.M., Geiger, A., Rother, C.: Augmented reality meets computer vision: Efficient data generation for urban driving scenes. Int. J. Comput. Vis. 126, 961 (2018). https://doi.org/10.1007/s11263-018-1070-x

    Article  Google Scholar 

  2. Bansal, M., Krizhevsky, A., Ogale, A.S.: ChauffeurNet: Learning to drive by imitating the best and synthesizing the worst. In: Robotics: Science and Systems XV, University of Freiburg, Freiburg im Breisgau, Germany, June 22–26 (2019)

    Google Scholar 

  3. Beattie, C., et al.: Deepmind lab. CoRR (2016)

    Google Scholar 

  4. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253 (2013)

    Article  Google Scholar 

  5. Bishop, C.M.: Pattern Recognition and Machine Learning, 5th edn. Springer, New York (2007)

    MATH  Google Scholar 

  6. Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. CoRR (2019)

    Google Scholar 

  7. Casas, S., Luo, W., Urtasun, R.: IntentNet: learning to predict intention from raw sensor data. In: 2nd Annual Conference on Robot Learning, CoRL 2018, Zürich, Switzerland, 29–31 October 2018, Proceedings (2018)

    Google Scholar 

  8. Chadwick, S., Maddern, W., Newman, P.: Distant vehicle detection using radar and vision. In: International Conference on Robotics and Automation, ICRA 2019, Montreal, QC, Canada, May 20–24 (2019)

    Google Scholar 

  9. Chen, C., Seff, A., Kornhauser, A.L., Xiao, J.: DeepDriving: learning affordance for direct perception in autonomous driving. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7–13 (2015)

    Google Scholar 

  10. Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., Urtasun, R.: Monocular 3D object detection for autonomous driving. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30 (2016)

    Google Scholar 

  11. Coumans, E., Bai, Y.: PyBullet, a python module for physics simulation for games, robotics and machine learning. http://www.pybullet.org (2016–2019)

  12. Dosovitskiy, A., Ros, G., Codevilla, F., López, A., Koltun, V.: CARLA: an open urban driving simulator. In: 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, California, USA, November 13–15, Proceedings (2017)

    Google Scholar 

  13. Fan, H., et al.: Baidu apollo EM motion planner. CoRR (2018)

    Google Scholar 

  14. Fang, J., et al.: Simulating LIDAR point cloud for autonomous driving using real-world scenes and traffic flows. CoRR (2018)

    Google Scholar 

  15. Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. CoRR (2016)

    Google Scholar 

  16. Geras, K.J., et al.: Compressing LSTMs into CNNs. CoRR (2015)

    Google Scholar 

  17. Gschwandtner, M., Kwitt, R., Uhl, A., Pree, W.: BlenSor: blender sensor simulation toolbox. In: Advances in Visual Computing-7th International Symposium, ISVC 2011, Las Vegas, NV, USA, September 26–28. Proceedings, Part II (2011)

    Google Scholar 

  18. Gu, T., Dolan, J.M.: A lightweight simulator for autonomous driving motion planning development. In: ICIS (2015)

    Google Scholar 

  19. Gubelli, D., Krasnov, O.A., Yarovyi, O.: Ray-tracing simulator for radar signals propagation in radar networks. In: 2013 European Radar Conference (2013)

    Google Scholar 

  20. Guo, X., Li, H., Yi, S., Ren, J.S.J., Wang, X.: Learning monocular depth by distilling cross-domain stereo networks. In: Computer Vision-ECCV 2018–15th European Conference, Munich, Germany, September 8–14. Proceedings, Part XI (2018)

    Google Scholar 

  21. Gupta, S., Hoffman, J., Malik, J.: Cross modal distillation for supervision transfer. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30 (2016)

    Google Scholar 

  22. Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR (2015)

    Google Scholar 

  23. Jain, A., et al.: Discrete residual flow for probabilistic pedestrian behavior prediction. In: 3rd Annual Conference on Robot Learning, CoRL 2019, Osaka, Japan, October 30–November 1. Proceedings (2019)

    Google Scholar 

  24. Johnson, M., Hofmann, K., Hutton, T., Bignell, D.: The malmo platform for artificial intelligence experimentation. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, July 9–15 (2016)

    Google Scholar 

  25. Kar, A., et al.: Meta-sim: Learning to generate synthetic datasets. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27–November 2 (2019)

    Google Scholar 

  26. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaskowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. CoRR (2016)

    Google Scholar 

  27. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, Conference Track Proceedings (2015)

    Google Scholar 

  28. Koenig, N.P., Howard, A.: Design and use paradigms for Gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, September 28–October 2 (2004)

    Google Scholar 

  29. Kolve, E., Mottaghi, R., Gordon, D., Zhu, Y., Gupta, A., Farhadi, A.: AI2-THOR: an interactive 3D environment for visual AI. CoRR (2017)

    Google Scholar 

  30. Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: Pointpillars: Fast encoders for object detection from point clouds. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20 (2019)

    Google Scholar 

  31. Li, L., et al.: End-to-end contextual perception and prediction with interaction transformer. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2020, October 25–29 (2020)

    Google Scholar 

  32. Li, W., et al.: AADS: augmented autonomous driving simulation using data-driven algorithms. Sci. Robot. 4, eaaw0863 (2019)

    Article  Google Scholar 

  33. Liang, M., et al.: PnPNet: end-to-end perception and prediction with tracking in the loop. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 16–18 (2020)

    Google Scholar 

  34. Luo, W., Yang, B., Urtasun, R.: Fast and furious: real time end-to-end 3D detection, tracking and motion forecasting with a single convolutional net. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, pp. 3569–3577 (2018)

    Google Scholar 

  35. Manivasagam, S., et al.: LiDARsim: realistic lidar simulation by leveraging the real world. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 16–18 (2020)

    Google Scholar 

  36. Mehta, B., Diaz, M., Golemo, F., Pal, C.J., Paull, L.: Active domain randomization. In: 3rd Annual Conference on Robot Learning, CoRL 2019, Osaka, Japan, October 30–November 1, Proceedings (2019)

    Google Scholar 

  37. Akkaya, I., et al.: Solving rubik’s cube with a robot hand. CoRR (2019)

    Google Scholar 

  38. Papernot, N., McDaniel, P.D., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA, May 22–26 (2016)

    Google Scholar 

  39. Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. In: 2018 IEEE International Conference on Robotics and Automation, ICRA 2018, Brisbane, Australia, May 21–25 (2018)

    Google Scholar 

  40. Pomerleau, D.: ALVINN: an autonomous land vehicle in a neural network. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems, vol. 1, NIPS Conference, Denver, Colorado, USA (1988)

    Google Scholar 

  41. Pouyanfar, S., Saleem, M., George, N., Chen, S.: ROADS: randomization for obstacle avoidance and driving in simulation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2019, Long Beach, CA, USA, June 16–20 (2019)

    Google Scholar 

  42. Ros, G., Sellart, L., Materzynska, J., Vázquez, D., López, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30 (2016)

    Google Scholar 

  43. Rusu, A.A., et al.: Policy distillation. In: 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, Conference Track Proceedings (2016)

    Google Scholar 

  44. Sadat, A., Ren, M., Pokrovsky, A., Lin, Y., Yumer, E., Urtasun, R.: Jointly learnable behavior and trajectory planning for self-driving vehicles. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019, Macau, SAR, China, November 3–8 (2019)

    Google Scholar 

  45. Savva, M., Chang, A.X., Dosovitskiy, A., Funkhouser, T.A., Koltun, V.: MINOS: multimodal indoor simulator for navigation in complex environments. CoRR (2017)

    Google Scholar 

  46. Shah, S., Dey, D., Lovett, C., Kapoor, A.: AirSim: high-fidelity visual and physical simulation for autonomous vehicles. CoRR (2017)

    Google Scholar 

  47. Tessler, C., Givony, S., Zahavy, T., Mankowitz, D.J., Mannor, S.: A deep hierarchical approach to lifelong learning in minecraft. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, San Francisco, California, USA (2017)

    Google Scholar 

  48. Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada, September 24–28 (2017)

    Google Scholar 

  49. Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2012, Vilamoura, Algarve, Portugal, October 7–12 (2012)

    Google Scholar 

  50. Wang, T.H., Manivasagam, S., Liang, M., Yang, B., Zeng, W., Urtasun, R.: V2VNet: Vehicle-to-vehicle communication for joint perception and prediction. In: Computer Vision-ECCV 2020–16th European Conference, August 23–28, Proceedings (2020)

    Google Scholar 

  51. Wang, Y., Chao, W., Garg, D., Hariharan, B., Campbell, M.E., Weinberger, K.Q.: Pseudo-lidar from visual depth estimation: Bridging the gap in 3D object detection for autonomous driving. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20 (2019)

    Google Scholar 

  52. Wheeler, T.A., Holder, M., Winner, H., Kochenderfer, M.J.: Deep stochastic radar models. CoRR (2017)

    Google Scholar 

  53. Wrenninge, M., Unger, J.: Synscapes: a photorealistic synthetic dataset for street scene parsing. CoRR (2018)

    Google Scholar 

  54. Wymann, B., Dimitrakakisy, C., Sumnery, A., Guionneauz, C.: TORCS: the open racing car simulator (2015)

    Google Scholar 

  55. Xia, F., et al.: Interactive Gibson benchmark: a benchmark for interactive navigation in cluttered environments. IEEE Robot. Autom. Lett. 5, 713 (2020)

    Article  Google Scholar 

  56. Xia, F., Zamir, A.R., He, Z., Sax, A., Malik, J., Savarese, S.: Gibson Env: real-world perception for embodied agents. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22 (2018)

    Google Scholar 

  57. Yang, B., Guo, R., Liang, M., Casas, S., Urtasun, R.: Exploiting radar for robust perception of dynamic objects. In: Computer Vision-ECCV 2020–16th European Conference, August 23–28, Proceedings (2020)

    Google Scholar 

  58. Yang, B., Liang, M., Urtasun, R.: HDNET: exploiting HD maps for 3D object detection. In: 2nd Annual Conference on Robot Learning, CoRL 2018, Zürich, Switzerland, October 29–31, Proceedings (2018)

    Google Scholar 

  59. Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22 (2018)

    Google Scholar 

  60. Yang, Z., et al.: SurfelGAN: synthesizing realistic sensor data for autonomous driving. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 16–18 (2020)

    Google Scholar 

  61. Yue, X., Wu, B., Seshia, S.A., Keutzer, K., Sangiovanni-Vincentelli, A.L.: A lidar point cloud generator: from a virtual world to autonomous driving. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, ICMR 2018, Yokohama, Japan, June 11–14 (2018)

    Google Scholar 

  62. Zhang, Z., Gao, J., Mao, J., Liu, Y., Anguelov, D., Li, C.: STINet: Spatio-temporal-interactive network for pedestrian detection and trajectory prediction. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 16–18 (2020)

    Google Scholar 

  63. Zhou, Y., Tuzel, O.: Voxelnet: end-to-end learning for point cloud based 3D object detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kelvin Wong .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 727 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wong, K. et al. (2020). Testing the Safety of Self-driving Vehicles by Simulating Perception and Prediction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12371. Springer, Cham. https://doi.org/10.1007/978-3-030-58574-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58574-7_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58573-0

  • Online ISBN: 978-3-030-58574-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics