Testing the Safety of Self-driving Vehicles by Simulating Perception and Prediction

Wong, Kelvin; Zhang, Qiang; Liang, Ming; Yang, Bin; Liao, Renjie; Sadat, Abbas; Urtasun, Raquel

doi:10.1007/978-3-030-58574-7_19

Kelvin Wong^12,13,
Qiang Zhang^12,14,
Ming Liang¹²,
Bin Yang^12,13,
Renjie Liao^12,13,
Abbas Sadat¹² &
…
Raquel Urtasun^12,13

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12371))

Included in the following conference series:

European Conference on Computer Vision

3190 Accesses
8 Citations

Abstract

We present a novel method for testing the safety of self-driving vehicles in simulation. We propose an alternative to sensor simulation, as sensor simulation is expensive and has large domain gaps. Instead, we directly simulate the outputs of the self-driving vehicle’s perception and prediction system, enabling realistic motion planning testing. Specifically, we use paired data in the form of ground truth labels and real perception and prediction outputs to train a model that predicts what the online system will produce. Importantly, the inputs to our system consists of high definition maps, bounding boxes, and trajectories, which can be easily sketched by a test engineer in a matter of minutes. This makes our approach a much more scalable solution. Quantitative results on two large-scale datasets demonstrate that we can realistically test motion planning using our simulations.

K. Wong and Q. Zhang—Indicates equal contribution. Work done during Qiang’s internship at Uber ATG.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We use the terms prediction and motion forecasting interchangeably.
2.
Actors’ future orientations are approximated from their predicted waypoints using finite differences, and their bounding box sizes remain constant over time.
3.
True positive, false positive, and false negative detections are determined by IoU following the detection AP metric. In our experiments, we use a 0.5 IoU threshold for cars and vehicles and 0.3 IoU for pedestrians and bicyclists.
4.
Our representation uses bounding boxes and trajectories. Most self-driving datasets provide this as ground truth labels for the standard perception and prediction task. For perception and prediction simulation, we use these labels as inputs instead.
5.
As of nuScenes map v1.0.
6.
Note that ACC always uses the same driving behavior.

References

Alhaija, H.A., Mustikovela, S.K., Mescheder, L.M., Geiger, A., Rother, C.: Augmented reality meets computer vision: Efficient data generation for urban driving scenes. Int. J. Comput. Vis. 126, 961 (2018). https://doi.org/10.1007/s11263-018-1070-x
Article Google Scholar
Bansal, M., Krizhevsky, A., Ogale, A.S.: ChauffeurNet: Learning to drive by imitating the best and synthesizing the worst. In: Robotics: Science and Systems XV, University of Freiburg, Freiburg im Breisgau, Germany, June 22–26 (2019)
Google Scholar
Beattie, C., et al.: Deepmind lab. CoRR (2016)
Google Scholar
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253 (2013)
Article Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning, 5th edn. Springer, New York (2007)
MATH Google Scholar
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. CoRR (2019)
Google Scholar
Casas, S., Luo, W., Urtasun, R.: IntentNet: learning to predict intention from raw sensor data. In: 2nd Annual Conference on Robot Learning, CoRL 2018, Zürich, Switzerland, 29–31 October 2018, Proceedings (2018)
Google Scholar
Chadwick, S., Maddern, W., Newman, P.: Distant vehicle detection using radar and vision. In: International Conference on Robotics and Automation, ICRA 2019, Montreal, QC, Canada, May 20–24 (2019)
Google Scholar
Chen, C., Seff, A., Kornhauser, A.L., Xiao, J.: DeepDriving: learning affordance for direct perception in autonomous driving. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7–13 (2015)
Google Scholar
Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., Urtasun, R.: Monocular 3D object detection for autonomous driving. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30 (2016)
Google Scholar
Coumans, E., Bai, Y.: PyBullet, a python module for physics simulation for games, robotics and machine learning. http://www.pybullet.org (2016–2019)
Dosovitskiy, A., Ros, G., Codevilla, F., López, A., Koltun, V.: CARLA: an open urban driving simulator. In: 1st Annual Conference on Robot Learning, CoRL 2017, Mountain View, California, USA, November 13–15, Proceedings (2017)
Google Scholar
Fan, H., et al.: Baidu apollo EM motion planner. CoRR (2018)
Google Scholar
Fang, J., et al.: Simulating LIDAR point cloud for autonomous driving using real-world scenes and traffic flows. CoRR (2018)
Google Scholar
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. CoRR (2016)
Google Scholar
Geras, K.J., et al.: Compressing LSTMs into CNNs. CoRR (2015)
Google Scholar
Gschwandtner, M., Kwitt, R., Uhl, A., Pree, W.: BlenSor: blender sensor simulation toolbox. In: Advances in Visual Computing-7th International Symposium, ISVC 2011, Las Vegas, NV, USA, September 26–28. Proceedings, Part II (2011)
Google Scholar
Gu, T., Dolan, J.M.: A lightweight simulator for autonomous driving motion planning development. In: ICIS (2015)
Google Scholar
Gubelli, D., Krasnov, O.A., Yarovyi, O.: Ray-tracing simulator for radar signals propagation in radar networks. In: 2013 European Radar Conference (2013)
Google Scholar
Guo, X., Li, H., Yi, S., Ren, J.S.J., Wang, X.: Learning monocular depth by distilling cross-domain stereo networks. In: Computer Vision-ECCV 2018–15th European Conference, Munich, Germany, September 8–14. Proceedings, Part XI (2018)
Google Scholar
Gupta, S., Hoffman, J., Malik, J.: Cross modal distillation for supervision transfer. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30 (2016)
Google Scholar
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. CoRR (2015)
Google Scholar
Jain, A., et al.: Discrete residual flow for probabilistic pedestrian behavior prediction. In: 3rd Annual Conference on Robot Learning, CoRL 2019, Osaka, Japan, October 30–November 1. Proceedings (2019)
Google Scholar
Johnson, M., Hofmann, K., Hutton, T., Bignell, D.: The malmo platform for artificial intelligence experimentation. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, July 9–15 (2016)
Google Scholar
Kar, A., et al.: Meta-sim: Learning to generate synthetic datasets. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27–November 2 (2019)
Google Scholar
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaskowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. CoRR (2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, Conference Track Proceedings (2015)
Google Scholar
Koenig, N.P., Howard, A.: Design and use paradigms for Gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, September 28–October 2 (2004)
Google Scholar
Kolve, E., Mottaghi, R., Gordon, D., Zhu, Y., Gupta, A., Farhadi, A.: AI2-THOR: an interactive 3D environment for visual AI. CoRR (2017)
Google Scholar
Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: Pointpillars: Fast encoders for object detection from point clouds. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20 (2019)
Google Scholar
Li, L., et al.: End-to-end contextual perception and prediction with interaction transformer. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2020, October 25–29 (2020)
Google Scholar
Li, W., et al.: AADS: augmented autonomous driving simulation using data-driven algorithms. Sci. Robot. 4, eaaw0863 (2019)
Article Google Scholar
Liang, M., et al.: PnPNet: end-to-end perception and prediction with tracking in the loop. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 16–18 (2020)
Google Scholar
Luo, W., Yang, B., Urtasun, R.: Fast and furious: real time end-to-end 3D detection, tracking and motion forecasting with a single convolutional net. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, pp. 3569–3577 (2018)
Google Scholar
Manivasagam, S., et al.: LiDARsim: realistic lidar simulation by leveraging the real world. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 16–18 (2020)
Google Scholar
Mehta, B., Diaz, M., Golemo, F., Pal, C.J., Paull, L.: Active domain randomization. In: 3rd Annual Conference on Robot Learning, CoRL 2019, Osaka, Japan, October 30–November 1, Proceedings (2019)
Google Scholar
Akkaya, I., et al.: Solving rubik’s cube with a robot hand. CoRR (2019)
Google Scholar
Papernot, N., McDaniel, P.D., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA, May 22–26 (2016)
Google Scholar
Peng, X.B., Andrychowicz, M., Zaremba, W., Abbeel, P.: Sim-to-real transfer of robotic control with dynamics randomization. In: 2018 IEEE International Conference on Robotics and Automation, ICRA 2018, Brisbane, Australia, May 21–25 (2018)
Google Scholar
Pomerleau, D.: ALVINN: an autonomous land vehicle in a neural network. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems, vol. 1, NIPS Conference, Denver, Colorado, USA (1988)
Google Scholar
Pouyanfar, S., Saleem, M., George, N., Chen, S.: ROADS: randomization for obstacle avoidance and driving in simulation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2019, Long Beach, CA, USA, June 16–20 (2019)
Google Scholar
Ros, G., Sellart, L., Materzynska, J., Vázquez, D., López, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30 (2016)
Google Scholar
Rusu, A.A., et al.: Policy distillation. In: 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, Conference Track Proceedings (2016)
Google Scholar
Sadat, A., Ren, M., Pokrovsky, A., Lin, Y., Yumer, E., Urtasun, R.: Jointly learnable behavior and trajectory planning for self-driving vehicles. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019, Macau, SAR, China, November 3–8 (2019)
Google Scholar
Savva, M., Chang, A.X., Dosovitskiy, A., Funkhouser, T.A., Koltun, V.: MINOS: multimodal indoor simulator for navigation in complex environments. CoRR (2017)
Google Scholar
Shah, S., Dey, D., Lovett, C., Kapoor, A.: AirSim: high-fidelity visual and physical simulation for autonomous vehicles. CoRR (2017)
Google Scholar
Tessler, C., Givony, S., Zahavy, T., Mankowitz, D.J., Mannor, S.: A deep hierarchical approach to lifelong learning in minecraft. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4–9, San Francisco, California, USA (2017)
Google Scholar
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, BC, Canada, September 24–28 (2017)
Google Scholar
Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2012, Vilamoura, Algarve, Portugal, October 7–12 (2012)
Google Scholar
Wang, T.H., Manivasagam, S., Liang, M., Yang, B., Zeng, W., Urtasun, R.: V2VNet: Vehicle-to-vehicle communication for joint perception and prediction. In: Computer Vision-ECCV 2020–16th European Conference, August 23–28, Proceedings (2020)
Google Scholar
Wang, Y., Chao, W., Garg, D., Hariharan, B., Campbell, M.E., Weinberger, K.Q.: Pseudo-lidar from visual depth estimation: Bridging the gap in 3D object detection for autonomous driving. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20 (2019)
Google Scholar
Wheeler, T.A., Holder, M., Winner, H., Kochenderfer, M.J.: Deep stochastic radar models. CoRR (2017)
Google Scholar
Wrenninge, M., Unger, J.: Synscapes: a photorealistic synthetic dataset for street scene parsing. CoRR (2018)
Google Scholar
Wymann, B., Dimitrakakisy, C., Sumnery, A., Guionneauz, C.: TORCS: the open racing car simulator (2015)
Google Scholar
Xia, F., et al.: Interactive Gibson benchmark: a benchmark for interactive navigation in cluttered environments. IEEE Robot. Autom. Lett. 5, 713 (2020)
Article Google Scholar
Xia, F., Zamir, A.R., He, Z., Sax, A., Malik, J., Savarese, S.: Gibson Env: real-world perception for embodied agents. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22 (2018)
Google Scholar
Yang, B., Guo, R., Liang, M., Casas, S., Urtasun, R.: Exploiting radar for robust perception of dynamic objects. In: Computer Vision-ECCV 2020–16th European Conference, August 23–28, Proceedings (2020)
Google Scholar
Yang, B., Liang, M., Urtasun, R.: HDNET: exploiting HD maps for 3D object detection. In: 2nd Annual Conference on Robot Learning, CoRL 2018, Zürich, Switzerland, October 29–31, Proceedings (2018)
Google Scholar
Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22 (2018)
Google Scholar
Yang, Z., et al.: SurfelGAN: synthesizing realistic sensor data for autonomous driving. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 16–18 (2020)
Google Scholar
Yue, X., Wu, B., Seshia, S.A., Keutzer, K., Sangiovanni-Vincentelli, A.L.: A lidar point cloud generator: from a virtual world to autonomous driving. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, ICMR 2018, Yokohama, Japan, June 11–14 (2018)
Google Scholar
Zhang, Z., Gao, J., Mao, J., Liu, Y., Anguelov, D., Li, C.: STINet: Spatio-temporal-interactive network for pedestrian detection and trajectory prediction. In: 2020 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 16–18 (2020)
Google Scholar
Zhou, Y., Tuzel, O.: Voxelnet: end-to-end learning for point cloud based 3D object detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Uber Advanced Technologies Group, Toronto, Canada
Kelvin Wong, Qiang Zhang, Ming Liang, Bin Yang, Renjie Liao, Abbas Sadat & Raquel Urtasun
University of Toronto, Toronto, Canada
Kelvin Wong, Bin Yang, Renjie Liao & Raquel Urtasun
Shanghai Jiao Tong University, Shanghai, China
Qiang Zhang

Authors

Kelvin Wong
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ming Liang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Renjie Liao
View author publications
You can also search for this author in PubMed Google Scholar
Abbas Sadat
View author publications
You can also search for this author in PubMed Google Scholar
Raquel Urtasun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kelvin Wong .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 727 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wong, K. et al. (2020). Testing the Safety of Self-driving Vehicles by Simulating Perception and Prediction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12371. Springer, Cham. https://doi.org/10.1007/978-3-030-58574-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-58574-7_19
Published: 13 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58573-0
Online ISBN: 978-3-030-58574-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics