Control and Safety of Autonomous Vehicles with Learning-Enabled Components

  • Somil Bansal
  • Claire J. TomlinEmail author
Part of the Unmanned System Technologies book series (UST)


Real-world autonomous systems, such as autonomous vehicles, often operate in uncertain and partially observable environments. In such scenarios, designing a controller that achieves the desired behavior on the system is a challenging problem. The proven efficacy of learning-based control schemes strongly motivates their application to autonomous vehicles. However, guaranteeing correct operation of learning-based schemes during and after the learning process is currently an unresolved issue, which is of vital importance in safety-critical systems such as autonomous vehicles. Hamilton-Jacobi (HJ) reachability analysis is an important formal verification method for guaranteeing performance and safety properties of dynamical systems; it has been applied to many small-scale systems in the past decade. Its advantages include compatibility with general nonlinear system dynamics, formal treatment of bounded disturbances, and the availability of well-developed numerical tools.

In this chapter, we provide a brief overview of the challenges associated with system verification when learning is involved in the control loop, some recent attempts to address these challenges based on HJ reachability, and the open questions that remain to be answered.


  1. 1.
    B.P. Tice, Unmanned aerial vehicles: The force multiplier of the 1990s. Airpower Journal 5(1), 41–55 (1991)Google Scholar
  2. 2.
    W. DeBusk, Unmanned aerial vehicle systems for disaster relief: Tornado alley, in Infotech@ Aerospace Conferences (2010)Google Scholar
  3. 3., Inc., Amazon Prime Air, 2016. Available:
  4. 4.
    AUVSI News, UAS aid in South Carolina tornado investigation, 2016. Available:
  5. 5.
    BBC Technology, Google plans drone delivery service for 2017, 2016. Available:
  6. 6.
    I. Mitchell, A. Bayen, C. Tomlin, A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games. IEEE Trans. Autom. Control 50(7), 947–957 (2005)MathSciNetCrossRefGoogle Scholar
  7. 7.
    E. Coddington, N. Levinson, Theory of Ordinary Differential Equations (Tata McGraw-Hill Education, 1955)Google Scholar
  8. 8.
    J. Lygeros, On reachability and minimum cost optimal control. Automatica 40(6), 917–927 (2004)MathSciNetCrossRefGoogle Scholar
  9. 9.
    K. Margellos, J. Lygeros, Hamilton-Jacobi formulation for reach–avoid differential games. IEEE Trans. Autom. Control 56(8), 1849–1861 (2011)MathSciNetCrossRefGoogle Scholar
  10. 10.
    J. Fisac, M. Chen, C. Tomlin, S. Sastry, Reach-avoid problems with time-varying dynamics, targets and constraints, in Conference on Hybrid Systems: Computation and Control (2015)Google Scholar
  11. 11.
    S. Bansal, M. Chen, S. Herbert, C. Tomlin, Hamilton-Jacobi reachability: a brief overview and recent advances, in Conference on Decision and Control (2017)Google Scholar
  12. 12.
    M. Chen, S. Bansal, K. Tanabe, C. Tomlin, Provably safe and robust drone routing via sequential path planning: a case study in San Francisco and the Bay Area, 2017. Available:
  13. 13.
    M. Chen, S. Bansal, J. Fisac, C. Tomlin, Robust sequential path planning under disturbances and adversarial intruder. IEEE Trans. Control Syst. Technol. (2018)Google Scholar
  14. 14.
    S. Bansal, M. Chen, J. Fisac, C. Tomlin, Safe sequential path planning of multi-vehicle systems under presence of disturbances and imperfect information, in American Control Conference (2017)Google Scholar
  15. 15.
    M.P. Deisenroth, G. Neumann, J. Peters, A survey on policy search for robotics. Found. Trends Robot. 2(1–2), 1–142 (2013)Google Scholar
  16. 16.
    M.P. Deisenroth, D. Fox, C.E. Rasmussen, Gaussian processes for data-efficient learning in robotics and control. IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 408–423 (2015)CrossRefGoogle Scholar
  17. 17.
    S. Levine, P. Abbeel, Learning neural network policies with guided policy search under unknown dynamics, in Advances in Neural Information Processing Systems (2014)Google Scholar
  18. 18.
    L. Ljung, System identification, in Signal Analysis and Prediction (Springer, 1998)Google Scholar
  19. 19.
    T. Söderström, P. Stoica, System identification (1989)Google Scholar
  20. 20.
    K.J. Åström, P. Eykhoff, System identification–a survey. Automatica 7(2), 123–162 (1971)MathSciNetCrossRefGoogle Scholar
  21. 21.
    J.-N. Juang, Applied system identification (1994)Google Scholar
  22. 22.
    O. Nelles, Nonlinear system identification: from classical approaches to neural networks and fuzzy models (2013)Google Scholar
  23. 23.
    S. Chen, S. Billings, Neural networks for nonlinear dynamic system modelling and identification. Int. J. Control 56(2), 319–346 (1992)MathSciNetCrossRefGoogle Scholar
  24. 24.
    S. Haykin, Neural networks: a comprehensive foundation (1998)Google Scholar
  25. 25.
    K.S. Narendra, K. Parthasarathy, Identification and control of dynamical systems using neural networks. IEEE Trans. Neural Netw. 1(1), 4–27 (1990)CrossRefGoogle Scholar
  26. 26.
    K.J. Hunt, D. Sbarbaro, R. Żbikowski, P.J. Gawthrop, Neural networks for control systems–a survey. Automatica 28(6), 1083–1112 (1992)MathSciNetCrossRefGoogle Scholar
  27. 27.
    R. Fierro, F.L. Lewis, Control of a nonholonomic mobile robot using neural networks. IEEE Trans. Neural Netw. 9(4), 589–600 (1998)CrossRefGoogle Scholar
  28. 28.
    A. Yeşildirek, F.L. Lewis, Feedback linearization using neural networks. Automatica 31(11), 1659–1664 (1995)MathSciNetCrossRefGoogle Scholar
  29. 29.
    S. Bansal, A. Akametalu, F. Jiang, F. Laine, C. Tomlin, Learning quadrotor dynamics using neural network for flight control, in Conference on Decision and Control (2016)Google Scholar
  30. 30.
    A. Punjani, P. Abbeel, Deep learning helicopter dynamics models, in Conference on Robotics and Automation (2015)Google Scholar
  31. 31.
    I. Lenz, R.A. Knepper, A. Saxena, DeepMPC: learning deep latent features for model predictive control, in Robotics: Science and Systems (2015)Google Scholar
  32. 32.
    A. Nagabandi, G. Yang, T. Asmar, G. Kahn, S. Levine, R. Fearing, Neural network dynamics models for control of under-actuated legged millirobots (2017, Preprint), arXiv:1711.05253Google Scholar
  33. 33.
    M. Deisenroth, C. Rasmussen, PILCO: a model-based and data-efficient approach to policy search, in International Conference on Machine Learning (2011)Google Scholar
  34. 34.
    J. Joseph, A. Geramifard, J. Roberts, J. How, N. Roy, Reinforcement learning with misspecified model classes, in International Conference on Robotics and Automation (2013)Google Scholar
  35. 35.
    P. Donti, B. Amos, J. Kolter, Task-based end-to-end model learning. Adv. Neural Inf. Proces. Syst. (2017)Google Scholar
  36. 36.
    C. Atkeson, Nonparametric model-based reinforcement learning. Adv. Neural Inf. Proces. Syst. (1998)Google Scholar
  37. 37.
    P. Abbeel, M. Quigley, A. Ng, Using inaccurate models in reinforcement learning, in International Conference on Machine Learning (2006)Google Scholar
  38. 38.
    M. Gevers, Identification for control: from the early achievements to the revival of experiment design. Eur. J. Control 11(4–5), 335–352 (2005)MathSciNetCrossRefGoogle Scholar
  39. 39.
    H. Hjalmarsson, M. Gevers, F. De Bruyne, For model-based control design, closed-loop identification gives better performance. Automatica 32(12), 1659–1673 (1996)MathSciNetCrossRefGoogle Scholar
  40. 40.
    S. Bansal, R. Calandra, T. Xiao, S. Levine, C. Tomlin, Goal-driven dynamics learning via Bayesian optimization, in Conference on Decision and Control (2017)Google Scholar
  41. 41.
    A. Akametalu, J. Fisac, J. Gillula, S. Kaynama, M. Zeilinger, C. Tomlin, Reachability-based safe learning with Gaussian processes, in Conference on Decision and Control (2014)Google Scholar
  42. 42.
    J. Fisac, A. Akametalu, M. Zeilinger, S. Kaynama, J. Gillula, C. Tomlin, A general safety framework for learning-based control in uncertain robotic systems (2017, Preprint), arXiv:1705.01292Google Scholar
  43. 43.
    Y. Sui, A. Gotovos, J. Burdick, A. Krause, Safe exploration for optimization with Gaussian processes, in International Conference on Machine Learning (2015)Google Scholar
  44. 44.
    F. Berkenkamp, A. Schoellig, A. Krause, Safe controller optimization for quadrotors with Gaussian processes, in International Conference on Robotics and Automation (2016)Google Scholar
  45. 45.
    R. Alur, T. A. Henzinger, G. Lafferriere, G.J. Pappas, Discrete abstractions of hybrid systems. Proc. IEEE 88(7), 971–984 (2000)CrossRefGoogle Scholar
  46. 46.
    C. Baier, J. Katoen, K.G. Larsen, Principles of Model Checking (MIT press, 2008)Google Scholar
  47. 47.
    A. Girard, G.J. Pappas, Approximate bisimulation: a bridge between computer science and control theory. Eur. J. Control 17(5–6), 568–578 (2011)MathSciNetCrossRefGoogle Scholar
  48. 48.
    G. Pola, A. Girard, P. Tabuada, Approximately bisimilar symbolic models for nonlinear control systems. Automatica 44(10), 2508–2516 (2008)MathSciNetCrossRefGoogle Scholar
  49. 49.
    A. Girard, G. Pola, P. Tabuada, Approximately bisimilar symbolic models for incrementally stable switched systems. IEEE Trans. Autom. Control 55(1), 116–126 (2010)MathSciNetCrossRefGoogle Scholar
  50. 50.
    M.L. Bujorianu, J. Lygeros, M.C. Bujorianu, Bisimulation for general stochastic hybrid systems, in International Workshop on Hybrid Systems: Computation and Control (Springer, 2005), pp. 198–214Google Scholar
  51. 51.
    J. Desharnais, A. Edalat, P. Panangaden, Bisimulation for labelled Markov processes. Inf. Comput. 179(2), 163–193 (2002)MathSciNetCrossRefGoogle Scholar
  52. 52.
    K.G. Larsen, A. Skou, Bisimulation through probabilistic testing. Inf. Comput. 94(1), 1–28 (1991)MathSciNetCrossRefGoogle Scholar
  53. 53.
    S. Strubbe, A. Van Der Schaft, Bisimulation for communicating piecewise deterministic Markov processes (CPDPs), in International Workshop on Hybrid Systems: Computation and Control (Springer, 2005), pp. 623–639Google Scholar
  54. 54.
    A. Abate, Approximation metrics based on probabilistic bisimulations for general state-space Markov processes: a survey. Electron. Notes Theor. Comput. Sci. 297, 3–25 (2013)CrossRefGoogle Scholar
  55. 55.
    S. Bansal, S. Ghosh, A. Sangiovanni Vincentelli, S. Seshia, C. Tomlin, Context-specific validation of data-driven models (2018, Preprint), arXiv:1802.04929Google Scholar
  56. 56.
    M. Watter, J. Springenberg, J. Boedecker, M. Riedmiller, Embed to control: a locally linear latent dynamics model for control from raw images. Adv. Neural Inf. Proces. Syst. (2015)Google Scholar
  57. 57.
    V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller, Playing atari with deep reinforcement learning (2013, Preprint), arXiv:1312.5602Google Scholar
  58. 58.
    S. Levine, C. Finn, T. Darrell, P. Abbeel, End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(39), 1–40 (2016)MathSciNetzbMATHGoogle Scholar
  59. 59.
    S. Gupta, J. Davidson, S. Levine, R. Sukthankar, J. Malik, Cognitive mapping and planning for visual navigation, in Conference on Computer Vision and Pattern Recognition (2017)Google Scholar
  60. 60.
    P. Agrawal, A. Nair, P. Abbeel, J. Malik, S. Levine, Learning to poke by poking: experiential learning of intuitive physics. Adv. Neural Inf. Proces. Syst. (2016)Google Scholar
  61. 61.
    S. Herbert, M. Chen, S. Han, S. Bansal, J. Fisac, C. Tomlin, FaSTrack: a modular framework for fast and guaranteed safe motion planning, in Conference on Decision and Control (2017)Google Scholar
  62. 62.
    K. Hashimoto, A review on vision-based control of robot manipulators. Adv. Robot. 17(10), 969–991 (2003)CrossRefGoogle Scholar
  63. 63.
    M. Achtelik, M. Achtelik, S. Weiss, R. Siegwart, Onboard IMU and monocular vision based control for MAVs in unknown in-and outdoor environments. Int. Conf. Robot. Autom. (2011)Google Scholar
  64. 64.
    A. Beyeler, J. Zufferey, D. Floreano, Vision-based control of near-obstacle flight. Auton. Robot. 27(3), 201 (2009)CrossRefGoogle Scholar
  65. 65.
    O. Shakernia, Y. Ma, T. Koo, S. Sastry, Landing an unmanned air vehicle: vision based motion estimation and nonlinear control. Asian Journal of Control 1(3), 128–145 (1999)CrossRefGoogle Scholar
  66. 66.
    G. Ros, A. Sappa, D. Ponsa, A. Lopez, Visual SLAM for driverless cars: a brief survey, in Intelligent Vehicles Symposium (IV) Workshops, vol. 2, 2012Google Scholar
  67. 67.
    A. Kim, R. Eustice, Perception-driven navigation: active visual SLAM for robotic area coverage, in International Conference on Robotics and Automation (2013)Google Scholar
  68. 68.
    J. Fuentes-Pacheco, J. Ruiz-Ascencio, J. Rendón-Mancha, Visual simultaneous localization and mapping: a survey. Artif. Intell. Rev. 43(1), 55–81 (2015)CrossRefGoogle Scholar
  69. 69.
    J. Aulinas, Y. Petillot, J. Salvi, X. Lladó, The SLAM problem: a survey. CCIA 184(1), 363–371 (2008)Google Scholar
  70. 70.
    C. Finn, I. Goodfellow, S. Levine, Unsupervised learning for physical interaction through video prediction, in Advances in Neural Information Processing Systems (2016)Google Scholar
  71. 71.
    C. Finn, X. Tan, Y. Duan, T. Darrell, S. Levine, P. Abbeel, Deep spatial autoencoders for visuomotor learning, in International Conference on Robotics and Automation (2016)Google Scholar
  72. 72.
    T. Dreossi, A. Donzé, S.A. Seshia, Compositional falsification of cyber-physical systems with machine learning components, in NASA Formal Methods Symposium (Springer, Cham, 2017), pp. 357–372Google Scholar
  73. 73.
    S.A. Seshia, D. Sadigh, S.S. Sastry, Towards verified artificial intelligence. arXiv preprint arXiv:1606.08514Google Scholar
  74. 74.
    T. Dreossi, S. Jha, S.A. Seshia, Semantic adversarial deep learning. arXiv preprint arXiv:1804.07045Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Electrical Engineering and Computer SciencesUniversity of CaliforniaBerkeleyUSA

Personalised recommendations