Skip to main content

A Hybrid Dynamical Systems Perspective on Reinforcement Learning for Cyber-Physical Systems: Vistas, Open Problems, and Challenges

  • Chapter
  • First Online:
Handbook of Reinforcement Learning and Control

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 325))

Abstract

The next generation of autonomous cyber-physical systems will integrate a variety of heterogeneous computation, communication, and control algorithms. This integration will lead to closed-loop systems with highly intertwined interactions between the digital world and the physical world. For these systems, designing robust and optimal data-driven control algorithms necessitates fundamental breakthroughs at the intersection of different areas such as adaptive and learning-based control, optimal and robust control, hybrid dynamical systems theory, and network control, to name just a few. Motivated by this necessity, control techniques inspired by ideas of reinforcement learning have emerged as a promising paradigm that could potentially integrate most of the key desirable features. However, while significant results in reinforcement learning have been developed during the last decades, the literature is still missing a systematic framework for the design and analysis of reinforcement learning-based controllers that can safely and systematically integrate the intrinsic continuous-time and discrete-time dynamics that emerge in cyber-physical systems. Motivated by this limitation, and by recent theoretical frameworks developed for the analysis of hybrid systems, in this chapter we explore some vistas and open problems that could potentially be addressed by merging tools from reinforcement learning and hybrid dynamical systems theory, and which could have significant implications for the development of the next generation of autonomous cyber-physical systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A solution is said to be Zeno if it has an infinite number of jumps in a finite interval of time.

  2. 2.

    A sequence of sets converges if its inner limit and its outer limit exist and are equal. See [38, Definition 5.1].

  3. 3.

    A function \(\beta :\mathbb {R}_{\ge 0}\times \mathbb {R}_{\ge 0}\rightarrow \mathbb {R}_{\ge 0}\) is of class \(\mathcal {K}\mathcal {L}\) if it is nondecreasing in its first argument, nonincreasing in its second argument, \(\lim _{r\rightarrow 0^+}\beta (r,s)=0\) for each \(s\in \mathbb {R}_{\ge 0}\), and \(\lim _{s\rightarrow \infty }\beta (r,s)=0\) for each \(r\in \mathbb {R}_{\ge 0}\).

References

  1. Lamnabhi-Lagarrigue, F., Annaswamy, A., Engel, S., Isaksson, A., Khargonekar, P., Murray, R., Nijmeijer, H., Samad, T., Tilbury, D., den Hof, P.V.: Systems & control for the future of humanity, research agenda: Current and future roles, impact and grand challenges. Ann Rev Control 43, 1–64 (2017)

    Article  Google Scholar 

  2. Xue, M., Wang, W., Roy, S.: Security concepts for the dynamics of autonomous vehicle networks. Automatica 50(3), 852–857 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  3. Ocampo-Martinez, C., Puig, V., Cembrano, G., Quevedo, J.: Application of predictive control strategies to the management of complex networks in the urban water cycle. IEEE Control Syst. Mag. 33(1), 15–41 (2003)

    MathSciNet  MATH  Google Scholar 

  4. Nie, Y., Wang, X., Cheng, K.: Multi-area self-adaptive pricing control in smart city with EV user participation. IEEE Trans. Intell. Transport. Syst. 99, 1–9 (2017)

    Google Scholar 

  5. Pepyne, D.L., Cassandras, C.G.: Control of hybrid systems in manufacturing. Proceed. IEEE 88(7), 1108–1122 (2000)

    Article  Google Scholar 

  6. Allgöwer, F., Borges de Sousa, J., Kapinski, Mosterman, P., Oehlerking, J., Panciatici, P., Prandini, M., Rajhans, A., Tabuada, P., Wenzelburger, P.: Position paper on the challenges posed by modern applications to cyber-physical systems theory. Nonlinear Anal.: Hybrid Syst. 34, 147–165 (2019). https://doi.org/10.1016/j.nahs.2019.05.007

  7. Passino, K.: Biomimicry for Optimization, Control, and Automation. Springer, Berlin (2016)

    MATH  Google Scholar 

  8. Alur, R., Forejt, V., Moarref, S., Trivedi, A.: Safe schedulability of bounded-rate multi-mode systems. In: Proceedings of the 16th International Conference on Hybrid Systems: Computation and Control, pp. 243–252 (2013)

    Google Scholar 

  9. Hou, Z., Wang, Z.: From model-based control to data-driven control: survey, classification and perspective. Inf. Sci. 235(20), 3–35 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  10. Tao, G.: Multivariable adaptive control: a survey. Automatica 50, 2737–2764 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  11. Kim, J.W., Park, B.J., Yoo, H., Lee, J.H., Lee, J.M.: Deep reinforcement learning based finite-horizon optimal tracking control for nonlinear system. IFAC-PapersOnLine 51(25), 257–262 (2018). https://doi.org/10.1016/j.ifacol.2018.11.115

    Article  Google Scholar 

  12. Ravanbakhsh, H., Sankaranarayanan, S.: Learning control Lyapunov functions from counterexamples and demonstrations. Auton. Robots 43(2), 275–307 (2019)

    Article  Google Scholar 

  13. Bertsekas, D.: Reinforcement Learning and Optimal Control. Athena Scientific, Nashua (2019)

    Google Scholar 

  14. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  15. Vrabie, D., Vamvoudakis, K.G., Lewis, F.L.: Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. IET Press (2012)

    Google Scholar 

  16. Lewis, F.L., Liu, D.: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Computational Intelligence Series. John Wiley/IEEE Press, Hoboken (2012)

    Book  Google Scholar 

  17. Kiumarsi, B., Vamvoudakis, K.: Optimal and autonomous control using reinforcement learning: a survey. IEEE Trans. Neural Netw. Learn. Syst. 29(6), 2042–2062 (2018)

    Article  MathSciNet  Google Scholar 

  18. Recht, B.: A tour of reinforcement learning: the view from continuous control. Ann. Rev. Control Robot. Auton. Syst. 2, 253–279 (2019)

    Article  Google Scholar 

  19. Görges, D.: Relations between Model predictive control and reinforcement learning. IFAC-PapersOnLine 50(1), 4920–4928 (2017). https://doi.org/10.1016/j.ifacol.2017.08.747

    Article  Google Scholar 

  20. Lee, D., Hu, J.: Primal-dual Q-learning framework for LQR design. IEEE Trans. Autom. Control 64(9), 3756–3763 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  21. Vamvoudakis, K.G., Lewis, F.L.: Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5), 878–888 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  22. Kamalapurkar, R., Klotz, J.R., Dixon, W.E.: Concurrent learning-based approximate feedback-nash equilibrium solution of n-player nonsero-sum differential games. IEEE/CAA J. Autom. Sinica 1, 239–247 (2014)

    Article  Google Scholar 

  23. Kamalapurkar, R., Rosenfeld, J.A., Dixon, W.E.: Efficient model-based reinforcement learning for approximate online optimal control. Automatica 74, 247–258 (2016). https://doi.org/10.1016/j.automatica.2016.08.004

    Article  MathSciNet  MATH  Google Scholar 

  24. Wang, Y., Velswamy, K., Huang, B.: A novel approach to feedback control with deep reinforcement learning. IFAC-PapersOnLine 51(18), 31–36 (2018). https://doi.org/10.1016/j.ifacol.2018.09.241

    Article  Google Scholar 

  25. Kamalapurkar, R., Walters, P., Rosenfeld, J., Dixon, W.: Reinforcemement Learning for Optimal Feedback Control: A Lyapunov Based Approach. Springer, Berlin (2018)

    Book  MATH  Google Scholar 

  26. Branicky, M.S., Borkar, V.S., Mitter, S.K.: A unified framework ofr hybrid control: model and optimal control theory. IEEE Trans. Autom. Control 43(1), 31–45 (1998)

    Article  MATH  Google Scholar 

  27. Bensoussan, A., Menaldi, J.L.: Hybrid control and dynamic programming. Dyn. Contin. Discrete Impulsive Syst. 3(4), 395–442 (1997)

    MathSciNet  MATH  Google Scholar 

  28. Shaikh, M.S., Caines, P.E.: On the hybrid optimal control problem: theory and algorithms. IEEE Trans. Autom. Control 52(9), 1587–1603 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  29. Cassandras, C.G., Pepyne, D.L., Wardi, Y.: Optimal control of a class of hybrid systems. IEEE Trans. Autom. Control 46(3), 398–415 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  30. Pakniyat, A.: Optimal control of deterministic and stochastic hybrid systems: theory and applications. Ph.D. Dissertation, McGill University (2016)

    Google Scholar 

  31. Forte, F., Marconi, L., Teel, A.R.: Robust nonlinear regulation: continuous-time internal models and hybrid identifiers. IEEE Trans. Autom. Control. 62(7), 3136–3151 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  32. Poveda, J.I., Krstić, M.: Fixed-time gradient-based extremum seeking. Amer. Control Conf. 2838–2843 (2020)

    Google Scholar 

  33. Poveda, J.I., Teel, A.R.: A framework for a class of hybrid extremum seeking controllers with dynamic inclusions. Automatica 76 (2017)

    Google Scholar 

  34. Kutadinata, R.J., Moase, W., Manzie, C.: Extremum-seeking in singularly perturbed hybrid systems. IEEE Trans. Autom. Control 62(6), 3014–3020 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  35. Poveda, J.I., Kutadinata, R., Manzie, C., Nes̆íc, D., Teel, A.R., Liao, C.: Hybrid extremum seeking for black-box optimization in hybrid plants: an analytical framework. In: 57th IEEE Conferece on Decision and Control, pp. 2235–2240 (2018)

    Google Scholar 

  36. Owens, D.H.: Iterative Learning Control: An Optimization Paradigm. Springer, London (2015)

    MATH  Google Scholar 

  37. Poveda, J.I., Benosman, M., Teel, A.R.: Hybrid online learning control in networked multiagent systems: a survey. Int. J. Adapt. Control Signal Process. 33(2), 228–261 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  38. Goebel, R., Sanfelice, R.G., Teel, A.R.: Hybrid dynamical systems: modeling, stability, and robustness. Princeton University Press, Princeton (2012)

    Book  MATH  Google Scholar 

  39. Poveda, J.I., Li, N.: Robust hybrid zero-order optimization algorithms with acceleration via averaging in continuous time. Automatica, 123, 2021, 109361

    Google Scholar 

  40. Teel, A.R., Poveda, J.I., Le, J.: First-order optimization algorithms with resets and hamiltonian flows. In: 58th IEEE Conference on Decision and Control, pp. 5838–5843 (2019)

    Google Scholar 

  41. Poveda, J.I., Teel, A.R.: A robust event-triggered approach for fast sampled-data extremization and learning. IEEE Trans. Autom. Control 62(10) (2017

    Google Scholar 

  42. Liu, J., Teel, A.R.: Lyapunov-based sufficient conditions for stability of hybrid systems with memory. IEEE Trans. Autom. Control 61(4), 1057–1062 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  43. Mayhew, C.G.: Hybrid control for topologically constrained systems. Ph.D Dissertation, University of California, Santa Barbara (2010)

    Google Scholar 

  44. Sanfelice, R.G., Messina, M.J., Tuna, S.E., Teel, A.R.: Robust hybrid controllers for contrinuous-time systems with applications to obstacle avoidance and regulation to disconnected set of points. In: Proceedings of American Control Conference, pp. 3352–3357 (2006)

    Google Scholar 

  45. Poveda, J.I., Benosman, M., Sanfelice, R.G., Teel, A.R.: A hybrid adaptive feedback law for robust obstacle avoidance and coordination in multiple vehicle systems. In: Proceedings of American Control Conferece, pp. 616–621 (2018)

    Google Scholar 

  46. Strizic, T., Poveda, J.I., Teel, A.R.: Hybrid gradient descent for robust global optimization on the circle. In: 56th IEEE Conference on Decision and Control, pp. 2985–2990 (2017)

    Google Scholar 

  47. Hespanha, J.P., Morse, A.S.: Stabilization of switched systems with average dwell-time. In: 38th IEEE Conference on Decision and Control, pp. 2655–2660 (1999)

    Google Scholar 

  48. Vidyasagar, M.: Nonlinear Systems Analysis. Prentice Hall, Upper Saddle River (1993)

    MATH  Google Scholar 

  49. Jakubczyk, B., Sontag, E.D.: Controllability of nonlinear discrete-time systems: a Lie-algebraic approach. SIAM J. Control Opt. 28(1), 1–33 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  50. Subbaraman, A., Teel, A.R.: On the equivalence between global recurrence and the existence of a smooth Lyapunov function for hybrid systems. Syst. & Control Lett. 88, 54–61 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  51. Teel, A.R., Subbaraman, A., Sferlazza, A.: Stability analysis for stochastic hybrid systems: a survey. Automatica 50(10), 2435–2456 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  52. Vamvoudakis, K.G., Fantini-Miranda, M., Hespanha, J.P.: Asymptotically stable adaptive-optimal control algorithm with saturating actuators and relaxed persistence of excitation. IEEE Trans. Neural Netw. Learn. Syst. 27(11), 2386–2398 (2016)

    Article  MathSciNet  Google Scholar 

  53. Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. SMC-13(5), 834–846 (1983)

    Google Scholar 

  54. Lewis, F., Syrmos, V.: Optimal Control. Wiley, Boston (1995)

    Google Scholar 

  55. Hespanha, J.P.: Linear Systems Theory. Princeton University Press, Princeton (2009)

    MATH  Google Scholar 

  56. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes: The Art of Scientific Computing, 3rd edn. Cambridge University Press, Cambridge (2007)

    MATH  Google Scholar 

  57. Bhaya, A., Kaszkurewicz, E.: Control perspectives on numerical algorithms and matrix problems. SIAM (2006)

    Google Scholar 

  58. Nocedal, J., Wright, S.: Numerical Optimization. Springer, Berlin (1999)

    Book  MATH  Google Scholar 

  59. Saridis, G.N., Lee, C.S.: An approximation theory of optimal control for trainabla manipulators. IEEE Trans. Syst. Man Cybern. 9(3), 152–159 (1979)

    Article  MATH  Google Scholar 

  60. Hornik, K., Stinchcombe, S., White, H.: Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw. 3, 551–560 (1990)

    Article  Google Scholar 

  61. Sastry, S., Bodson, M.: Adaptive Control: Stability, Convergence, and Robustness. Prentice-Hall, Englewood Cliffs (1989)

    MATH  Google Scholar 

  62. Ioannou, P.A., Sun, J.: Robust Adaptive Control. Dover Publications Inc., Mineola (2012)

    MATH  Google Scholar 

  63. Prokhorov, D.V., Wunsch, D.C.: Adaptive critic designs. IEEE Trans. Neural Netw. 8(5), 997–1007 (1997)

    Article  Google Scholar 

  64. Abouheaf, M.I., Lewis, F.L., Vamvoudakis, K.G., Haesaert, S., Babuska, R.: Multi-agent discrete-time graphical games and reinforcement learning solutions. Automatica 50(12), 3038–3053 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  65. Yang, Q., Vance, J.B., Jagannathan, S.: Control of nonaffine nonlinear discrete-time systems using reinforcement learning-based linearly parameterized neural networks. IEEE Trans. Syst. Man Cybern. Part B Cybern. 38(4), 994–1001 (2008)

    Google Scholar 

  66. Yang, X., Liu, D., Wang, D., Wei, Q.: Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning. Neural Netw. 55, 30–41 (2014). https://doi.org/10.1016/j.neunet.2014.03.008

    Article  MATH  Google Scholar 

  67. Yang, Q., Jagannathan, S.: Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 42(2), 377–390 (2012)

    Google Scholar 

  68. Vamvoudakis, K., Lewis, F.L.: Online solution of nonlinear two-player zero-sum games using synchronous policy iteration. Int. J. Robust Nonlinear Control 22, 1460–1483 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  69. Vamvoudakis, K.G., Lewis, F.L.: Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica 47(8), 1556–1569 (2011). https://doi.org/10.1016/j.automatica.2011.03.005

    Article  MathSciNet  MATH  Google Scholar 

  70. Kanellopoulos, A., Vamvoudakis, K.G.: Non-equilibrium dynamic games and cyber-physical security: a cognitive hierarchy approach. Syst. Control Lett. 125, 59–66 (2019). https://doi.org/10.1016/j.sysconle.2019.01.008

    Article  MathSciNet  MATH  Google Scholar 

  71. Vamvoudakis, K.G.: Model-Free Learning of Nash Games With Applications to Network Security. Elsevier Inc., Amsterdam (2016). https://doi.org/10.1016/B978-0-12-805246-4.00010-0

  72. Vamvoudakis, K.G., Ferraz, H.: Model-free event-triggered control algorithm for continuous-time linear systems with optimal performance. Automatica 87, 412–420 (2018). https://doi.org/10.1016/j.automatica.2017.03.013

    Article  MathSciNet  MATH  Google Scholar 

  73. Chen, C., Modares, H., Xie, K., Lewis, F.L., Wan, Y., Xie, S.: Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics. IEEE Trans. Autom. Control 64(11), 4423–4438 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  74. Vamvoudakis, K.G.: Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA J. Autom. Sinica 1(3), 282–293 (2014)

    Article  Google Scholar 

  75. Su, H., Zhang, H., Sun, S., Cai, Y.: Integral reinforcement learning-based online adaptive event-triggered control for non-zero-sum games of partially unknown nonlinear systems. Neurocomputing (2019). https://doi.org/10.1016/j.neucom.2019.09.088

    Article  Google Scholar 

  76. Prieur, C., Teel, A.R.: Uniting local and global output feedback controllers. IEEE Trans. Autom. Control 56(7), 1636–1649 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  77. Prieur, C., Goebel, R., Teel, A.R.: Hybrid feedback control and robust stabilization of nonlinear systems. IEEE Trans. Autom. Control 52(11), 2103–2117 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  78. Mayhew, C.G., Sanfelice, R.G., Sheng, J., Arcak, M., Teel, A.R.: Quaternion-based hybrid feedback for robust global attitude synchronization. IEEE Trans. Autom. Control 57(8), 2122–2127 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  79. Nesić, D., Teel, A.R., Zaccarian, L.: Stabiliy and performace of SISO control systems with first-order reet elements. IEEE Trans. Autom. Control 56(11), 2567–2582 (2011)

    Article  MATH  Google Scholar 

  80. Nes̆ić, D., Teel, A.R., Kokotović, P.V.: Sufficient conditions for stabilization of sampled-data nonlinear systems via discrete-time approximations. Syst. & Control Lett. 38, 259–270 (1999)

    Google Scholar 

  81. Nes̆ić, D., Teel, A.R.: A framework for stabilization of nonlinear sampled-data systems based on their approximate discrete-time models. IEEE Trans. Autom. Control 49(7), 1103–1121 (2004)

    Google Scholar 

  82. Nešić, D., Teel, A.R., Carnevale, D.: Explicit computation of the sampling period in emulation of controllers for nonlinear sampled-data systems. IEEE Transactions on Automatic and Control 54(3), 619–624 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  83. Khong, S.Z., Nes̆ić, D., Tan, Y., Manzie, C.: Unified framework for sampled-data extremum seeking control: global optimisation and multi-unit systems. Automatica 49, 2720–2733 (2013)

    Google Scholar 

  84. Chien, C.: A sampled-data iterative learning control using fuzzy network design. Int. J. Control 73, 902–913 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  85. Bai, E.W., Fu, L.C., Sastry, S.S.: Averaging analysis for discrete time and sampled data adaptive systems. IEEE Trans. Circuits Syst. 35(2) (1988)

    Google Scholar 

  86. Chen, T., Francis, B.: Optimal Sampled-Data Control Systems. Springer, Berlin (1995)

    Book  MATH  Google Scholar 

  87. Vrabie, D., Lewis, F.: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Netw. 22, 237–246 (2009)

    Article  MATH  Google Scholar 

  88. Tabuada, P.: Event-triggered real-time scheduling of stabilizing control tasks. IEEE Trans. Autom. Control 52, 1680–1685 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  89. Postoyan, R., Tabuada, P., Nes̆ić, D., Anta, A.: A framework for the event-trigered stabilization of nonlinear systems. IEEE Trans. Autom. Control. 60(4), 982–996 (2015)

    Google Scholar 

  90. Heemels, W.P.M.H., Donkers, M.C.F., Teel, A.R.: Periodic event-triggered control for linear systems. IEEE Trans. Autom. Control 58, 847–861 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  91. Narayanan, V., Jagannathan, S.: Event-triggered distributed control of nonlinear interconnected systems using online reinforcement learning with exploration. IEEE Trans. Cybern. 48(9), 2510–2519 (2018)

    Article  Google Scholar 

  92. Poveda, J.I., Teel, A.R.: Hybrid mechanisms for robust synchronization and coordination of multi-agent networked sampled-data systems. Automatica 99, 41–53 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  93. Persis, C.D., Postoyan, R.: A Lyapunov redesign of coordination algorithms for cyber-physical systems. IEEE Transactions on Automatic and Control 62(2), 808–823 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  94. Poveda, J., Teel, A.: A hybrid systems approach for distributed nonsmooth optimization in asynchronous multi-agent sampled-data systems. IFAC-PapersOnLine 49(18) (2016)

    Google Scholar 

  95. Nešić, D., Teel, A.R., Zaccarian, L.: Stability and performance of SISO control systems with first-order reset elements. IEEE Trans. Autom. Control 56(11), 2567–2582 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  96. Prieur, C., Queinnec, I., Tarbouriech, S., Zaccarian, L.: Analysis and Synthesis of Reset Control Systems. Now Foundations and Trends (2018)

    Google Scholar 

  97. Hustig-Schultz, D., Sanfelice, R.G.: A robust hybrid heavy ball algorithm for optimization with high performance. Amer. Control Conf. (2019). To appear

    Google Scholar 

  98. Poveda, J.I., Li, N.: Inducing uniform asymptotic stability in non-autonomous accelerated optimization dynamics via hybrid regularization. In: 58th IEEE Conference on Decision and Control, pp. 3000–3005 (2019)

    Google Scholar 

  99. Ochoa, D., Poveda, J.I., Uribe, C., Quijano, N.: Robust accelerated optimization on networks via distributed restarting of Nesterov’s-like ODE. IEEE Control Syst. Lett. 5(1) (2021)

    Google Scholar 

  100. Socy, B.V., Freeman, R.A., Lynch, K.M.: The fastest known globally convergent first-order method for minimizing strongly convex functions. IEEE Control Syst. Lett. 2(1), 49–54 (2018)

    Article  MathSciNet  Google Scholar 

  101. Baradaran, M., Poveda, J.I., Teel, A.R.: Stochastic hybrid inclusions applied to global almost sure optimization on manifolds. In: IEEE 57th Conference on Decision and Control, pp. 6538–6543 (2018)

    Google Scholar 

  102. Baradaran, M., Poveda, J.I., Teel, A.R.: Global optimization on the sphere: a stochastic hybrid systems approach. In: Proceedings of the 10th IFAC Symposium on Nonlinear Control Systems (2019). To appear

    Google Scholar 

  103. Possieri, C., Teel, A.R.: LQ optimal control for a class of hybrid systems. In: IEEE 55th Conference on Decision and Control, pp. 604–609 (2016)

    Google Scholar 

  104. Carnevale, D., Galeani, S., Sassano, M.: A linear quadratic approach to linear time invariant stabilization for a class of hybrid systems. In: Proceedings of the of 22nd Mediterranean Conference on Control and Automation, pp. 545–550 (2014)

    Google Scholar 

  105. Dharmatti, S., Ramaswamy, M.: Hybrid control systems and viscosity solutions. SIAM J. Control Opt. 44(4), 1259–1288 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  106. Barles, G., Dharmatti, S., Ramaswamy, M.: Unbounded viscosity solutions of hybrid control systems. ESAIM: Control Opt. Cal. Var. 16(1), 176–193 (2010)

    Google Scholar 

  107. De Carolis, G., Saccon, A.: On linear quadratic optimal control for time-varying multimodal linear systems with time-triggered jumps. IEEE Control Syst. Lett. 4(1), 217–222 (2020)

    Article  MathSciNet  Google Scholar 

  108. Hedlund, S., Rantzer, A.: Convex dynamic programming for hybrid systems. IEEE Trans. Autom. Control 47(9), 1536–1540 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  109. Passenberg, B., Caines, P.E., Leibold, M., Stursberg, O., Buss, M.: Optimal control for hybrid systems with partitioned state space. IEEE Trans. Autom. Control 58(8), 2131–2136 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  110. Bemporad, A., Giorgetti, N.: Logic-based solution methods for optimal control of hybrid systems. IEEE Trans. Autom. Control 51(6), 963–976 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  111. Chen, H.: Optimal control and reinforcement learning of switched systems. Ph.D. Dissertation, The Ohio State University (2018)

    Google Scholar 

Download references

Acknowledgements

The first author would like to thank John Hauser, Ashutosh Trivedi, and Fabio Somenzi for fruitful conversations about optimal control, dynamic programming, and reinforcement learning from the controls and computer science perspective. The first author acknowledges support from NSF via the grant CNS-1947613. The second author acknowledges support from AFOSR via the grant FA9550-18-1-0246.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge I. Poveda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Poveda, J.I., Teel, A.R. (2021). A Hybrid Dynamical Systems Perspective on Reinforcement Learning for Cyber-Physical Systems: Vistas, Open Problems, and Challenges. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds) Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, vol 325. Springer, Cham. https://doi.org/10.1007/978-3-030-60990-0_24

Download citation

Publish with us

Policies and ethics