Abstract
The next generation of autonomous cyber-physical systems will integrate a variety of heterogeneous computation, communication, and control algorithms. This integration will lead to closed-loop systems with highly intertwined interactions between the digital world and the physical world. For these systems, designing robust and optimal data-driven control algorithms necessitates fundamental breakthroughs at the intersection of different areas such as adaptive and learning-based control, optimal and robust control, hybrid dynamical systems theory, and network control, to name just a few. Motivated by this necessity, control techniques inspired by ideas of reinforcement learning have emerged as a promising paradigm that could potentially integrate most of the key desirable features. However, while significant results in reinforcement learning have been developed during the last decades, the literature is still missing a systematic framework for the design and analysis of reinforcement learning-based controllers that can safely and systematically integrate the intrinsic continuous-time and discrete-time dynamics that emerge in cyber-physical systems. Motivated by this limitation, and by recent theoretical frameworks developed for the analysis of hybrid systems, in this chapter we explore some vistas and open problems that could potentially be addressed by merging tools from reinforcement learning and hybrid dynamical systems theory, and which could have significant implications for the development of the next generation of autonomous cyber-physical systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A solution is said to be Zeno if it has an infinite number of jumps in a finite interval of time.
- 2.
A sequence of sets converges if its inner limit and its outer limit exist and are equal. See [38, Definition 5.1].
- 3.
A function \(\beta :\mathbb {R}_{\ge 0}\times \mathbb {R}_{\ge 0}\rightarrow \mathbb {R}_{\ge 0}\) is of class \(\mathcal {K}\mathcal {L}\) if it is nondecreasing in its first argument, nonincreasing in its second argument, \(\lim _{r\rightarrow 0^+}\beta (r,s)=0\) for each \(s\in \mathbb {R}_{\ge 0}\), and \(\lim _{s\rightarrow \infty }\beta (r,s)=0\) for each \(r\in \mathbb {R}_{\ge 0}\).
References
Lamnabhi-Lagarrigue, F., Annaswamy, A., Engel, S., Isaksson, A., Khargonekar, P., Murray, R., Nijmeijer, H., Samad, T., Tilbury, D., den Hof, P.V.: Systems & control for the future of humanity, research agenda: Current and future roles, impact and grand challenges. Ann Rev Control 43, 1–64 (2017)
Xue, M., Wang, W., Roy, S.: Security concepts for the dynamics of autonomous vehicle networks. Automatica 50(3), 852–857 (2014)
Ocampo-Martinez, C., Puig, V., Cembrano, G., Quevedo, J.: Application of predictive control strategies to the management of complex networks in the urban water cycle. IEEE Control Syst. Mag. 33(1), 15–41 (2003)
Nie, Y., Wang, X., Cheng, K.: Multi-area self-adaptive pricing control in smart city with EV user participation. IEEE Trans. Intell. Transport. Syst. 99, 1–9 (2017)
Pepyne, D.L., Cassandras, C.G.: Control of hybrid systems in manufacturing. Proceed. IEEE 88(7), 1108–1122 (2000)
Allgöwer, F., Borges de Sousa, J., Kapinski, Mosterman, P., Oehlerking, J., Panciatici, P., Prandini, M., Rajhans, A., Tabuada, P., Wenzelburger, P.: Position paper on the challenges posed by modern applications to cyber-physical systems theory. Nonlinear Anal.: Hybrid Syst. 34, 147–165 (2019). https://doi.org/10.1016/j.nahs.2019.05.007
Passino, K.: Biomimicry for Optimization, Control, and Automation. Springer, Berlin (2016)
Alur, R., Forejt, V., Moarref, S., Trivedi, A.: Safe schedulability of bounded-rate multi-mode systems. In: Proceedings of the 16th International Conference on Hybrid Systems: Computation and Control, pp. 243–252 (2013)
Hou, Z., Wang, Z.: From model-based control to data-driven control: survey, classification and perspective. Inf. Sci. 235(20), 3–35 (2013)
Tao, G.: Multivariable adaptive control: a survey. Automatica 50, 2737–2764 (2014)
Kim, J.W., Park, B.J., Yoo, H., Lee, J.H., Lee, J.M.: Deep reinforcement learning based finite-horizon optimal tracking control for nonlinear system. IFAC-PapersOnLine 51(25), 257–262 (2018). https://doi.org/10.1016/j.ifacol.2018.11.115
Ravanbakhsh, H., Sankaranarayanan, S.: Learning control Lyapunov functions from counterexamples and demonstrations. Auton. Robots 43(2), 275–307 (2019)
Bertsekas, D.: Reinforcement Learning and Optimal Control. Athena Scientific, Nashua (2019)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Vrabie, D., Vamvoudakis, K.G., Lewis, F.L.: Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. IET Press (2012)
Lewis, F.L., Liu, D.: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Computational Intelligence Series. John Wiley/IEEE Press, Hoboken (2012)
Kiumarsi, B., Vamvoudakis, K.: Optimal and autonomous control using reinforcement learning: a survey. IEEE Trans. Neural Netw. Learn. Syst. 29(6), 2042–2062 (2018)
Recht, B.: A tour of reinforcement learning: the view from continuous control. Ann. Rev. Control Robot. Auton. Syst. 2, 253–279 (2019)
Görges, D.: Relations between Model predictive control and reinforcement learning. IFAC-PapersOnLine 50(1), 4920–4928 (2017). https://doi.org/10.1016/j.ifacol.2017.08.747
Lee, D., Hu, J.: Primal-dual Q-learning framework for LQR design. IEEE Trans. Autom. Control 64(9), 3756–3763 (2018)
Vamvoudakis, K.G., Lewis, F.L.: Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5), 878–888 (2010)
Kamalapurkar, R., Klotz, J.R., Dixon, W.E.: Concurrent learning-based approximate feedback-nash equilibrium solution of n-player nonsero-sum differential games. IEEE/CAA J. Autom. Sinica 1, 239–247 (2014)
Kamalapurkar, R., Rosenfeld, J.A., Dixon, W.E.: Efficient model-based reinforcement learning for approximate online optimal control. Automatica 74, 247–258 (2016). https://doi.org/10.1016/j.automatica.2016.08.004
Wang, Y., Velswamy, K., Huang, B.: A novel approach to feedback control with deep reinforcement learning. IFAC-PapersOnLine 51(18), 31–36 (2018). https://doi.org/10.1016/j.ifacol.2018.09.241
Kamalapurkar, R., Walters, P., Rosenfeld, J., Dixon, W.: Reinforcemement Learning for Optimal Feedback Control: A Lyapunov Based Approach. Springer, Berlin (2018)
Branicky, M.S., Borkar, V.S., Mitter, S.K.: A unified framework ofr hybrid control: model and optimal control theory. IEEE Trans. Autom. Control 43(1), 31–45 (1998)
Bensoussan, A., Menaldi, J.L.: Hybrid control and dynamic programming. Dyn. Contin. Discrete Impulsive Syst. 3(4), 395–442 (1997)
Shaikh, M.S., Caines, P.E.: On the hybrid optimal control problem: theory and algorithms. IEEE Trans. Autom. Control 52(9), 1587–1603 (2007)
Cassandras, C.G., Pepyne, D.L., Wardi, Y.: Optimal control of a class of hybrid systems. IEEE Trans. Autom. Control 46(3), 398–415 (2001)
Pakniyat, A.: Optimal control of deterministic and stochastic hybrid systems: theory and applications. Ph.D. Dissertation, McGill University (2016)
Forte, F., Marconi, L., Teel, A.R.: Robust nonlinear regulation: continuous-time internal models and hybrid identifiers. IEEE Trans. Autom. Control. 62(7), 3136–3151 (2017)
Poveda, J.I., Krstić, M.: Fixed-time gradient-based extremum seeking. Amer. Control Conf. 2838–2843 (2020)
Poveda, J.I., Teel, A.R.: A framework for a class of hybrid extremum seeking controllers with dynamic inclusions. Automatica 76 (2017)
Kutadinata, R.J., Moase, W., Manzie, C.: Extremum-seeking in singularly perturbed hybrid systems. IEEE Trans. Autom. Control 62(6), 3014–3020 (2017)
Poveda, J.I., Kutadinata, R., Manzie, C., Nes̆Ãc, D., Teel, A.R., Liao, C.: Hybrid extremum seeking for black-box optimization in hybrid plants: an analytical framework. In: 57th IEEE Conferece on Decision and Control, pp. 2235–2240 (2018)
Owens, D.H.: Iterative Learning Control: An Optimization Paradigm. Springer, London (2015)
Poveda, J.I., Benosman, M., Teel, A.R.: Hybrid online learning control in networked multiagent systems: a survey. Int. J. Adapt. Control Signal Process. 33(2), 228–261 (2019)
Goebel, R., Sanfelice, R.G., Teel, A.R.: Hybrid dynamical systems: modeling, stability, and robustness. Princeton University Press, Princeton (2012)
Poveda, J.I., Li, N.: Robust hybrid zero-order optimization algorithms with acceleration via averaging in continuous time. Automatica, 123, 2021, 109361
Teel, A.R., Poveda, J.I., Le, J.: First-order optimization algorithms with resets and hamiltonian flows. In: 58th IEEE Conference on Decision and Control, pp. 5838–5843 (2019)
Poveda, J.I., Teel, A.R.: A robust event-triggered approach for fast sampled-data extremization and learning. IEEE Trans. Autom. Control 62(10) (2017
Liu, J., Teel, A.R.: Lyapunov-based sufficient conditions for stability of hybrid systems with memory. IEEE Trans. Autom. Control 61(4), 1057–1062 (2016)
Mayhew, C.G.: Hybrid control for topologically constrained systems. Ph.D Dissertation, University of California, Santa Barbara (2010)
Sanfelice, R.G., Messina, M.J., Tuna, S.E., Teel, A.R.: Robust hybrid controllers for contrinuous-time systems with applications to obstacle avoidance and regulation to disconnected set of points. In: Proceedings of American Control Conference, pp. 3352–3357 (2006)
Poveda, J.I., Benosman, M., Sanfelice, R.G., Teel, A.R.: A hybrid adaptive feedback law for robust obstacle avoidance and coordination in multiple vehicle systems. In: Proceedings of American Control Conferece, pp. 616–621 (2018)
Strizic, T., Poveda, J.I., Teel, A.R.: Hybrid gradient descent for robust global optimization on the circle. In: 56th IEEE Conference on Decision and Control, pp. 2985–2990 (2017)
Hespanha, J.P., Morse, A.S.: Stabilization of switched systems with average dwell-time. In: 38th IEEE Conference on Decision and Control, pp. 2655–2660 (1999)
Vidyasagar, M.: Nonlinear Systems Analysis. Prentice Hall, Upper Saddle River (1993)
Jakubczyk, B., Sontag, E.D.: Controllability of nonlinear discrete-time systems: a Lie-algebraic approach. SIAM J. Control Opt. 28(1), 1–33 (1990)
Subbaraman, A., Teel, A.R.: On the equivalence between global recurrence and the existence of a smooth Lyapunov function for hybrid systems. Syst. & Control Lett. 88, 54–61 (2016)
Teel, A.R., Subbaraman, A., Sferlazza, A.: Stability analysis for stochastic hybrid systems: a survey. Automatica 50(10), 2435–2456 (2014)
Vamvoudakis, K.G., Fantini-Miranda, M., Hespanha, J.P.: Asymptotically stable adaptive-optimal control algorithm with saturating actuators and relaxed persistence of excitation. IEEE Trans. Neural Netw. Learn. Syst. 27(11), 2386–2398 (2016)
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. SMC-13(5), 834–846 (1983)
Lewis, F., Syrmos, V.: Optimal Control. Wiley, Boston (1995)
Hespanha, J.P.: Linear Systems Theory. Princeton University Press, Princeton (2009)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes: The Art of Scientific Computing, 3rd edn. Cambridge University Press, Cambridge (2007)
Bhaya, A., Kaszkurewicz, E.: Control perspectives on numerical algorithms and matrix problems. SIAM (2006)
Nocedal, J., Wright, S.: Numerical Optimization. Springer, Berlin (1999)
Saridis, G.N., Lee, C.S.: An approximation theory of optimal control for trainabla manipulators. IEEE Trans. Syst. Man Cybern. 9(3), 152–159 (1979)
Hornik, K., Stinchcombe, S., White, H.: Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw. 3, 551–560 (1990)
Sastry, S., Bodson, M.: Adaptive Control: Stability, Convergence, and Robustness. Prentice-Hall, Englewood Cliffs (1989)
Ioannou, P.A., Sun, J.: Robust Adaptive Control. Dover Publications Inc., Mineola (2012)
Prokhorov, D.V., Wunsch, D.C.: Adaptive critic designs. IEEE Trans. Neural Netw. 8(5), 997–1007 (1997)
Abouheaf, M.I., Lewis, F.L., Vamvoudakis, K.G., Haesaert, S., Babuska, R.: Multi-agent discrete-time graphical games and reinforcement learning solutions. Automatica 50(12), 3038–3053 (2014)
Yang, Q., Vance, J.B., Jagannathan, S.: Control of nonaffine nonlinear discrete-time systems using reinforcement learning-based linearly parameterized neural networks. IEEE Trans. Syst. Man Cybern. Part B Cybern. 38(4), 994–1001 (2008)
Yang, X., Liu, D., Wang, D., Wei, Q.: Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning. Neural Netw. 55, 30–41 (2014). https://doi.org/10.1016/j.neunet.2014.03.008
Yang, Q., Jagannathan, S.: Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 42(2), 377–390 (2012)
Vamvoudakis, K., Lewis, F.L.: Online solution of nonlinear two-player zero-sum games using synchronous policy iteration. Int. J. Robust Nonlinear Control 22, 1460–1483 (2011)
Vamvoudakis, K.G., Lewis, F.L.: Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica 47(8), 1556–1569 (2011). https://doi.org/10.1016/j.automatica.2011.03.005
Kanellopoulos, A., Vamvoudakis, K.G.: Non-equilibrium dynamic games and cyber-physical security: a cognitive hierarchy approach. Syst. Control Lett. 125, 59–66 (2019). https://doi.org/10.1016/j.sysconle.2019.01.008
Vamvoudakis, K.G.: Model-Free Learning of Nash Games With Applications to Network Security. Elsevier Inc., Amsterdam (2016). https://doi.org/10.1016/B978-0-12-805246-4.00010-0
Vamvoudakis, K.G., Ferraz, H.: Model-free event-triggered control algorithm for continuous-time linear systems with optimal performance. Automatica 87, 412–420 (2018). https://doi.org/10.1016/j.automatica.2017.03.013
Chen, C., Modares, H., Xie, K., Lewis, F.L., Wan, Y., Xie, S.: Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics. IEEE Trans. Autom. Control 64(11), 4423–4438 (2019)
Vamvoudakis, K.G.: Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA J. Autom. Sinica 1(3), 282–293 (2014)
Su, H., Zhang, H., Sun, S., Cai, Y.: Integral reinforcement learning-based online adaptive event-triggered control for non-zero-sum games of partially unknown nonlinear systems. Neurocomputing (2019). https://doi.org/10.1016/j.neucom.2019.09.088
Prieur, C., Teel, A.R.: Uniting local and global output feedback controllers. IEEE Trans. Autom. Control 56(7), 1636–1649 (2011)
Prieur, C., Goebel, R., Teel, A.R.: Hybrid feedback control and robust stabilization of nonlinear systems. IEEE Trans. Autom. Control 52(11), 2103–2117 (2007)
Mayhew, C.G., Sanfelice, R.G., Sheng, J., Arcak, M., Teel, A.R.: Quaternion-based hybrid feedback for robust global attitude synchronization. IEEE Trans. Autom. Control 57(8), 2122–2127 (2012)
Nesić, D., Teel, A.R., Zaccarian, L.: Stabiliy and performace of SISO control systems with first-order reet elements. IEEE Trans. Autom. Control 56(11), 2567–2582 (2011)
Nes̆ić, D., Teel, A.R., Kokotović, P.V.: Sufficient conditions for stabilization of sampled-data nonlinear systems via discrete-time approximations. Syst. & Control Lett. 38, 259–270 (1999)
Nes̆ić, D., Teel, A.R.: A framework for stabilization of nonlinear sampled-data systems based on their approximate discrete-time models. IEEE Trans. Autom. Control 49(7), 1103–1121 (2004)
Nešić, D., Teel, A.R., Carnevale, D.: Explicit computation of the sampling period in emulation of controllers for nonlinear sampled-data systems. IEEE Transactions on Automatic and Control 54(3), 619–624 (2009)
Khong, S.Z., Nes̆ić, D., Tan, Y., Manzie, C.: Unified framework for sampled-data extremum seeking control: global optimisation and multi-unit systems. Automatica 49, 2720–2733 (2013)
Chien, C.: A sampled-data iterative learning control using fuzzy network design. Int. J. Control 73, 902–913 (2000)
Bai, E.W., Fu, L.C., Sastry, S.S.: Averaging analysis for discrete time and sampled data adaptive systems. IEEE Trans. Circuits Syst. 35(2) (1988)
Chen, T., Francis, B.: Optimal Sampled-Data Control Systems. Springer, Berlin (1995)
Vrabie, D., Lewis, F.: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Netw. 22, 237–246 (2009)
Tabuada, P.: Event-triggered real-time scheduling of stabilizing control tasks. IEEE Trans. Autom. Control 52, 1680–1685 (2007)
Postoyan, R., Tabuada, P., Nes̆ić, D., Anta, A.: A framework for the event-trigered stabilization of nonlinear systems. IEEE Trans. Autom. Control. 60(4), 982–996 (2015)
Heemels, W.P.M.H., Donkers, M.C.F., Teel, A.R.: Periodic event-triggered control for linear systems. IEEE Trans. Autom. Control 58, 847–861 (2013)
Narayanan, V., Jagannathan, S.: Event-triggered distributed control of nonlinear interconnected systems using online reinforcement learning with exploration. IEEE Trans. Cybern. 48(9), 2510–2519 (2018)
Poveda, J.I., Teel, A.R.: Hybrid mechanisms for robust synchronization and coordination of multi-agent networked sampled-data systems. Automatica 99, 41–53 (2019)
Persis, C.D., Postoyan, R.: A Lyapunov redesign of coordination algorithms for cyber-physical systems. IEEE Transactions on Automatic and Control 62(2), 808–823 (2017)
Poveda, J., Teel, A.: A hybrid systems approach for distributed nonsmooth optimization in asynchronous multi-agent sampled-data systems. IFAC-PapersOnLine 49(18) (2016)
Nešić, D., Teel, A.R., Zaccarian, L.: Stability and performance of SISO control systems with first-order reset elements. IEEE Trans. Autom. Control 56(11), 2567–2582 (2011)
Prieur, C., Queinnec, I., Tarbouriech, S., Zaccarian, L.: Analysis and Synthesis of Reset Control Systems. Now Foundations and Trends (2018)
Hustig-Schultz, D., Sanfelice, R.G.: A robust hybrid heavy ball algorithm for optimization with high performance. Amer. Control Conf. (2019). To appear
Poveda, J.I., Li, N.: Inducing uniform asymptotic stability in non-autonomous accelerated optimization dynamics via hybrid regularization. In: 58th IEEE Conference on Decision and Control, pp. 3000–3005 (2019)
Ochoa, D., Poveda, J.I., Uribe, C., Quijano, N.: Robust accelerated optimization on networks via distributed restarting of Nesterov’s-like ODE. IEEE Control Syst. Lett. 5(1) (2021)
Socy, B.V., Freeman, R.A., Lynch, K.M.: The fastest known globally convergent first-order method for minimizing strongly convex functions. IEEE Control Syst. Lett. 2(1), 49–54 (2018)
Baradaran, M., Poveda, J.I., Teel, A.R.: Stochastic hybrid inclusions applied to global almost sure optimization on manifolds. In: IEEE 57th Conference on Decision and Control, pp. 6538–6543 (2018)
Baradaran, M., Poveda, J.I., Teel, A.R.: Global optimization on the sphere: a stochastic hybrid systems approach. In: Proceedings of the 10th IFAC Symposium on Nonlinear Control Systems (2019). To appear
Possieri, C., Teel, A.R.: LQ optimal control for a class of hybrid systems. In: IEEE 55th Conference on Decision and Control, pp. 604–609 (2016)
Carnevale, D., Galeani, S., Sassano, M.: A linear quadratic approach to linear time invariant stabilization for a class of hybrid systems. In: Proceedings of the of 22nd Mediterranean Conference on Control and Automation, pp. 545–550 (2014)
Dharmatti, S., Ramaswamy, M.: Hybrid control systems and viscosity solutions. SIAM J. Control Opt. 44(4), 1259–1288 (2005)
Barles, G., Dharmatti, S., Ramaswamy, M.: Unbounded viscosity solutions of hybrid control systems. ESAIM: Control Opt. Cal. Var. 16(1), 176–193 (2010)
De Carolis, G., Saccon, A.: On linear quadratic optimal control for time-varying multimodal linear systems with time-triggered jumps. IEEE Control Syst. Lett. 4(1), 217–222 (2020)
Hedlund, S., Rantzer, A.: Convex dynamic programming for hybrid systems. IEEE Trans. Autom. Control 47(9), 1536–1540 (2002)
Passenberg, B., Caines, P.E., Leibold, M., Stursberg, O., Buss, M.: Optimal control for hybrid systems with partitioned state space. IEEE Trans. Autom. Control 58(8), 2131–2136 (2013)
Bemporad, A., Giorgetti, N.: Logic-based solution methods for optimal control of hybrid systems. IEEE Trans. Autom. Control 51(6), 963–976 (2006)
Chen, H.: Optimal control and reinforcement learning of switched systems. Ph.D. Dissertation, The Ohio State University (2018)
Acknowledgements
The first author would like to thank John Hauser, Ashutosh Trivedi, and Fabio Somenzi for fruitful conversations about optimal control, dynamic programming, and reinforcement learning from the controls and computer science perspective. The first author acknowledges support from NSF via the grant CNS-1947613. The second author acknowledges support from AFOSR via the grant FA9550-18-1-0246.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Poveda, J.I., Teel, A.R. (2021). A Hybrid Dynamical Systems Perspective on Reinforcement Learning for Cyber-Physical Systems: Vistas, Open Problems, and Challenges. In: Vamvoudakis, K.G., Wan, Y., Lewis, F.L., Cansever, D. (eds) Handbook of Reinforcement Learning and Control. Studies in Systems, Decision and Control, vol 325. Springer, Cham. https://doi.org/10.1007/978-3-030-60990-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-60990-0_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60989-4
Online ISBN: 978-3-030-60990-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)