Advertisement

Overview of Robust Adaptive Critic Control Design

  • Ding Wang
  • Chaoxu Mu
Chapter
Part of the Studies in Systems, Decision and Control book series (SSDC, volume 167)

Abstract

Adaptive dynamic programming (ADP) and reinforcement learning are quite relevant to each other when performing intelligent optimization. They are both regarded as promising methods involving important components of evaluation and improvement, at the background of information technology, such as artificial intelligence, big data, and deep learning. Although great progresses have been achieved and surveyed when addressing nonlinear optimal control problems, the research on robustness of ADP-based control strategies under uncertain environment has not been fully summarized. Hence, this chapter reviews the recent main results of adaptive-critic-based robust control design of continuous-time nonlinear systems. The ADP-based nonlinear optimal regulation is reviewed, followed by robust stabilization of nonlinear systems with matched uncertainties, guaranteed cost control design of unmatched plants, and decentralized stabilization of interconnected systems. Additionally, further comprehensive discussions are presented, including event-based robust control design, improvement of the critic learning rule, nonlinear \(H_{\infty }\) control design, and several notes on future perspectives. This overview is beneficial to promote the development of adaptive critic control methods with robustness guarantee and the construction of higher level intelligent systems.

References

  1. 1.
    Abu-Khalaf, M., Lewis, F.L.: Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5), 779–791 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Abu-Khalaf, M., Lewis, F.L., Huang, J.: Policy iterations on the Hamilton-Jacobi-Isaacs equation for \(H_{\infty }\) state feedback control with input saturation. IEEE Trans. Autom. Control 51(12), 1989–1995 (2006)MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    Adhyaru, D.M., Kar, I.N., Gopal, M.: Fixed final time optimal control approach for bounded robust controller design using Hamilton-Jacobi-Bellman solution. IET Control Theory Appl. 3(9), 1183–1195 (2009)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Adhyaru, D.M., Kar, I.N., Gopal, M.: Bounded robust control of nonlinear systems using neural network-based HJB solution. Neural Comput. Appl. 20(1), 91–103 (2011)Google Scholar
  5. 5.
    Al-Tamimi, A., Lewis, F.L., Abu-Khalaf, M.: Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans. Syst. Man Cybern.-Part B: Cybern. 38(4), 943–949 (2008)CrossRefGoogle Scholar
  6. 6.
    Arel, I., Rose, D.C., Karnowski, T.P.: Deep machine learning-A new frontier in artificial intelligence research. IEEE Comput. Intell. Mag. 5, 13–18 (2010)CrossRefGoogle Scholar
  7. 7.
    Basar, T., Bernhard, P.: \(H_{\infty }\)-Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, 2nd edn. Birkhauser, Boston, MA (2008)zbMATHCrossRefGoogle Scholar
  8. 8.
    Beard, R.W., Saridis, G.N., Wen, J.T.: Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation. Automatica 33(12), 2159–2177 (1997)MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton, New Jersey (1957)zbMATHGoogle Scholar
  10. 10.
    Bertsekas, D.P.: Abstract Dynamic Programming. Athena Scientific, Belmont, MA (2013)zbMATHGoogle Scholar
  11. 11.
    Bertsekas, D.P.: Value and policy iterations in optimal control and adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 500–509 (2017)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont, Massachusetts (1996)zbMATHGoogle Scholar
  13. 13.
    Bhasin, S., Kamalapurkar, R., Johnson, M., Vamvoudakis, K.G., Lewis, F.L., Dixon, W.E.: A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 49(1), 82–92 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    Bian, T., Jiang, Y., Jiang, Z.P.: Decentralized adaptive optimal control of large-scale systems with application to power systems. IEEE Trans. Industr. Electron. 62(4), 2439–2447 (2015)CrossRefGoogle Scholar
  15. 15.
    Bian, T., Jiang, Z.P.: Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design. Automatica 71, 348–360 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  16. 16.
    Busoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton, Florida (2010)zbMATHCrossRefGoogle Scholar
  17. 17.
    Chang, S.S.L., Peng, T.K.C.: Adaptive guaranteed cost control of systems with uncertain parameters. IEEE Trans. Autom. Control 17(4), 474–483 (1972)MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    Chen, X.W., Lin, X.: Big data deep learning: challenges and perspectives. IEEE Access 2, 514–525 (2014)CrossRefGoogle Scholar
  19. 19.
    Corless, M.J., Leitmann, G.: Continuous state feedback guaranteeing uniform ultimate boundedness for uncertain dynamic systems. IEEE Trans. Autom. Control 26(5), 1139–1144 (1981)MathSciNetzbMATHCrossRefGoogle Scholar
  20. 20.
    Dierks, T., Jagannathan, S.: Optimal control of affine nonlinear continuous-time systems. In: Proceedings of the American Control Conference, pp. 1568–1573 (2010)Google Scholar
  21. 21.
    Dierks, T., Thumati, B.T., Jagannathan, S.: Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence. Neural Netw. 22(5–6), 851–860 (2009)zbMATHCrossRefGoogle Scholar
  22. 22.
    Dong, L., Tang, Y., He, H., Sun, C.: An event-triggered approach for load frequency control with supplementary ADP. IEEE Trans. Power Syst. 32(1), 581–589 (2017)CrossRefGoogle Scholar
  23. 23.
    Fan, Q.Y., Yang, G.H.: Adaptive actor-critic design-based integral sliding-mode control for partially unknown nonlinear systems with input disturbances. IEEE Trans. Neural Netw. Learn. Syst. 27(1), 165–177 (2016)MathSciNetCrossRefGoogle Scholar
  24. 24.
    Fu, J., He, H., Zhou, X.: Adaptive learning and control for MIMO system based on adaptive dynamic programming. IEEE Trans. Neural Netw. 22(7), 1133–1148 (2011)CrossRefGoogle Scholar
  25. 25.
    Gao, W., Jiang, Y., Jiang, Z.P., Chai, T.: Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming. Automatica 72, 37–45 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  26. 26.
    Gao, W., Jiang, Z.P.: Adaptive dynamic programming and adaptive optimal output regulation of linear systems. IEEE Trans. Autom. Control 61(12), 4164–4169 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  27. 27.
    Haddad, W.M., Chellaboina, V.: Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach. Princeton University Press, Princeton, New Jersey (2008)zbMATHGoogle Scholar
  28. 28.
    Haddad, W.M., Chellaboina, V., Fausz, J.L.: Robust nonlinear feedback control for uncertain linear systems with nonquadratic performance criteria. Syst. Control Lett. 33(5), 327–338 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  29. 29.
    Haddad, W.M., Chellaboina, V., Fausz, J.L., Leonessa, A.: Optimal non-linear robust control for nonlinear uncertain systems. Int. J. Control 73(4), 329–342 (2000)zbMATHCrossRefGoogle Scholar
  30. 30.
    Hanselmann, T., Noakes, L., Zaknich, A.: Continuous-time adaptive critics. IEEE Trans. Neural Netw. 18(3), 631–647 (2007)CrossRefGoogle Scholar
  31. 31.
    Haykin, S.: Neural Networks: A Comprehensive Foundation. Prentice-Hall, Upper Saddle River, New Jersey (1999)zbMATHGoogle Scholar
  32. 32.
    He, H., Ni, Z., Fu, J.: A three-network architecture for on-line learning and optimization based on adaptive dynamic programming. Neurocomputing 78, 3–13 (2012)CrossRefGoogle Scholar
  33. 33.
    He, W., Dong, Y., Sun, C.: Adaptive neural impedance control of a robotic manipulator with input saturation. IEEE Trans. Syst. Man Cybern.: Syst. 46(3), 334–344 (2016)CrossRefGoogle Scholar
  34. 34.
    Heydari, A.: Revisiting approximate dynamic programming and its convergence. IEEE Trans. Cybern. 44(12), 2733–2743 (2014)CrossRefGoogle Scholar
  35. 35.
    Heydari, A., Balakrishnan, S.N.: Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics. IEEE Trans. Neural Netw. Learn. Syst. 24(1), 145–157 (2013)CrossRefGoogle Scholar
  36. 36.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)MathSciNetzbMATHCrossRefGoogle Scholar
  37. 37.
    Hou, Z., Jin, S.: Data-driven model-free adaptive control for a class of MIMO nonlinear discrete-time systems. IEEE Trans. Neural Netw. 22, 2173–2188 (2011)CrossRefGoogle Scholar
  38. 38.
    Hou, Z., Wang, Z.: From model-based control to data-driven control: survey, classification and perspective. Inf. Sci. 235, 3–35 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  39. 39.
    Jagannathan, S.: Neural Network Control of Nonlinear Discrete-Time Systems. CRC Press, Boca Raton, FL (2006)zbMATHGoogle Scholar
  40. 40.
    Jiang, Y., Jiang, Z.P.: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48(10), 2699–2704 (2012)MathSciNetzbMATHCrossRefGoogle Scholar
  41. 41.
    Jiang, Y., Jiang, Z.P.: Robust adaptive dynamic programming for large-scale systems with an application to multimachine power systems. IEEE Trans. Circuits Syst.-II: Express Briefs 59(10), 693–697 (2012)CrossRefGoogle Scholar
  42. 42.
    Jiang, Y., Jiang, Z.P.: Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 882–893 (2014)CrossRefGoogle Scholar
  43. 43.
    Jiang, Y., Jiang, Z.P.: Global adaptive dynamic programming for continuous-time nonlinear systems. IEEE Trans. Autom. Control 60(11), 2917–2929 (2015)MathSciNetzbMATHCrossRefGoogle Scholar
  44. 44.
    Jiang, Y., Jiang, Z.P.: Robust Adaptive Dynamic Programming. Wiley-IEEE Press, Hoboken, NJ (2017)zbMATHCrossRefGoogle Scholar
  45. 45.
    Jiang, Z.P., Jiang, Y.: Robust adaptive dynamic programming for linear and nonlinear systems: an overview. Eur. J. Control 19(5), 417–425 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  46. 46.
    Jiang, Z.P., Teel, A.R., Praly, L.: Small-gain theorem for ISS systems and applications. Math. Control Signals Syst. 7(2), 95–120 (1994)MathSciNetzbMATHCrossRefGoogle Scholar
  47. 47.
    Kamalapurkar, R., Walters, P., Dixon, W.E.: Model-based reinforcement learning for approximate optimal regulation. Automatica 64, 94–104 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  48. 48.
    Khalil, H.K.: Nonlinear Systems, 3rd edn. Prentice-Hall, New Jersey (2002)zbMATHGoogle Scholar
  49. 49.
    Krstic, M., Kanellakopoulos, I., Kokotovic, P.: Nonlinear and Adaptive Control Design. Wiley, New York (1995)zbMATHGoogle Scholar
  50. 50.
    Lavretsky, E., Wise, K.A.: Robust and Adaptive Control with Aerospace Applications. Springer, London (2013)zbMATHCrossRefGoogle Scholar
  51. 51.
    Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)CrossRefGoogle Scholar
  52. 52.
    Lee, J.Y., Park, J.B., Choi, Y.H.: Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems. Automatica 48(11), 2850–2859 (2012)MathSciNetzbMATHCrossRefGoogle Scholar
  53. 53.
    Lee, J.Y., Park, J.B., Choi, Y.H.: On integral generalized policy iteration for continuous-time linear quadratic regulations. Automatica 50, 475–489 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  54. 54.
    Lee, J.Y., Park, J.B., Choi, Y.H.: Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 916–932 (2015)MathSciNetCrossRefGoogle Scholar
  55. 55.
    Lendaris, G.G.: A retrospective on adaptive dynamic programming for control. In: Proceedings of the International Joint Conference on Neural Networks, pp. 1750–1757 (2009)Google Scholar
  56. 56.
    Lewis, F.L., Jagannathan, S., Yesildirek, A.: Neural Network Control of Robot Manipulators and Nonlinear Systems. Taylor & Francis, London (1998)Google Scholar
  57. 57.
    Lewis, F.L., Liu, D.: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Wiley, New Jersey (2013)Google Scholar
  58. 58.
    Lewis, F.L., Vrabie, D.: Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst. Mag. 9(3), 32–50 (2009)CrossRefGoogle Scholar
  59. 59.
    Lewis, F.L., Vrabie, D., Syrmos, V.L.: Optimal Control, 3rd edn. Wiley, New York (2012)zbMATHCrossRefGoogle Scholar
  60. 60.
    Lewis, F.L., Vrabie, D., Vamvoudakis, K.G.: Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers. IEEE Control Syst. Mag. 32(6), 76–105 (2012)MathSciNetCrossRefGoogle Scholar
  61. 61.
    Liang, J., Venayagamoorthy, G.K., Harley, R.G.: Wide-area measurement based dynamic stochastic optimal power flow control for smart grids with high variability and uncertainty. IEEE Trans. Smart Grid 3(1), 59–69 (2012)CrossRefGoogle Scholar
  62. 62.
    Lincoln, B., Rantzer, A.: Relaxing dynamic programming. IEEE Trans. Autom. Control 51, 1249–1260 (2006)MathSciNetzbMATHCrossRefGoogle Scholar
  63. 63.
    Lin, F.: Robust Control Design: An Optimal Control Approach. Wiley, New York (2007)CrossRefGoogle Scholar
  64. 64.
    Lin, F., Brand, R.D., Sun, J.: Robust control of nonlinear systems: compensating for uncertainty. Int. J. Control 56(6), 1453–1459 (1992)MathSciNetzbMATHCrossRefGoogle Scholar
  65. 65.
    Liu, D.: Approximate dynamic programming for self-learning control. Acta Automatica Sinica 31(1), 13–18 (2005)MathSciNetGoogle Scholar
  66. 66.
    Liu, D., Li, C., Li, H., Wang, D., Ma, H.: Neural-network-based decentralized control of continuous-time nonlinear interconnected systems with unknown dynamics. Neurocomputing 165, 90–98 (2015)CrossRefGoogle Scholar
  67. 67.
    Liu, D., Li, H., Wang, D.: Data-based self-learning optimal control: research progress and prospects. Acta Automatica Sinica 39(11), 1858–1870 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  68. 68.
    Liu, D., Li, H., Wang, D.: Error bounds of adaptive dynamic programming algorithms for solving undiscounted optimal control problems. IEEE Trans. Neural Netw. Learn. Syst. 26(6), 1323–1334 (2015)MathSciNetCrossRefGoogle Scholar
  69. 69.
    Liu, D., Wang, D., Li, H.: Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach. IEEE Trans. Neural Netw. Learn. Syst. 25(2), 418–428 (2014)CrossRefGoogle Scholar
  70. 70.
    Liu, D., Wang, D., Wang, F.Y., Li, H., Yang, X.: Neural-network-based online HJB solution for optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems. IEEE Trans. Cybern. 44(12), 2834–2847 (2014)CrossRefGoogle Scholar
  71. 71.
    Liu, D., Wang, D., Zhao, D., Wei, Q., Jin, N.: Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming. IEEE Trans. Autom. Sci. Eng. 9(3), 628–634 (2012)CrossRefGoogle Scholar
  72. 72.
    Liu, D., Wei, Q., Wang, D., Yang, X., Li, H.: Adaptive Dynamic Programming with Applications in Optimal Control. Springer, London (2017)zbMATHCrossRefGoogle Scholar
  73. 73.
    Liu, D., Xu, Y., Wei, Q., Liu, X.: Residential energy scheduling for variable weather solar energy based on adaptive dynamic programming. IEEE/CAA J. Autom. Sinica 5(1), 36–46 (2018)CrossRefGoogle Scholar
  74. 74.
    Liu, D., Yang, X., Li, H.: Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neural Comput. Appl. 23(7–8), 1843–1850 (2013)CrossRefGoogle Scholar
  75. 75.
    Liu, D., Yang, X., Wang, D., Wei, Q.: Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints. IEEE Trans. Cybern. 45(7), 1372–1385 (2015)CrossRefGoogle Scholar
  76. 76.
    Liu, D., Zhang, H.: A neural dynamic programming approach for learning control of failure avoidance problems. Int. J. Intell. Control Syst. 10(1), 21–32 (2005)Google Scholar
  77. 77.
    Liu, L., Wang, Z., Zhang, H.: Adaptive fault-tolerant tracking control for MIMO discrete-time systems via reinforcement learning algorithm with less learning parameters. IEEE Trans. Autom. Sci. Eng. 14(1), 299–313 (2017)CrossRefGoogle Scholar
  78. 78.
    Liu, Y.J., Tong, S., Chen, C.L.P., Li, D.J.: Neural controller design-based adaptive control for nonlinear MIMO systems with unknown hysteresis inputs. IEEE Trans. Cybern. 46(1), 9–19 (2016)CrossRefGoogle Scholar
  79. 79.
    Luo, B., Huang, T., Wu, H.N., Yang, X.: Data-driven \(H_{\infty }\) control for nonlinear distributed parameter systems. IEEE Trans. Neural Netw. Learn. Syst. 26(11), 2949–2961 (2015)MathSciNetCrossRefGoogle Scholar
  80. 80.
    Luo, B., Wu, H.N., Huang, T.: Off-policy reinforcement learning for \(H_{\infty }\) control design. IEEE Trans. Cybern. 45(1), 65–76 (2015)CrossRefGoogle Scholar
  81. 81.
    Luo, B., Wu, H.N., Huang, T., Liu, D.: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica 50(12), 3281–3290 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  82. 82.
    Lv, Y., Na, J., Yang, Q., Wu, X., Guo, Y.: Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics. Int. J. Control 89(1), 99–112 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  83. 83.
    Lyshevski, S.E.: Nonlinear discrete-time systems: constrained optimization and application of nonquadratic costs. In: Proceedings of American Control Conference, pp. 3699–3703 (1998)Google Scholar
  84. 84.
    Ma, H., Wang, Z., Wang, D., Liu, D., Yan, P., Wei, Q.: Neural-network-based distributed adaptive robust control for a class of nonlinear multiagent systems with time delays and external noises. IEEE Trans. Syst. Man Cybern.: Syst. 46, 750–758 (2016)CrossRefGoogle Scholar
  85. 85.
    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)CrossRefGoogle Scholar
  86. 86.
    Modares, H., Lewis, F.L.: Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Trans. Autom. Control 59(11), 3051–3056 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  87. 87.
    Modares, H., Lewis, F.L.: Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica 50(7), 1780–1792 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  88. 88.
    Modares, H., Lewis, F.L., Naghibi-Sistani, M.B.: Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks. IEEE Trans. Neural Netw. Learn. Syst. 24(10), 1513–1525 (2013)CrossRefGoogle Scholar
  89. 89.
    Modares, H., Lewis, F.L., Sistani, M.B.N.: Online solution of nonquadratic two-player zero-sum games arising in the \(H_{\infty }\) control of constrained input systems. Int. J. Adapt. Control Signal Process. 28(3–5), 232–254 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  90. 90.
    Mu, C., Ni, Z., Sun, C., He, H.: Air-breathing hypersonic vehicle tracking control based on adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 584–598 (2017)MathSciNetCrossRefGoogle Scholar
  91. 91.
    Mu, C., Sun, C., Song, A., Yu, H.: Iterative GDHP-based approximate optimal tracking control for a class of discrete-time nonlinear systems. Neurocomputing 214, 775–784 (2016)CrossRefGoogle Scholar
  92. 92.
    Mu, C., Sun, C., Wang, D., Song, A., Qian, C.: Decentralized adaptive optimal stabilization of nonlinear systems with matched interconnections. Soft. Comput. 22(8), 2705–2715 (2018)CrossRefGoogle Scholar
  93. 93.
    Mu, C., Tang, Y., He, H.: Improved sliding mode design for load frequency control of power system integrated an adaptive learning strategy. IEEE Trans. Industr. Electron. 64(8), 6742–6751 (2017)CrossRefGoogle Scholar
  94. 94.
    Mu, C., Wang, D.: Neural-network-based adaptive guaranteed cost control of nonlinear dynamical systems with matched uncertainties. Neurocomputing 245, 46–54 (2017)CrossRefGoogle Scholar
  95. 95.
    Mu, C., Wang, D., He, H.: Data-driven finite-horizon approximate optimal control for discrete-time nonlinear systems using iterative HDP approach. IEEE Trans. Cybern. (2017).  https://doi.org/10.1109/TCYB.2017.2752845
  96. 96.
    Mu, C., Wang, D., He, H.: Novel iterative neural dynamic programming for data-based approximate optimal control design. Automatica 81, 240–252 (2017)MathSciNetzbMATHCrossRefGoogle Scholar
  97. 97.
    Murray, J.J., Cox, C.J., Lendaris, G.G., Saeks, R.: Adaptive dynamic programming. IEEE Trans. Syst. Man Cybern.-Part C: Appl. Rev. 32(2), 140–153 (2002)CrossRefGoogle Scholar
  98. 98.
    Na, J., Herrmann, G.: Online adaptive approximate optimal tracking control with simplified dual approximation structure for continuoustime unknown nonlinear systems. IEEE/CAA J. Autom. Sinica 1(4), 412–422 (2014)CrossRefGoogle Scholar
  99. 99.
    Ni, Z., He, H., Zhao, D., Xu, X., Prokhorov, D.V.: GrDHP: a general utility function representation for dual heuristic dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 26(3), 614–627 (2015)MathSciNetCrossRefGoogle Scholar
  100. 100.
    Nodland, D., Zargarzadeh, H., Jagannathan, S.: Neural network-based optimal adaptive output feedback control of a helicopter UAV. IEEE Trans. Neural Netw. Learn. Syst. 24(7), 1061–1073 (2013)CrossRefGoogle Scholar
  101. 101.
    Padhi, R., Unnikrishnan, N., Wang, X., Balakrishnan, S.N.: A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neural Netw. 19(10), 1648–1660 (2006)zbMATHCrossRefGoogle Scholar
  102. 102.
    Prokhorov, D.V., Wunsch, D.C.: Adaptive critic designs. IEEE Trans. Neural Netw. 8(5), 997–1007 (1997)Google Scholar
  103. 103.
    Qin, C., Zhang, H., Luo, Y.: Model-free \(H_{\infty }\) control design for unknown continuous-time linear system by using adaptive dynamic programming. Asian J. Control 18(2), 609–618 (2016)MathSciNetzbMATHCrossRefGoogle Scholar
  104. 104.
    Qin, C., Zhang, H., Wang, Y., Luo, Y.: Neural network-based online \(H_{\infty }\) control for discrete-time affine nonlinear system using adaptive dynamic programming. Neurocomputing 198, 91–99 (2016)CrossRefGoogle Scholar
  105. 105.
    Qiu, J., Wu, Q., Ding, G., Xu, Y.: Feng S (2016) A survey of machine learning for big data processing. EURASIP J. Adv. Signal Process. 1, 67 (2016)CrossRefGoogle Scholar
  106. 106.
    Rantzer, A.: Relaxed dynamic programming in switching systems. IEE Proc.-Control Theory. Appl. 153(5), 567–574 (2006)MathSciNetCrossRefGoogle Scholar
  107. 107.
    Saberi, A.: On optimality of decentralized control for a class of nonlinear interconnected systems. Automatica 24, 101–104 (1988)MathSciNetzbMATHCrossRefGoogle Scholar
  108. 108.
    Sahoo, A., Xu, H., Jagannathan, S.: Neural network-based event-triggered state feedback control of nonlinear continuous-time systems. IEEE Trans. Neural Netw. Learn. Syst. 27(3), 497–509 (2016)MathSciNetCrossRefGoogle Scholar
  109. 109.
    Santiago, R.A., Werbos, P.J.: New progress towards truly brain-like intelligent control. In: Proceedings of the World Congress on Neural Networks, pp. 27–33 (1994)Google Scholar
  110. 110.
    Saridis, G.N., Wang, F.Y.: Suboptimal control of nonlinear stochastic systems. Control Theory Adv. Technol. 10(4), 847–871 (1994)MathSciNetGoogle Scholar
  111. 111.
    Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRefGoogle Scholar
  112. 112.
    Si, J., Barto, A.G., Powell, W.B., Wunsch, D.C.: Handbook of Learning and Approximate Dynamic Programming. Wiley-IEEE Press, New Jersey (2004)CrossRefGoogle Scholar
  113. 113.
    Si, J., Wang, Y.T.: On-line learning control by association and reinforcement. IEEE Trans. Neural Netw. 12(2), 264–276 (2001)CrossRefGoogle Scholar
  114. 114.
    Siljak, D.D.: Decentralized Control of Complex Systems. Academic Press, Boston, Massachusetts (2012)zbMATHGoogle Scholar
  115. 115.
    Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016)CrossRefGoogle Scholar
  116. 116.
    Sokolov, Y., Kozma, R., Werbos, L.D., Werbos, P.J.: Complete stability analysis of a heuristic approximate dynamic programming control design. Automatica 59, 9–18 (2015)MathSciNetzbMATHCrossRefGoogle Scholar
  117. 117.
    Song, R., Lewis, F.L., Wei, Q., Zhang, H.: Off-policy actor-critic structure for optimal control of unknown systems with disturbances. IEEE Trans. Cybern. 46(5), 1041–1050 (2016)CrossRefGoogle Scholar
  118. 118.
    Song, R., Xiao, W., Wei, Q., Sun, C.: Neural-network-based approach to finite-time optimal control for a class of unknown nonlinear systems. Soft. Comput. 18, 1645–1653 (2014)zbMATHCrossRefGoogle Scholar
  119. 119.
    Sun, J., Liu, C., Ye, Q.: Robust differential game guidance laws design for uncertain interceptor-target engagement via adaptive dynamic programming. Int. J. Control 90(5), 990–1004 (2017)MathSciNetzbMATHCrossRefGoogle Scholar
  120. 120.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning - An Introduction. MIT Press, Cambridge, Massachusetts (1998)Google Scholar
  121. 121.
    Tabuada, P.: Event-triggered real-time scheduling of stabilizing control tasks. IEEE Trans. Autom. Control 52(9), 1680–1685 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  122. 122.
    Tallapragada, P., Chopra, N.: On event triggered tracking for nonlinear systems. IEEE Trans. Autom. Control 58(9), 2343–2348 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  123. 123.
    Tang, Y., He, H., Ni, Z., Zhong, X., Zhao, D., Xu, X.: Fuzzy-based goal representation adaptive dynamic programming. IEEE Trans. Fuzzy Syst. 24(5), 1159–1175 (2016)CrossRefGoogle Scholar
  124. 124.
    Tang, Y., He, H., Wen, J., Liu, J.: Power system stability control for a wind farm based on adaptive dynamic programming. IEEE Trans. Smart Grid 6(1), 166–177 (2015)CrossRefGoogle Scholar
  125. 125.
    Vamvoudakis, K.G.: Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems. IEEE/CAA J. Autom. Sinica 1(3), 282–293 (2014)CrossRefGoogle Scholar
  126. 126.
    Vamvoudakis, K.G., Ferraz, H.: Event-triggered H-infinity control for unknown continuous-time linear systems using Q-learning. In: Proceedings of IEEE Conference on Decision and Control, pp. 1376–1381 (2016)Google Scholar
  127. 127.
    Vamvoudakis, K.G., Lewis, F.L.: Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5), 878–888 (2010)MathSciNetzbMATHCrossRefGoogle Scholar
  128. 128.
    Vamvoudakis, K.G., Lewis, F.L.: Online solution of nonlinear two-player zero-sum games using synchronous policy iteration. Int. J. Robust Nonlinear Control 22(13), 1460–1483 (2012)MathSciNetzbMATHCrossRefGoogle Scholar
  129. 129.
    Vamvoudakis, K.G., Miranda, M.F., Hespanha, J.P.: Asymptotically stable adaptive-optimal control algorithm with saturating actuators and relaxed persistence of excitation. IEEE Trans. Neural Netw. Learn. Syst. 27(11), 2386–2398 (2016)MathSciNetCrossRefGoogle Scholar
  130. 130.
    Vrabie, D., Lewis, F.L.: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Netw. 22(3), 237–246 (2009)zbMATHCrossRefGoogle Scholar
  131. 131.
    Vrabie, D., Vamvoudakis, K.G., Lewis, F.L.: Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. IET, London (2013)zbMATHGoogle Scholar
  132. 132.
    Wang, C., Liu, D., Wei, Q., Zhao, D., Xia, Z.: Iterative adaptive dynamic programming approach to power optimal control for smart grid with energy storage devices. Acta Automatica Sinica 40(9), 1984–1990 (2014)zbMATHGoogle Scholar
  133. 133.
    Wang, D.: Adaptation-oriented near-optimal control and robust synthesis of an overhead crane system. In: Proceedings of 24th International Conference on Neural Information Processing, Part VI, Guangzhou, China, November 2017, pp. 42–50 (2017)Google Scholar
  134. 134.
    Wang, D., He, H., Liu, D.: Adaptive critic nonlinear robust control: a survey. IEEE Trans. Cybern. 47(10), 3429–3451 (2017)CrossRefGoogle Scholar
  135. 135.
    Wang, D., He, H., Mu, C., Liu, D.: Intelligent critic control with disturbance attenuation for affine dynamics including an application to a micro-grid system. IEEE Trans. Industr. Electron. 64(6), 4935–4944 (2017)CrossRefGoogle Scholar
  136. 136.
    Wang, D., He, H., Zhao, B., Liu, D.: Adaptive near-optimal controllers for non-linear decentralised feedback stabilisation problems. IET Control Theory Appl. 11(6), 799–806 (2017)MathSciNetCrossRefGoogle Scholar
  137. 137.
    Wang, D., He, H., Zhong, X., Liu, D.: Event-driven nonlinear discounted optimal regulation involving a power system application. IEEE Trans. Industr. Electron. 64(10), 8177–8186 (2017)CrossRefGoogle Scholar
  138. 138.
    Wang, D., Li, C., Liu, D., Mu, C.: Data-based robust optimal control of continuous-time affine nonlinear systems with matched uncertainties. Inf. Sci. 366, 121–133 (2016)MathSciNetCrossRefGoogle Scholar
  139. 139.
    Wang, D., Liu, D., Li, H.: Policy iteration algorithm for online design of robust control for a class of continuous-time nonlinear systems. IEEE Trans. Autom. Sci. Eng. 11(2), 627–632 (2014)CrossRefGoogle Scholar
  140. 140.
    Wang, D., Liu, D., Li, H., Luo, B., Ma, H.: An approximate optimal control approach for robust stabilization of a class of discrete-time nonlinear systems with uncertainties. IEEE Trans. Syst. Man Cybern.: Syst. 46(5), 713–717 (2016)CrossRefGoogle Scholar
  141. 141.
    Wang, D., Liu, D., Li, H., Ma, H.: Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming. Inf. Sci. 282, 167–179 (2014)MathSciNetzbMATHCrossRefGoogle Scholar
  142. 142.
    Wang, D., Liu, D., Li, H., Ma, H.: Adaptive dynamic programming for infinite horizon optimal robust guaranteed cost control of a class of uncertain nonlinear system. In: Proceedings of American Control Conference, pp. 2900–2905 (2015)Google Scholar
  143. 143.
    Wang, D., Liu, D., Wei, Q., Zhao, D., Jin, N.: Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming. Automatica 48(8), 1825–1832 (2012)MathSciNetzbMATHCrossRefGoogle Scholar
  144. 144.
    Wang, D., Liu, D., Zhang, Q., Zhao, D.: Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics. IEEE Trans. Syst. Man Cybern.: Syst. 46(11), 1544–1555 (2016)CrossRefGoogle Scholar
  145. 145.
    Wang, D., Mu, C.: A novel neural optimal control framework with nonlinear dynamics: closed-loop stability and simulation verification. Neurocomputing 266, 353–360 (2017)CrossRefGoogle Scholar
  146. 146.
    Wang, D., Mu, C.: Developing nonlinear adaptive optimal regulators through an improved neural learning mechanism. Sci. China Inf. Sci. 60(5), 058201:1–058201:3 (2017)Google Scholar
  147. 147.
    Wang, D., Mu, C., He, H., Liu, D.: Adaptive-critic-based event-driven nonlinear robust state feedback. In: Proceedings of 55th IEEE Conference on Decision and Control, pp. 5813–5818 (2016)Google Scholar
  148. 148.
    Wang, D., Mu, C., He, H., Liu, D.: Event-driven adaptive robust control of nonlinear systems with uncertainties through NDP strategy. IEEE Trans. Syst. Man Cybern.: Syst. 47(7), 1358–1370 (2017)CrossRefGoogle Scholar
  149. 149.
    Wang, D., Mu, C., Liu, D.: Data-driven nonlinear near-optimal regulation based on iterative neural dynamic programming. Acta Automatica Sinica 43(3), 366–375 (2017)zbMATHGoogle Scholar
  150. 150.
    Wang, D., Mu, C., Liu, D., Ma, H.: On mixed data and event driven design for adaptive-critic-based nonlinear \(H_{\infty }\) control. IEEE Trans. Neural Netw. Learn. Syst. 29(4), 993–1005 (2018)CrossRefGoogle Scholar
  151. 151.
    Wang, D., Mu, C., Zhang, Q., Liu, D.: Event-based input-constrained nonlinear \(H_{\infty }\) state feedback with adaptive critic and neural implementation. Neurocomputing 214, 848–856 (2016)CrossRefGoogle Scholar
  152. 152.
    Wang, F.Y.: Parallel control: a method for data-driven and computational control. Acta Automatica Sinica 39(4), 293–302 (2013)CrossRefGoogle Scholar
  153. 153.
    Wang, F.Y., Jin, N., Liu, D., Wei, Q.: Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with \(\varepsilon \)-error bound. IEEE Trans. Neural Netw. 22(1), 24–36 (2011)CrossRefGoogle Scholar
  154. 154.
    Wang, F.Y., Zhang, H., Liu, D.: Adaptive dynamic programming: an introduction. IEEE Comput. Intell. Mag. 4(2), 39–47 (2009)CrossRefGoogle Scholar
  155. 155.
    Wang, F.Y., Zhang, J.J., Zheng, X., Wang, X., Yuan, Y., Dai, X., Zhang, J., Yang, L.: Where does AlphaGo go: from church-turing thesis to AlphaGo thesis and beyond. IEEE/CAA J. Autom. Sinica 3(2), 113–120 (2016)CrossRefGoogle Scholar
  156. 156.
    Wang, J., Xu, X., Liu, D., Sun, Z., Chen, Q.: Self-learning cruise control using Kernel-based least squares policy iteration. IEEE Trans. Control Syst. Technol. 22(3), 1078–1087 (2014)CrossRefGoogle Scholar
  157. 157.
    Wang, Y., Cheng, L., Hou, Z.G., Yu, J., Tan, M.: Optimal formation of multi-robot systems based on a recurrent neural network. IEEE Trans. Neural Netw. Learn. Syst. 27(2), 322–333 (2016)MathSciNetCrossRefGoogle Scholar
  158. 158.
    Wang, Z., Liu, D.: A data-based state feedback control method for a class of nonlinear systems. IEEE Trans. Industr. Inf. 9, 2284–2292 (2013)CrossRefGoogle Scholar
  159. 159.
    Watkins, C., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)zbMATHGoogle Scholar
  160. 160.
    Wei, Q., Liu, D.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11(4), 1020–1036 (2014)CrossRefGoogle Scholar
  161. 161.
    Wei, Q., Liu, D.: A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems. Sci. China Inf. Sci. 58(12), 1–15 (2015)CrossRefGoogle Scholar
  162. 162.
    Wei, Q., Liu, D.: Data-driven neuro-optimal temperature control of water gas shift reaction using stable iterative adaptive dynamic programming. IEEE Trans. Industr. Electron. 61(11), 6399–6408 (2014)CrossRefGoogle Scholar
  163. 163.
    Wei, Q., Liu, D., Yang, X.: Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 26(4), 866–879 (2015)MathSciNetCrossRefGoogle Scholar
  164. 164.
    Werbos, P.J.: Beyond regression: new tools for prediction and analysis in the behavioural sciences. Ph.D. dissertation, Harvard University 29(18), 65–78 (1974)Google Scholar
  165. 165.
    Werbos, P.J.: Advanced forecasting methods for global crisis warning and models of intelligence. Gen. Syst. Yearbook 22, 25–38 (1977)Google Scholar
  166. 166.
    Werbos, P.J.: Building and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research. IEEE Trans. Syst. Man Cybern. 17(1), 7–20 (1987)CrossRefGoogle Scholar
  167. 167.
    Werbos, P.J.: Approximate dynamic programming for real-time control and neural modeling. Neural, Fuzzy, and Adaptive Approaches, Handbook of Intelligent Control, pp. 493–526 (1992)Google Scholar
  168. 168.
    Werbos P.J.: Using ADP to understand and replicate brain intelligence: the next level design. In: Proceedings of the IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp. 209–216 (2007)Google Scholar
  169. 169.
    Werbos, P.J.: ADP: the key direction for future research in intelligent control and understanding brain intelligence. IEEE Trans. Syst. Man. Cybern.-Part B: Cybern. 38(4), 898–900 (2008)CrossRefGoogle Scholar
  170. 170.
    Werbos, P.J.: Intelligence in the brain: a theory of how it works and how to build it. Neural Netw. 22(3), 200–212 (2009)CrossRefGoogle Scholar
  171. 171.
    Werbos, P.J.: Computational intelligence for the smart grid - history, challenges, and opportunities. IEEE Comput. Intell. Mag. 6, 14–21 (2011)CrossRefGoogle Scholar
  172. 172.
    Wu, H.N., Li, M., Guo, L.: Finite-horizon approximate optimal guaranteed cost control of uncertain nonlinear systems with application to Mars entry guidance. IEEE Trans. Neural Netw. Learn. Syst. 26(7), 1456–1467 (2015)MathSciNetCrossRefGoogle Scholar
  173. 173.
    Xu, B.: Robust adaptive neural control of flexible hypersonic flight vehicle with dead-zone input nonlinearity. Nonlinear Dyn. 80(3), 1509–1520 (2015)MathSciNetzbMATHCrossRefGoogle Scholar
  174. 174.
    Xu, B., Yang, C., Shi, Z.: Reinforcement learning output feedback NN control using deterministic learning technique. IEEE Trans. Neural Netw. Learn. Syst. 25(3), 635–641 (2014)CrossRefGoogle Scholar
  175. 175.
    Xu, X., Hou, Z., Lian, C., He, H.: Online learning control using adaptive critic designs with sparse kernel machines. IEEE Trans. Neural Netw. Learn. Syst. 24(5), 762–775 (2013)CrossRefGoogle Scholar
  176. 176.
    Yan, J., He, H., Zhong, X., Tang, Y.: Q-learning based vulnerability analysis of smart grid against sequential topology attacks. IEEE Trans. Inf. Forensics Secur. 12(1), 200–210 (2017)CrossRefGoogle Scholar
  177. 177.
    Yan, P., Liu, D., Wang, D., Ma, H.: Data-driven controller design for general MIMO nonlinear systems via virtual reference feedback tuning and neural networks. Neurocomputing 171, 815–825 (2016)CrossRefGoogle Scholar
  178. 178.
    Yang, X., Liu, D., Luo, B., Li, C.: Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning. Inf. Sci. 369, 731–747 (2016)CrossRefGoogle Scholar
  179. 179.
    Yang, X., Liu, D., Ma, H., Xu, Y.: Online approximate solution of HJI equation for unknown constrained-input nonlinear continuous-time systems. Inf. Sci. 328, 435–454 (2016)zbMATHCrossRefGoogle Scholar
  180. 180.
    Yang, X., Liu, D., Wei, Q., Wang, D.: Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programming. Neurocomputing 198, 80–90 (2016)CrossRefGoogle Scholar
  181. 181.
    Yu, W.: Recent Advances in Intelligent Control Systems. Springer, London (2009)zbMATHCrossRefGoogle Scholar
  182. 182.
    Zargarzadeh, H., Dierks, T., Jagannathan, S.: Optimal control of nonlinear continuous-time systems in strict-feedback form. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 2535–2549 (2015)MathSciNetzbMATHCrossRefGoogle Scholar
  183. 183.
    Zhang, H., Cui, L., Luo, Y.: Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP. IEEE Trans. Cybern. 43(1), 206–216 (2013)CrossRefGoogle Scholar
  184. 184.
    Zhang, H., Cui, L., Zhang, X., Luo, Y.: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22(12), 2226–2236 (2011)CrossRefGoogle Scholar
  185. 185.
    Zhang, H., Feng, T., Yang, G.H., Liang, H.: Distributed cooperative optimal control for multiagent systems on directed graphs: an inverse optimal approach. IEEE Trans. Cybern. 45(7), 1315–1326 (2015)CrossRefGoogle Scholar
  186. 186.
    Zhang, H., Liu, D., Luo, Y., Wang, D.: Adaptive Dynamic Programming for Control: Algorithms and Stability. Springer, London (2013)zbMATHCrossRefGoogle Scholar
  187. 187.
    Zhang, H., Luo, Y., Liu, D.: Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans. Neural Netw. 20(9), 1490–1503 (2009)CrossRefGoogle Scholar
  188. 188.
    Zhang, H., Qin, C., Jiang, B., Luo, Y.: Online adaptive policy learning algorithm for \(H_{\infty }\) state feedback control of unknown affine nonlinear discrete-time systems. IEEE Trans. Cybern. 44(12), 2706–2718 (2014)CrossRefGoogle Scholar
  189. 189.
    Zhang, H., Qin, C., Luo, Y.: Neural-network-based constrained optimal control scheme for discrete-time switched nonlinear system using dual heuristic programming. IEEE Trans. Autom. Sci. Eng. 11(3), 839–849 (2014)CrossRefGoogle Scholar
  190. 190.
    Zhang, H., Wei, Q., Liu, D.: An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica 47(1), 207–214 (2011)MathSciNetzbMATHCrossRefGoogle Scholar
  191. 191.
    Zhang, H., Zhang, J., Yang, G.H., Luo, Y.: Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming. IEEE Trans. Fuzzy Syst. 23, 152–163 (2015)CrossRefGoogle Scholar
  192. 192.
    Zhang, H., Zhang, X., Luo, Y., Yang, J.: An overview of research on adaptive dynamic programming. Acta Automatica Sinica 39(4), 303–311 (2013)MathSciNetzbMATHCrossRefGoogle Scholar
  193. 193.
    Zhang, Q., Zhao, D., Wang, D.: Event-based robust control for uncertain nonlinear systems using adaptive dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 29(1), 37–50 (2018)MathSciNetCrossRefGoogle Scholar
  194. 194.
    Zhang, Q., Zhao, D., Zhu, Y.: Event-triggered \(H_{\infty }\) control for continuous-time nonlinear system via concurrent learning. IEEE Trans. Syst. Man Cybern.: Syst. 47(7), 1071–1081 (2017)CrossRefGoogle Scholar
  195. 195.
    Zhao, D., Dai, Y., Zhang, Z.: Computational intelligence in urban traffic signal control: a survey. IEEE Trans. Syst. Man. Cybern.-Part C: Appl. Rev. 42, 485–494 (2012)CrossRefGoogle Scholar
  196. 196.
    Zhao, D., Liu, D., Yi, J.: An overview on the adaptive dynamic programming based urban city traffic signal optimal control. Acta Automatica Sinica 35(6), 676–681 (2009)CrossRefGoogle Scholar
  197. 197.
    Zhao, Q., Xu, H., Jagannathan, S.: Neural network-based finite-horizon optimal control of uncertain affine nonlinear discrete-time systems. IEEE Trans. Neural Netw. Learn. Syst. 26(3), 486–499 (2015)MathSciNetCrossRefGoogle Scholar
  198. 198.
    Zhong, X., He, H.: An event-triggered ADP control approach for continuous-time system with unknown internal states. IEEE Trans. Cybern. 47(3), 683–694 (2017)CrossRefGoogle Scholar
  199. 199.
    Zhong, X., He, H., Prokhorov, D.V.: Robust controller design of continuous-time nonlinear system using neural network. In: Proceedings of International Joint Conference on Neural Networks Dallas, pp. 1–8 (2013)Google Scholar
  200. 200.
    Zhong, X., Ni, Z., He, H.: A theoretical foundation of goal representation heuristic dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 27(12), 2513–2525 (2016)CrossRefGoogle Scholar
  201. 201.
    Zhu, Y., Zhao, D., He, H., Ji, J.: Event-triggered optimal control for partially-unknown constrained-input systems via adaptive dynamic programming. IEEE Trans. Industr. Electron. 64(5), 4101–4109 (2017)CrossRefGoogle Scholar
  202. 202.
    Zhu, Y., Zhao, D., Li, X.: Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online data. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 714–725 (2017)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.The State Key Laboratory of Management and Control for Complex SystemsInstitute of Automation, Chinese Academy of SciencesBeijingChina
  2. 2.School of Electrical and Information EngineeringTianjin UniversityTianjinChina

Personalised recommendations