Advertisement

Overview

  • Huaguang Zhang
  • Derong Liu
  • Yanhong Luo
  • Ding Wang
Part of the Communications and Control Engineering book series (CCE)

Abstract

In this chapter, a brief introduction to the background and development of adaptive dynamic programming (ADP) is provided. The review begins with the origin of ADP, and the basic structures and algorithm development are narrated according to chronological order. After that, we turn attention to control problems based on ADP. We present this subject regarding two aspects: feedback control based on ADP and non-linear games based on ADP. We will mention a few iterative algorithms from the recent literature and will point out some open problems in each case.

Keywords

Optimal Control Problem Cerebellar Model Articulation Controller Critic Network Approximate Dynamic Programming Adaptive Critic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791 MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Abu-Khalaf M, Lewis FL, Huang J (2006) Policy iterations on the Hamilton–Jacobi–Isaacs equation for state feedback control with input saturation. IEEE Trans Autom Control 51(12):1989–1995 MathSciNetCrossRefGoogle Scholar
  3. 3.
    Al-Tamimi A, Lewis FL (2007) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. In: Proceedings of IEEE international symposium on approximate dynamic programming and reinforcement learning, Honolulu, HI, pp 38–43 CrossRefGoogle Scholar
  4. 4.
    Al-Tamimi A, Abu-Khalaf M, Lewis FL (2007) Adaptive critic designs for discrete-time zero-sum games with application to H control. IEEE Trans Syst Man Cybern, Part B, Cybern 37(1):240–247 CrossRefGoogle Scholar
  5. 5.
    Al-Tamimi A, Lewis FL, Abu-Khalaf M (2007) Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control. Automatica 43(3):473–481 MathSciNetzbMATHCrossRefGoogle Scholar
  6. 6.
    Al-Tamimi A, Lewis FL, Wang Y (2007) Model-free H-infinity load-frequency controller design for power systems. In: Proceedings of IEEE international symposium on intelligent control, pp 118–125 Google Scholar
  7. 7.
    Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):943–949 CrossRefGoogle Scholar
  8. 8.
    Bardi M, Capuzzo-Dolcetta I (1997) Optimal control and viscosity solutions of Hamilton–Jacobi–Bellman equations. Birkhauser, Boston zbMATHCrossRefGoogle Scholar
  9. 9.
    Barto AG, Sutton RS, Anderson CW (1983) Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans Syst Man Cybern 13(5):835–846 CrossRefGoogle Scholar
  10. 10.
    Beard RW (1995) Improving the closed-loop performance of nonlinear systems. PhD dissertation, Rensselaer Polytech Institute, Troy, NY Google Scholar
  11. 11.
    Beard RW, Saridis GN (1998) Approximate solutions to the timeinvariant Hamilton–Jacobi–Bellman equation. J Optim Theory Appl 96(3):589–626 MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Beard RW, Saridis GN, Wen JT (1997) Galerkin approximations of the generalized Hamilton–Jacobi–Bellman equation. Automatica 33(12):2159–2177 MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton zbMATHGoogle Scholar
  14. 14.
    Bertsekas DP (2003) Convex analysis and optimization. Athena Scientific, Belmont zbMATHGoogle Scholar
  15. 15.
    Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont zbMATHGoogle Scholar
  16. 16.
    Bertsekas DP, Homer ML, Logan DA, Patek SD, Sandell NR (2000) Missile defense and interceptor allocation by neuro-dynamic programming. IEEE Trans Syst Man Cybern, Part A, Syst Hum 30(1):42–51 CrossRefGoogle Scholar
  17. 17.
    Bhasin S, Sharma N, Patre P, Dixon WE (2011) Asymptotic tracking by a reinforcement learning-based adaptive critic controller. J Control Theory Appl 9(3):400–409 MathSciNetCrossRefGoogle Scholar
  18. 18.
    Blackmore L, Rajamanoharan S, Williams BC (2008) Active estimation for jump Markov linear systems. IEEE Trans Autom Control 53(10):2223–2236 MathSciNetCrossRefGoogle Scholar
  19. 19.
    Bradtke SJ, Ydestie BE, Barto AG (1994) Adaptive linear quadratic control using policy iteration. In: Proceedings of the American control conference, Baltimore, Maryland, pp 3475–3476 Google Scholar
  20. 20.
    Bryson AE, Ho YC (1975) Applied optimal control: optimization, estimation, and control. Hemisphere–Wiley, New York Google Scholar
  21. 21.
    Busoniu L, Babuska R, Schutter BD, Ernst D (2010) Reinforcement learning and dynamic programming using function approximators. CRC Press, Boca Raton CrossRefGoogle Scholar
  22. 22.
    Campos J, Lewis FL (1999) Adaptive critic neural network for feedforward compensation. In: Proceedings of American control conference, San Diego, CA, pp 2813–2818 Google Scholar
  23. 23.
    Chang HS, Marcus SI (2003) Two-person zero-sum Markov games: receding horizon approach. IEEE Trans Autom Control 48(11):1951–1961 MathSciNetCrossRefGoogle Scholar
  24. 24.
    Chen Z, Jagannathan S (2008) Generalized Hamilton–Jacobi–Bellman formulation-based neural network control of affine nonlinear discrete-time systems. IEEE Trans Neural Netw 19(1):90–106 CrossRefGoogle Scholar
  25. 25.
    Chen BS, Tseng CS, Uang HJ (2002) Fuzzy differential games for nonlinear stochastic systems: suboptimal approach. IEEE Trans Fuzzy Syst 10(2):222–233 MathSciNetCrossRefGoogle Scholar
  26. 26.
    Cheng T, Lewis FL, Abu-Khalaf M (2007) Fixed-final-time-constrained optimal control of nonlinear systems using neural network HJB approach. IEEE Trans Neural Netw 18(6):1725–1736 CrossRefGoogle Scholar
  27. 27.
    Cheng T, Lewis FL, Abu-Khalaf M (2007) A neural network solution for fixed-final time optimal control of nonlinear systems. Automatica 43(3):482–490 MathSciNetzbMATHCrossRefGoogle Scholar
  28. 28.
    Costa OLV, Tuesta EF (2003) Finite horizon quadratic optimal control and a separation principle for Markovian jump linear systems. IEEE Trans Autom Control 48:1836–1842 MathSciNetCrossRefGoogle Scholar
  29. 29.
    Dalton J, Balakrishnan SN (1996) A neighboring optimal adaptive critic for missile guidance. Math Comput Model 23:175–188 CrossRefGoogle Scholar
  30. 30.
    Dreyfus SE, Law AM (1977) The art and theory of dynamic programming. Academic Press, New York zbMATHGoogle Scholar
  31. 31.
    Engwerda J (2008) Uniqueness conditions for the affine open-loop linear quadratic differential game. Automatica 44(2):504–511 MathSciNetCrossRefGoogle Scholar
  32. 32.
    Enns R, Si J (2002) Apache helicopter stabilization using neural dynamic programming. J Guid Control Dyn 25(1):19–25 CrossRefGoogle Scholar
  33. 33.
    Enns R, Si J (2003) Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Trans Neural Netw 14(4):929–939 CrossRefGoogle Scholar
  34. 34.
    Ferrari S, Stengel RF (2004) Online adaptive critic flight control. J Guid Control Dyn 27(5):777–786 CrossRefGoogle Scholar
  35. 35.
    Ferrari S, Steck JE, Chandramohan R (2008) Adaptive feedback control by constrained approximate dynamic programming. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):982–987 CrossRefGoogle Scholar
  36. 36.
    Goebel R (2002) Convexity in zero-sum differential games. In: Proceedings of the 41th IEEE conference on decision and control, Las Vegas, Nevada, pp 3964–3969 Google Scholar
  37. 37.
    Goulart PJ, Kerrigan EC, Alamo T (2009) Control of constrained discrete-time systems with bounded L 2 gain. IEEE Trans Autom Control 54(5):1105–1111 MathSciNetCrossRefGoogle Scholar
  38. 38.
    Gu D (2008) A differential game approach to formation control. IEEE Trans Control Syst Technol 16(1):85–93 CrossRefGoogle Scholar
  39. 39.
    Hanselmann T, Noakes L, Zaknich A (2007) Continuous-time adaptive critics. IEEE Trans Neural Netw 18(3):631–647 CrossRefGoogle Scholar
  40. 40.
    He P, Jagannathan S (2005) Reinforcement learning-based output feedback control of nonlinear systems with input constraints. IEEE Trans Syst Man Cybern, Part B, Cybern 35(1):150–154 CrossRefGoogle Scholar
  41. 41.
    He P, Jagannathan S (2007) Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints. IEEE Trans Syst Man Cybern, Part B, Cybern 37(2):425–436 CrossRefGoogle Scholar
  42. 42.
    Hou ZG, Wu CP (2005) A dynamic programming neural network for large-scale optimization problems. Acta Autom Sin 25(1):46–51 Google Scholar
  43. 43.
    Hua X, Mizukami K (1994) Linear-quadratic zero-sum differential games for generalized state space systems. IEEE Trans Autom Control 39(1):143–147 zbMATHCrossRefGoogle Scholar
  44. 44.
    Hwnag KS, Chiou JY, Chen TY (2004) Reinforcement learning in zero-sum Markov games for robot soccer systems. In: Proceedings of the 2004 IEEE international conference on networking, sensing and control, Taipei, Taiwan, pp 1110–1114 CrossRefGoogle Scholar
  45. 45.
    Jamshidi M (1982) Large-scale systems-modeling and control. North-Holland, Amsterdam Google Scholar
  46. 46.
    Javaherian H, Liu D, Zhang Y, Kovalenko O (2004) Adaptive critic learning techniques for automotive engine control. In: Proceedings of American control conference, Boston, MA, pp 4066–4071 Google Scholar
  47. 47.
    Jimenez M, Poznyak A (2006) Robust and adaptive strategies with pre-identification via sliding mode technique in LQ differential games. In: Proceedings of American control conference Minneapolis, Minnesota, USA, pp 14–16 Google Scholar
  48. 48.
    Kim YJ, Lim MT (2008) Parallel optimal control for weakly coupled nonlinear systems using successive Galerkin approximation. IEEE Trans Autom Control 53(6):1542–1547 MathSciNetCrossRefGoogle Scholar
  49. 49.
    Kleinman D (1968) On an iterative technique for Riccati equation computations. IEEE Trans Autom Control 13(1):114–115 CrossRefGoogle Scholar
  50. 50.
    Kulkarni NV, KrishnaKumar K (2003) Intelligent engine control using an adaptive critic. IEEE Trans Control Syst Technol 11:164–173 CrossRefGoogle Scholar
  51. 51.
    Landelius T (1997) Reinforcement learning and distributed local model synthesis. PhD dissertation, Linkoping University, Sweden Google Scholar
  52. 52.
    Laraki R, Solan E (2005) The value of zero-sum stopping games in continuous time. SIAM J Control Optim 43(5):1913–1922 MathSciNetzbMATHCrossRefGoogle Scholar
  53. 53.
    Lendaris GG (2008) Higher level application of ADP: a nextphase for the control field. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):901–912 CrossRefGoogle Scholar
  54. 54.
    Lendaris GG, Paintz C (1997) Training strategies for critic and action neural networks in dual heuristic programming method. In: Proceedings of the 1997 IEEE international conference on neural networks, Houston, TX, pp 712–717 Google Scholar
  55. 55.
    Leslie DS, Collins EJ (2005) Individual Q-learning in normal form games. SIAM J Control Optim 44(2):495–514 MathSciNetzbMATHCrossRefGoogle Scholar
  56. 56.
    Lewis FL (1992) Applied optimal control and estimation. Texas instruments. Prentice Hall, Englewood Cliffs Google Scholar
  57. 57.
    Lewis FL, Liu D (2012) Reinforcement learning and approximate dynamic programming for feedback control. IEEE press series on computational intelligence. Wiley, New York Google Scholar
  58. 58.
    Lewis FL, Syrmos VL (1992) Optimal control. Wiley, New York zbMATHGoogle Scholar
  59. 59.
    Lin CS, Kim H (1991) CMAC-based adaptive critic self-learning control. IEEE Trans Neural Netw 2(5):530–533 CrossRefGoogle Scholar
  60. 60.
    Liu X, Balakrishnan SN (2000) Convergence analysis of adaptive critic based optimal control. In: Proceedings of American control conference, Chicago, Illinois, pp 1929–1933 Google Scholar
  61. 61.
    Liu D, Zhang HG (2005) A neural dynamic programming approach for learning control of failure avoidance problems. Int J Intell Control Syst 10(1):21–32 Google Scholar
  62. 62.
    Liu D, Xiong X, Zhang Y (2001) Action-dependent adaptive critic designs. In: Proceeding of international joint conference on neural networks, Washington, DC, pp 990–995 Google Scholar
  63. 63.
    Liu D, Zhang Y, Zhang HG (2005) A self-learning call admission control scheme for CDMA cellular networks. IEEE Trans Neural Netw 16(5):1219–1228 CrossRefGoogle Scholar
  64. 64.
    Liu D, Javaherian H, Kovalenko O, Huang T (2008) Adaptive critic learning techniques for engine torque and air–fuel ratio control. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):988–993 CrossRefGoogle Scholar
  65. 65.
    Lu C, Si J, Xie X (2008) Direct heuristic dynamic programming for damping oscillations in a large power system. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):1008–1013 CrossRefGoogle Scholar
  66. 66.
    Lyshevski SE (2002) Optimization of dynamic systems using novel performance functionals. In: Proceedings of 41st IEEE conference on decision and control, Las Vegas, Nevada, pp 753–758 CrossRefGoogle Scholar
  67. 67.
    Malek-Zavarei M, Jashmidi M (1987) Time-delay systems: analysis, optimization and applications North-Holland, Amsterdam, pp 80–96 zbMATHGoogle Scholar
  68. 68.
    Murray JJ, Cox CJ, Lendaris GG, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Syst Man Cybern, Part C, Appl Rev 32(2):140–153 CrossRefGoogle Scholar
  69. 69.
    Murray JJ, Cox CJ, Saeks R (2003) The adaptive dynamic programming theorem. In: Liu D, Antsaklis PJ (eds) Stability and control of dynamical systems with applications. Birkhäser, Boston, pp 379–394 CrossRefGoogle Scholar
  70. 70.
    Necoara I, Kerrigan EC, Schutter BD, Boom T (2007) Finite-horizon min–max control of max-plus-linear systems. IEEE Trans Autom Control 52(6):1088–1093 CrossRefGoogle Scholar
  71. 71.
    Owen G (1982) Game theory. Academic Press, New York zbMATHGoogle Scholar
  72. 72.
    Padhi R, Unnikrishnan N, Wang X, Balakrishnan SN (2006) A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neural Netw 19(10):1648–1660 zbMATHCrossRefGoogle Scholar
  73. 73.
    Parisini T, Zoppoli R (1998) Neural approximations for infinite-horizon optimal control of nonlinear stochastic systems. IEEE Trans Neural Netw 9(6):1388–1408 CrossRefGoogle Scholar
  74. 74.
    Park JW, Harley RG, Venayagamoorthy GK (2003) Adaptive-critic-based optimal neurocontrol for synchronous generators in a power system using MLP/RBF neural networks. IEEE Trans Ind Appl 39:1529–1540 CrossRefGoogle Scholar
  75. 75.
    Powell WB (2011) Approximate dynamic programming: solving the curses of dimensionality, 2nd edn. Wiley, Princeton zbMATHCrossRefGoogle Scholar
  76. 76.
    Prokhorov DV, Wunsch DC (1997) Adaptive critic designs. IEEE Trans Neural Netw 8(5):997–1007 CrossRefGoogle Scholar
  77. 77.
    Ray S, Venayagamoorthy GK, Chaudhuri B, Majumder R (2008) Comparison of adaptive critic-based and classical wide-area controllers for power systems. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):1002–1007 CrossRefGoogle Scholar
  78. 78.
    Rovithakis GA (2001) Stable adaptive neuro-control design via Lyapunov function derivative estimation. Automatica 37(8):1213–1221 MathSciNetzbMATHCrossRefGoogle Scholar
  79. 79.
    Saeks RE, Cox CJ, Mathia K, Maren AJ (1997) Asymptotic dynamic programming: preliminary concepts and results. In: Proceedings of the 1997 IEEE international conference on neural networks, Houston, TX, pp 2273–2278 Google Scholar
  80. 80.
    Saridis GN, Lee CS (1979) An approximation theory of optimal control for trainable manipulators. IEEE Trans Syst Man Cybern 9(3):152–159 MathSciNetzbMATHCrossRefGoogle Scholar
  81. 81.
    Saridis GN, Wang FY (1994) Suboptimal control of nonlinear stochastic systems. Control Theory Adv Technol 10(4):847–871 MathSciNetGoogle Scholar
  82. 82.
    Seiffertt J, Sanyal S, Wunsch DC (2008) Hamilton–Jacobi–Bellman equations and approximate dynamic programming on time scales. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):918–923 CrossRefGoogle Scholar
  83. 83.
    Shervais S, Shannon TT, Lendaris GG (2003) Intelligent supply chain management using adaptive critic learning. IEEE Trans Syst Man Cybern, Part A, Syst Hum 33(2):235–244 CrossRefGoogle Scholar
  84. 84.
    Shih P, Kaul B, Jagannathan S, Drallmeier J (2007) Near optimal output-feedback control of nonlinear discrete-time systems in nonstrict feedback form with application to engines. In: Proceedings of international joint conference on neural networks, Orlando, Florida, pp 396–401 CrossRefGoogle Scholar
  85. 85.
    Si J, Wang YT (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276 MathSciNetCrossRefGoogle Scholar
  86. 86.
    Si J, Barto A, Powell W, Wunsch D (2004) Handbook of learning dynamic programming. Wiley, New Jersey CrossRefGoogle Scholar
  87. 87.
    Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge Google Scholar
  88. 88.
    Tesauro GJ (2000) Practical issues in temporal difference learning. Mach Learn 8:257–277 Google Scholar
  89. 89.
    Tsitsiklis JN (1995) Efficient algorithms for globally optimal trajectories. IEEE Trans Autom Control 40(9):1528–1538 MathSciNetzbMATHCrossRefGoogle Scholar
  90. 90.
    Uchida K, Fujita M (1992) Finite horizon H control problems with terminal penalties. IEEE Trans Autom Control 37(11):1762–1767 MathSciNetzbMATHCrossRefGoogle Scholar
  91. 91.
    Vamvoudakis KG, Lewis FL (2010) Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46:878–888 MathSciNetzbMATHCrossRefGoogle Scholar
  92. 92.
    Venayagamoorthy GK, Harley RG, Wunsch DG (2002) Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator. IEEE Trans Neural Netw 13:764–773 CrossRefGoogle Scholar
  93. 93.
    Vrabie D, Abu-Khalaf M, Lewis FL, Wang Y (2007) Continuous-time ADP for linear systems with partially unknown dynamics. In: Proceedings of the 2007 IEEE symposium on approximate dynamic programming and reinforcement learning, Honolulu, USA, pp 247–253 CrossRefGoogle Scholar
  94. 94.
    Vrabie D, Vamvoudakis KG, Lewis FL (2012) Optimal adaptive control and differential games by reinforcement learning principles. IET Press, London Google Scholar
  95. 95.
    Watkins C (1989) Learning from delayed rewards. PhD dissertation, Cambridge University, Cambridge, England Google Scholar
  96. 96.
    Werbos PJ (1977) Advanced forecasting methods for global crisis warning and models of intelligence. Gen Syst Yearbook 22:25–38 Google Scholar
  97. 97.
    Werbos PJ (1987) Building and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research. IEEE Trans Syst Man Cybern 17(1):7–20 CrossRefGoogle Scholar
  98. 98.
    Werbos PJ (1990) Consistency of HDP applied to a simple reinforcement learning problem. Neural Netw 3(2):179–189 CrossRefGoogle Scholar
  99. 99.
    Werbos PJ (1990) A menu of designs for reinforcement learning over time. In: Miller WT, Sutton RS, Werbos PJ (eds) Neural networks for control. MIT Press, Cambridge, pp 67–95 Google Scholar
  100. 100.
    Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of intelligent control: neural, fuzzy and adaptive approaches. Van Nostrand, New York, chap 13 Google Scholar
  101. 101.
    Werbos PJ (2007) Using ADP to understand and replicate brain intelligence: the next level design. In: Proceedings of IEEE symposium on approximate dynamic programming and reinforcement learning, Honolulu, HI, pp 209–216 CrossRefGoogle Scholar
  102. 102.
    Widrow B, Gupta N, Maitra S (1973) Punish/reward: learning with a critic in adaptive threshold systems. IEEE Trans Syst Man Cybern 3(5):455–465 MathSciNetzbMATHCrossRefGoogle Scholar
  103. 103.
    Wiering MA, Hasselt HV (2008) Ensemble algorithms in reinforcement learning. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):930–936 CrossRefGoogle Scholar
  104. 104.
    Yadav V, Padhi R, Balakrishnan SN (2007) Robust/optimal temperature profile control of a high-speed aerospace vehicle using neural networks. IEEE Trans Neural Netw 18(4):1115–1128 CrossRefGoogle Scholar
  105. 105.
    Yang Q, Jagannathan S (2007) Online reinforcement learning neural network controller design for nanomanipulation. In: Proceedings of IEEE symposium on approximate dynamic programming and reinforcement learning, Honolulu, HI, pp 225–232 CrossRefGoogle Scholar
  106. 106.
    Yang L, Enns R, Wang YT, Si J (2003) Direct neural dynamic programming. In: Liu D, Antsaklis PJ (eds) Stability and control of dynamical systems with applications. Birkhauser, Boston Google Scholar
  107. 107.
    Yang F, Wang Z, Feng G, Liu X (2009) Robust filtering with randomly varying sensor delay: the finite-horizon case. IEEE Trans Circuits Syst I, Regul Pap 56(3):664–672 MathSciNetCrossRefGoogle Scholar
  108. 108.
    Yen GG, DeLima PG (2005) Improving the performance of globalized dual heuristic programming for fault tolerant control through an online learning supervisor. IEEE Trans Autom Sci Eng 2(2):121–131 CrossRefGoogle Scholar
  109. 109.
    Yen GG, Hickey TW (2004) Reinforcement learning algorithms for robotic navigation in dynamic environments. ISA Trans 43:217–230 CrossRefGoogle Scholar
  110. 110.
    Zadorojniy A, Shwartz A (2006) Robustness of policies in constrained Markov decision processes. IEEE Trans Autom Control 51(4):635–638 MathSciNetCrossRefGoogle Scholar
  111. 111.
    Zattoni E (2008) Structural invariant subspaces of singular Hamiltonian systems and nonrecursive solutions of finite-horizon optimal control problems. IEEE Trans Autom Control 53(5):1279–1284 MathSciNetCrossRefGoogle Scholar
  112. 112.
    Zhang HG, Wang ZS (2007) Global asymptotic stability of delayed cellular neural networks. IEEE Trans Neural Netw 18(3):947–950 CrossRefGoogle Scholar
  113. 113.
    Zhang P, Deng H, Xi J (2005) On the value of two-person zero-sum linear quadratic differential games. In: Proceedings of the 44th IEEE conference on decision and control, and the European control conference. Seville, Spain, pp 12–15 Google Scholar
  114. 114.
    Zhang HG, Lun SX, Liu D (2007) Fuzzy H(infinity) filter design for a class of nonlinear discrete-time systems with multiple time delays. IEEE Trans Fuzzy Syst 15(3):453–469 CrossRefGoogle Scholar
  115. 115.
    Zhang HG, Wang ZS, Liu D (2007) Robust exponential stability of recurrent neural networks with multiple time-varying delays. IEEE Trans Circuits Syst II, Express Briefs 54(8):730–734 CrossRefGoogle Scholar
  116. 116.
    Zhang HS, Xie L, Duan G (2007) H control of discrete-time systems with multiple input delays. IEEE Trans Autom Control 52(2):271–283 MathSciNetCrossRefGoogle Scholar
  117. 117.
    Zhang HG, Yang DD, Chai TY (2007) Guaranteed cost networked control for T-S fuzzy systems with time delays. IEEE Trans Syst Man Cybern, Part C, Appl Rev 37(2):160–172 CrossRefGoogle Scholar
  118. 118.
    Zhang HG, Ma TD, Huang GB (2010) Robust global exponential synchronization of uncertain chaotic delayed neural networks via dual-stage impulsive control. IEEE Trans Syst Man Cybern, Part B, Cybern 40(3):831–844 CrossRefGoogle Scholar
  119. 119.
    Zhang HG, Cui LL, Zhang X, Luo YH (2011) Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans Neural Netw 22(12):2226–2236 CrossRefGoogle Scholar
  120. 120.
    Zhang HG, Wei QL, Liu D (2011) An iterative approximate dynamic programming method to solve for a class of nonlinear zero-sum differential games. Automatica 47(1):207–214 MathSciNetzbMATHCrossRefGoogle Scholar
  121. 121.
    Zhao Y, Patek SD, Beling PA (2008) Decentralized Bayesian search using approximate dynamic programming methods. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):970–975 CrossRefGoogle Scholar
  122. 122.
    Zheng CD, Zhang HG, Wang ZS (2010) An augmented LKF approach involving derivative information of both state and delay. IEEE Trans Neural Netw 21(7):1100–1109 CrossRefGoogle Scholar
  123. 123.
    Zheng CD, Zhang HG, Wang ZS (2011) Novel exponential stability criteria of high-order neural networks with time-varying delays. IEEE Trans Syst Man Cybern, Part B, Cybern 41(2):486–496 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  • Huaguang Zhang
    • 1
  • Derong Liu
    • 2
  • Yanhong Luo
    • 1
  • Ding Wang
    • 2
  1. 1.College of Information Science Engin.Northeastern UniversityShenyangPeople’s Republic of China
  2. 2.Institute of Automation, Laboratory of Complex SystemsChinese Academy of SciencesBeijingPeople’s Republic of China

Personalised recommendations