Skip to main content

Part of the book series: Communications and Control Engineering ((CCE))

Abstract

In this chapter, a brief introduction to the background and development of adaptive dynamic programming (ADP) is provided. The review begins with the origin of ADP, and the basic structures and algorithm development are narrated according to chronological order. After that, we turn attention to control problems based on ADP. We present this subject regarding two aspects: feedback control based on ADP and non-linear games based on ADP. We will mention a few iterative algorithms from the recent literature and will point out some open problems in each case.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791

    Article  MathSciNet  MATH  Google Scholar 

  2. Abu-Khalaf M, Lewis FL, Huang J (2006) Policy iterations on the Hamilton–Jacobi–Isaacs equation for state feedback control with input saturation. IEEE Trans Autom Control 51(12):1989–1995

    Article  MathSciNet  Google Scholar 

  3. Al-Tamimi A, Lewis FL (2007) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. In: Proceedings of IEEE international symposium on approximate dynamic programming and reinforcement learning, Honolulu, HI, pp 38–43

    Chapter  Google Scholar 

  4. Al-Tamimi A, Abu-Khalaf M, Lewis FL (2007) Adaptive critic designs for discrete-time zero-sum games with application to H control. IEEE Trans Syst Man Cybern, Part B, Cybern 37(1):240–247

    Article  Google Scholar 

  5. Al-Tamimi A, Lewis FL, Abu-Khalaf M (2007) Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control. Automatica 43(3):473–481

    Article  MathSciNet  MATH  Google Scholar 

  6. Al-Tamimi A, Lewis FL, Wang Y (2007) Model-free H-infinity load-frequency controller design for power systems. In: Proceedings of IEEE international symposium on intelligent control, pp 118–125

    Google Scholar 

  7. Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):943–949

    Article  Google Scholar 

  8. Bardi M, Capuzzo-Dolcetta I (1997) Optimal control and viscosity solutions of Hamilton–Jacobi–Bellman equations. Birkhauser, Boston

    Book  MATH  Google Scholar 

  9. Barto AG, Sutton RS, Anderson CW (1983) Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans Syst Man Cybern 13(5):835–846

    Article  Google Scholar 

  10. Beard RW (1995) Improving the closed-loop performance of nonlinear systems. PhD dissertation, Rensselaer Polytech Institute, Troy, NY

    Google Scholar 

  11. Beard RW, Saridis GN (1998) Approximate solutions to the timeinvariant Hamilton–Jacobi–Bellman equation. J Optim Theory Appl 96(3):589–626

    Article  MathSciNet  MATH  Google Scholar 

  12. Beard RW, Saridis GN, Wen JT (1997) Galerkin approximations of the generalized Hamilton–Jacobi–Bellman equation. Automatica 33(12):2159–2177

    Article  MathSciNet  MATH  Google Scholar 

  13. Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton

    MATH  Google Scholar 

  14. Bertsekas DP (2003) Convex analysis and optimization. Athena Scientific, Belmont

    MATH  Google Scholar 

  15. Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont

    MATH  Google Scholar 

  16. Bertsekas DP, Homer ML, Logan DA, Patek SD, Sandell NR (2000) Missile defense and interceptor allocation by neuro-dynamic programming. IEEE Trans Syst Man Cybern, Part A, Syst Hum 30(1):42–51

    Article  Google Scholar 

  17. Bhasin S, Sharma N, Patre P, Dixon WE (2011) Asymptotic tracking by a reinforcement learning-based adaptive critic controller. J Control Theory Appl 9(3):400–409

    Article  MathSciNet  Google Scholar 

  18. Blackmore L, Rajamanoharan S, Williams BC (2008) Active estimation for jump Markov linear systems. IEEE Trans Autom Control 53(10):2223–2236

    Article  MathSciNet  Google Scholar 

  19. Bradtke SJ, Ydestie BE, Barto AG (1994) Adaptive linear quadratic control using policy iteration. In: Proceedings of the American control conference, Baltimore, Maryland, pp 3475–3476

    Google Scholar 

  20. Bryson AE, Ho YC (1975) Applied optimal control: optimization, estimation, and control. Hemisphere–Wiley, New York

    Google Scholar 

  21. Busoniu L, Babuska R, Schutter BD, Ernst D (2010) Reinforcement learning and dynamic programming using function approximators. CRC Press, Boca Raton

    Book  Google Scholar 

  22. Campos J, Lewis FL (1999) Adaptive critic neural network for feedforward compensation. In: Proceedings of American control conference, San Diego, CA, pp 2813–2818

    Google Scholar 

  23. Chang HS, Marcus SI (2003) Two-person zero-sum Markov games: receding horizon approach. IEEE Trans Autom Control 48(11):1951–1961

    Article  MathSciNet  Google Scholar 

  24. Chen Z, Jagannathan S (2008) Generalized Hamilton–Jacobi–Bellman formulation-based neural network control of affine nonlinear discrete-time systems. IEEE Trans Neural Netw 19(1):90–106

    Article  Google Scholar 

  25. Chen BS, Tseng CS, Uang HJ (2002) Fuzzy differential games for nonlinear stochastic systems: suboptimal approach. IEEE Trans Fuzzy Syst 10(2):222–233

    Article  MathSciNet  Google Scholar 

  26. Cheng T, Lewis FL, Abu-Khalaf M (2007) Fixed-final-time-constrained optimal control of nonlinear systems using neural network HJB approach. IEEE Trans Neural Netw 18(6):1725–1736

    Article  Google Scholar 

  27. Cheng T, Lewis FL, Abu-Khalaf M (2007) A neural network solution for fixed-final time optimal control of nonlinear systems. Automatica 43(3):482–490

    Article  MathSciNet  MATH  Google Scholar 

  28. Costa OLV, Tuesta EF (2003) Finite horizon quadratic optimal control and a separation principle for Markovian jump linear systems. IEEE Trans Autom Control 48:1836–1842

    Article  MathSciNet  Google Scholar 

  29. Dalton J, Balakrishnan SN (1996) A neighboring optimal adaptive critic for missile guidance. Math Comput Model 23:175–188

    Article  Google Scholar 

  30. Dreyfus SE, Law AM (1977) The art and theory of dynamic programming. Academic Press, New York

    MATH  Google Scholar 

  31. Engwerda J (2008) Uniqueness conditions for the affine open-loop linear quadratic differential game. Automatica 44(2):504–511

    Article  MathSciNet  Google Scholar 

  32. Enns R, Si J (2002) Apache helicopter stabilization using neural dynamic programming. J Guid Control Dyn 25(1):19–25

    Article  Google Scholar 

  33. Enns R, Si J (2003) Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Trans Neural Netw 14(4):929–939

    Article  Google Scholar 

  34. Ferrari S, Stengel RF (2004) Online adaptive critic flight control. J Guid Control Dyn 27(5):777–786

    Article  Google Scholar 

  35. Ferrari S, Steck JE, Chandramohan R (2008) Adaptive feedback control by constrained approximate dynamic programming. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):982–987

    Article  Google Scholar 

  36. Goebel R (2002) Convexity in zero-sum differential games. In: Proceedings of the 41th IEEE conference on decision and control, Las Vegas, Nevada, pp 3964–3969

    Google Scholar 

  37. Goulart PJ, Kerrigan EC, Alamo T (2009) Control of constrained discrete-time systems with bounded L 2 gain. IEEE Trans Autom Control 54(5):1105–1111

    Article  MathSciNet  Google Scholar 

  38. Gu D (2008) A differential game approach to formation control. IEEE Trans Control Syst Technol 16(1):85–93

    Article  Google Scholar 

  39. Hanselmann T, Noakes L, Zaknich A (2007) Continuous-time adaptive critics. IEEE Trans Neural Netw 18(3):631–647

    Article  Google Scholar 

  40. He P, Jagannathan S (2005) Reinforcement learning-based output feedback control of nonlinear systems with input constraints. IEEE Trans Syst Man Cybern, Part B, Cybern 35(1):150–154

    Article  Google Scholar 

  41. He P, Jagannathan S (2007) Reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints. IEEE Trans Syst Man Cybern, Part B, Cybern 37(2):425–436

    Article  Google Scholar 

  42. Hou ZG, Wu CP (2005) A dynamic programming neural network for large-scale optimization problems. Acta Autom Sin 25(1):46–51

    Google Scholar 

  43. Hua X, Mizukami K (1994) Linear-quadratic zero-sum differential games for generalized state space systems. IEEE Trans Autom Control 39(1):143–147

    Article  MATH  Google Scholar 

  44. Hwnag KS, Chiou JY, Chen TY (2004) Reinforcement learning in zero-sum Markov games for robot soccer systems. In: Proceedings of the 2004 IEEE international conference on networking, sensing and control, Taipei, Taiwan, pp 1110–1114

    Chapter  Google Scholar 

  45. Jamshidi M (1982) Large-scale systems-modeling and control. North-Holland, Amsterdam

    Google Scholar 

  46. Javaherian H, Liu D, Zhang Y, Kovalenko O (2004) Adaptive critic learning techniques for automotive engine control. In: Proceedings of American control conference, Boston, MA, pp 4066–4071

    Google Scholar 

  47. Jimenez M, Poznyak A (2006) Robust and adaptive strategies with pre-identification via sliding mode technique in LQ differential games. In: Proceedings of American control conference Minneapolis, Minnesota, USA, pp 14–16

    Google Scholar 

  48. Kim YJ, Lim MT (2008) Parallel optimal control for weakly coupled nonlinear systems using successive Galerkin approximation. IEEE Trans Autom Control 53(6):1542–1547

    Article  MathSciNet  Google Scholar 

  49. Kleinman D (1968) On an iterative technique for Riccati equation computations. IEEE Trans Autom Control 13(1):114–115

    Article  Google Scholar 

  50. Kulkarni NV, KrishnaKumar K (2003) Intelligent engine control using an adaptive critic. IEEE Trans Control Syst Technol 11:164–173

    Article  Google Scholar 

  51. Landelius T (1997) Reinforcement learning and distributed local model synthesis. PhD dissertation, Linkoping University, Sweden

    Google Scholar 

  52. Laraki R, Solan E (2005) The value of zero-sum stopping games in continuous time. SIAM J Control Optim 43(5):1913–1922

    Article  MathSciNet  MATH  Google Scholar 

  53. Lendaris GG (2008) Higher level application of ADP: a nextphase for the control field. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):901–912

    Article  Google Scholar 

  54. Lendaris GG, Paintz C (1997) Training strategies for critic and action neural networks in dual heuristic programming method. In: Proceedings of the 1997 IEEE international conference on neural networks, Houston, TX, pp 712–717

    Google Scholar 

  55. Leslie DS, Collins EJ (2005) Individual Q-learning in normal form games. SIAM J Control Optim 44(2):495–514

    Article  MathSciNet  MATH  Google Scholar 

  56. Lewis FL (1992) Applied optimal control and estimation. Texas instruments. Prentice Hall, Englewood Cliffs

    Google Scholar 

  57. Lewis FL, Liu D (2012) Reinforcement learning and approximate dynamic programming for feedback control. IEEE press series on computational intelligence. Wiley, New York

    Google Scholar 

  58. Lewis FL, Syrmos VL (1992) Optimal control. Wiley, New York

    MATH  Google Scholar 

  59. Lin CS, Kim H (1991) CMAC-based adaptive critic self-learning control. IEEE Trans Neural Netw 2(5):530–533

    Article  Google Scholar 

  60. Liu X, Balakrishnan SN (2000) Convergence analysis of adaptive critic based optimal control. In: Proceedings of American control conference, Chicago, Illinois, pp 1929–1933

    Google Scholar 

  61. Liu D, Zhang HG (2005) A neural dynamic programming approach for learning control of failure avoidance problems. Int J Intell Control Syst 10(1):21–32

    Google Scholar 

  62. Liu D, Xiong X, Zhang Y (2001) Action-dependent adaptive critic designs. In: Proceeding of international joint conference on neural networks, Washington, DC, pp 990–995

    Google Scholar 

  63. Liu D, Zhang Y, Zhang HG (2005) A self-learning call admission control scheme for CDMA cellular networks. IEEE Trans Neural Netw 16(5):1219–1228

    Article  Google Scholar 

  64. Liu D, Javaherian H, Kovalenko O, Huang T (2008) Adaptive critic learning techniques for engine torque and air–fuel ratio control. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):988–993

    Article  Google Scholar 

  65. Lu C, Si J, Xie X (2008) Direct heuristic dynamic programming for damping oscillations in a large power system. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):1008–1013

    Article  Google Scholar 

  66. Lyshevski SE (2002) Optimization of dynamic systems using novel performance functionals. In: Proceedings of 41st IEEE conference on decision and control, Las Vegas, Nevada, pp 753–758

    Chapter  Google Scholar 

  67. Malek-Zavarei M, Jashmidi M (1987) Time-delay systems: analysis, optimization and applications North-Holland, Amsterdam, pp 80–96

    MATH  Google Scholar 

  68. Murray JJ, Cox CJ, Lendaris GG, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Syst Man Cybern, Part C, Appl Rev 32(2):140–153

    Article  Google Scholar 

  69. Murray JJ, Cox CJ, Saeks R (2003) The adaptive dynamic programming theorem. In: Liu D, Antsaklis PJ (eds) Stability and control of dynamical systems with applications. Birkhäser, Boston, pp 379–394

    Chapter  Google Scholar 

  70. Necoara I, Kerrigan EC, Schutter BD, Boom T (2007) Finite-horizon min–max control of max-plus-linear systems. IEEE Trans Autom Control 52(6):1088–1093

    Article  Google Scholar 

  71. Owen G (1982) Game theory. Academic Press, New York

    MATH  Google Scholar 

  72. Padhi R, Unnikrishnan N, Wang X, Balakrishnan SN (2006) A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neural Netw 19(10):1648–1660

    Article  MATH  Google Scholar 

  73. Parisini T, Zoppoli R (1998) Neural approximations for infinite-horizon optimal control of nonlinear stochastic systems. IEEE Trans Neural Netw 9(6):1388–1408

    Article  Google Scholar 

  74. Park JW, Harley RG, Venayagamoorthy GK (2003) Adaptive-critic-based optimal neurocontrol for synchronous generators in a power system using MLP/RBF neural networks. IEEE Trans Ind Appl 39:1529–1540

    Article  Google Scholar 

  75. Powell WB (2011) Approximate dynamic programming: solving the curses of dimensionality, 2nd edn. Wiley, Princeton

    Book  MATH  Google Scholar 

  76. Prokhorov DV, Wunsch DC (1997) Adaptive critic designs. IEEE Trans Neural Netw 8(5):997–1007

    Article  Google Scholar 

  77. Ray S, Venayagamoorthy GK, Chaudhuri B, Majumder R (2008) Comparison of adaptive critic-based and classical wide-area controllers for power systems. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):1002–1007

    Article  Google Scholar 

  78. Rovithakis GA (2001) Stable adaptive neuro-control design via Lyapunov function derivative estimation. Automatica 37(8):1213–1221

    Article  MathSciNet  MATH  Google Scholar 

  79. Saeks RE, Cox CJ, Mathia K, Maren AJ (1997) Asymptotic dynamic programming: preliminary concepts and results. In: Proceedings of the 1997 IEEE international conference on neural networks, Houston, TX, pp 2273–2278

    Google Scholar 

  80. Saridis GN, Lee CS (1979) An approximation theory of optimal control for trainable manipulators. IEEE Trans Syst Man Cybern 9(3):152–159

    Article  MathSciNet  MATH  Google Scholar 

  81. Saridis GN, Wang FY (1994) Suboptimal control of nonlinear stochastic systems. Control Theory Adv Technol 10(4):847–871

    MathSciNet  Google Scholar 

  82. Seiffertt J, Sanyal S, Wunsch DC (2008) Hamilton–Jacobi–Bellman equations and approximate dynamic programming on time scales. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):918–923

    Article  Google Scholar 

  83. Shervais S, Shannon TT, Lendaris GG (2003) Intelligent supply chain management using adaptive critic learning. IEEE Trans Syst Man Cybern, Part A, Syst Hum 33(2):235–244

    Article  Google Scholar 

  84. Shih P, Kaul B, Jagannathan S, Drallmeier J (2007) Near optimal output-feedback control of nonlinear discrete-time systems in nonstrict feedback form with application to engines. In: Proceedings of international joint conference on neural networks, Orlando, Florida, pp 396–401

    Chapter  Google Scholar 

  85. Si J, Wang YT (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276

    Article  MathSciNet  Google Scholar 

  86. Si J, Barto A, Powell W, Wunsch D (2004) Handbook of learning dynamic programming. Wiley, New Jersey

    Book  Google Scholar 

  87. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

    Google Scholar 

  88. Tesauro GJ (2000) Practical issues in temporal difference learning. Mach Learn 8:257–277

    Google Scholar 

  89. Tsitsiklis JN (1995) Efficient algorithms for globally optimal trajectories. IEEE Trans Autom Control 40(9):1528–1538

    Article  MathSciNet  MATH  Google Scholar 

  90. Uchida K, Fujita M (1992) Finite horizon H control problems with terminal penalties. IEEE Trans Autom Control 37(11):1762–1767

    Article  MathSciNet  MATH  Google Scholar 

  91. Vamvoudakis KG, Lewis FL (2010) Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46:878–888

    Article  MathSciNet  MATH  Google Scholar 

  92. Venayagamoorthy GK, Harley RG, Wunsch DG (2002) Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator. IEEE Trans Neural Netw 13:764–773

    Article  Google Scholar 

  93. Vrabie D, Abu-Khalaf M, Lewis FL, Wang Y (2007) Continuous-time ADP for linear systems with partially unknown dynamics. In: Proceedings of the 2007 IEEE symposium on approximate dynamic programming and reinforcement learning, Honolulu, USA, pp 247–253

    Chapter  Google Scholar 

  94. Vrabie D, Vamvoudakis KG, Lewis FL (2012) Optimal adaptive control and differential games by reinforcement learning principles. IET Press, London

    Google Scholar 

  95. Watkins C (1989) Learning from delayed rewards. PhD dissertation, Cambridge University, Cambridge, England

    Google Scholar 

  96. Werbos PJ (1977) Advanced forecasting methods for global crisis warning and models of intelligence. Gen Syst Yearbook 22:25–38

    Google Scholar 

  97. Werbos PJ (1987) Building and understanding adaptive systems: a statistical/numerical approach to factory automation and brain research. IEEE Trans Syst Man Cybern 17(1):7–20

    Article  Google Scholar 

  98. Werbos PJ (1990) Consistency of HDP applied to a simple reinforcement learning problem. Neural Netw 3(2):179–189

    Article  Google Scholar 

  99. Werbos PJ (1990) A menu of designs for reinforcement learning over time. In: Miller WT, Sutton RS, Werbos PJ (eds) Neural networks for control. MIT Press, Cambridge, pp 67–95

    Google Scholar 

  100. Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of intelligent control: neural, fuzzy and adaptive approaches. Van Nostrand, New York, chap 13

    Google Scholar 

  101. Werbos PJ (2007) Using ADP to understand and replicate brain intelligence: the next level design. In: Proceedings of IEEE symposium on approximate dynamic programming and reinforcement learning, Honolulu, HI, pp 209–216

    Chapter  Google Scholar 

  102. Widrow B, Gupta N, Maitra S (1973) Punish/reward: learning with a critic in adaptive threshold systems. IEEE Trans Syst Man Cybern 3(5):455–465

    Article  MathSciNet  MATH  Google Scholar 

  103. Wiering MA, Hasselt HV (2008) Ensemble algorithms in reinforcement learning. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):930–936

    Article  Google Scholar 

  104. Yadav V, Padhi R, Balakrishnan SN (2007) Robust/optimal temperature profile control of a high-speed aerospace vehicle using neural networks. IEEE Trans Neural Netw 18(4):1115–1128

    Article  Google Scholar 

  105. Yang Q, Jagannathan S (2007) Online reinforcement learning neural network controller design for nanomanipulation. In: Proceedings of IEEE symposium on approximate dynamic programming and reinforcement learning, Honolulu, HI, pp 225–232

    Chapter  Google Scholar 

  106. Yang L, Enns R, Wang YT, Si J (2003) Direct neural dynamic programming. In: Liu D, Antsaklis PJ (eds) Stability and control of dynamical systems with applications. Birkhauser, Boston

    Google Scholar 

  107. Yang F, Wang Z, Feng G, Liu X (2009) Robust filtering with randomly varying sensor delay: the finite-horizon case. IEEE Trans Circuits Syst I, Regul Pap 56(3):664–672

    Article  MathSciNet  Google Scholar 

  108. Yen GG, DeLima PG (2005) Improving the performance of globalized dual heuristic programming for fault tolerant control through an online learning supervisor. IEEE Trans Autom Sci Eng 2(2):121–131

    Article  Google Scholar 

  109. Yen GG, Hickey TW (2004) Reinforcement learning algorithms for robotic navigation in dynamic environments. ISA Trans 43:217–230

    Article  Google Scholar 

  110. Zadorojniy A, Shwartz A (2006) Robustness of policies in constrained Markov decision processes. IEEE Trans Autom Control 51(4):635–638

    Article  MathSciNet  Google Scholar 

  111. Zattoni E (2008) Structural invariant subspaces of singular Hamiltonian systems and nonrecursive solutions of finite-horizon optimal control problems. IEEE Trans Autom Control 53(5):1279–1284

    Article  MathSciNet  Google Scholar 

  112. Zhang HG, Wang ZS (2007) Global asymptotic stability of delayed cellular neural networks. IEEE Trans Neural Netw 18(3):947–950

    Article  Google Scholar 

  113. Zhang P, Deng H, Xi J (2005) On the value of two-person zero-sum linear quadratic differential games. In: Proceedings of the 44th IEEE conference on decision and control, and the European control conference. Seville, Spain, pp 12–15

    Google Scholar 

  114. Zhang HG, Lun SX, Liu D (2007) Fuzzy H(infinity) filter design for a class of nonlinear discrete-time systems with multiple time delays. IEEE Trans Fuzzy Syst 15(3):453–469

    Article  Google Scholar 

  115. Zhang HG, Wang ZS, Liu D (2007) Robust exponential stability of recurrent neural networks with multiple time-varying delays. IEEE Trans Circuits Syst II, Express Briefs 54(8):730–734

    Article  Google Scholar 

  116. Zhang HS, Xie L, Duan G (2007) H control of discrete-time systems with multiple input delays. IEEE Trans Autom Control 52(2):271–283

    Article  MathSciNet  Google Scholar 

  117. Zhang HG, Yang DD, Chai TY (2007) Guaranteed cost networked control for T-S fuzzy systems with time delays. IEEE Trans Syst Man Cybern, Part C, Appl Rev 37(2):160–172

    Article  Google Scholar 

  118. Zhang HG, Ma TD, Huang GB (2010) Robust global exponential synchronization of uncertain chaotic delayed neural networks via dual-stage impulsive control. IEEE Trans Syst Man Cybern, Part B, Cybern 40(3):831–844

    Article  Google Scholar 

  119. Zhang HG, Cui LL, Zhang X, Luo YH (2011) Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans Neural Netw 22(12):2226–2236

    Article  Google Scholar 

  120. Zhang HG, Wei QL, Liu D (2011) An iterative approximate dynamic programming method to solve for a class of nonlinear zero-sum differential games. Automatica 47(1):207–214

    Article  MathSciNet  MATH  Google Scholar 

  121. Zhao Y, Patek SD, Beling PA (2008) Decentralized Bayesian search using approximate dynamic programming methods. IEEE Trans Syst Man Cybern, Part B, Cybern 38(4):970–975

    Article  Google Scholar 

  122. Zheng CD, Zhang HG, Wang ZS (2010) An augmented LKF approach involving derivative information of both state and delay. IEEE Trans Neural Netw 21(7):1100–1109

    Article  Google Scholar 

  123. Zheng CD, Zhang HG, Wang ZS (2011) Novel exponential stability criteria of high-order neural networks with time-varying delays. IEEE Trans Syst Man Cybern, Part B, Cybern 41(2):486–496

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag London

About this chapter

Cite this chapter

Zhang, H., Liu, D., Luo, Y., Wang, D. (2013). Overview. In: Adaptive Dynamic Programming for Control. Communications and Control Engineering. Springer, London. https://doi.org/10.1007/978-1-4471-4757-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4757-2_1

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4756-5

  • Online ISBN: 978-1-4471-4757-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics