Skip to main content
Log in

A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints

  • ISNN 2011
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In this paper, a novel neural-network-based iterative adaptive dynamic programming (ADP) algorithm is proposed. It aims at solving the optimal control problem of a class of nonlinear discrete-time systems with control constraints. By introducing a generalized nonquadratic functional, the iterative ADP algorithm through globalized dual heuristic programming technique is developed to design optimal controller with convergence analysis. Three neural networks are constructed as parametric structures to facilitate the implementation of the iterative algorithm. They are used for approximating at each iteration the cost function, the optimal control law, and the controlled nonlinear discrete-time system, respectively. A simulation example is also provided to verify the effectiveness of the control scheme in solving the constrained optimal control problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Chen D, Yang J, Mohler RR (2008) On near optimal neural control of multiple-input nonlinear systems. Neural Comput Appl 17(4):327–337

    Article  Google Scholar 

  2. Lyshevski SE (1996) Constrained optimization and control of nonlinear systems: new results in optimal control. In: Proceedings of the 35th IEEE conference on decision and control, Kobe, Japan, pp 541–546

  3. Lyshevski SE (1998) Nonlinear discrete-time systems: constrained optimization and application of nonquadratic costs. In: Proceedings of the American control conference, Philadelphia, pp 3699–3703

  4. Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton

    MATH  Google Scholar 

  5. Jagannathan S (2006) Neural network control of nonlinear discrete-time systems. CRC Press, Boca Raton

    MATH  Google Scholar 

  6. Yu W (2009) Recent advances in intelligent control systems. Springer, London

    Book  MATH  Google Scholar 

  7. Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of intelligent control: neural, fuzzy, and adaptive approaches. Van Nostrand Reinhold, New York, pp 493–525

    Google Scholar 

  8. Werbos PJ (2008) ADP: The key direction for future research in intelligent control and understanding brain intelligence. IEEE Trans Syst Man Cybern B Cybern 38(4):898–900

    Article  Google Scholar 

  9. Werbos PJ (2009) Intelligence in the brain: a theory of how it works and how to build it. Neural Netw 22(3):200–212

    Article  Google Scholar 

  10. Murray JJ, Cox CJ, Lendaris GG, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Syst Man Cybern C Appl Rev 32(2):140–153

    Article  Google Scholar 

  11. Wang FY, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag 4(2):39–47

    Article  Google Scholar 

  12. Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50

    Article  MathSciNet  Google Scholar 

  13. Si J, Barto AG, Powell WB, Wunsch DC (2004) Handbook of learning and approximate dynamic programming. IEEE Press/Wiley, New York

    Book  Google Scholar 

  14. Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont

    MATH  Google Scholar 

  15. Si J, Wang YT (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276

    Article  MathSciNet  Google Scholar 

  16. Liu D, Zhang H (2005) A neural dynamic programming approach for learning control of failure avoidance problems. Int J Intell Control Syst 10(1):21–32

    Google Scholar 

  17. Prokhorov DV, Wunsch DC (1997) Adaptive critic designs. IEEE Trans Neural Netw 8(5):997–1007

    Article  Google Scholar 

  18. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge

    Google Scholar 

  19. Hagen ST, Krose B (2003) Neural Q-learning. Neural Comput Appl 12(2):81–88

    Article  Google Scholar 

  20. Liu D, Xiong X, Zhang Y (2001) Action-dependent adaptive critic designs. In: Proceedings of the international joint conference on neural networks, Washington, vol 2, pp 990–995

  21. Venayagamoorthy GK, Harley RG, Wunsch DC (2002) Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator. IEEE Trans Neural Netw 13(3):764–773

    Article  Google Scholar 

  22. Venayagamoorthy GK, Harley RG, Wunsch DC (2003) Implementation of adaptive critic-based neurocontrollers for turbogenerators in a multimachine power system. IEEE Trans Neural Netw 14(5):1047–1064

    Article  Google Scholar 

  23. Yen GG, Delima PG (2005) Improving the performance of globalized dual heuristic programming for fault tolerant control through an online learning supervisor. IEEE Trans Autom Sci Eng 2(2):121–131

    Article  Google Scholar 

  24. Jagannathan S, He P (2008) Neural-network-based state feedback control of a nonlinear discrete-time system in nonstrict feedback form. IEEE Trans Neural Netw 19(12):2073–2087

    Article  Google Scholar 

  25. Cheng T, Lewis FL, Abu-Khalaf M (2007) A neural network solution for fixed-final time optimal control of nonlinear systems. Automatica 43(3):482–490

    Article  MathSciNet  MATH  Google Scholar 

  26. Balakrishnan SN, Biega V (1996) Adaptive-critic based neural networks for aircraft optimal control. J Guid Control Dyn 19(4):893–898

    Article  Google Scholar 

  27. Balakrishnan SN, Ding J, Lewis FL (2008) Issues on stability of ADP feedback controllers for dynamic systems. IEEE Trans Syst Man Cybern B Cybern 38(4):913–917

    Article  Google Scholar 

  28. Han D, Balakrishnan SN (2002) State-constrained agile missile control with adaptive critic-based neural networks. IEEE Trans Control Syst Technol 10(4):481–489

    Article  Google Scholar 

  29. Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern B Cybern 38(4):943–949

    Article  Google Scholar 

  30. Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern B Cybern 38(4):937–942

    Article  Google Scholar 

  31. Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis FL (2009) Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 45(2):477–484

    Article  MathSciNet  MATH  Google Scholar 

  32. Liu D, Jin N (2008) \(\varepsilon\)-adaptive dynamic programming for discrete-time systems. In: Proceedings of the international joint conference on neural networks, Hong Kong, pp 1417–1424

  33. Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791

    Article  MathSciNet  MATH  Google Scholar 

  34. Zhang H, Luo Y, Liu D (2009) Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 20(9):1490–1503

    Article  Google Scholar 

  35. Vamvoudakis KG, Lewis FL (2010) Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888

    Article  MathSciNet  MATH  Google Scholar 

  36. Zhang H, Wei Q, Liu D (2011) An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica 47(1):207–214

    Article  MathSciNet  MATH  Google Scholar 

  37. Song R, Zhang H, Luo Y, Wei Q (2010) Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming. Neurocomputing 73(16–18):3020–3027

    Article  Google Scholar 

  38. Ma J, Yang T, Hou ZG, Tan M, Liu D (2008) Neurodynamic programming: a case study of the traveling salesman problem. Neural Comput Appl 17(4):347–355

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grants 60874043, 60904037, 60921061, and 61034002, by Beijing Natural Science Foundation under Grant 4102061, and by the National Science Foundation of USA under Grant ECCS-1027602.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ding Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, D., Liu, D., Zhao, D. et al. A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints. Neural Comput & Applic 22, 219–227 (2013). https://doi.org/10.1007/s00521-011-0707-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-011-0707-2

Keywords

Navigation