Skip to main content
Log in

Finite horizon optimal control of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming

  • Published:
Journal of Control Theory and Applications Aims and scope Submit manuscript

Abstract

In this paper, we aim to solve the finite horizon optimal control problem for a class of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming (ADP) approach. A new ɛ-optimal control algorithm based on the iterative ADP approach is proposed which makes the performance index function converge iteratively to the greatest lower bound of all performance indices within an error according to ɛ within finite time. The optimal number of control steps can also be obtained by the proposed ɛ-optimal control algorithm for the situation where the initial state of the system is unfixed. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the ɛ-optimal control algorithm. Finally, a simulation example is given to show the results of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. H. G. Zhang, Y. Wang, D. Liu. Delay-dependent guaranteed cost control for uncertain stochastic fuzzy systems with multiple time delays. IEEE Transactions on Systems, Man, and Cybernetics — Part B: Cybernetics, 2008, 38(1): 125–140.

    MATH  Google Scholar 

  2. M. R. Hsu, W. H. Ho, J. H. Chou. Stable and quadratic optimal control for T-S fuzzy-model-based time-delay control systems. IEEE Transactions on Systems, Man, and Cybernetics — Part B: Cybernetics, 2008, 38(4): 933–944.

    Article  Google Scholar 

  3. P. J. Goulart, E. C. Kerrigan, T. Alamo. Control of constrained discrete-time systems with bounded L 2 gain. IEEE Transactions on Automatic Control, 2009, 54(5): 1105–1111.

    Article  MathSciNet  Google Scholar 

  4. J. H. Park, H. W. Yoo, S. Han, et al. Receding horizon controls for input-delayed systems. IEEE Transactions on Automatic Control, 2008, 53(7): 1746–1752.

    Article  MathSciNet  Google Scholar 

  5. H. Zhang, L. Xie, G. Duan. H control of discrete-time systems with multiple input delays. IEEE Transactions on Automatic Control, 2007, 52(2): 271–283.

    Article  MathSciNet  Google Scholar 

  6. H. Zhang, M. Li, J. Yang, et al. Fuzzy model-based robust networked control for a class of nonlinear systems. IEEE Transactions on Systems, Man, and Cybernetics — Part B: Cybernetics, 2009, 39(2):437–447.

    Article  Google Scholar 

  7. Z. Wang, H. Zhang, W. Yu. Robust stability of Cohen-Grossberg neural networks via state transmission matrix. IEEE Transactions on Neural Networks, 2009, 20(1): 169–174.

    Article  Google Scholar 

  8. L. Blackmore, S. Rajamanoharan, B. C. Williams. Active estimation for jump Markov linear systems. IEEE Transactions on Automatic Control, 2008, 53(10): 2223–2236.

    Article  MathSciNet  Google Scholar 

  9. T. Cimen, S. P. Banks. Nonlinear optimal tracking control with application to super-tankers for autopilot design. Automatica, 2004, 40(11): 1845–1863.

    Article  MathSciNet  MATH  Google Scholar 

  10. J. Azzato, J. B. Krawczyk. Applying a finite-horizon numerical optimization method to a periodic optimal control problem. Automatica, 2008, 44(6): 1642–1651.

    Article  MathSciNet  Google Scholar 

  11. A. Ferrantea, G. Marrob, L. Ntogramatzidisb. A parametrization of the solutions of the finite-horizon LQ problem with general cost and boundary conditions. Automatica, 2005, 41(8): 1359–1366.

    Article  MathSciNet  Google Scholar 

  12. N. Fukushima, M. S. Arslan, I. Hagiwara. An optimal control method based on the energy flow equation. IEEE Transactions on Control Systems Technology, 2009, 17(4): 866–875.

    Article  Google Scholar 

  13. I. Kioskeridis, C. Mademlis. A unified approach for four-quadrant optimal controlled switched reluctance machine drives with smooth transition between control operations. IEEE Transactions on Automatic Control, 2009, 24(1): 301–306.

    Google Scholar 

  14. G. N. Saridis, F. Y. Wang. Suboptimal control of nonlinear stochastic systems. Control-Theory and Advanced Technology, 1994, 10(4): 847–871.

    MathSciNet  Google Scholar 

  15. A. Zadorojniy, A. Shwartz. Robustness of policies in constrained Markov decision processes. IEEE Transactions on Automatic Control, 2006, 51(4): 635–638.

    Article  MathSciNet  Google Scholar 

  16. E. Zattoni. Structural invariant subspaces of singular hamiltonian systems and nonrecursive solutions of finite-horizon optimal control problems. IEEE Transactions on Automatic Control, 2008, 53(5):1279–1284.

    Article  MathSciNet  Google Scholar 

  17. H. Ichihara. Optimal control for polynomial systems using matrix sum of squares relaxations. IEEE Transactions on Automatic Control, 2009, 54(5): 1048–1053.

    Article  MathSciNet  Google Scholar 

  18. J. Mao, C. G. Cassandras. Optimal control of multi-stage discrete event systems with real-time constraints. IEEE Transactions on Automatic Control, 2009, 54(1): 108–123.

    Article  MathSciNet  Google Scholar 

  19. H. Zhang, Q. Wei, Y. Luo. A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Transactions on Systems, Man, and Cybernetics — Part B: Cybernetics, 2008, 38(4): 937–942.

    Article  Google Scholar 

  20. H. Zhang, Y. Luo, D. Liu. The RBF neural network-based nearoptimal control for a class of discrete-time affine nonlinear systems with control constraint. IEEE Transactions on Neural Networks, 2009, 20(9): 1490–1503.

    Article  Google Scholar 

  21. R. E. Bellman. Dynamic Programming. Princeton: Princeton University Press, 1957.

    MATH  Google Scholar 

  22. V. G. Boltyanskii. Optimal Control of Discrete Systems. New York: John Wiley & Sons, 1978.

    Google Scholar 

  23. A. E. Bryson, Y. C. Ho. Applied Optimal Control: Optimization, Estimation, and Control. New York: Hemisphere Publishing Co., 1975.

    Google Scholar 

  24. Q. Wei, H. Zhang, J. Dai. Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions. Neurocomputing, 2009, 72(7/9): 1839–1848.

    Article  Google Scholar 

  25. D. Liu, X. Xiong, Y. Zhang. Action-dependent adaptive critic designs. International Joint Conference on Neural Networks. New York: IEEE, 2001: 990–995.

    Google Scholar 

  26. J. Si, Y. Wang. On-line learning control by association and reinforcement. IEEE Transactions on Neural Networks, 2001, 12(2):264–276.

    Article  MathSciNet  Google Scholar 

  27. J. J. Murray, C. J. Cox, G. G. Lendaris, et al. Adaptive dynamic programming. IEEE Transactions on Systems, Man, and Cybernetics — Part C: Cybernetics, 2002, 32(2): 140–153.

    Article  Google Scholar 

  28. D. Liu, H. Zhang. A neural dynamic programming approach for learning control of failure avoidance problems. International Journal of Intelligence Control and Systems, 2005, 10(1): 21–32.

    Google Scholar 

  29. T. Landelius. Reinforcement Learning and Distributed Local Model Synthesis. Sweden: Linkoping University, 1997.

    Google Scholar 

  30. D. Liu, Y. Zhang, H. Zhang. A self-learning call admission control scheme for CDMA cellular networks. IEEE Transactions on Neural Networks, 2005, 16(5): 1219–1228.

    Article  Google Scholar 

  31. N. Jin, D. Liu, T. Huang, et al. Discrete-time adaptive dynamic programming using wavelet basis function neural networks. Proceedings of the IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning. New York: IEEE, 2007: 135–142.

    Chapter  Google Scholar 

  32. A. Al-Tamimi, F. L. Lewis. Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. Proceedings of the IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, New York: IEEE, 2007: 38–43.

    Chapter  Google Scholar 

  33. A. Al-Tamimi, M. Abu-Khalaf, F. L. Lewis. Adaptive critic designs for discrete-time zero-sum games with application to H control. IEEE Transactions on Systems, Man, and Cybernetics — Part B: Cybernetics, 2007, 37(1): 240–247.

    Article  Google Scholar 

  34. H. Zhang, Q. Wei, D. Liu. An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica, 2011, 47(1): 207–214.

    Article  MATH  Google Scholar 

  35. P. J. Werbos. A menu of designs for reinforcement learning over time. W. T. Miller, R. S. Sutton, P. J. Werbos, eds. Neural Networks for Control, Cambridge: MIT Press, 1991: 67–95.

    Google Scholar 

  36. D. V. Prokhorov, D. C. Wunsch. Adaptive critic designs. IEEE Transactions on Neural Networks, 1997, 8(5): 997–1007.

    Article  Google Scholar 

  37. P. J. Werbos. Approximate dynamic programming for real-time control and neural modeling. D. A. White, D. A. Sofge, eds. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, New York: Van Nostrand Reinhold, 1992: 493–525

    Google Scholar 

  38. F. Wang, N. Jin, D. Liu, et al. Adaptive dynamic programming for finite horizon optimal control of discrete-time nonlinear systems with ε-error bound. IEEE Trasctions on Neural Networks, 2011, 22(1): 24–36.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qinglai Wei.

Additional information

This work was partly supported by the National Natural Science Foundation of China (No.60904037, 60921061, 61034002), and the Beijing Natural Science Foundation (No. 4102061).

Qinglai WEI received his B.S. degree in Automation, M.S. degree in Control Theory and Control Engineering, and Ph.D. degree in Control Theory and Control Engineering, from the Northeastern University, Shenyang, China, in 2002, 2005, and 2008, respectively. He is currently a postdoctoral fellow with the Key Laboratory of Complex Systems and Intelligence Science, Institute of Automation, Chinese Academy of Sciences, Beijing, China. His research interests include neural-networks-based control, non nonlinear control, adaptive dynamic programming, and their industrial applications.

Derong LIU received his Ph.D. degree in Electrical Engineering from the University of Notre Dame, Notre Dame, IN, in 1994. Dr. Liu was a staff fellow with General Motors Research and Development Center, Warren, MI, from 1993 to 1995. He was an assistant professor in the Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ, from 1995 to 1999. He joined the University of Illinois at Chicago in 1999, where he became a full professor of Electrical and Computer Engineering and of Computer Science in 2006. He was selected for the ‘100 Talents Program’ by the Chinese Academy of Sciences in 2008.

Currently, Dr. Liu is the editor-in-chief of the IEEE Transactions on Neural Networks and an associate editor of several other journals. He received the Michael J. Birck Fellowship from the University of Notre Dame (1990), the Harvey N. Davis Distinguished Teaching Award from Stevens Institute of Technology (1997), the Faculty Early Career Development (CAREER) Award from the National Science Foundation (1999), the University Scholar Award from University of Illinois (2006), and the Overseas Outstanding Young Scholar Award from the National Natural Science Foundation of China (2008).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, Q., Liu, D. Finite horizon optimal control of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming. J. Control Theory Appl. 9, 381–390 (2011). https://doi.org/10.1007/s11768-011-0181-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11768-011-0181-5

Keywords

Navigation