Approximate dynamic programming solutions with a single network adaptive critic for a class of nonlinear systems

Ding, Jie; Balakrishnan, S. N.

doi:10.1007/s11768-011-0191-3

Approximate dynamic programming solutions with a single network adaptive critic for a class of nonlinear systems

Published: 19 July 2011

Volume 9, pages 370–380, (2011)
Cite this article

Journal of Control Theory and Applications Aims and scope Submit manuscript

Jie Ding¹ &
S. N. Balakrishnan¹

226 Accesses
12 Citations
Explore all metrics

Abstract

Approximate dynamic programming (ADP) formulation implemented with an adaptive critic (AC)-based neural network (NN) structure has evolved as a powerful technique for solving the Hamilton-Jacobi-Bellman (HJB) equations. As interest in ADP and the AC solutions are escalating with time, there is a dire need to consider possible enabling factors for their implementations. A typical AC structure consists of two interacting NNs, which is computationally expensive. In this paper, a new architecture, called the ‘cost-function-based single network adaptive critic (J-SNAC)’ is presented, which eliminates one of the networks in a typical AC structure. This approach is applicable to a wide class of nonlinear systems in engineering. In order to demonstrate the benefits and the control synthesis with the J-SNAC, two problems have been solved with the AC and the J-SNAC approaches. Results are presented, which show savings of about 50% of the computational costs by J-SNAC while having the same accuracy levels of the dual network structure in solving for optimal control. Furthermore, convergence of the J-SNAC iterations, which reduces to a least-squares problem, is discussed; for linear systems, the iterative process is shown to reduce to solving the familiar algebraic Ricatti equation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximately Optimal Control of Discrete-Time Nonlinear Switched Systems Using Globalized Dual Heuristic Programming

Article 30 July 2020

Robust Control of Uncertain Nonlinear Systems Based on Adaptive Dynamic Programming

Robust Tracking Control of Uncertain Nonlinear Systems Using Adaptive Dynamic Programming

References

F. L. Lewis. Applied Optimal Control and Estimation. New York: Prentice Hall, 1992.
Google Scholar
A. E. Bryson, Y. C. Ho. Applied Optimal Control. London: Taylor & Francis, 1975.
Google Scholar
P. J. Werbos. Approximate dynamic programming for real-time control and neural modeling. Handbook of Intelligent Control, New York: Van Nostrand, 1992: 493–525.
Google Scholar
A. G. Barto. Connectionist learning for control: an overview. Neural Networks for Control. Cambridge: MIT Press, 1991: 5–58.
Google Scholar
A. Barto, T. Dieterich. Reinforcement learning and its relation to supervised learning. Learning and Approximate Dynamic Programming. Piscataway: Wiley-IEEE Press, 2004: 47–63.
Google Scholar
W. Powell, B. Van Roy. ADP for high-dimensional resource allocation problems. Learning and Approximate Dynamic Programming. Piscataway: Wiley-IEEE Press, 2004: 261–283.
Google Scholar
D. P. Bertsekas, J. N. Tsitsiklis. Neuro-dynamic Programming. Belmont: Athena Scientific, 1996.
MATH Google Scholar
A. Al-Tamimi, F. L. Lewis, M. Abu-Khalaf. Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Transactions on Systems, Man, Cybernetics — Part B, 2008, 38(4): 943–949.
Article Google Scholar
S. N. Balakrishnan, J. Ding, F. L. Lewis. Issues on stability of adp feedback controllers for dynamical systems. IEEE Transaction on Systems, Man, Cybernetics — Part B, 2008, 38(4): 913–917.
Article Google Scholar
B. Li, J. Si. Robust dynamic programming for discounted infinitehorizon markov decision processes with uncertain stationary transition matrices. Proceedings of IEEE International Symposiom Approximate Dynamic Programming and Reinforcement Learning, New York: IEEE, 2007: 96–102.
Chapter Google Scholar
P. J. Werbos. Using ADP to understand and replicate brain intelligence: the next level design. Proceedings of IEEE Symposium Approximately Dynamic Programming and Reinforcement Learning, New York: IEEE, 2007: 209–216.
Chapter Google Scholar
S. N. Balakrishnan, V. Biega. Adaptive-critic based neural networks for aircraft optimal control. Journal of Guidance, Control and Dynamics, 1996, 19(4): 893–898.
Article Google Scholar
D. Prokhorov, D. Wunsch. Adaptive critic designs. IEEE Transactions on Neural Networks, 1995, 8(9): 1367–1372.
Google Scholar
G. Venayagamoorthy, R. Harley, D. Wunsch. Dual heuristic programming excitation neurocontrol for generators in a multimachine power system. IEEE Transactions on Industry Applications, 2003, 39(2): 382–384.
Article Google Scholar
R. Padhi, N. Unnikrishnan, X. Wang, et al. A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neural Network, 2006, 19(10): 1648–1660.
Article MATH Google Scholar
S. Ferrari, R. Stengel. An adaptive critic global controller. American Control Conference, New York: IEEE, 2002: 2665–2670.
Google Scholar
S. Ferrari, R. Stengel. Classical/neural synthesis of nonlinear control systems. Journal of Guidance, Control and Dynamics, 2002, 25(3): 442–448.
Article Google Scholar
Q. Yang, J. Vance, S. Jagannathan. Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks. IEEE Transactions on Systems, Man, and Cybernetics — Part B, 2008, 38(4): 994–1001.
Article Google Scholar
G. Lendaris, L. Schultz, T. Shannon. Adaptive critic design for intelligent steering and speed control of a 2-axle vehicle. IEEE/INNS/ENNS International Joint Conference on Neural Networks, Los Alamitos, CA: IEEE Computer Society, 2000: 73–78.
Google Scholar
T. Hanselmann, L. Noakes, A. Zaknich. Continuous time adaptive critics. IEEE Transactions on Neural Networks, 2007, 18(3): 631–647.
Article Google Scholar
F. L. Lewis, K. G. Vamvoudakis. Optimal adaptive control for unknown systems using output feedback by reinforcement learning methods. Proceedings of the 8th IEEE International Conference on Control & Automation, New York: IEEE, 2010: 2138–2145.
Google Scholar
R. Padhi, S. N. Balakrishnan. Optimal beaver population management using reduced order distributed parameter model and single network adaptive critics. American Control Conference, New York: IEEE, 2004: 1598–1603.
Google Scholar
R. Padhi, N. Unnikrishnan, S. N. Balakrishnana. Optimal control synthesis of a class of nonlinear systems using single network adaptive critics. American Control Conference, New York: IEEE, 2004: 1592–1597.
Google Scholar
V. Yadav, R. Padhi, S. N. Balakrishnan. Robust/optimal temperature profile control using neural networks. Proceedings of IEEE International Conference on Control Applications, New York: IEEE, 2006: 1986–1991.
Google Scholar
S. Chen, Y. Yang, N. Nguyen, et al. SNAC convergence and use in adaptive autopilot design. International Joint Conference on Neural Networks, New York: IEEE, 2009: 530–537.
Chapter Google Scholar
L. Yang, J. Si, K. S. Tsakalis, et al. Direct heuristic dynamic programming for nonlinear tracking control with filtered tracking error. IEEE Transactions on Systems, Man, and Cybernetics — Part B, 2009, 39(6): 1617–1622.
Article Google Scholar
F. Wang, H. Zhang, D. Liu. Adaptive dynamic programming: an introduction. IEEE Computational Intelligence Magazine, 2009, 4(2): 39–47.
Article Google Scholar
H. Zhang, Q. Wei, Y. Luo. A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Transactions on Systems, Man, and Cybernetics — Part B, 2008, 38(4): 937–942.
Article Google Scholar
S. N. Balakrishnan, V. Biega. Adaptive-critic based neural networks for aircraft optimal control. Journal of Guidance, Control and Dynamics, 1996, 19(4): 893–898.
Article Google Scholar
S. K. Gupta. Numerical Methods for Engineers. New Delhi: New Age International Publishers, Wiley Eastern Limited, 1995.
Google Scholar
R. W. Beard. Improving the Closed-loop Performance of Nonlinear Systems. Ph.D. thesis. New York: Rensselaer Polytechnic Institute, 1995.
Google Scholar
M. Gopal. Modern Control System Theory. 2nd ed, New York: John Wiley & Sons, 1993.
Google Scholar
A. Yesildirek. Nonlinear Systems Control Using Neural Networks. Ph.D. thesis. Arlington: University of Texas, 1994.
Google Scholar
D. Shirley, W. Stanley. Statistics for Research. 2nd ed, New York: John Wiley & Sons, 1991.
MATH Google Scholar
S. D. Senturia. Microsystem Design. Netherlands: Kluwer Academic Publishers, 2001.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mechanical and Aerospace Engineering, Missouri University of Science and Technology, Rolla, MO, 65401, USA
Jie Ding & S. N. Balakrishnan

Authors

Jie Ding
View author publications
You can also search for this author in PubMed Google Scholar
S. N. Balakrishnan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Ding.

Additional information

This work was supported by the National Aeronautics and Space Administration (NASA) (No.ARMD NRA NNH07ZEA001N-IRAC1), and the National Science Foundation (NSF).

Jie DING is currently working toward the Ph.D. degree at the Department of Mechanical and Aerospace Engineering, Missouri University of Science and Technology.

S. N. BALAKRISHNAN received his Ph.D. degree from the University of Texas, Austin. He is currently a curators’ professor of Aerospace Engineering, Missouri University of Science and Technology. His research interests include neural networks, optimal control, and large-scale and impulse systems. His papers from the development of techniques in these areas include applications to missiles, spacecraft, aircraft, robotics, temperature, and animal population control.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, J., Balakrishnan, S.N. Approximate dynamic programming solutions with a single network adaptive critic for a class of nonlinear systems. J. Control Theory Appl. 9, 370–380 (2011). https://doi.org/10.1007/s11768-011-0191-3

Download citation

Received: 06 August 2010
Revised: 21 March 2011
Published: 19 July 2011
Issue Date: August 2011
DOI: https://doi.org/10.1007/s11768-011-0191-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximate dynamic programming solutions with a single network adaptive critic for a class of nonlinear systems

Abstract

Access this article

Similar content being viewed by others

Approximately Optimal Control of Discrete-Time Nonlinear Switched Systems Using Globalized Dual Heuristic Programming

Robust Control of Uncertain Nonlinear Systems Based on Adaptive Dynamic Programming

Robust Tracking Control of Uncertain Nonlinear Systems Using Adaptive Dynamic Programming

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Approximate dynamic programming solutions with a single network adaptive critic for a class of nonlinear systems

Abstract

Access this article

Similar content being viewed by others

Approximately Optimal Control of Discrete-Time Nonlinear Switched Systems Using Globalized Dual Heuristic Programming

Robust Control of Uncertain Nonlinear Systems Based on Adaptive Dynamic Programming

Robust Tracking Control of Uncertain Nonlinear Systems Using Adaptive Dynamic Programming

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation