Adaptive optimal control for a class of continuoustime affine nonlinear systems with unknown internal dynamics
 Derong Liu,
 Xiong Yang,
 Hongliang Li
 … show all 3 hide
Rent the article at a discount
Rent now* Final gross prices may vary according to local VAT.
Get AccessAbstract
This paper develops an online algorithm based on policy iteration for optimal control with infinite horizon cost for continuoustime nonlinear systems. In the present method, a discounted value function is employed, which is considered to be a more general case for optimal control problems. Meanwhile, without knowledge of the internal system dynamics, the algorithm can converge uniformly online to the optimal control, which is the solution of the modified Hamilton–Jacobi–Bellman equation. By means of two neural networks, the algorithm is able to find suitable approximations of both the optimal control and the optimal cost. The uniform convergence to the optimal control is shown, guaranteeing the stability of the nonlinear system. A simulation example is provided to illustrate the effectiveness and applicability of the present approach.
 Pontryagin LS (1959) Optimal control processes. Uspehi Mat Nauk (in Russian) 14:3–20
 Bellman, RE (1957) Dynamic programming. Princeton University Press, New Jersey
 Lewis, FL, Syrmos, VL (1995) Optimal control. John Wiley, New York
 Kailath, T (1973) Some new algorithms for recursive estimation in constant linear systems. IEEE Trans Inf Theory 19: pp. 750760 CrossRef
 Laub, AJ (1979) A Schur method for solving algebraic Riccati equations. IEEE Trans Autom Control 24: pp. 913921 CrossRef
 Moris K, Navasca C (2006) Iterative solution of algebraic Riccati equations for damped systems. In: Proceedings of 45th IEEE conference on decision and control, San Diego, CA, pp 2436–2440
 Saridis, GN, Lee, CS (1979) An approximation theory of optimal control for trainable manipulators. IEEE Trans Syst Man Cybern 9: pp. 152159 CrossRef
 Beard, R, Saridis, G, Wen, J (1997) Galerkin approximations of the generalized Hamilton–Jacobi–Bellman equation. Automatica 33: pp. 21592177 CrossRef
 AbuKhalaf, M, Lewis, FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41: pp. 779791 CrossRef
 Murray, JJ, Cox, CJ, Lendaris, GG, Saeks, R (2002) Adaptive dynamic programming. IEEE Transa Syst Man Cybern C Appl Rev 32: pp. 140153 CrossRef
 Werbos, PJ Approximate dynamic programming for realtime control and neural modeling. In: White, DA, Sofge, DA eds. (1992) Handbook of intelligent control: neural, fuzzy, and adaptive approaches.. Van Nostrand Reinhold, New York, pp. 493525
 Wang, FY, Zhang, H, Liu, D (2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag 4: pp. 3947 CrossRef
 Lewis, FL, Vrabie, D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9: pp. 3250 CrossRef
 AlTamimi, A, Lewis, FL, AbuKhalaf, M (2008) Discretetime nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern B Cybern 38: pp. 943949 CrossRef
 Liu D, Wang D, Zhao D (2011) Neuralnetworkbased optimal control for a class of nonlinear discretetime systems with control constraints using the iterative GDHP algorithm. In: Proceedings of international joint conference on neural networks, San Jose, CA, pp 53–60
 Howard, RA (1960) Dynamic programming and Markov processes. MIT Press, Cambridge
 Vrabie, D, Lewis, FL (2009) Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems. Neural Netw 22: pp. 237246 CrossRef
 Vamvoudakis, KG, Lewis, FL (2010) Online actorcritic algorithm to solve the continuoustime infinite horizon optimal control problem. Automatica 46: pp. 878888 CrossRef
 Lee JY, Park JB, Choi YH (2010) A novel generalized value iteration scheme for uncertain continuoustime linear systems. In: Proceedings of the 49th IEEE conference on decision and control, Atlanta, GA, pp 4637–4642
 Guo L, Cheng DZ, Feng DX (2005) Introduction to control theory: from basic concepts to research frontiers. Science Press (in Chinese), Beijing
 Rudin, W (1976) Principles of mathematical analysis (3rd edn). McGrawHill, New York
 Hornik, K, Stinchcombe, M, White, H (1990) Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw 3: pp. 551560 CrossRef
 Finlayson, BA (1972) The method of weighted residuals and variational principles. Academic Press, New York
 Lewis, FL, Vamvoudakis, KG (2011) Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Trans Syst Man Cybern B Cybern 41: pp. 1425 CrossRef
 Title
 Adaptive optimal control for a class of continuoustime affine nonlinear systems with unknown internal dynamics
 Journal

Neural Computing and Applications
Volume 23, Issue 78 , pp 18431850
 Cover Date
 20131201
 DOI
 10.1007/s005210121249y
 Print ISSN
 09410643
 Online ISSN
 14333058
 Publisher
 Springer London
 Additional Links
 Topics
 Keywords

 Adaptive dynamic programming
 Reinforcement learning
 Policy iteration
 Adaptive optimal control
 Neural network
 Online control
 Nonlinear system
 Industry Sectors
 Authors

 Derong Liu ^{(1)}
 Xiong Yang ^{(1)}
 Hongliang Li ^{(1)}
 Author Affiliations

 1. State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, People’s Republic of China