Adaptive optimal control for a class of continuoustime affine nonlinear systems with unknown internal dynamics
 Derong Liu,
 Xiong Yang,
 Hongliang Li
This paper develops an online algorithm based on policy iteration for optimal control with infinite horizon cost for continuoustime nonlinear systems. In the present method, a discounted value function is employed, which is considered to be a more general case for optimal control problems. Meanwhile, without knowledge of the internal system dynamics, the algorithm can converge uniformly online to the optimal control, which is the solution of the modified Hamilton–Jacobi–Bellman equation. By means of two neural networks, the algorithm is able to find suitable approximations of both the optimal control and the optimal cost. The uniform convergence to the optimal control is shown, guaranteeing the stability of the nonlinear system. A simulation example is provided to illustrate the effectiveness and applicability of the present approach.
 Title
Neural Computing and Applications
Volume 23, Issue 78 , pp 18431850
 20131201
 10.1007/s005210121249y
 09410643
 14333058
 Springer London
 Adaptive dynamic programming
 Reinforcement learning
 Policy iteration
 Adaptive optimal control
 Neural network
 Online control
 Nonlinear system
 Derong Liu ^{(1)}
 Xiong Yang ^{(1)}
 Hongliang Li ^{(1)}
 1. State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, People’s Republic of China