Abstract
In industrial process control, there may be multiple performance objectives, depending on salient features of the input-output data. Aiming at this situation, this chapter proposes multiple actor-critic structures to obtain the optimal control via input-output data for unknown nonlinear systems. The shunting inhibitory artificial neural network (SIANN) is used to classify the input-output data into one of several categories. Different performance measure functions may be defined for disparate categories. The ADP algorithm, which contains model module, critic network and action network, is used to establish the optimal control in each category. A recurrent neural network (RNN) model is used to reconstruct the unknown system dynamics using input-output data. Neural networks are used to approximate the critic and action networks, respectively. It is proven that the model error and the closed unknown system are uniformly ultimately bounded (UUB). Simulation results demonstrate the performance of the proposed optimal control scheme for the unknown nonlinear system.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Levine, D., Ramirez Jr., P.: An attentional theory of emotional influences on risky decisions. Prog. Brain Res. 202(2), 369–388 (2013)
Levine, D., Mills, B., Estrada, S.: Modeling emotional influences on human decision making under risk. In: Proceedings of International Joint Conference on Neural Networks, pp. 1657–1662 (2005)
Werbos, P.: Intelligence in the brain: a theory of how it works and how to build it. Neural Netw. 22, 200–212 (2009)
Werbos. P.: Stable adaptive control using new critic designs. In: Proceedings of Adaptation, Noise, and Self-Organizing Systems (1998)
Narendra, K., Balakrishnan, J.: Adaptive control using multiple models. IEEE Trans. Autom. control 42(2), 171–187 (1997)
Sugimoto, N., Morimoto, J., Hyon, S., Kawato, M.: The eMOSAIC model for humanoid robot control. Neural Netw. 29–30, 8–19 (2012)
Doya, K.: What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw. 12(7–8), 961–974 (1999)
Hikosaka, O., Nakahara, H., Rand, M., Sakai, K., Lu, X., Nakamura, K., Miyachi, S., Doya, K.: Parallel neural networks for learning sequential procedures. Trends Neurosci. 22(10), 464–471 (1999)
Lee, J., Lee, J.: Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes. Automatica 41(7), 1281–1288 (2005)
Song, R., Xiao, W., Zhang, H.: Multi-objective optimal control for a class of unknown nonlinear systems based on finite- approximation-error ADP algorithm. Neurocomputing 119(7), 212–221 (2013)
Li, H., Liu, D., Wang, D.: Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics. IEEE Trans. Autom. Sci. Eng. 11(3), 706–714 (2014)
Yang, X., Liu, D., Huang, Y.: Neural-network-based online optimal control for uncertain nonlinear continuous-time systems with control constraints. IET Control Theory Appl. 7(17), 2037–2047 (2013)
Lewis, F., Vamvoudakis, K.: Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 41(1), 14–25 (2011)
Li, Z., Duan, Z., Lewis, F.: Distributed robust consensus control of multi-agent systems with heterogeneous matching uncertainties. Automatica 50(3), 883–889 (2014)
Modares, H., Lewis, F., Naghibi-Sistani, M.: Integral reinforcement learning and experience replay for adaptive optimal control of partially unknown constrained-input continuous-time systems. Automatica 50(1), 193–202 (2014)
Zhang, H., Lewis, F.: Adaptive cooperative tracking control of higher-order nonlinear systems with unknown dynamics. Automatica 48(7), 1432–1439 (2012)
Wei, Q., Liu, D.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11(4), 1020–1036 (2014)
Doya, K., Samejima, K., Katagiri, K., Kawato, M.: Multiple model-based reinforcement learning. Neural Comput. 14, 1347–1369 (2002)
Levine, D.: Neural dynamics of affect, gist, probability, and choice. Cogn. Syst. Res. 15–16, 57–72 (2012)
Werbos, P.: Using ADP to understand and replicate brain intelligence: the next level design. IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp. 209–216 (2007)
Arulampalam, G., Bouzerdoum, A.: A generalized feedforward neural network architecture for classification and regression. Neural Netw. 16, 561–568 (2003)
Bouzerdoum, A.: Classification and function approximation using feedforward shunting inhibitory artificial neural networks. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, vol. 6, pp. 613–618 (2000)
Tivive, F., Bouzerdoum, A.: Efficient training algorithms for a class of shunting inhibitory convolutional neural networks. IEEE Trans. Neural Netw. 16(3), 541–556 (2005)
Song, R., Lewis, F., Wei, Q., Zhang, H.: Off-policy actor-critic structure for optimal control of unknown systems with disturbances. IEEE Trans. Cybern. 46(5), 1041–1050 (2016)
Hornik, K., Stinchcombe, M., White, H., Auer, P.: Degree of approximation results for feedforward networks approximating unknown mappings and their derivatives. Neural Comput. 6(6), 1262–1275 (1994)
Zhang, H., Cui, L., Zhang, X., Luo, Y.: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22(12), 2226–2236 (2011)
Kim, Y., Lewis, F.: Neural network output feedback control of robot manipulators. IEEE Trans. Robot. Autom. 15(2), 301–309 (1999)
Khalil, H.: Nonlinear System. Prentice-Hall, NJ (2002)
Lewis, F., Jagannathan, S., Yesildirek, A.: Neural Network Control of Robot Manipulators and Nonlinear Systems. Taylor and Francis, London (1999)
Song, R., Xiao, W., Zhang, H., Sun, C.: Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(9), 1733–1739 (2014)
Yang, C., Li, Z., Li, J.: Trajectory planning and optimized adaptive control for a class of wheeled inverted pendulum vehicle models. IEEE Trans. Cybern. 43(1), 24–36 (2013)
Yang, C., Li, Z., Cui, R., Xu, B.: Neural network-based motion control of an underactuated wheeled inverted pendulum model. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 2004–2016 (2014)
Beard, R.: Improving the Closed-Loop Performance of Nonlinear Systems, Ph.D. thesis, Rensselaer Polytechnic Institute, Troy, NY (1995)
Abu-Khalaf, M., Lewis, F.: Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41, 779–791 (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2019 Science Press, Beijing and Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Song, R., Wei, Q., Li, Q. (2019). Multiple Actor-Critic Optimal Control via ADP. In: Adaptive Dynamic Programming: Single and Multiple Controllers. Studies in Systems, Decision and Control, vol 166. Springer, Singapore. https://doi.org/10.1007/978-981-13-1712-5_5
Download citation
DOI: https://doi.org/10.1007/978-981-13-1712-5_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1711-8
Online ISBN: 978-981-13-1712-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)