Skip to main content

Multiple Actor-Critic Optimal Control via ADP

  • Chapter
  • First Online:
  • 646 Accesses

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 166))

Abstract

In industrial process control, there may be multiple performance objectives, depending on salient features of the input-output data. Aiming at this situation, this chapter proposes multiple actor-critic structures to obtain the optimal control via input-output data for unknown nonlinear systems. The shunting inhibitory artificial neural network (SIANN) is used to classify the input-output data into one of several categories. Different performance measure functions may be defined for disparate categories. The ADP algorithm, which contains model module, critic network and action network, is used to establish the optimal control in each category. A recurrent neural network (RNN) model is used to reconstruct the unknown system dynamics using input-output data. Neural networks are used to approximate the critic and action networks, respectively. It is proven that the model error and the closed unknown system are uniformly ultimately bounded (UUB). Simulation results demonstrate the performance of the proposed optimal control scheme for the unknown nonlinear system.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Levine, D., Ramirez Jr., P.: An attentional theory of emotional influences on risky decisions. Prog. Brain Res. 202(2), 369–388 (2013)

    Google Scholar 

  2. Levine, D., Mills, B., Estrada, S.: Modeling emotional influences on human decision making under risk. In: Proceedings of International Joint Conference on Neural Networks, pp. 1657–1662 (2005)

    Google Scholar 

  3. Werbos, P.: Intelligence in the brain: a theory of how it works and how to build it. Neural Netw. 22, 200–212 (2009)

    Article  Google Scholar 

  4. Werbos. P.: Stable adaptive control using new critic designs. In: Proceedings of Adaptation, Noise, and Self-Organizing Systems (1998)

    Google Scholar 

  5. Narendra, K., Balakrishnan, J.: Adaptive control using multiple models. IEEE Trans. Autom. control 42(2), 171–187 (1997)

    Article  MathSciNet  Google Scholar 

  6. Sugimoto, N., Morimoto, J., Hyon, S., Kawato, M.: The eMOSAIC model for humanoid robot control. Neural Netw. 29–30, 8–19 (2012)

    Article  Google Scholar 

  7. Doya, K.: What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw. 12(7–8), 961–974 (1999)

    Article  Google Scholar 

  8. Hikosaka, O., Nakahara, H., Rand, M., Sakai, K., Lu, X., Nakamura, K., Miyachi, S., Doya, K.: Parallel neural networks for learning sequential procedures. Trends Neurosci. 22(10), 464–471 (1999)

    Article  Google Scholar 

  9. Lee, J., Lee, J.: Approximate dynamic programming-based approaches for input-output data-driven control of nonlinear processes. Automatica 41(7), 1281–1288 (2005)

    Article  MathSciNet  Google Scholar 

  10. Song, R., Xiao, W., Zhang, H.: Multi-objective optimal control for a class of unknown nonlinear systems based on finite- approximation-error ADP algorithm. Neurocomputing 119(7), 212–221 (2013)

    Article  Google Scholar 

  11. Li, H., Liu, D., Wang, D.: Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics. IEEE Trans. Autom. Sci. Eng. 11(3), 706–714 (2014)

    Article  Google Scholar 

  12. Yang, X., Liu, D., Huang, Y.: Neural-network-based online optimal control for uncertain nonlinear continuous-time systems with control constraints. IET Control Theory Appl. 7(17), 2037–2047 (2013)

    Article  MathSciNet  Google Scholar 

  13. Lewis, F., Vamvoudakis, K.: Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 41(1), 14–25 (2011)

    Article  Google Scholar 

  14. Li, Z., Duan, Z., Lewis, F.: Distributed robust consensus control of multi-agent systems with heterogeneous matching uncertainties. Automatica 50(3), 883–889 (2014)

    Article  MathSciNet  Google Scholar 

  15. Modares, H., Lewis, F., Naghibi-Sistani, M.: Integral reinforcement learning and experience replay for adaptive optimal control of partially unknown constrained-input continuous-time systems. Automatica 50(1), 193–202 (2014)

    Article  MathSciNet  Google Scholar 

  16. Zhang, H., Lewis, F.: Adaptive cooperative tracking control of higher-order nonlinear systems with unknown dynamics. Automatica 48(7), 1432–1439 (2012)

    Article  MathSciNet  Google Scholar 

  17. Wei, Q., Liu, D.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11(4), 1020–1036 (2014)

    Article  Google Scholar 

  18. Doya, K., Samejima, K., Katagiri, K., Kawato, M.: Multiple model-based reinforcement learning. Neural Comput. 14, 1347–1369 (2002)

    Article  Google Scholar 

  19. Levine, D.: Neural dynamics of affect, gist, probability, and choice. Cogn. Syst. Res. 15–16, 57–72 (2012)

    Article  Google Scholar 

  20. Werbos, P.: Using ADP to understand and replicate brain intelligence: the next level design. IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp. 209–216 (2007)

    Google Scholar 

  21. Arulampalam, G., Bouzerdoum, A.: A generalized feedforward neural network architecture for classification and regression. Neural Netw. 16, 561–568 (2003)

    Article  Google Scholar 

  22. Bouzerdoum, A.: Classification and function approximation using feedforward shunting inhibitory artificial neural networks. In: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, vol. 6, pp. 613–618 (2000)

    Google Scholar 

  23. Tivive, F., Bouzerdoum, A.: Efficient training algorithms for a class of shunting inhibitory convolutional neural networks. IEEE Trans. Neural Netw. 16(3), 541–556 (2005)

    Article  Google Scholar 

  24. Song, R., Lewis, F., Wei, Q., Zhang, H.: Off-policy actor-critic structure for optimal control of unknown systems with disturbances. IEEE Trans. Cybern. 46(5), 1041–1050 (2016)

    Article  Google Scholar 

  25. Hornik, K., Stinchcombe, M., White, H., Auer, P.: Degree of approximation results for feedforward networks approximating unknown mappings and their derivatives. Neural Comput. 6(6), 1262–1275 (1994)

    Article  Google Scholar 

  26. Zhang, H., Cui, L., Zhang, X., Luo, Y.: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22(12), 2226–2236 (2011)

    Article  Google Scholar 

  27. Kim, Y., Lewis, F.: Neural network output feedback control of robot manipulators. IEEE Trans. Robot. Autom. 15(2), 301–309 (1999)

    Article  Google Scholar 

  28. Khalil, H.: Nonlinear System. Prentice-Hall, NJ (2002)

    Google Scholar 

  29. Lewis, F., Jagannathan, S., Yesildirek, A.: Neural Network Control of Robot Manipulators and Nonlinear Systems. Taylor and Francis, London (1999)

    Google Scholar 

  30. Song, R., Xiao, W., Zhang, H., Sun, C.: Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(9), 1733–1739 (2014)

    Article  Google Scholar 

  31. Yang, C., Li, Z., Li, J.: Trajectory planning and optimized adaptive control for a class of wheeled inverted pendulum vehicle models. IEEE Trans. Cybern. 43(1), 24–36 (2013)

    Article  Google Scholar 

  32. Yang, C., Li, Z., Cui, R., Xu, B.: Neural network-based motion control of an underactuated wheeled inverted pendulum model. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 2004–2016 (2014)

    Article  Google Scholar 

  33. Beard, R.: Improving the Closed-Loop Performance of Nonlinear Systems, Ph.D. thesis, Rensselaer Polytechnic Institute, Troy, NY (1995)

    Google Scholar 

  34. Abu-Khalaf, M., Lewis, F.: Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41, 779–791 (2005)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruizhuo Song .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Science Press, Beijing and Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Song, R., Wei, Q., Li, Q. (2019). Multiple Actor-Critic Optimal Control via ADP. In: Adaptive Dynamic Programming: Single and Multiple Controllers. Studies in Systems, Decision and Control, vol 166. Springer, Singapore. https://doi.org/10.1007/978-981-13-1712-5_5

Download citation

Publish with us

Policies and ethics