Skip to main content
Log in

Knowledge-based reinforcement learning controller with fuzzy-rule network: experimental validation

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

A model-free controller for a general class of output feedback nonlinear discrete-time systems is established by action-critic networks and reinforcement learning with human knowledge based on IF–THEN rules. The action network is designed by a single input fuzzy-rules emulated network with the set of IF–THEN rules utilized by the relation between control effort and plant’s output such as IF the output is high THEN the control effort should be reduced. The critic network is constructed by a multi-input FREN (MiFREN) for estimating an unknown long-term cost function. The set of IF–THEN rules for MiFREN is defined by the general knowledge of optimization such that IF the quadratic values of control effort and tracking error are high THEN the cost function should be high. The convergence of tracking error and bounded external signals can be guaranteed by Lyapunov direct method under general assumptions which are reasonable for practical plants. A computer simulation system is firstly provided to demonstrate the design method and the performance of the proposed controller. Furthermore, an experimental system with the prototype of DC-motor current control is conducted to show the effectiveness of the control scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Hou ZS, Wang Z (2013) From model-based control to data-driven control: survey, classification and perspective. Inf Sci 235:3–35

    Article  MathSciNet  Google Scholar 

  2. Zhu Y, Hou ZS (2014) Data-driven MFAC for a class of discrete-time nonlinear systems with RBFNN. IEEE Trans Neural Netw Learn Syst 25(5):1013–2014

    Article  Google Scholar 

  3. Wang X, Li X, Wang J, Fang X, Zhu X (2016) Data-driven model-free adaptive sliding mode control for the multi degree-of-freedom robotic exoskeleton. Inf Sci 327:246–257

    Article  MathSciNet  Google Scholar 

  4. Mu C, Zhao Q, Gao Z, Sun C (2019) Q-learning solution for optimal consensus control of discrete-time multiagent systems using reinforcement learning. J Franklin Inst 356:6946–6967

    Article  MathSciNet  Google Scholar 

  5. He S, Zhang M, Fang1 H, Liu F, Luan X, Ding Z (2019) Reinforcement learning and adaptive optimization of a class of Markov jump systems with completely unknown dynamic information. Neural Comput Appl, pp 1–10. https://doi.org/10.1007/s00521-019-04180-2

  6. Kaldmae A, Kotta U (2014) Input output linearization of discrete-time systems by dynamic output feedback. Eur J Control 20:73–78

    Article  MathSciNet  Google Scholar 

  7. Treesatayapun C (2018) Discrete-time adaptive controller for unfixed and unknown control direction. IEEE Trans Ind Electron 65(7):5367–5375

    Article  Google Scholar 

  8. Wang HP, Ghazally IYM, Tian Y (2018) Model-free fractional-order sliding mode control for an active vehicle suspension system. Adv Eng Softw 115:452–461

    Article  Google Scholar 

  9. Treesatayapun C (2015) Data input-output adaptive controller based on IF-THEN rules for a class of non-affine discrete-time systems: the robotic plant. J Intell Fuzzy Syst 28:661–668

    Article  MathSciNet  Google Scholar 

  10. Liu YJ, Tong S (2015) Adaptive NN tracking control of uncertain nonlinear discrete-time systems with nonaffine dead-zone input. IEEE Trans Cybernet 45(3):497–505

    Article  Google Scholar 

  11. Zhang CL, Li JM (2015) Adaptive iterative learning control of non-uniform trajectory tracking for strict feedback nonlinear time-varying systems with unknown control direction. Appl Math Model 39:2942–2950

    Article  MathSciNet  Google Scholar 

  12. Precup RE, Radac MB, Roman RC, Petriu EM (2017) Model-free sliding mode control of nonlinear systems: algorithms and experiments. Inf Sci 381:176–192

    Article  Google Scholar 

  13. Zhou Y, Kampen EJ, Chu QP (2018) Incremental model based online dual heuristic programming for nonlinear adaptive control. Control Eng Pract 73:13–25

    Article  Google Scholar 

  14. Dong B, Zhou F, Liu K, Li-in Y (2018) Decentralized robust optimal control for modular robot manipulators via critic-identifier structure-based adaptive dynamic programming. Neural Comput Appl, pp 1–18

  15. Radac MB, Precup RE (2018) Data-driven model-free slip control of anti-lock braking systems using reinforcement Q-learning. Neurocomputing 275:317–329

    Article  Google Scholar 

  16. Yang Q, Jagannathan S (2012) Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators. IEEE Trans Syst Man Cybern B Cybern 42(2):377–390

    Article  Google Scholar 

  17. Wang D, Liu D, Zhao D, Huang Y (2013) A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints. Neural Comput Appl 22(2):219–227

    Article  Google Scholar 

  18. Kiumarsi B, Lewis FL, Modares H, Karimpour A, Sistani MBN (2014) Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica 50(4):1167–1175

    Article  MathSciNet  Google Scholar 

  19. Liu D, Yang X, Li H (2013) Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neural Comput Appl 23(7–8):1843–1850

    Article  Google Scholar 

  20. Lin YC, Chen DD, Chen MS, Chen X, Jia L (2018) A precise BP neural network-based online model predictive control strategy for die forging hydraulic press machine. Neural Comput Appl 29(9):585–596

    Article  Google Scholar 

  21. Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Cambridge, MA

    MATH  Google Scholar 

  22. Prokhorov DV, Wunsch DC (1997) Adaptive critic designs. IEEE Trans Neural Netw 8(5):997–1007

    Article  Google Scholar 

  23. Liu D, Wang D, Yang X (2013) An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs. Inf Sci 220(20):331–342

    Article  MathSciNet  Google Scholar 

  24. Zhao B, Liu D, Li Y (2017) Observer based adaptive dynamic programming for fault tolerant control of a class of nonlinear systems. Inf Sci 384:21–33

    Article  Google Scholar 

  25. Adhyaru MD, Kar IN, Gopal M (2011) Bounded robust control of nonlinear systems using neural network? Based HJB solution. Neural Comput Appl 20(1):91–103

    Article  Google Scholar 

  26. Wei Q, Li B, Song R (2018) Discrete-time stable generalized self-learning optimal control with approximation errors. IEEE Trans Neural Netw Learn Syst 29(4):1226–1238

    Article  Google Scholar 

  27. Wei Q, Liu D (2014) Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time nonlinear sys. Neural Comput Appl 24:1355–1367

    Article  Google Scholar 

  28. Alibekov E, Kubalik J, Babuska R (2016) Policy derivation methods for critic-only reinforcement learning in continuous action spaces. IFAC-PapersOnLine 49:285–290

    Article  Google Scholar 

  29. Luo Y, Sun Q, Zhang H, Cui L (2015) Adaptive critic design-based robust neural network control for nonlinear distributed parameter systems with unknown dynamics. Neurocomputing 148:200–208

    Article  Google Scholar 

  30. Liang Y, Zhang H, Xiao G, Jiang H (2018) Reinforcement learning-based online adaptive controller design for a class of unknown nonlinear discrete-time systems with time delays. Neural Comput Appl 30:1733–1745

    Article  Google Scholar 

  31. Xu H, Zhao Q, Jagannathan S (2015) Finite-horizon near-optimal output feedback neural network control of quantized nonlinear discrete-time systems with input constraint. IEEE Trans Neural Netw Learn Syst 26(8):1776–1788

    Article  MathSciNet  Google Scholar 

  32. Wei Q, Lewis FL, Sun Q, Yan P, Song R (2017) Discrete-time deterministic Q-learning: a novel convergence analysis. IEEE Trans Cybernet 47(5):1224–1237

    Article  Google Scholar 

  33. Wei Q, Song R, Li B, Lin X (2018) A novel policy iteration-based deterministic Q-learning for discrete-time nonlinear systems. In: Self-learning optimal control of nonlinear systems, pp 85–109

  34. Liu C (2018) Optimal power management based on Q-learning and neuro-dynamic programming for plug-in hybrid electric vehicles. Ph.D. thesis dissertation, Information Systems Engineering, University of Michigan-Dearborn

  35. Navin NK, Sharma R (2017) A fuzzy reinforcement learning approach to thermal unit commitment problem. Neural Comput Appl 31:737–750

    Article  Google Scholar 

  36. Tang Y, He H, Ni Z, Zhong X, Zhao D, Xu X (2016) Fuzzy-based goal representation adaptive dynamic programming. IEEE Trans Fuzzy Syst 24(5):1159–1175

    Article  Google Scholar 

  37. Sui S, Tong S, Sun K (2018) Adaptive-dynamic-programming-based fuzzy control for triangular structure nonlinear uncertain systems with unknown time delay. Opt Control Appl Methods 39(2):819–834

    Article  MathSciNet  Google Scholar 

  38. Wang T, Zhang Y, Gao J (2015) Adaptive fuzzy backstepping control for a class of nonlinear systems with sampled and delayed measurements. IEEE Trans Fuzzy Syst 23(2):302–312

    Article  Google Scholar 

  39. Chang EC, Wu RC, Zhu K, Chen GY (2018) Adaptive neuro-fuzzy inference system-based grey time-varying sliding mode control for power conditioning applications. Neural Comput Appl 30(3):699–707

    Article  Google Scholar 

  40. Khater AA, El-Nagar AM, El-Bardini M, El-Rabaie NM (2019) Online learning based on adaptive learning rate for a class of recurrent fuzzy neural network. Neural Comput Appl, pp 1–20. https://doi.org/10.1007/s00521-019-04372-w

  41. Treesatayapun C, Uatrongjit S (2005) Adaptive controller with fuzzy rules emulated structure and its applications. Eng Appl Artif Intell 18:603–615

    Article  Google Scholar 

  42. Treesatayapun C (2014) Adaptive control based on IF–THEN rules for grasping force regulation with unknown contact mechanism. Robot Comput Integr Manuf 30:11–18

    Article  Google Scholar 

  43. Sahoo A, Xu H, Jagannathan S (2016) Near optimal event-triggered control of nonlinear discrete-time systems using neurodynamic programming. IEEE Trans Neural Netw Learn Syst 27(9):1801–1815

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research was supported by Fundamental Research Funds for CINVESTAV-IPN 2017 and Mexican Research Organization CONACyT Grant # 257253.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chidentree Treesatayapun.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Treesatayapun, C. Knowledge-based reinforcement learning controller with fuzzy-rule network: experimental validation. Neural Comput & Applic 32, 9761–9775 (2020). https://doi.org/10.1007/s00521-019-04509-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04509-x

Keywords

Navigation