Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems

Zhang, Haiyun; Meng, Deyuan; Wang, Jin; Lu, Guodong

doi:10.1631/FITEE.1900610

Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems

面向未知连续非线性系统的间接自适应模糊规划最优控制方法

Published: 08 January 2021

Volume 22, pages 155–169, (2021)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

155 Accesses
2 Citations
Explore all metrics

Abstract

We present a novel indirect adaptive fuzzy-regulated optimal control scheme for continuous-time nonlinear systems with unknown dynamics, mismatches, and disturbances. Initially, the Hamilton-Jacobi-Bellman (HJB) equation associated with its performance function is derived for the original nonlinear systems. Unlike existing adaptive dynamic programming (ADP) approaches, this scheme uses a special non-quadratic variable performance function as the reinforcement medium in the actor-critic architecture. An adaptive fuzzy-regulated critic structure is correspondingly constructed to configure the weighting matrix of the performance function for the purpose of approximating and balancing the HJB equation. A concurrent self-organizing learning technique is designed to adaptively update the critic weights. Based on this particular critic, an adaptive optimal feedback controller is developed as the actor with a new form of augmented Riccati equation to optimize the fuzzy-regulated variable performance function in real time. The result is an online indirect adaptive optimal control mechanism implemented as an actor-critic structure, which involves continuous-time adaptation of both the optimal cost and the optimal control policy. The convergence and closed-loop stability of the proposed system are proved and guaranteed. Simulation examples and comparisons show the effectiveness and advantages of the proposed method.

摘要

针对动力学未知、不匹配和扰动条件下的连续非线性系统, 提出一种新的间接自适应模糊规划最优控制方案. 首先, 建立非线性系统汉密尔顿-雅各比-贝尔曼(HJB)方程及其匹配的性能函数. 与现有自适应动态规划(ADP)方法不同, 在执行器-评判器架构下, 所提方案采用特殊的非二次变量性能函数作为强化媒介. 构造一个自适应模糊规划的评判器结构来配置性能函数的权重矩阵, 以逼近和平衡非线性HJB方程. 同时, 设计一种并行的自组织学习技术用于自适应更新该评判器的权重. 在此基础上, 提出一种自适应最优反馈控制器与一个新形式的增广黎卡提方程作为执行器, 实时优化模糊规划后的性能函数. 通过设计上述执行器-评判器架构获得一种在线间接自适应最优控制机制, 可同时实现最优成本函数和最优控制策略的连续实时自适应调整. 该方法的控制收敛性和闭环稳定性得到证明和保证. 最后, 仿真和比较表明所提方案的有效性和可靠性.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge-based reinforcement learning controller with fuzzy-rule network: experimental validation

Article 03 October 2019

Online learning based on adaptive learning rate for a class of recurrent fuzzy neural network

Article 29 July 2019

Fractional-order fuzzy sliding mode control of uncertain nonlinear MIMO systems using fractional-order reinforcement learning

Article Open access 10 January 2024

References

Abu-Khalaf M, Lewis FL, 2005. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 41(5):779–791. https://doi.org/10.1016/j.automatica.2004.11.034
Article MathSciNet MATH Google Scholar
Bhasin S, Kamalapurkar R, Johnson M, et al., 2013. A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica, 49(1):82–92. https://doi.org/10.1016/j.automatica.2012.09.019
Article MathSciNet MATH Google Scholar
Bian T, Jiang ZP, 2016. Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design. Automatica, 71:348–360. https://doi.org/10.1016/j.automatica.2016.05.003
Article MathSciNet MATH Google Scholar
Chang XH, Yang C, Xiong J, 2019. Quantized fuzzy output feedback H_∞ control for nonlinear systems with adjustment of dynamic parameters. IEEE Trans Syst Man Cybern Syst, 49(10):2005–2015. https://doi.org/10.1109/TSMC.2018.2867213.
Article Google Scholar
Chang Y, Wang YQ, Alsaadi FE, et al., 2019. Adaptive fuzzy output-feedback tracking control for switched stochastic pure-feedback nonlinear systems. Int J Adapt Contr Signal Process, 33(10):1567–1582. https://doi.org/10.1002/acs.3052
Article MathSciNet MATH Google Scholar
Finlayson BA, 1990. The Method of Weighted Residuals and Variational Principles. Academic Press, New York, USA.
MATH Google Scholar
Huo X, Ma L, Zhao XD, et al., 2020. Event-triggered adaptive fuzzy output feedback control of MIMO switched nonlinear systems with average dwell time. Appl Math Comput, 365:124665. https://doi.org/10.1016/j.amc.2019.124665
MathSciNet MATH Google Scholar
Ioannou PA, Fidan B, 2006. Advances in Design and Control. Adaptive Control Tutorial. SIAM, Philadelphia, USA.
MATH Google Scholar
Jiang Y, Jiang ZP, 2012. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica, 48(10):2699–2704. https://doi.org/10.1016/j.automatica.2012.06.096
Article MathSciNet MATH Google Scholar
Jiang Y, Jiang ZP, 2014. Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Trans Neur Netw Learn Syst, 25(5):882–893. https://doi.org/10.1109/TNNLS.2013.2294968
Article Google Scholar
Kiumarsi B, Lewis FL, Modares H, et al., 2014. Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica, 50(4):1167–1175. https://doi.org/10.1016/j.automatica.2014.02.015
Article MathSciNet MATH Google Scholar
Lee JM, Lee JH, 2004. Approximate dynamic programming strategies and their applicability for process control: a review and future directions. Int J Contr Autom Syst, 2(3):263–278.
Google Scholar
Lee JY, Park JB, Choi YH, 2012. Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems. Automatica, 48(11): 2850–2859. https://doi.org/10.1016/j.automatica.2012.06.008
Article MathSciNet MATH Google Scholar
Lee JY, Park JB, Choi YH, 2015. Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations. IEEE Trans Neur Netw Learn Syst, 26(5):916–932. https://doi.org/10.1109/TNNLS.2014.2328590
Article MathSciNet Google Scholar
Lewis FL, Vrabie DL, Syrmos VL, 2012a. Optimal Control (3^rd Ed.). Wiley, Hoboken, USA.
Book MATH Google Scholar
Lewis FL, Vrabie D, Vamvoudakis KG, 2012b. Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers. IEEE Contr Syst Mag, 32(6):76–105. https://doi.org/10.1109/MCS.2012.2214134
Article MathSciNet MATH Google Scholar
Li YM, Tong SC, Li TS, 2016. Hybrid fuzzy adaptive output feedback control design for uncertain MIMO nonlinear systems with time-varying delays and input saturation. IEEE Trans Fuzzy Syst, 24(4):841–853. https://doi.org/10.1109/TFUZZ.2015.2486811
Article Google Scholar
Lin WS, 2011. Optimality and convergence of adaptive optimal control by reinforcement synthesis. Automatica, 47(5):1047–1052. https://doi.org/10.1016/j.automatica.2011.01.060
Article MathSciNet MATH Google Scholar
Liu DR, Wei QL, 2013. Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems. IEEE Trans Cybern, 43(2):779–789. https://doi.org/10.1109/TSMCB.2012.2216523
Article Google Scholar
Liu DR, Yang X, Li HL, 2013. Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neur Comput Appl, 23(7): 1843–1850. https://doi.org/10.1007/s00521-012-1249-y
Article Google Scholar
Liu DR, Wang D, Wang FY, et al., 2014. Neural-network-based online HJB solution for optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems. IEEE Trans Cybern, 44(12):2834–2847. https://doi.org/10.1109/TCYB.2014.2357896
Article Google Scholar
Ma L, Huo X, Zhao XD, et al., 2019. Adaptive fuzzy tracking control for a class of uncertain switched nonlinear systems with multiple constraints: a small-gain approach. Int J Fuzzy Syst, 21(8):2609–2624. https://doi.org/10.1007/s40815-019-00708-9
Article MathSciNet Google Scholar
Modares H, Lewis FL, 2014. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica, 50(7):1780–1792. https://doi.org/10.1016/j.automatica.2014.05.011
Article MathSciNet MATH Google Scholar
Modares H, Naghibi Sistani MB, Lewis FL, 2013. A policy iteration approach to online optimal control of continuous-time constrained-input systems. ISA Trans, 52(5):611–621. https://doi.org/10.1016/j.isatra.2013.04.004
Article Google Scholar
Murray JJ, Cox CJ, Lendaris GG, et al., 2002. Adaptive dynamic programming. IEEE Trans Syst Man Cybern Part C, 32(2):140–153. https://doi.org/10.1109/TSMCC.2002.801727
Article Google Scholar
Padhi R, Unnikrishnan N, Wang XH, et al., 2006. A Single Network Adaptive Critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neur Netw, 19(10):1648–1660. https://doi.org/10.1016/j.neunet.2006.08.010
Article MATH Google Scholar
Powell WB, 2007. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley, New York, USA.
Book MATH Google Scholar
Sastry SS, 1999. Nonlinear Systems: Analysis, Stability, and Control. Springer-Verlag, New York, USA.
Book MATH Google Scholar
Slotine JE, Li W, 1991. Applied Nonlinear Control. Prentice Hall, Englewood Cliffs, NJ, USA.
MATH Google Scholar
Song RZ, Xiao WD, Zhang HG, et al., 2014. Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans Neur Netw Learn Syst, 25(9): 1733–1739. https://doi.org/10.1109/TNNLS.2014.2306201
Article Google Scholar
Tao G, 2003. Adaptive Control Design and Analysis. In: Adaptive and Learning Systems for Signal Processing, Communications and Control Series. Wiley-Interscience, Hoboken, NJ, USA.
Google Scholar
Vamvoudakis KG, 2017. Q-learning for continuous-time linear systems: a model-free infinite horizon optimal control approach. Syst Contr Lett, 100:14–20. https://doi.org/10.1016/j.sysconle.2016.12.003
Article MathSciNet MATH Google Scholar
Vamvoudakis KG, Lewis FL, 2010. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 46(5):878–888. https://doi.org/10.1016/j.automatica.2010.02.018
Article MathSciNet MATH Google Scholar
van der Schaft AJ, 1992. L₂-gain analysis of nonlinear systems and nonlinear state-feedback H₁ control. IEEE Trans Autom Contr, 37(6):770–784. https://doi.org/10.1109/9.256331
Article MATH Google Scholar
Vrabie D, Pastravanu O, Abu-Khalaf M, et al., 2009. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 45(2):477–484. https://doi.org/10.1016/j.automatica.2008.08.017
Article MathSciNet MATH Google Scholar
Wang FY, Zhang HG, Liu DR, 2009. Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag, 4(2):39–47. https://doi.org/10.1109/MCI.2009.932261
Article Google Scholar
Wei QL, Zhang HG, Dai J, 2009. Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions. Neurocomputing, 72(8–9):1839–1848. https://doi.org/10.1016/j.neucom.2008.05.012
Article Google Scholar
Werbos P, 2004. ADP: goals, opportunities and principles. In: Si J, Barto A, Powell W, et al. (Eds.), Handbook of Learning and Approximate Dynamic Programming. Institute of Electrical and Electronics Engineers, New York, USA, p.3–44. https://doi.org/10.1002/9780470544785.ch1
Google Scholar
Yang X, He HB, 2018. Self-learning robust optimal control for continuous-time nonlinear systems with mismatched disturbances. Neur Netw, 99:19–30. https://doi.org/10.1016/j.neunet.2017.11.022
Article MATH Google Scholar
Yang X, Liu DR, Luo B, et al., 2016. Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning. Inform Sci, 369:731–747. https://doi.org/10.1016/j.ins.2016.07.051
Article MATH Google Scholar
Yang XY, Liu DR, Huang YZ, 2013. Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints. IET Contr Theory Appl, 7(17):2037–2047. https://doi.org/10.1049/iet-cta.2013.0472
Article MathSciNet Google Scholar
Yin YF, Zhao XD, Zheng XL, 2017. New stability and stabilization conditions of switched systems with mode-dependent average dwell time. Circ Syst Signal Process, 36(1):82–98. https://doi.org/10.1007/s00034-016-0306-7
Article MathSciNet MATH Google Scholar
Yu ZX, Yang YK, Li SG, et al., 2018. Observer-based adaptive finite-time quantized tracking control of nonstrict-feedback nonlinear systems with asymmetric actuator saturation. IEEE Trans Syst Man Cyber Syst, 50(11): 545–4556. https://doi.org/10.1109/TSMC.2018.2854927
Google Scholar
Zak SH, 2003. Systems and Control. Oxford University Press, New York, USA.
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Fluid Power and Mechatronic Systems, Zhejiang University, Hangzhou, 310027, China
Haiyun Zhang (张海运), Jin Wang (王进) & Guodong Lu (陆国栋)
Department of Mechatronic Engineering, China University of Mining and Technology, Xuzhou, 221116, China
Haiyun Zhang (张海运) & Deyuan Meng (孟德远)

Authors

Haiyun Zhang (张海运)
View author publications
You can also search for this author in PubMed Google Scholar
Deyuan Meng (孟德远)
View author publications
You can also search for this author in PubMed Google Scholar
Jin Wang (王进)
View author publications
You can also search for this author in PubMed Google Scholar
Guodong Lu (陆国栋)
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Haiyun ZHANG and Jin WANG designed and conducted the research. Deyuan MENG processed the data. Haiyun ZHANG drafted the manuscript. Guodong LU helped organize the manuscript. Haiyun ZHANG and Jin WANG revised and finalized the paper.

Corresponding author

Correspondence to Jin Wang (王进).

Ethics declarations

Haiyun ZHANG, Deyuan MENG, Jin WANG, and Guodong LU declare that they have no conflict of interest.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 51805531 and 51675470), the Natural Science Foundation of Jiangsu Province, China (No. BK20150200), the Key R&D Program of Zhejiang Province, China (No. 2020C01026), and the China Postdoctoral Science Foundation (No. 2020M671706)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Meng, D., Wang, J. et al. Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems. Front Inform Technol Electron Eng 22, 155–169 (2021). https://doi.org/10.1631/FITEE.1900610

Download citation

Received: 11 November 2019
Accepted: 27 March 2020
Published: 08 January 2021
Issue Date: February 2021
DOI: https://doi.org/10.1631/FITEE.1900610

Key words

关键词

CLC number

TP13

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems

Abstract

摘要

Access this article

Similar content being viewed by others

Knowledge-based reinforcement learning controller with fuzzy-rule network: experimental validation

Online learning based on adaptive learning rate for a class of recurrent fuzzy neural network

Fractional-order fuzzy sliding mode control of uncertain nonlinear MIMO systems using fractional-order reinforcement learning

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Key words

关键词

CLC number

Navigation

Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems

Abstract

摘要

Access this article

Similar content being viewed by others

Knowledge-based reinforcement learning controller with fuzzy-rule network: experimental validation

Online learning based on adaptive learning rate for a class of recurrent fuzzy neural network

Fractional-order fuzzy sliding mode control of uncertain nonlinear MIMO systems using fractional-order reinforcement learning

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

CLC number

Search

Navigation