Abstract
This paper proposes a reinforcement learning sliding mode strategy for the time-varying formation control of quadrotor unmanned aerial vehicles system. Previous research relied on sliding mode control for swarm coordination, but encountered significant challenges, including the chattering phenomenon and increased energy consumption. To overcome these issues, this work introduces a reinforcement learning algorithm that leverages a critic neural network to approximate the performance index function and an actor neural network to redesign the switching control strategy within the traditional sliding mode framework. Additionally, the incorporation of control thresholds ensures the smooth operation of the system when executing time-varying formation tasks. According to the Lyapunov method, the stability is analyzed. The evaluation results indicate that with the proposed reinforcement learning sliding mode controller, the quadrotor unmanned aerial vehicles system can achieve time-varying formation control well. At the same time, the chattering phenomenon and power consumption characteristics are significantly improved.
Similar content being viewed by others
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Yu D, Chen CLP (2020) Automatic leader-follower persistent formation generation with minimum agent-movement in various switching topologies. IEEE Transactions on Cybernetics 50:1569–1581
Yu D, Chen CLP, Xu H (2021) Intelligent decision making and bionic movement control of self-organized swarm. IEEE Trans Ind Electron, Institute of Electrical and Electronics Engineers (IEEE) 68:6369–6378
Ma C, Dong D (2023) Finite-time prescribed performance time-varying formation control for second-order multi-agent systems with non-strict feedback based on a neural network observer. IEEE/CAA Journal of Automatica Sinica, Institute of Electrical and Electronics Engineers (IEEE), 1–12
Luis N, Fernández S, Borrajo D (2019) Plan merging by reuse for multi-agent planning. Appl Intell, Springer Science and Business Media LLC 50:365–396
Xu C, Qin Y, Su H (2023) Observer-based dynamic event-triggered bipartite consensus of discrete-time multi-agent systems. IEEE Transactions on Circuits and Systems II: Express Briefs 70:1054–1058
Fang S, Chen G, Li Y (2021) Joint optimization for secure intelligent reflecting surface assisted UAV networks. IEEE Wireless Communications Letters 10:276–280
Din AFU, Akhtar S, Maqsood A, Habib M, Mir I (2022) Modified model free dynamic programming : an augmented approach for unmanned aerial vehicle. Appl Intell, Springer Science and Business Media LLC 53:3048–3068
Duan J, Duan G, Cheng S, Cao S, Wang G (2023) Fixed-time time-varying output formation-containment control of heterogeneous general multi-agent systems. ISA Trans, Elsevier BV 137:210–221
Shi H, Lu F, Wu L, Yang G (2022) Optimal trajectories of multi-UAVs with approaching formation for target tracking using improved Harris Hawks optimizer. Appl Intell, Springer Science and Business Media LLC 52:14313–14335
Tang J, Chen X, Zhu X, Zhu F (2023) Dynamic reallocation model of multiple unmanned aerial vehicle tasks in emergent adjustment scenarios. IEEE Trans Aerosp Electron Syst 59:1139–1155
Ali ZA, Zhangang H, Zhengru D (2020) Path planning of multiple UAVs using MMACO and DE algorithm in dynamic environment. Meas Control, SAGE Publications 56:459–469
Yin F-C, Ji QZ, Wen CW (2022) An adaptive terminal sliding mode control of stone-carving robotic manipulators based on radial basis function neural network. Appl Intell, Springer Science and Business Media LLC 52:16051–16068
Qi W, Zong G, Hou Y, Chadli M (2023) SMC for discrete-time nonlinear Semi-Markovian switching systems with partly unknown Semi-Markov Kernel. IEEE Trans Autom Control 68:1855–1861
Milbradt DMC, de Oliveira Evald PJD, Hollweg GV, Gründling HA (2023) A hybrid robust adaptive sliding mode controller for partially modelled systems: discrete-time Lyapunov stability analysis and application. Nonlinear Anal: Hybrid Systems, Elsevier BV 48:101333
Mofid O, Mobayen S (2018) Adaptive sliding mode control for finite-time stability of quad-rotor UAVs with parametric uncertainties. ISA Trans, Elsevier BV 72:1–14
Cui L, Zhang R, Yang H, Zuo Z (2021) Adaptive super-twisting trajectory tracking control for unmanned aerial vehicle under gust winds aerospace. Sci Technol
Wang F, Gao H, Wang K, Zhou C, Zong Q, Hua C (2021) Disturbance observer-based finite-time control design for a quadrotor UAV with external disturbance. IEEE Trans Aerosp Electron Syst 57:834–847
Yin T, Gu Z, Xie X (2023) Observer-based event-triggered sliding mode control for secure formation tracking of multi-UAV systems. IEEE Transactions on Network Science and Engineering 10:887–898
Ding S, Park JH, Chen C-C (2020) Second-order sliding mode controller design with output constraint. Automatica, Elsevier BV 112:108704
Ding S, Hou Q, Wang H (2023) Disturbance-observer-based second-order sliding mode controller for speed control of PMSM drives. IEEE Transactions on Energy Conversion 38:100–110
Roy S, Baldi S, Fridman LM (2020) On adaptive sliding mode control without a priori bounded uncertainty. Automatica, Elsevier BV 111:108650
Zeghlache S, Mekki H, Bouguerra A, Djerioui A (2018) Actuator fault tolerant control using adaptive RBFNN fuzzy sliding mode controller for coaxial octorotor UAV. ISA Trans 80:267–278
Zhang H, Zhao X, Zhang L, Niu B, Zong G, Xu N (2022) Observer-based adaptive fuzzy hierarchical sliding mode control of uncertain under-actuated switched nonlinear systems with input quantization. International Journal of Robust and Nonlinear Control, Wiley 32:8163–8185
Truong TN, Vo AT, Kang H-J (2020) Implementation of an adaptive neural terminal sliding mode for tracking control of magnetic levitation systems. IEEE Access 8:206931–206941
Lan J, Liu Y-J, Yu D, Wen G, Tong S, Liu L (2022) Time-varying optimal formation control for second-order multiagent systems based on neural network observer and reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, Institute of Electrical and Electronics Engineers (IEEE), 1–12
Dao PN, Liu Y-C (2020) Adaptive reinforcement learning strategy with sliding mode control for unknown and disturbed wheeled inverted pendulum. Int J Control Autom Syst, Springer Science and Business Media LLC 19:1139–1150
Bai W, Li T, Tong S (2020) NN reinforcement learning adaptive control for a class of nonstrict-feedback discrete-time systems. IEEE Transactions on Cybernetics, Institute of Electrical and Electronics Engineers (IEEE) 50:4573–4584
Li H, Wu Y, Chen M (2021) Adaptive fault-tolerant tracking control for discrete-time multiagent systems via reinforcement learning algorithm. IEEE Transactions on Cybernetics 51:1163–1174
Wang N, Gao Y, Zhang X (2021) Data-driven performance-prescribed reinforcement learning control of an unmanned surface vehicle. IEEE Transactions on Neural Networks and Learning Systems 32:5456–5467
Xin X, Tu Y, Stojanovic V, Wang H, Shi K, He S, Pan T (2022) Online reinforcement learning multiplayer non-zero sum games of continuous-time Markov jump linear systems. Appl Math Comput, Elsevier BV 412:126537
Yang C, Huang D, He W, Cheng L (2021) Neural control of robot manipulators with trajectory tracking constraints and input saturation. IEEE Transactions on Neural Networks and Learning Systems, Institute of Electrical and Electronics Engineers (IEEE) 32:4231–4242
Su Y, Wang Q, Sun C (2020) Self-triggered consensus control for linear multi-agent systems with input saturation. IEEE/CAA Journal of Automatica Sinica 7:150–157
Xu Q, Wang Z, Zhen Z (2019) Adaptive neural network finite time control for quadrotor UAV with unknown input saturation. Nonlinear Dyn, Springer Science and Business Media LLC 98:1973–1998
Convens B, Merckaert K, Nicotra MM, Vanderborght B (2022) Safe, fast, and efficient distributed receding horizon constrained control of aerial robot swarms. IEEE Robotics and Automation Letters, Institute of Electrical and Electronics Engineers (IEEE) 7:4173–4180
Huang D, Huang T, Qin N, Li Y, Yang Y (2022) Finite-time control for a UAV system based on finite-time disturbance observer. Aerosp Sci Technol, Elsevier BV 129:107825
Zuo Z (2010) Trajectory tracking control design with command-filtered compensation for a quadrotor. IET Control Theory & Applications, Institution of Engineering and Technology (IET) 4:2343–2355
Liang H, Zou J, Zuo K, Khan MJ (2020) An improved genetic algorithm optimization fuzzy controller applied to the wellhead back pressure control system. Mech Syst Signal Process, Elsevier BV 142:106708
Zhang G, Zhang J, Li W, Ge C, Liu Y (2021) Robust synchronization of uncertain delayed neural networks with packet dropout using sampled-data control. Appl Intell, Springer Science and Business Media LLC 51:9054–9065
Funding
This work was supported by the National Natural Science Foundation of China under Grant 62203356.
Author information
Authors and Affiliations
Contributions
Chi Ma: Data curation, Conceptualization, Methodology, Writing - review & editing. Yizhe Cao: Writing - original draft, Investigation, Formal analysis. Dianbiao Dong: Conceptualization, Methodology, Supervision, Writing - review & editing.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no personal and financial relationships with other organizations or people that can inappropriately influence the work, reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ma, C., Cao, Y. & Dong, D. Reinforcement learning based time-varying formation control for quadrotor unmanned aerial vehicles system with input saturation. Appl Intell 53, 28730–28744 (2023). https://doi.org/10.1007/s10489-023-05050-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05050-0