Reinforcement learning based time-varying formation control for quadrotor unmanned aerial vehicles system with input saturation

Ma, Chi; Cao, Yizhe; Dong, Dianbiao

doi:10.1007/s10489-023-05050-0

Reinforcement learning based time-varying formation control for quadrotor unmanned aerial vehicles system with input saturation

Published: 12 October 2023

Volume 53, pages 28730–28744, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

290 Accesses
Explore all metrics

Abstract

This paper proposes a reinforcement learning sliding mode strategy for the time-varying formation control of quadrotor unmanned aerial vehicles system. Previous research relied on sliding mode control for swarm coordination, but encountered significant challenges, including the chattering phenomenon and increased energy consumption. To overcome these issues, this work introduces a reinforcement learning algorithm that leverages a critic neural network to approximate the performance index function and an actor neural network to redesign the switching control strategy within the traditional sliding mode framework. Additionally, the incorporation of control thresholds ensures the smooth operation of the system when executing time-varying formation tasks. According to the Lyapunov method, the stability is analyzed. The evaluation results indicate that with the proposed reinforcement learning sliding mode controller, the quadrotor unmanned aerial vehicles system can achieve time-varying formation control well. At the same time, the chattering phenomenon and power consumption characteristics are significantly improved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Gain Tuning Method of a Quad-Rotor Geometric Attitude Controller Using A3C

Article 23 November 2019

Reinforced Learning-Based Robust Control Design for Unmanned Aerial Vehicle

Article 24 March 2022

Reinforcement Learning Based Robust Attitude Control of a Tilt Trirotor Unmanned Aerial Vehicle

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Yu D, Chen CLP (2020) Automatic leader-follower persistent formation generation with minimum agent-movement in various switching topologies. IEEE Transactions on Cybernetics 50:1569–1581
Yu D, Chen CLP, Xu H (2021) Intelligent decision making and bionic movement control of self-organized swarm. IEEE Trans Ind Electron, Institute of Electrical and Electronics Engineers (IEEE) 68:6369–6378
Google Scholar
Ma C, Dong D (2023) Finite-time prescribed performance time-varying formation control for second-order multi-agent systems with non-strict feedback based on a neural network observer. IEEE/CAA Journal of Automatica Sinica, Institute of Electrical and Electronics Engineers (IEEE), 1–12
Luis N, Fernández S, Borrajo D (2019) Plan merging by reuse for multi-agent planning. Appl Intell, Springer Science and Business Media LLC 50:365–396
Google Scholar
Xu C, Qin Y, Su H (2023) Observer-based dynamic event-triggered bipartite consensus of discrete-time multi-agent systems. IEEE Transactions on Circuits and Systems II: Express Briefs 70:1054–1058
Google Scholar
Fang S, Chen G, Li Y (2021) Joint optimization for secure intelligent reflecting surface assisted UAV networks. IEEE Wireless Communications Letters 10:276–280
Article Google Scholar
Din AFU, Akhtar S, Maqsood A, Habib M, Mir I (2022) Modified model free dynamic programming : an augmented approach for unmanned aerial vehicle. Appl Intell, Springer Science and Business Media LLC 53:3048–3068
Google Scholar
Duan J, Duan G, Cheng S, Cao S, Wang G (2023) Fixed-time time-varying output formation-containment control of heterogeneous general multi-agent systems. ISA Trans, Elsevier BV 137:210–221
Article Google Scholar
Shi H, Lu F, Wu L, Yang G (2022) Optimal trajectories of multi-UAVs with approaching formation for target tracking using improved Harris Hawks optimizer. Appl Intell, Springer Science and Business Media LLC 52:14313–14335
Google Scholar
Tang J, Chen X, Zhu X, Zhu F (2023) Dynamic reallocation model of multiple unmanned aerial vehicle tasks in emergent adjustment scenarios. IEEE Trans Aerosp Electron Syst 59:1139–1155
Google Scholar
Ali ZA, Zhangang H, Zhengru D (2020) Path planning of multiple UAVs using MMACO and DE algorithm in dynamic environment. Meas Control, SAGE Publications 56:459–469
Article Google Scholar
Yin F-C, Ji QZ, Wen CW (2022) An adaptive terminal sliding mode control of stone-carving robotic manipulators based on radial basis function neural network. Appl Intell, Springer Science and Business Media LLC 52:16051–16068
Google Scholar
Qi W, Zong G, Hou Y, Chadli M (2023) SMC for discrete-time nonlinear Semi-Markovian switching systems with partly unknown Semi-Markov Kernel. IEEE Trans Autom Control 68:1855–1861
Article MathSciNet MATH Google Scholar
Milbradt DMC, de Oliveira Evald PJD, Hollweg GV, Gründling HA (2023) A hybrid robust adaptive sliding mode controller for partially modelled systems: discrete-time Lyapunov stability analysis and application. Nonlinear Anal: Hybrid Systems, Elsevier BV 48:101333
Mofid O, Mobayen S (2018) Adaptive sliding mode control for finite-time stability of quad-rotor UAVs with parametric uncertainties. ISA Trans, Elsevier BV 72:1–14
Article Google Scholar
Cui L, Zhang R, Yang H, Zuo Z (2021) Adaptive super-twisting trajectory tracking control for unmanned aerial vehicle under gust winds aerospace. Sci Technol
Wang F, Gao H, Wang K, Zhou C, Zong Q, Hua C (2021) Disturbance observer-based finite-time control design for a quadrotor UAV with external disturbance. IEEE Trans Aerosp Electron Syst 57:834–847
Article Google Scholar
Yin T, Gu Z, Xie X (2023) Observer-based event-triggered sliding mode control for secure formation tracking of multi-UAV systems. IEEE Transactions on Network Science and Engineering 10:887–898
Article MathSciNet Google Scholar
Ding S, Park JH, Chen C-C (2020) Second-order sliding mode controller design with output constraint. Automatica, Elsevier BV 112:108704
MathSciNet MATH Google Scholar
Ding S, Hou Q, Wang H (2023) Disturbance-observer-based second-order sliding mode controller for speed control of PMSM drives. IEEE Transactions on Energy Conversion 38:100–110
Article Google Scholar
Roy S, Baldi S, Fridman LM (2020) On adaptive sliding mode control without a priori bounded uncertainty. Automatica, Elsevier BV 111:108650
MathSciNet MATH Google Scholar
Zeghlache S, Mekki H, Bouguerra A, Djerioui A (2018) Actuator fault tolerant control using adaptive RBFNN fuzzy sliding mode controller for coaxial octorotor UAV. ISA Trans 80:267–278
Article Google Scholar
Zhang H, Zhao X, Zhang L, Niu B, Zong G, Xu N (2022) Observer-based adaptive fuzzy hierarchical sliding mode control of uncertain under-actuated switched nonlinear systems with input quantization. International Journal of Robust and Nonlinear Control, Wiley 32:8163–8185
Article MathSciNet Google Scholar
Truong TN, Vo AT, Kang H-J (2020) Implementation of an adaptive neural terminal sliding mode for tracking control of magnetic levitation systems. IEEE Access 8:206931–206941
Article Google Scholar
Lan J, Liu Y-J, Yu D, Wen G, Tong S, Liu L (2022) Time-varying optimal formation control for second-order multiagent systems based on neural network observer and reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, Institute of Electrical and Electronics Engineers (IEEE), 1–12
Dao PN, Liu Y-C (2020) Adaptive reinforcement learning strategy with sliding mode control for unknown and disturbed wheeled inverted pendulum. Int J Control Autom Syst, Springer Science and Business Media LLC 19:1139–1150
Google Scholar
Bai W, Li T, Tong S (2020) NN reinforcement learning adaptive control for a class of nonstrict-feedback discrete-time systems. IEEE Transactions on Cybernetics, Institute of Electrical and Electronics Engineers (IEEE) 50:4573–4584
Google Scholar
Li H, Wu Y, Chen M (2021) Adaptive fault-tolerant tracking control for discrete-time multiagent systems via reinforcement learning algorithm. IEEE Transactions on Cybernetics 51:1163–1174
Wang N, Gao Y, Zhang X (2021) Data-driven performance-prescribed reinforcement learning control of an unmanned surface vehicle. IEEE Transactions on Neural Networks and Learning Systems 32:5456–5467
Article MathSciNet Google Scholar
Xin X, Tu Y, Stojanovic V, Wang H, Shi K, He S, Pan T (2022) Online reinforcement learning multiplayer non-zero sum games of continuous-time Markov jump linear systems. Appl Math Comput, Elsevier BV 412:126537
Article MathSciNet MATH Google Scholar
Yang C, Huang D, He W, Cheng L (2021) Neural control of robot manipulators with trajectory tracking constraints and input saturation. IEEE Transactions on Neural Networks and Learning Systems, Institute of Electrical and Electronics Engineers (IEEE) 32:4231–4242
MathSciNet Google Scholar
Su Y, Wang Q, Sun C (2020) Self-triggered consensus control for linear multi-agent systems with input saturation. IEEE/CAA Journal of Automatica Sinica 7:150–157
Article MathSciNet Google Scholar
Xu Q, Wang Z, Zhen Z (2019) Adaptive neural network finite time control for quadrotor UAV with unknown input saturation. Nonlinear Dyn, Springer Science and Business Media LLC 98:1973–1998
MATH Google Scholar
Convens B, Merckaert K, Nicotra MM, Vanderborght B (2022) Safe, fast, and efficient distributed receding horizon constrained control of aerial robot swarms. IEEE Robotics and Automation Letters, Institute of Electrical and Electronics Engineers (IEEE) 7:4173–4180
Google Scholar
Huang D, Huang T, Qin N, Li Y, Yang Y (2022) Finite-time control for a UAV system based on finite-time disturbance observer. Aerosp Sci Technol, Elsevier BV 129:107825
Article Google Scholar
Zuo Z (2010) Trajectory tracking control design with command-filtered compensation for a quadrotor. IET Control Theory & Applications, Institution of Engineering and Technology (IET) 4:2343–2355
Article MathSciNet Google Scholar
Liang H, Zou J, Zuo K, Khan MJ (2020) An improved genetic algorithm optimization fuzzy controller applied to the wellhead back pressure control system. Mech Syst Signal Process, Elsevier BV 142:106708
Article Google Scholar
Zhang G, Zhang J, Li W, Ge C, Liu Y (2021) Robust synchronization of uncertain delayed neural networks with packet dropout using sampled-data control. Appl Intell, Springer Science and Business Media LLC 51:9054–9065
Google Scholar

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grant 62203356.

Author information

Authors and Affiliations

School of Mechanical Engineering, Northwestern Polytechnical University, Xi’an, 710072, China
Chi Ma, Yizhe Cao & Dianbiao Dong

Authors

Chi Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yizhe Cao
View author publications
You can also search for this author in PubMed Google Scholar
Dianbiao Dong
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Chi Ma: Data curation, Conceptualization, Methodology, Writing - review & editing. Yizhe Cao: Writing - original draft, Investigation, Formal analysis. Dianbiao Dong: Conceptualization, Methodology, Supervision, Writing - review & editing.

Corresponding author

Correspondence to Dianbiao Dong.

Ethics declarations

Competing interests

The authors declare that they have no personal and financial relationships with other organizations or people that can inappropriately influence the work, reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ma, C., Cao, Y. & Dong, D. Reinforcement learning based time-varying formation control for quadrotor unmanned aerial vehicles system with input saturation. Appl Intell 53, 28730–28744 (2023). https://doi.org/10.1007/s10489-023-05050-0

Download citation

Accepted: 26 September 2023
Published: 12 October 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10489-023-05050-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement learning based time-varying formation control for quadrotor unmanned aerial vehicles system with input saturation

Abstract

Access this article

Similar content being viewed by others

Automatic Gain Tuning Method of a Quad-Rotor Geometric Attitude Controller Using A3C

Reinforced Learning-Based Robust Control Design for Unmanned Aerial Vehicle

Reinforcement Learning Based Robust Attitude Control of a Tilt Trirotor Unmanned Aerial Vehicle

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Reinforcement learning based time-varying formation control for quadrotor unmanned aerial vehicles system with input saturation

Abstract

Access this article

Similar content being viewed by others

Automatic Gain Tuning Method of a Quad-Rotor Geometric Attitude Controller Using A3C

Reinforced Learning-Based Robust Control Design for Unmanned Aerial Vehicle

Reinforcement Learning Based Robust Attitude Control of a Tilt Trirotor Unmanned Aerial Vehicle

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation