Skip to main content
Log in

Reinforcement learning based time-varying formation control for quadrotor unmanned aerial vehicles system with input saturation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This paper proposes a reinforcement learning sliding mode strategy for the time-varying formation control of quadrotor unmanned aerial vehicles system. Previous research relied on sliding mode control for swarm coordination, but encountered significant challenges, including the chattering phenomenon and increased energy consumption. To overcome these issues, this work introduces a reinforcement learning algorithm that leverages a critic neural network to approximate the performance index function and an actor neural network to redesign the switching control strategy within the traditional sliding mode framework. Additionally, the incorporation of control thresholds ensures the smooth operation of the system when executing time-varying formation tasks. According to the Lyapunov method, the stability is analyzed. The evaluation results indicate that with the proposed reinforcement learning sliding mode controller, the quadrotor unmanned aerial vehicles system can achieve time-varying formation control well. At the same time, the chattering phenomenon and power consumption characteristics are significantly improved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Yu D, Chen CLP (2020) Automatic leader-follower persistent formation generation with minimum agent-movement in various switching topologies. IEEE Transactions on Cybernetics 50:1569–1581

  2. Yu D, Chen CLP, Xu H (2021) Intelligent decision making and bionic movement control of self-organized swarm. IEEE Trans Ind Electron, Institute of Electrical and Electronics Engineers (IEEE) 68:6369–6378

    Google Scholar 

  3. Ma C, Dong D (2023) Finite-time prescribed performance time-varying formation control for second-order multi-agent systems with non-strict feedback based on a neural network observer. IEEE/CAA Journal of Automatica Sinica, Institute of Electrical and Electronics Engineers (IEEE), 1–12

  4. Luis N, Fernández S, Borrajo D (2019) Plan merging by reuse for multi-agent planning. Appl Intell, Springer Science and Business Media LLC 50:365–396

    Google Scholar 

  5. Xu C, Qin Y, Su H (2023) Observer-based dynamic event-triggered bipartite consensus of discrete-time multi-agent systems. IEEE Transactions on Circuits and Systems II: Express Briefs 70:1054–1058

    Google Scholar 

  6. Fang S, Chen G, Li Y (2021) Joint optimization for secure intelligent reflecting surface assisted UAV networks. IEEE Wireless Communications Letters 10:276–280

    Article  Google Scholar 

  7. Din AFU, Akhtar S, Maqsood A, Habib M, Mir I (2022) Modified model free dynamic programming : an augmented approach for unmanned aerial vehicle. Appl Intell, Springer Science and Business Media LLC 53:3048–3068

    Google Scholar 

  8. Duan J, Duan G, Cheng S, Cao S, Wang G (2023) Fixed-time time-varying output formation-containment control of heterogeneous general multi-agent systems. ISA Trans, Elsevier BV 137:210–221

    Article  Google Scholar 

  9. Shi H, Lu F, Wu L, Yang G (2022) Optimal trajectories of multi-UAVs with approaching formation for target tracking using improved Harris Hawks optimizer. Appl Intell, Springer Science and Business Media LLC 52:14313–14335

    Google Scholar 

  10. Tang J, Chen X, Zhu X, Zhu F (2023) Dynamic reallocation model of multiple unmanned aerial vehicle tasks in emergent adjustment scenarios. IEEE Trans Aerosp Electron Syst 59:1139–1155

    Google Scholar 

  11. Ali ZA, Zhangang H, Zhengru D (2020) Path planning of multiple UAVs using MMACO and DE algorithm in dynamic environment. Meas Control, SAGE Publications 56:459–469

    Article  Google Scholar 

  12. Yin F-C, Ji QZ, Wen CW (2022) An adaptive terminal sliding mode control of stone-carving robotic manipulators based on radial basis function neural network. Appl Intell, Springer Science and Business Media LLC 52:16051–16068

    Google Scholar 

  13. Qi W, Zong G, Hou Y, Chadli M (2023) SMC for discrete-time nonlinear Semi-Markovian switching systems with partly unknown Semi-Markov Kernel. IEEE Trans Autom Control 68:1855–1861

    Article  MathSciNet  MATH  Google Scholar 

  14. Milbradt DMC, de Oliveira Evald PJD, Hollweg GV, Gründling HA (2023) A hybrid robust adaptive sliding mode controller for partially modelled systems: discrete-time Lyapunov stability analysis and application. Nonlinear Anal: Hybrid Systems, Elsevier BV 48:101333

  15. Mofid O, Mobayen S (2018) Adaptive sliding mode control for finite-time stability of quad-rotor UAVs with parametric uncertainties. ISA Trans, Elsevier BV 72:1–14

    Article  Google Scholar 

  16. Cui L, Zhang R, Yang H, Zuo Z (2021) Adaptive super-twisting trajectory tracking control for unmanned aerial vehicle under gust winds aerospace. Sci Technol

  17. Wang F, Gao H, Wang K, Zhou C, Zong Q, Hua C (2021) Disturbance observer-based finite-time control design for a quadrotor UAV with external disturbance. IEEE Trans Aerosp Electron Syst 57:834–847

    Article  Google Scholar 

  18. Yin T, Gu Z, Xie X (2023) Observer-based event-triggered sliding mode control for secure formation tracking of multi-UAV systems. IEEE Transactions on Network Science and Engineering 10:887–898

    Article  MathSciNet  Google Scholar 

  19. Ding S, Park JH, Chen C-C (2020) Second-order sliding mode controller design with output constraint. Automatica, Elsevier BV 112:108704

    MathSciNet  MATH  Google Scholar 

  20. Ding S, Hou Q, Wang H (2023) Disturbance-observer-based second-order sliding mode controller for speed control of PMSM drives. IEEE Transactions on Energy Conversion 38:100–110

    Article  Google Scholar 

  21. Roy S, Baldi S, Fridman LM (2020) On adaptive sliding mode control without a priori bounded uncertainty. Automatica, Elsevier BV 111:108650

    MathSciNet  MATH  Google Scholar 

  22. Zeghlache S, Mekki H, Bouguerra A, Djerioui A (2018) Actuator fault tolerant control using adaptive RBFNN fuzzy sliding mode controller for coaxial octorotor UAV. ISA Trans 80:267–278

    Article  Google Scholar 

  23. Zhang H, Zhao X, Zhang L, Niu B, Zong G, Xu N (2022) Observer-based adaptive fuzzy hierarchical sliding mode control of uncertain under-actuated switched nonlinear systems with input quantization. International Journal of Robust and Nonlinear Control, Wiley 32:8163–8185

    Article  MathSciNet  Google Scholar 

  24. Truong TN, Vo AT, Kang H-J (2020) Implementation of an adaptive neural terminal sliding mode for tracking control of magnetic levitation systems. IEEE Access 8:206931–206941

    Article  Google Scholar 

  25. Lan J, Liu Y-J, Yu D, Wen G, Tong S, Liu L (2022) Time-varying optimal formation control for second-order multiagent systems based on neural network observer and reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, Institute of Electrical and Electronics Engineers (IEEE), 1–12

  26. Dao PN, Liu Y-C (2020) Adaptive reinforcement learning strategy with sliding mode control for unknown and disturbed wheeled inverted pendulum. Int J Control Autom Syst, Springer Science and Business Media LLC 19:1139–1150

    Google Scholar 

  27. Bai W, Li T, Tong S (2020) NN reinforcement learning adaptive control for a class of nonstrict-feedback discrete-time systems. IEEE Transactions on Cybernetics, Institute of Electrical and Electronics Engineers (IEEE) 50:4573–4584

    Google Scholar 

  28. Li H, Wu Y, Chen M (2021) Adaptive fault-tolerant tracking control for discrete-time multiagent systems via reinforcement learning algorithm. IEEE Transactions on Cybernetics 51:1163–1174

  29. Wang N, Gao Y, Zhang X (2021) Data-driven performance-prescribed reinforcement learning control of an unmanned surface vehicle. IEEE Transactions on Neural Networks and Learning Systems 32:5456–5467

    Article  MathSciNet  Google Scholar 

  30. Xin X, Tu Y, Stojanovic V, Wang H, Shi K, He S, Pan T (2022) Online reinforcement learning multiplayer non-zero sum games of continuous-time Markov jump linear systems. Appl Math Comput, Elsevier BV 412:126537

    Article  MathSciNet  MATH  Google Scholar 

  31. Yang C, Huang D, He W, Cheng L (2021) Neural control of robot manipulators with trajectory tracking constraints and input saturation. IEEE Transactions on Neural Networks and Learning Systems, Institute of Electrical and Electronics Engineers (IEEE) 32:4231–4242

    MathSciNet  Google Scholar 

  32. Su Y, Wang Q, Sun C (2020) Self-triggered consensus control for linear multi-agent systems with input saturation. IEEE/CAA Journal of Automatica Sinica 7:150–157

    Article  MathSciNet  Google Scholar 

  33. Xu Q, Wang Z, Zhen Z (2019) Adaptive neural network finite time control for quadrotor UAV with unknown input saturation. Nonlinear Dyn, Springer Science and Business Media LLC 98:1973–1998

    MATH  Google Scholar 

  34. Convens B, Merckaert K, Nicotra MM, Vanderborght B (2022) Safe, fast, and efficient distributed receding horizon constrained control of aerial robot swarms. IEEE Robotics and Automation Letters, Institute of Electrical and Electronics Engineers (IEEE) 7:4173–4180

    Google Scholar 

  35. Huang D, Huang T, Qin N, Li Y, Yang Y (2022) Finite-time control for a UAV system based on finite-time disturbance observer. Aerosp Sci Technol, Elsevier BV 129:107825

    Article  Google Scholar 

  36. Zuo Z (2010) Trajectory tracking control design with command-filtered compensation for a quadrotor. IET Control Theory & Applications, Institution of Engineering and Technology (IET) 4:2343–2355

    Article  MathSciNet  Google Scholar 

  37. Liang H, Zou J, Zuo K, Khan MJ (2020) An improved genetic algorithm optimization fuzzy controller applied to the wellhead back pressure control system. Mech Syst Signal Process, Elsevier BV 142:106708

    Article  Google Scholar 

  38. Zhang G, Zhang J, Li W, Ge C, Liu Y (2021) Robust synchronization of uncertain delayed neural networks with packet dropout using sampled-data control. Appl Intell, Springer Science and Business Media LLC 51:9054–9065

    Google Scholar 

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grant 62203356.

Author information

Authors and Affiliations

Authors

Contributions

Chi Ma: Data curation, Conceptualization, Methodology, Writing - review & editing. Yizhe Cao: Writing - original draft, Investigation, Formal analysis. Dianbiao Dong: Conceptualization, Methodology, Supervision, Writing - review & editing.

Corresponding author

Correspondence to Dianbiao Dong.

Ethics declarations

Competing interests

The authors declare that they have no personal and financial relationships with other organizations or people that can inappropriately influence the work, reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, C., Cao, Y. & Dong, D. Reinforcement learning based time-varying formation control for quadrotor unmanned aerial vehicles system with input saturation. Appl Intell 53, 28730–28744 (2023). https://doi.org/10.1007/s10489-023-05050-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-05050-0

Keywords

Navigation