Skip to main content

Robust flight control system design of a fixed wing UAV using optimal dynamic programming


Innovative design intricacies of new generation of UAVs, necessitate formulation of control laws utilizing intelligent techniques which are independent of underlying dynamic model besides being robust to changing environment. In current research, a novel control architecture is presented for maximizing glide range of the UAV which bears an unconventional design. To handle the control complexities emerging due to the unique design of the UAV, a distinct RL technique named ’optimal dynamic programming’ is proposed which besides being computationally acceptable also effectively controls the entire flight regime of the UAV. The proposed methodology has been specifically modified to configure the problem in continuous state and control space domains. The efficacy of the results and performance characteristics, demonstrated the ability of the presented algorithm to dynamically adapt to the changing environment, thereby making it suitable for UAV applications. Nonlinear simulations performed under different environmental conditions demonstrated the effectiveness of the proposed methodology over the conventional classical approaches.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Data Availability

Data are available from the authors upon reasonable request.



Application programming interface

b :

Wing span (m)

\(\tilde{c}\) :

Mean aerodynamic chord (m)


Computer aided design


Computational fluid dynamics

\(C_{{\text {M}}_x}\) :

Rolling moment coefficient

\(C_{{\text {M}}_y}\) :

Pitching moment coefficient

\(C_{{\text {M}}_z}\) :

Yawing moment coefficient

\(C_{{\text {F}}_x}\) :

X-direction force coefficient

\(C_{{\text {F}}_y}\) :

Y-direction force coefficient

\(C_{{\text {F}}_z}\) :

Z-direction force coefficient


Degree of freedom

g :

Acceleration due to gravity \(({\text {m/sec}}^2)\)

h :

Altitude (m)


Left control fin


Optimal dynamic programming


Machine learning

m :

Mass of the vehicle (kg)

\(P_{\text {E}}\) :

Position vector along East direction (km)

\(P_{\text {N}}\) :

Position vector along North direction (km)

P :

Roll rate, \({\text {(deg/sec)}}\)

Q :

Pitch rate, \({\text {(deg/sec)}}\)

R :

Yaw rate, \({\text {(deg/sec)}}\)


Reinforcement learning


Right control fin

S :

Wing area \(({\text {m}}^2)\)


Unmanned aerial vehicle

\(V_T\) :

Free stream velocity (m/sec)

w :

Numerical Weights

\(x{\text {curr}}\) :

Current X-position (m)

\(z{\text {curr}}\) :

Current Z-position (m)

r :

Momentary reward


Total reward



\(\alpha \) :

Angle of attack \(({\text {deg}})\)

\(\beta \) :

Sideslip angle \(({\text {deg}})\)

\(\gamma \) :

Flight path angle \(({\text {deg}})\)

\(\psi \) :

Yaw angle \(({\text {deg}})\)

\(\phi \) :

Roll angle \(({\text {deg}})\)

\(\theta \) :

Theta angle \(({\text {deg}})\)

\(\delta _L\) :

LCF deflection \(({\text {deg}})\)

\(\delta _R\) :

RCF deflection \(({\text {deg}})\)

\(\rho \) :

Density of the air \(({\text {kg/m}}^3)\)


  • Aboelezz A, Mohamady O, Hassanalian M, Elhadidi B (2021) Nonlinear flight dynamics and control of a fixed-wing micro air vehicle: Numerical, system identification and experimental investigations. J Intell Robot Syst 101(3):1–18

    Google Scholar 

  • Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609

    MathSciNet  MATH  Google Scholar 

  • Abualigah L, Yousri D, Abd Elaziz M, Ewees AA, Al-Qaness MA, Gandomi AH (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng 157:107250

    Google Scholar 

  • Abualigah L, Abd Elaziz M, Sumari P, Geem ZW, Gandomi AH (2022) Reptile search algorithm (RSA): a nature-inspired meta-heuristic optimizer. Expert Syst Appl 191:116158

    Google Scholar 

  • Adams RJ, Banda SS (1993) Robust flight control design using dynamic inversion and structured singular value synthesis. IEEE Trans Control Syst Technol 1(2):80–92

    Google Scholar 

  • Adams RJ, Buffington JM, Banda SS (1994) Design of nonlinear control laws for high-angle-of-attack flight. J Guid Control Dyn 17(4):737–746

    MATH  Google Scholar 

  • Adams RJ, Buffington JM, Sparks AG, Banda SS (2012) Robust multivariable flight control. Springer, Berlin

    Google Scholar 

  • Agushaka JO, Ezugwu AE, Abualigah L (2022) Dwarf mongoose optimization algorithm. Comput Methods Appl Mech Eng 391:114570

    MathSciNet  MATH  Google Scholar 

  • Araar O, Aouf N (2014) Full linear control of a Quadrotor UAV, LQ vs HINF. In: 2014 UKACC international conference on control (CONTROL). IEEE, pp 133–138

  • Bansal T, Pachocki J, Sidor S, Sutskever I, Mordatch I (2017) Emergent complexity via multi-agent competition. arXiv preprint arXiv:1710.03748

  • Chowdhary G, Frazzoli E, How J, Liu H (2014) Nonlinear flight control techniques for unmanned aerial vehicles. Handbook of unmanned aerial vehicles. Springer, Houten

    Google Scholar 

  • Dalal G, Dvijotham K, Vecerik M, Hester T, Paduraru C, Tassa Y (2018) Safe exploration in continuous action spaces. arXiv preprint arXiv:1801.08757

  • Din AFU, Mir I, Gul F, Nasar A, Rustom M, Abualigah L (2022) Reinforced learning-based robust control design for unmanned aerial vehicle. Arab J Sci Eng, pp 1–16

  • Ding S, Zhao X, Xu X, Sun T, Jia W (2019) An effective asynchronous framework for small scale reinforcement learning problems. Appl Intell 49(12):4303–4318

    Google Scholar 

  • Du W, Ding S (2021) A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artif Intell Rev 54(5):3215–3238

    Google Scholar 

  • Du W, Ding S, Zhang C, Du S (2021) Modified action decoder using Bayesian reasoning for multi-agent deep reinforcement learning. Int J Mach Learn Cybern 12(10):2947–2961

    Google Scholar 

  • Enomoto K, Yamasaki T, Takano H, Baba Y (2013) Guidance and control system design for chase UAV. In: AIAA guidance, navigation and control conference and exhibit, p 6842

  • Fatima SK, Abbas M, Mir I, Gul F, Mir S, Saeed N, Alotaibi AA, Althobaiti T, Abualigah L (2022) Data driven model estimation for aerial vehicles: a perspective analysis. Processes 10(7):1236

    Google Scholar 

  • Finck R, Hoak D, , Air Force Flight Dynamics Laboratory (U.S.) (1978) USAF stability and control DATCOM. Engineering Documents

  • Garcıa J, Fernández F (2015) A comprehensive survey on safe reinforcement learning. J Mach Learn Res 16(1):1437–1480

    MathSciNet  MATH  Google Scholar 

  • Gleave A, Dennis M, Legg S, Russell S, Leike J (2020) Quantifying differences in reward functions. arXiv preprint arXiv:2006.13900

  • Gul F, Rahiman W, Nazli Alhady SS (2019) A comprehensive study for robot navigation techniques. Cogent Eng 6(1):1632046

    Google Scholar 

  • Gul F, Alhady SSN, Rahiman W (2020) A review of controller approach for autonomous guided vehicle system. Indones J Electrical Eng Comput Sci 20(1):552–562

    Google Scholar 

  • Gul F, Mir I, Abualigah L, Sumari P, Forestiero A (2021) A consolidated review of path planning and optimization techniques: technical perspectives and future directions. Electronics 10(18):2250

    Google Scholar 

  • Gul F, Mir I, Abualigah L, Mir S, Altalhi M (2022) Cooperative multi-function approach: a new strategy for autonomous ground robotics. Future Gener Comput Syst

  • Gul F, Mir I, Abualigah L, Sumari P (2021) Multi-robot space exploration: an augmented arithmetic approach. IEEE Access 9:107 738–107 750

  • Gul F, Mir S, Mir I (2022) Reinforced whale optimizer for multi-robot application. In: AIAA SCITECH 2022 forum, p 1416

  • Gul F, Mir I, Rahiman W, Islam TU (2021) Novel implementation of multi-robot space exploration utilizing coordinated multi-robot exploration and frequency modified whale optimization algorithm. IEEE Access 9:22 774–22 787

  • Gul F, Rahiman W (2019) An integrated approach for path planning for mobile robot using Bi-RRT. In: IOP conference series: materials science and engineering, vol 697, no 1. IOP Publishing, p 012022

  • Gul F, Rahiman W (2022) Mathematical modeling of self balancing robot and hardware implementation. In: Proceedings of the 11th international conference on robotics, vision, signal processing and power applications. Springer, pp 20–26

  • Gul F, Rahiman W, Alhady SN, Ali A, Mir I, Jalil A (2020) Meta-heuristic approach for solving multi-objective path planning for autonomous guided robot using pso–gwo optimization algorithm with evolutionary programming. J Ambient Intell Humaniz Comput, pp 1–18

  • Haarnoja T, Ha S, Zhou A, Tan J, Tucker G, Levine S (2018) Learning to walk via deep reinforcement learnin. arXiv preprint arXiv:1812.11103

  • Hu H, Wang Q-L (2020) Proximal policy optimization with an integral compensator for quadrotor control. Front Inf Technol Electron Eng 21(5):777–795

    Google Scholar 

  • Hussain A, Anjum U, Channa BA, Afzal W, Hussain I, Mir I (2021) Displaced phase center antenna processing for airborne phased array radar. In: 2021 International Bhurban conference on applied sciences and technologies (IBCAST). IEEE, pp 988–992

  • Hussain A, Hussain I, Mir I, Afzal W, Anjum U, Channa BA (2020) Target parameter estimation in reduced dimension stap for airborne phased array radar. In: IEEE 23rd international multitopic conference (INMIC). IEEE 2020:1–6

  • Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285

    Google Scholar 

  • Kim D, Oh G, Seo Y, Kim Y (2017) Reinforcement learning-based optimal flat spin recovery for unmanned aerial vehicle. J Guid Control Dyn 40(4):1076–1084

    Google Scholar 

  • Kim H, Mokdad L, Ben-Othman J (2018) Designing UAV surveillance frameworks for smart city and extensive ocean with differential perspectives. IEEE Commun Mag 56(4):98–104

    Google Scholar 

  • Kimathi S (2017) Application of reinforcement learning in heading control of a fixed wing UAV using x-plane platform,

  • Koch W, Mancuso R, West R, Bestavros A (2019) Reinforcement learning for UAV attitude control. ACM Trans Cyber-Phys Syst 3(2):1–21

    Google Scholar 

  • Kretchmar RM, Young PM, Anderson CW, Hittle DC, Anderson ML, Delnero CC (2001) Robust reinforcement learning control with static and dynamic stability. Int J Robust Nonlinear Control: IFAC-Affil J 11(15):1469–1500

    MathSciNet  MATH  Google Scholar 

  • Lin X, Liu J, Yu Y, Sun C (2020) Event-triggered reinforcement learning control for the quadrotor UAV with actuator saturation. Neurocomputing 415:135–145

    Google Scholar 

  • Mannucci T, van Kampen E-J, de Visser C, Chu Q (2017) Safe exploration algorithms for reinforcement learning controllers. IEEE Trans Neural Netw Learn Syst 29(4):1069–1081

    Google Scholar 

  • Matignon L, Laurent GJ, Le Fort-Piat N (2006) Reward function and initial values: better choices for accelerated goal-directed reinforcement learning. In: International conference on artificial neural networks. Springer, pp 840–849

  • Mir I, Eisa SA, Maqsood A (2018) Review of dynamic soaring: technical aspects, nonlinear modeling perspectives and future directions. Nonlinear Dyn 94(4):3117–3144

    Google Scholar 

  • Mir I, Maqsood A, Akhtar S (2018) Biologically inspired dynamic soaring maneuvers for an unmanned air vehicle capable of sweep morphing. Int J Aeronautl Space Sci 19(4):1006–1016

    Google Scholar 

  • Mir I, Taha H, Eisa SA, Maqsood A (2018) A controllability perspective of dynamic soaring. Nonlinear Dyn 94(4):2347–2362

    MATH  Google Scholar 

  • Mir I, Maqsood A, Eisa SA, Taha H, Akhtar S (2018) Optimal morphing-augmented dynamic soaring maneuvers for unmanned air vehicle capable of span and sweep morphologies. Aerosp Sci Technol 79:17–36

    Google Scholar 

  • Mir I, Akhtar S, Eisa S, Maqsood A (2019) Guidance and control of standoff air-to-surface carrier vehicle. The Aeronaut J 123(1261):283–309

    Google Scholar 

  • Mir I, Eisa SA, Taha H, Maqsood A, Akhtar S, Islam TU (2021) A stability perspective of bio-inspired UAVs performing dynamic soaring optimally. Bioinspir, Biomim

    Google Scholar 

  • Mir I, Eisa S, Maqsood A, Gul F (2022) Contraction analysis of dynamic soaring. In: AIAA SCITECH 2022 Forum, p 0881

  • Mir I, Eisa S, Taha HE, Gul F (2022) On the stability of dynamic soaring: Floquet-based investigation. In: AIAA SCITECH 2022 Forum, p 0882

  • Mir I, Maqsood A, Akhtar S (2017) “Dynamic modeling & stability analysis of a generic UAV in glide phase. In: MATEC web of conferences, vol 114. EDP Sciences, p 01007

  • Mir I, Maqsood A, Akhtar S (2017) Optimization of dynamic soaring maneuvers to enhance endurance of a versatile UAV. In: IOP conference series: materials science and engineering, vol 211, no 1. IOP Publishing, p 012010

  • Mir I, Maqsood A, Akhtar S (2017) Optimization of dynamic soaring maneuvers to enhance endurance of a versatile UAV. In: IOP conference series: materials science and engineering, vol 211, no 1. IOP Publishing, p 012010

  • Mir I, Maqsood A, Taha HE, Eisa SA (2019) Soaring energetics for a nature inspired unmanned aerial vehicle. In: AIAA Scitech 2019 Forum, p 1622

  • Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Google Scholar 

  • Nurbani ES (2018) Environmental protection in international humanitarian law. Unram Law Rev 2(1)

  • Oyelade ON, Ezugwu AE-S, Mohamed TI, Abualigah L (2022) Ebola optimization search algorithm: a new nature-inspired metaheuristic optimization algorithm. IEEE Access 10:16 150–16 177

  • Paucar C, Morales L, Pinto K, Sánchez M, Rodríguez R, Gutierrez M, Palacios L (2018) Use of drones for surveillance and reconnaissance of military areas. In: International conference of research applied to defense and security. Springer, pp 119–132

  • Peng K (2021) Autonomous mission management based nonlinear flight control design for a class of hybrid unmanned aerial vehicles. Guidance Navig Control 1(02):2150009

    Google Scholar 

  • Petterson K (2006) CFD analysis of the low-speed aerodynamic characteristics of a UCAV. AIAA Paper 1259:2006

    Google Scholar 

  • Pham HX, La HM, Feil-Seifer D, Nguyen LV (2018) Autonomous uav navigation using reinforcement learning. arXiv preprint arXiv:1801.05086

  • Pi C-H, Ye W-Y, Cheng S (2021) Robust quadrotor control through reinforcement learning with disturbance compensation. Appl Sci 11(7):3257

    Google Scholar 

  • Poksawat P, Wang L, Mohamed A (2017) Gain scheduled attitude control of fixed-wing UAV with automatic controller tuning. IEEE Trans Control Syst Technol 26(4):1192–1203

    Google Scholar 

  • Rastogi D (2017) Deep reinforcement learning for bipedal robots

  • Rinaldi F, Chiesa S, Quagliotti F (2013) Linear quadratic control for Quadrotors UAVs dynamics and formation flight. J Intell Robot Syst 70(1–4):203–220

    Google Scholar 

  • Rodriguez-Ramos A, Sampedro C, Bavle H, De La Puente P, Campoy P (2019) A deep reinforcement learning strategy for UAV autonomous landing on a moving platform. J Intell Robot Syst 93(1–2):351–366

    Google Scholar 

  • Roskam J (1985) Airplane design 8 vol

  • Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489

    Google Scholar 

  • Sutton RS, Barto AG (1998) Planning and learning. In: Reinforcement learning: an introduction., ser. Adaptive Computation and Machine Learning. A Bradford Book, pp 227–254

  • Szczepanski R, Tarczewski T, Grzesiak LM (2019) Adaptive state feedback speed controller for PMSM based on artificial bee colony algorithm. Appl Soft Comput 83:105644

    Google Scholar 

  • Szczepanski R, Bereit A, Tarczewski T (2021) Efficient local path planning algorithm using artificial potential field supported by augmented reality. Energies 14(20):6642

    Google Scholar 

  • Szczepanski R, Tarczewski T (2021) Global path planning for mobile robot based on artificial bee colony and Dijkstra’s algorithms. In: IEEE 19th international power electronics and motion control conference (PEMC). IEEE 2021:724–730

  • Thorndike E (1911) Animal intelligence, Darien, Ct

  • van Lieshout M, Friedewald M (2018) Drones–dull, dirty or dangerous? The social construction of privacy and security technologies. In: Socially responsible innovation in security. Routledge, pp 37–55

  • Verma S (2020) A survey on machine learning applied to dynamic physical systems. arXiv preprint arXiv:2009.09719

  • Winkler S, Zeadally S, Evans K (2018) Privacy and civilian drone use: the need for further regulation. IEEE Secur Priv 16(5):72–80

    Google Scholar 

  • Xenou K, Chalkiadakis G, Afantenos S (2018) Deep reinforcement learning in strategic board game environments. In: European conference on multi-agent systems. Springer, pp 233–248

  • Xu D, Hui Z, Liu Y, Chen G (2019) Morphing control of a new bionic morphing UAV with deep reinforcement learning. Aerosp Sci Technol 92:232–243

    Google Scholar 

  • Zhou Y (2018) Online reinforcement learning control for aerospace systems

  • Zhou C, He H, Yang P, Lyu F, Wu W, Cheng N, Shen X (2019) Deep RL-based trajectory planning for AOI minimization in UAV-assisted IOT. In: 2019 11th international conference on wireless communications and signal processing (WCSP). IEEE, pp 1–6

Download references


The authors have not disclosed any funding.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Imran Mir or Faiza Gul.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Din, A.F.U., Mir, I., Gul, F. et al. Robust flight control system design of a fixed wing UAV using optimal dynamic programming. Soft Comput 27, 3053–3064 (2023).

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Flight dynamics
  • Control law
  • Machine learning
  • Reinforcement learning
  • Bellman’s principle of optimality
  • Reinforcement learning-based dynamic programming
  • Modified model free dynamic programming
  • Optimal reinforcement learning
  • Nonlinear simulations