Abstract
Innovation in UAV design technologies over the last decade and a half has resulted in capabilities that flourished the development of unique and complex multi-mission capable UAVs. These emerging new distinctive designs of UAVs necessitate development of intelligent and robust Control Laws which are independent of inherent plant variations besides being adaptive to environmental changes for achieving desired design objectives. Current research focuses on development of a control framework which aims to maximize the glide range for an experimental UAV employing reinforcement learning (RL)-based intelligent control architecture. A distinct model-free RL technique, abbreviated as ‘MRL’, is suggested which is capable of handling UAV control complications while keeping the computation cost low. At core, the basic RL DP algorithm has been sensibly modified to cater for the continuous state and control space domains associated with the current problem. Review of the performance characteristics through analysis of the results indicates the prowess of the presented algorithm to dynamically adapt to the changing environment, thereby making it suitable for complex designed UAV applications. Nonlinear simulations carried out under varying environmental conditions illustrated the effectiveness of the proposed methodology and its success over the conventional classical approaches.
Similar content being viewed by others
Data Availability Statement
Data are available from the authors upon reasonable request.
Abbreviations
- b ::
-
Wing span (m)
- \(\tilde{c}\) ::
-
Mean aerodynamic chord (m)
- CAD ::
-
Computer-aided design
- CFD ::
-
Computational fluid dynamics
- \(C_{M_x}\)::
-
Coefficient of rolling moment
- \(C_{M_y}\) ::
-
Coefficient of pitching moment
- \(C_{M_z}\) ::
-
Coefficient of yawing moment
- \(C_{F_x}\) ::
-
Force coefficient in the X-direction
- \(C_{F_y}\) ::
-
Force coefficient in the Y-direction
- \(C_{F_z}\) ::
-
Force coefficient in the Z-direction
- DoF ::
-
Degree of freedom
- DDD ::
-
Dull dirty and dangerous
- g ::
-
Acceleration due to gravity \((m/sec^2)\)
- h ::
-
Altitude (m)
- LF ::
-
Left-side control fin
- MRL ::
-
Model-free reinforcement learning
- ML ::
-
Machine learning
- m ::
-
Mass of the vehicle (kg)
- \(P_E\) ::
-
East position vector (km)
- \(P_N\) ::
-
North position vector (km)
- P ::
-
Roll rate (deg/sec)
- Q ::
-
Pitch rate (deg/sec)
- R ::
-
Yaw rate (deg/sec)
- RL ::
-
Reinforcement learning
- RF ::
-
Right-side control fin
- S ::
-
Wing area \((m^2)\)
- UAV ::
-
Unmanned aerial vehicle
- \(V_T\) ::
-
Far stream velocity (m/sec)
- n ::
-
Numerical weights
- xpos ::
-
Current X-position(m)
- zpos ::
-
Current Z-position(m)
- r ::
-
Momentary reward
- R ::
-
Total reward
- pny ::
-
Penalty
- \(\alpha \) ::
-
Angle of attack (deg)
- \(\beta \) ::
-
Sideslip angle (deg)
- \(\gamma \) ::
-
Flight path angle (deg)
- \(\psi \) ::
-
Yaw angle (deg)
- \(\phi \) ::
-
Roll angle (deg)
- \(\theta \) ::
-
Theta angle (deg)
- \(\delta _L\) ::
-
LF deflection (deg)
- \(\delta _R\) ::
-
RF deflection (deg)
- \(\rho \) ::
-
Air density \((kg/m^3)\)
References
Yanushevsky, R.: Guidance of Unmanned Aerial Vehicles. CRC Press (2011)
Mir, I.; Eisa, S.; Taha, H.E.; Gul, F.: On the stability of dynamic soaring: Floquet-based investigation. In AIAA SCITECH 2022 Forum, page 0882, (2022)
Mir, I.; Eisa, S.; Maqsood, A.; Gul, F.: Contraction analysis of dynamic soaring. In AIAA SCITECH 2022 Forum, page 0881, (2022)
Mir, I.; Taha, H.; Eisa, S.A.; Maqsood, A.: A controllability perspective of dynamic soaring. Nonlinear Dyn. 94(4), 2347–2362 (2018)
Mir, I.; Maqsood, A.; Eisa, S.A.; Taha, H.; Akhtar, S.: Optimal morphing-augmented dynamic soaring maneuvers for unmanned air vehicle capable of span and sweep morphologies. Aerosp. Sci. Technol. 79, 17–36 (2018)
Mir, I.; Maqsood, A.; Akhtar, S.: Optimization of dynamic soaring maneuvers to enhance endurance of a versatile uav. In IOP Conference Series: Materials Science and Engineering, volume 211, page 012010. IOP Publishing, (2017)
Cai, G.; Dias, J.; Seneviratne, L.: A survey of small-scale unmanned aerial vehicles: Recent advances and future development trends. Unmanned Syst. 2(02), 175–199 (2014)
Mir, I.; Eisa, S.A.; Taha, H.E.; Maqsood, A.; Akhtar, S.; Islam, T.U.: A stability perspective of bio-inspired uavs performing dynamic soaring optimally. Bioinspir, Biomim (2021)
Mir, I.; Akhtar, S.; Eisa, S.A.; Maqsood, A.: Guidance and control of standoff air-to-surface carrier vehicle. Aeronaut. J. 123(1261), 283–309 (2019)
Mir, I.; Maqsood, A.; Taha, H.E.; Eisa, S.A.: Soaring energetics for a nature inspired unmanned aerial vehicle. In AIAA Scitech 2019 Forum, page 1622, (2019)
Elmeseiry, N.; Alshaer, N.; Ismail, T.: A detailed survey and future directions of unmanned aerial vehicles (uavs) with potential applications. Aerospace 8(12), 363 (2021)
Giordan, Daniele; Adams, Marc S.; Aicardi, Irene; Alicandro, Maria; Allasia, Paolo; Baldo, Marco; De Berardinis, Pierluigi; Dominici, Donatella; Godone, Danilo; Hobbs, Peter; et al.: The use of unmanned aerial vehicles (uavs) for engineering geology applications. Bulletin of Engineering Geology and the Environment 79(7), 3437–3481 (2020)
Winkler, Stephanie; Zeadally, Sherali; Evans, Katrine: Privacy and civilian drone use: The need for further regulation. IEEE Security & Privacy 16(5), 72–80 (2018)
Nurbani, Erlies Septiana: Environmental protection in international humanitarian law. Unram Law Review, 2(1), (2018)
Giordan, Daniele; Hayakawa, Yuichi; Nex, Francesco; Remondino, Fabio; Tarolli, Paolo: The use of remotely piloted aircraft systems (rpass) for natural hazards monitoring and management. Natural hazards and earth system sciences 18(4), 1079–1096 (2018)
Nikolakopoulos, Konstantinos G.; Soura, Konstantina; Koukouvelas, Ioannis K.; Argyropoulos, Nikolaos G.: Uav vs classical aerial photogrammetry for archaeological studies. Journal of Archaeological Science: Reports 14, 758–773 (2017)
Abualigah, Laith; Diabat, Ali; Sumari, Putra; Gandomi, Amir H.: Applications, deployments, and integration of internet of drones (iod): a review. IEEE Sensors Journal, (2021)
Mir, Imran; Eisa, Sameh A.; Maqsood, Adnan: Review of dynamic soaring: technical aspects, nonlinear modeling perspectives and future directions. Nonlinear Dynamics 94(4), 3117–3144 (2018)
Mir, Imran; Maqsood, Adnan; Akhtar, Suhail: Biologically inspired dynamic soaring maneuvers for an unmanned air vehicle capable of sweep morphing. International Journal of Aeronautical and Space Sciences 19(4), 1006–1016 (2018)
Mir, Imran; Maqsood, Adnan; Akhtar, Suhail: Dynamic modeling & stability analysis of a generic uav in glide phase. In MATEC Web of Conferences, volume 114, page 01007. EDP Sciences, (2017)
Mir, Imran; Eisa, Sameh A.; Taha, Haithem; Maqsood, Adnan; Akhtar, Suhail; Islam, Tauqeer Ul: A stability perspective of bioinspired unmanned aerial vehicles performing optimal dynamic soaring. Bioinspiration & Biomimetics 16(6), 066010 (2021)
Gul, Faiza; Alhady, Syed Sahal Nazli.; Rahiman, Wan: A review of controller approach for autonomous guided vehicle system. Indonesian Journal of Electrical Engineering and Computer Science 20(1), 552–562 (2020)
Gul, Faiza; Rahiman, Wan: An integrated approach for path planning for mobile robot using bi-rrt. In IOP Conference Series: Materials Science and Engineering, volume 697, page 012022. IOP Publishing, (2019)
Gul, F.; Rahiman, W.; Alhady, S.S.; Nazli: A comprehensive study for robot navigation techniques. Cogent Eng. 6(1), 1632046 (2019)
Agushaka, Jeffrey O.; Ezugwu, Absalom E.; Abualigah, Laith: Dwarf mongoose optimization algorithm. Computer Methods in Applied Mechanics and Engineering 391, 114570 (2022)
Abualigah, Laith; Yousri, Dalia; Elaziz, Mohamed Abd; Ewees, Ahmed A.; Al-Qaness, Mohammed AA.; Gandomi, Amir H.: Aquila optimizer: a novel meta-heuristic optimization algorithm. Computers & Industrial Engineering 157, 107250 (2021)
Abualigah, Laith; Elaziz, Mohamed Abd; Sumari, Putra; Geem, Zong Woo; Gandomi, Amir H.: Reptile search algorithm (rsa): A nature-inspired meta-heuristic optimizer. Expert Systems with Applications 191, 116158 (2022)
Abualigah, Laith; Diabat, Ali; Mirjalili, Seyedali; Elaziz, Mohamed Abd; Gandomi, Amir H.: The arithmetic optimization algorithm. Computer methods in applied mechanics and engineering 376, 113609 (2021)
Oyelade, Olaide N.; Ezugwu, Absalom E.; Mohamed, Tehnan IA.; Abualigah, Laith: Ebola optimization search algorithm: A new nature-inspired metaheuristic algorithm with application in medical image classification problem. IEEE Access, (2022)
Thorndike, EL: Animal intelligence, darien, ct, (1911)
Sutton, Richard S; Barto, Andrew G: Planning and learning. In Reinforcement Learning: An Introduction., ser. Adaptive Computation and Machine Learning, pages 227–254. A Bradford Book, (1998)
Verma, Sagar: A survey on machine learning applied to dynamic physical systems. arXiv preprintarXiv:2009.09719, (2020).
Dalal, Gal; Dvijotham, Krishnamurthy; Vecerik, Matej; Hester, Todd; Paduraru, Cosmin; Tassa, Yuval: Safe exploration in continuous action spaces. arXiv preprintarXiv:1801.08757, (2018)
Garcıa, Javier; Fernández, Fernando: A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research 16(1), 1437–1480 (2015)
Matthew Kretchmar, R.; Young, Peter M.; Anderson, Charles W.; Hittle, Douglas C.; Anderson, Michael L.; Delnero, Christopher C.: Robust reinforcement learning control with static and dynamic stability. International Journal of Robust and Nonlinear Control: IFAC-Affiliated Journal 11(15), 1469–1500 (2001)
Mannucci, Tommaso; van Kampen, Erik-Jan.; de Visser, Cornelis; Chu, Qiping: Safe exploration algorithms for reinforcement learning controllers. IEEE transactions on neural networks and learning systems 29(4), 1069–1081 (2017)
Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Rusu, Andrei A.; Veness, Joel; Bellemare, Marc G.; Graves, Alex; Riedmiller, Martin; Fidjeland, Andreas K.; Ostrovski, Georg; et al.: Human-level control through deep reinforcement learning. nature 518(7540), 529–533 (2015)
Koch, Wil; Mancuso, Renato; West, Richard; Bestavros, Azer: Reinforcement learning for uav attitude control. ACM Transactions on Cyber-Physical Systems 3, 04 (2018)
Nurten, EMER; Özbek, Necdet Sinan: Control of attitude dynamics of an unmanned aerial vehicle with reinforcement learning algorithms. Avrupa Bilim ve Teknoloji Dergisi, (29):351–357.
Pi, Chen-Huan.; Ye, Wei-Yuan.; Cheng, Stone: Robust quadrotor control through reinforcement learning with disturbance compensation. Applied Sciences 11(7), 3257 (2021)
Xiang, Shuiying; Ren, Zhenxing; Zhang, Yahui; Song, Ziwei; Guo, Xingxing; Han, Genquan; Hao, Yue: Training a multi-layer photonic spiking neural network with modified supervised learning algorithm based on photonic stdp. IEEE Journal of Selected Topics in Quantum Electronics 27(2), 1–9 (2020)
Zhang, Baochang; Mao, Zhili; Liu, Wanquan; Liu, Jianzhuang: Geometric reinforcement learning for path planning of uavs. Journal of Intelligent & Robotic Systems 77(2), 391–409 (2015)
Jingzhi, Hu.; Zhang, Hongliang; Di, Boya; Li, Lianlin; Bian, Kaigui; Song, Lingyang; Li, Yonghui; Han, Zhu; Vincent Poor, H.: Reconfigurable intelligent surface based rf sensing: Design, optimization, and implementation. IEEE Journal on Selected Areas in Communications 38(11), 2700–2716 (2020)
Poksawat, Pakorn; Wang, Liuping; Mohamed, Abdulghani: Gain scheduled attitude control of fixed-wing uav with automatic controller tuning. IEEE Transactions on Control Systems Technology 26(4), 1192–1203 (2017)
Rinaldi, F.; Chiesa, S.; Quagliotti, Fulvia: Linear quadratic control for quadrotors uavs dynamics and formation flight. Journal of Intelligent & Robotic Systems 70(1–4), 203–220 (2013)
Araar, Oualid; Aouf, Nabil: Full linear control of a quadrotor uav, lq vs hinf. In 2014 UKACC International Conference on Control (CONTROL), pages 133–138. IEEE, (2014)
Brière, Dominique; Traverse, Pascal: Airbus a320/a330/a340 electrical flight controls-a family of fault-tolerant systems. In FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing, pages 616–623. IEEE, (1993)
Doyle, John; Lenz, Kathryn; Packard, Andy: Design examples using \(\mu \)-synthesis: Space shuttle lateral axis fcs during reentry. In Modelling, Robustness and Sensitivity Reduction in Control Systems, pages 127–154. Springer, (1987)
Kulcsar, Balazs: Lqg/ltr controller design for an aircraft model. Periodica Polytechnica Transportation Engineering 28(1–2), 131–142 (2000)
Escareno, Juan; Salazar-Cruz, S; Lozano, R.: Embedded control of a four-rotor uav. In 2006 American Control Conference, pages 6–pp. IEEE, (2006)
Derafa, L.; Ouldali, A.; Madani, T.; Benallegue, A.: Non-linear control algorithm for the four rotors uav attitude tracking problem. The Aeronautical Journal 115(1165), 175–185 (2011)
Adams, Richard J.; Banda, Siva S.: Robust flight control design using dynamic inversion and structured singular value synthesis. IEEE Transactions on control systems technology 1(2), 80–92 (1993)
Zhou, Y.: Online reinforcement learning control for aerospace systems. (2018).
Kaelbling, Leslie Pack; Littman, Michael L.; Moore, Andrew W.: Reinforcement learning: A survey. Journal of artificial intelligence research 4, 237–285 (1996)
Zhou, Conghao; He, Hongli; Yang, Peng; Lyu, Feng; Wu, Wen; Cheng, Nan; Shen, Xuemin: Deep rl-based trajectory planning for aoi minimization in uav-assisted iot. In 2019 11th International Conference on Wireless Communications and Signal Processing (WCSP), pages 1–6. IEEE, (2019)
Bansal, Trapit; Pachocki, Jakub; Sidor, Szymon; Sutskever, Ilya; Mordatch, Igor: Emergent complexity via multi-agent competition. arXiv preprintarXiv:1710.03748, (2017)
Kim, Donghae; Gyeongtaek, Oh.; Seo, Yongjun; Kim, Youdan: Reinforcement learning-based optimal flat spin recovery for unmanned aerial vehicle. Journal of Guidance, Control, and Dynamics 40(4), 1076–1084 (2017)
Dutoi, Brian; Richards, Nathan; Gandhi, Neha; Ward, David; Leonard, John: Hybrid robust control and reinforcement learning for optimal upset recovery. In AIAA Guidance, Navigation and Control Conference and Exhibit, page 6502, (2008)
Wickenheiser, Adam M.; Garcia, Ephrahim: Optimization of perching maneuvers through vehicle morphing. Journal of Guidance Control and Dynamics 31(4), 815–823 (2008)
Novati, Guido; Mahadevan, Lakshminarayanan; Koumoutsakos, Petros: Deep-reinforcement-learning for gliding and perching bodies. arXiv preprintarXiv:1807.03671, (2018)
Kroezen, Dave: Online reinforcement learning for flight control: An adaptive critic design without prior model knowledge. (2019)
Haarnoja, T.; Zhou, A.; Ha, S.; Tan, J.; Tucker, G.; Levine, S.; Dec, LG.: Learning to walk via deep reinforcement learning. arxiv 2019. arXiv preprintarXiv:1812.11103.
Silver, David; Huang, Aja; Maddison, Chris J.; Guez, Arthur; Sifre, Laurent; Van Den Driessche, George; Schrittwieser, Julian; Antonoglou, Ioannis; Panneershelvam, Veda; Lanctot, Marc; et al.: Mastering the game of go with deep neural networks and tree search. nature 529(7587), 484–489 (2016)
Xenou, Konstantia; Chalkiadakis, Georgios; Afantenos, Stergos: Deep reinforcement learning in strategic board game environments. In European Conference on Multi-Agent Systems, pages 233–248. Springer, (2018)
Kimathi, Stephen: Application of reinforcement learning in heading control of a fixed wing uav using x-plane platform. (2017)
Pham, Huy X.; La, Hung M.; Feil-Seifer, David; Nguyen, Luan V: Autonomous uav navigation using reinforcement learning. arXiv preprintarXiv:1801.05086, (2018)
Rodriguez-Ramos, Alejandro; Sampedro, Carlos; Bavle, Hriday; De La Puente, Paloma; Pascual, Campoy: A deep reinforcement learning strategy for uav autonomous landing on a moving platform. Journal of Intelligent & Robotic Systems 93(1–2), 351–366 (2019)
Petterson, Kristian: Cfd analysis of the low-speed aerodynamic characteristics of a ucav. AIAA Paper 1259, 2006 (2006)
Finck, R.D.: Air Force Flight Dynamics Laboratory (US), and DE Hoak. USAF stability and control DATCOM, Engineering Documents (1978)
Roskam, J.: Airplane design 8vol. (1985)
Buning, P.G.; Gomez, R.J.; Scallion, W.I.: Cfd approaches for simulation of wing-body stage separation. AIAA Paper 4838, 2004 (2004)
Hafner, R.; Riedmiller, M.: Reinforcement learning in feedback control. Mach. Learn. 84(1–2), 137–169 (2011)
Laroche, R.; Feraud, R.: Reinforcement learning algorithm selection. arXiv preprintarXiv:1701.08810, (2017)
Kingma, D.P.; Adam, J.B.: A method for stochastic optimization. arXiv preprintarXiv:1412.6980, (2014)
Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966)
Bellman, R.E.; Dreyfus, S.E.: Applied Dynamic Programming. Princeton university press (2015)
Liu, D.; Wei, Q.; Wang, D.; Yang, X.; Li, H.: Adaptive Dynamic Programming with Applications in Optimal Control. Springer (2017)
Luo, B.; Liu, D.; Huai-Ning, W.; Wang, D.; Lewis, F.L.: Policy gradient adaptive dynamic programming for data-based optimal control. IEEE Trans. Cybern. 47(10), 3341–3354 (2016)
Bouman, P.; Agatz, N.; Schmidt, M.: Dynamic programming approaches for the traveling salesman problem with drone. Networks 72(4), 528–542 (2018)
Silver, D.; Lever, G.: Nicolas, H.; Daan, W., Martin, R.: Deterministic policy gradient algorithms, Thomas Degris (2014)
Matignon, L.; Laurent, G.J; Le Fort-Piat, N.: Reward function and initial values: better choices for accelerated goal-directed reinforcement learning. In International Conference on Artificial Neural Networks, pages 840–849. Springer, (2006)
Gleave, A.; Dennis, M.; Legg, S.; Russell, S.; Leike, J.: Quantifying differences in reward functions. arXiv preprint arXiv:2006.13900, (2020)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed Consent
Informed consent was obtained from all individual participants included in the study.
Rights and permissions
About this article
Cite this article
Din, A.F.U., Mir, I., Gul, F. et al. Reinforced Learning-Based Robust Control Design for Unmanned Aerial Vehicle. Arab J Sci Eng 48, 1221–1236 (2023). https://doi.org/10.1007/s13369-022-06746-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13369-022-06746-0