Trading-Off Safety with Agility Using Deep Pose Error Estimation and Reinforcement Learning for Perception-Driven UAV Motion Planning

Kaymaz, Mehmetcan; Ayzit, Recep; Akgün, Onur; Atik, Kamil Canberk; Erdem, Mustafa; Yalcin, Baris; Cetin, Gürkan; Ure, Nazım Kemal

doi:10.1007/s10846-024-02085-4

Trading-Off Safety with Agility Using Deep Pose Error Estimation and Reinforcement Learning for Perception-Driven UAV Motion Planning

Regular paper
Open access
Published: 27 March 2024

Volume 110, article number 55, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Trading-Off Safety with Agility Using Deep Pose Error Estimation and Reinforcement Learning for Perception-Driven UAV Motion Planning

Download PDF

Mehmetcan Kaymaz ORCID: orcid.org/0009-0001-1218-384X¹,
Recep Ayzit¹,
Onur Akgün²,
Kamil Canberk Atik¹,
Mustafa Erdem²,
Baris Yalcin⁴,
Gürkan Cetin⁴ &
…
Nazım Kemal Ure³

513 Accesses
Explore all metrics

Abstract

Navigation and planning for unmanned aerial vehicles (UAVs) based on visual-inertial sensors has been a popular research area in recent years. However, most visual sensors are prone to high error rates when exposed to disturbances such as excessive brightness and blur, which can lead to catastrophic performance drops in perception and motion planning systems. This study proposes a novel framework to address the coupled perception-planning problem in high-risk environments. This achieved by developing algorithms that can automatically adjust the agility of the UAV maneuvers based on the predicted error rate of the pose estimation system. The fundamental idea behind our work is to demonstrate that highly agile maneuvers become infeasible to execute when visual measurements are noisy. Thus, agility should be traded-off with safety to enable efficient risk management. Our study focuses on navigating a quadcopter through a sequence of gates on an unknown map, and we rely on existing deep learning methods for visual gate-pose estimation. In addition, we develop an architecture for estimating the pose error under high disturbance visual inputs. We use the estimated pose errors to train a reinforcement learning agent to tune the parameters of the motion planning algorithm to safely navigate the environment while minimizing the track completion time. Simulation results demonstrate that our proposed approach yields significantly fewer crashes and higher track completion rates compared to approaches that do not utilize reinforcement learning.

Article PDF

UAV Autonomous Navigation Based on Multi-modal Perception: A Deep Hierarchical Reinforcement Learning Method

A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle Avoidance

A Deep Reinforcement Learning Strategy for UAV Path Following Control Under Sensor Fault

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Availability of Code and Data

The codes and datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

Bojarski, M., del Testa, D.W., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., Zhang, X., Zhao, J., Zieba, K.: End to end learning for self-driving cars. arXiv:1604.07316 (2016)
Bengio, Y., Lecun, Y., Hinton, G.: Deep learning for ai. Commun. ACM 64(7), 58–65 (2021)
Article Google Scholar
Foehn, P., Brescianini, D., Kaufmann, E., Cieslewski, T., Gehrig, M., Muglikar, M., Scaramuzza, D.: Alphapilot: Autonomous drone racing. arXiv:2005.12813 (2020)
Bartolomei, L., Teixeira, L., Chli, M.: Semantic-aware active perception for uavs using deep reinforcement learning. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3101–3108 (2021). https://doi.org/10.1109/IROS51168.2021.9635893
Akbari, Y., Almaadeed, N., Al-maadeed, S., Elharrouss, O.: Applications, databases and open computer vision research from drone videos and images: a survey. Artif. Intell. Rev. 54(5), 3887–3938 (2021)
Article Google Scholar
Tai, L., Liu, M.: Deep-learning in mobile robotics-from perception to control systems: a survey on why and why not. arXiv:1612.07139 (2016)
Kaufmann, E., Loquercio, A., Ranftl, R., Dosovitskiy, A., Koltun, V., Scaramuzza, D.: Deep drone racing: learning agile flight in dynamic environments. arXiv:1806.08548 (2018)
Loquercio, A., Maqueda, A.I., Blanco, C.R.D., Scaramuzza, D.: Dronet: learning to fly by driving. IEEE Robot. Automation Lett. (2018). https://doi.org/10.1109/lra.2018.2795643
Bonatti, R., Madaan, R., Vineet, V., Scherer, S., Kapoor, A.: Learning controls using cross-modal representations: bridging simulation and reality for drone racing. arXiv:1909.06993 (2019)
Jung, S., Hwang, S., Shin, H., Shim, D.H.: Perception, guidance, and navigation for indoor autonomous drone racing using deep learning. IEEE Robot. Automation Lett. 3(3), 2539–2544 (2018). https://doi.org/10.1109/LRA.2018.2808368
Article Google Scholar
Sharma, V.D., Toubeh, M., Zhou, L., Tokekar, P.: Risk-aware planning and assignment for ground vehicles using uncertain perception from aerial vehicles (2020). arXiv:2003.11675
Kaufmann, E., Gehrig, M., Foehn, P., Ranftl, R., Dosovitskiy, A., Koltun, V., Scaramuzza, D.: Beauty and the beast: optimal methods meet learning for drone racing. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 690–696 (2019). https://doi.org/10.1109/ICRA.2019.8793631
Li, S., Ozo, M.M., De Wagter, C., de Croon, G.C.: Autonomous drone race: a computationally efficient vision-based navigation and control strategy. Robot. Autonomous Syst. 133, 103621 (2020)
Article Google Scholar
Sanket, N.J., Singh, C.D., Ganguly, K., Fermüller, C., Aloimonos, Y.: Gapflyt: active vision based minimalist structure-less gap detection for quadrotor flight. IEEE Robot. Automation Lett. 3(4), 2799–2806 (2018)
Article Google Scholar
Gal, Y.: Uncertainty in deep learning (2016)
Li, R., Wang, S., Long, Z., Gu, D.: Undeepvo: monocular visual odometry through unsupervised deep learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp. 7286–7291 (2018)
Chakravarty, P., Narayanan, P., Roussel, T.: Gen-slam: generative modeling for monocular simultaneous localization and mapping. In: 2019 International Conference on Robotics and Automation (ICRA), IEEE, pp. 147–153 (2019)
Kurimo, E., Kunttu, L., Nikkanen, J., Grén, J., Kunttu, I., Laaksonen, J.: The effect of motion blur and signal noise on image quality in low light imaging, pp. 81–90 (2009). https://doi.org/10.1007/978-3-642-02230-2_9
Cosner, R.K., Tucker, M., Taylor, A.J., Li, K., Molnár, T.G., Ubellacker, W., Alan, A., Orosz, G., Yue, Y., Ames, A.D.: Safety-aware preference-based learning for safety-critical control. arXiv:2112.08516 (2021)
Cassel, A., Bergenhem, C., Christensen, O., Heyn, H.-M., Leadersson-Olsson, S., Majdandzic, M., Sun, P., Thorsén, A., Trygvesson, J.: Perception safety requirements and multi sensor systems for automated driving systems. (2020). https://doi.org/10.4271/2020-01-0101
Kraus, F., Dietmayer, K.: Uncertainty estimation in one-stage object detection. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, ??? (2019). https://doi.org/10.1109/itsc.2019.8917494
Richter, C., Roy, N.: Safe visual navigation via deep learning and novelty detection. In: Robotics: Science and Systems (2017)
González, D., Pérez, J., Milanés, V., Nashashibi, F.: A review of motion planning techniques for automated vehicles. IEEE Trans. Intell. Transp. Syst. 17(4), 1135–1145 (2015)
Article Google Scholar
Liu, S., Atanasov, N., Mohta, K., Kumar, V.: Search-based motion planning for quadrotors using linear quadratic minimum time control. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2872–2879 (2017). https://doi.org/10.1109/IROS.2017.8206119
Zhou, B., Gao, F., Wang, L., Liu, C., Shen, S.: Robust and efficient quadrotor trajectory generation for fast autonomous flight. IEEE Robot. Automation Lett. 4(4), 3529–3536 (2019)
Article Google Scholar
Tordesillas, J., Lopez, B.T., How, J.P.: Faster: fast and safe trajectory planner for flights in unknown environments. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, pp. 1934–1940 (2019)
Dong, Y., Fu, C., Kayacan, E.: Rrt-based 3d path planning for formation landing of quadrotor uavs. In: 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV), pp. 1–6 (2016). https://doi.org/10.1109/ICARCV.2016.7838567
Gebhardt, C., Hepp, B., Nägeli, T., Stevšić, S., Hilliges, O.: Airways: Optimization-based planning of quadrotor trajectories according to high-level user goals. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 2508–2519 (2016)
Costante G., Forster C., Delmerico J., Valigi P., Scaramuzza D.: Perception-aware path planning (2016). arXiv:1605.04151
Lin J., Wang L., Gao F., Shen S., Zhang F.: Flying through a narrow gap using neural network: an end-to-end planning and control approach (2019). arXiv:1903.09088
Zhou, B., Pan, J., Gao, F., Shen, S.: Raptor: robust and perception-aware trajectory replanning for quadrotor fast flight. IEEE Trans. Robotics 37(6), 1992–2009 (2021)
Article Google Scholar
Richard, A., Aravecchia, S., Geist, M., Pradalier, C.: Learning behaviors through physics-driven latent imagination. In: Faust A., Hsu D., Neumann G. (eds.) Proceedings of the 5th Conference on Robot Learning. Proceedings of Machine Learning Research, vol. 164, pp. 1190–1199. PMLR, ??? (2022). https://proceedings.mlr.press/v164/richard22a.html
Becker-Ehmck, P., Karl, M., Peters, J., van der Smagt, P.: Learning to fly via deep model-based reinforcement learning (2020). arXiv:2003.08876
Sandino, J., Maire, F., Caccetta, P., Sanderson, C., Gonzalez, F.: Drone-based autonomous motion planning system for outdoor environments under object detection uncertainty. Remote. Sens. 13, 4481 (2021)
Article Google Scholar
Ozturk, A., Burak Gunel, M., Dagdanov, R., Ekim Vural, M., Yurdakul, F., Dal, M., Kemal Ure, N.: Investigating value of curriculum reinforcement learning in autonomous driving under diverse road and weather conditions. In: 2021 IEEE Intelligent Vehicles Symposium Workshops (IV Workshops), pp. 358–363 (2021). https://doi.org/10.1109/IVWorkshops54471.2021.9669203
Yu, Q., Luo, L., Liu, B., Hu, S.: Re-planning of quadrotors under disturbance based on meta reinforcement learning. J. Intell. & Robotic Syst. 107(1), 13 (2023). https://doi.org/10.1007/s10846-022-01788-w
Article Google Scholar
Grando, R.B., de Jesus, J.C., Kich, V.A., Kolling, A.H., Drews-Jr, P.L.J.: Double critic deep reinforcement learning for mapless 3d navigation of unmanned aerial vehicles. J. Intell. & Robotic Syst. 104(2), 29 (2022). https://doi.org/10.1007/s10846-021-01568-y
Article Google Scholar
Xu, G., Jiang, W., Wang, Z., Wang, Y.: Autonomous obstacle avoidance and target tracking of uav based on deep reinforcement learning. J. Intell. & Robotic Syst. 104(4), 60 (2022). https://doi.org/10.1007/s10846-022-01601-8
Article Google Scholar
Gao, F., Wang, L., Zhou, B., Zhou, X., Pan, J., Shen, S.: Teach-repeat-replan: a complete and robust system for aggressive flight in complex environments. IEEE Trans. Robotics 36(5), 1526–1545 (2020)
Article Google Scholar
Fehr, M., Schneider, T., Dymczyk, M., Sturm, J., Siegwart, R.: Visual-inertial teach and repeat for aerial inspection (2018). arXiv:1803.09650
Mellinger, D., Kumar, V.: Minimum snap trajectory generation and control for quadrotors. In: 2011 IEEE International Conference on Robotics and Automation, IEEE, pp. 2520–2525 (2011)
Abro, G.E.M., Bin Mohd Zulkifli, S.A., Asirvadam, V.S.: Dual-loop single dimension fuzzy-based sliding mode control design for robust tracking of an underactuated quadrotor craft. Asian J. Control 25(1), 144–169 (2023) https://arxiv.org/abs/https://onlinelibrary.wiley.com/doi/pdf/10.1002/asjc.2753. https://doi.org/10.1002/asjc.2753
Abro, G.E.M., Zulkifli, S.A.B.M., Ali, Z.A., Asirvadam, V.S., Chowdhry, B.S.: Fuzzy based backstepping control design for stabilizing an underactuated quadrotor craft under unmodelled dynamic factors. Electronics 11(7) (2022). https://doi.org/10.3390/electronics11070999
Mustafa Abro, E.G., Ali, Z., Zulkifli, S., Asirvadam, V.: Performance evaluation of different control methods for an underactuated quadrotor unmanned aerial vehicle (quav) with position estimator and disturbance observer. Math. Problems Eng. 2021, 1–22 (2021). https://doi.org/10.1155/2021/8791620
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015). arXiv:1512.03385
Shah, S., Dey, D., Lovett, C., Kapoor, A.: Airsim: high-fidelity visual and physical simulation for autonomous vehicles. In: FSR (2017)
Foehn, P., Romero, A., Scaramuzza, D.: Time-optimal planning for quadrotor waypoint flight. Sci. Robotics (2021)
Foehn, P., Scaramuzza, D.: CPC: Complementary Progress Constraints for Time-Optimal Quadrotor Trajectories. (2020). https://doi.org/10.48550/ARXIV.2007.06255. arXiv:2007.06255
Wang, Y.-S., Sun, L., Zhou, L., Liu, J.-T.: Online minimum-acceleration trajectory planning with the kinematic constraints. Acta Automatica Sinica 40(7), 1328–1338 (2014). https://doi.org/10.1016/S1874-1029(14)60014-8
Article Google Scholar
Emami, S.A., Banazadeh, A.: Simultaneous trajectory tracking and aerial manipulation using a multi-stage model predictive control. Aerospace Sci. Technol. 112, 106573 (2021)
Article Google Scholar
Wang, P., Man, Z., Cao, Z., Zheng, J., Zhao, Y.: Dynamics modelling and linear control of quadcopter. In: 2016 International Conference on Advanced Mechatronic Systems (ICAMechS), pp. 498–503 (2016). https://doi.org/10.1109/ICAMechS.2016.7813499
Pólik, I., Terlaky, T.: In: Di Pillo G., Schoen F. (eds.) Interior Point Methods for Nonlinear Optimization, pp. 215–276. Springer, Berlin, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11339-0_4
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar

Download references

Acknowledgements

This work is supported by Havelsan and the Scientific Research Project Unit (BAP) of Istanbul Technical University, Grant NO: MOA-2019-42321.

Funding

This work is supported by Havelsan Grant NO: MOA-2019-42321. Mehmetcan Kaymaz has received research support.

Author information

Authors and Affiliations

Faculty of Aeronautics and Astronautics, Istanbul Technical University, Maslak, Istanbul, 38000, Turkey
Mehmetcan Kaymaz, Recep Ayzit & Kamil Canberk Atik
Faculty of Engineering, Mechatronics Engineering, Turkish-German University, Beykoz, Istanbul, 38000, Turkey
Onur Akgün & Mustafa Erdem
Faculty of Computer and Informatics Engineering, Istanbul Technical University, Maslak, Istanbul, 38000, Turkey
Nazım Kemal Ure
Havelsan, Havelsan, Çankaya, Ankara, 38000, Turkey
Baris Yalcin & Gürkan Cetin

Authors

Mehmetcan Kaymaz
View author publications
You can also search for this author in PubMed Google Scholar
Recep Ayzit
View author publications
You can also search for this author in PubMed Google Scholar
Onur Akgün
View author publications
You can also search for this author in PubMed Google Scholar
Kamil Canberk Atik
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Erdem
View author publications
You can also search for this author in PubMed Google Scholar
Baris Yalcin
View author publications
You can also search for this author in PubMed Google Scholar
Gürkan Cetin
View author publications
You can also search for this author in PubMed Google Scholar
Nazım Kemal Ure
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Mehmetcan Kaymaz conceived the research, wrote the article, and contributed to reinforcement learning, motion planning, modeling, control, and simulation. Recep Ayzit conceived the research, wrote the article, and contributed to the perception system, and simulation. Onur Akgün wrote the article, surveyed the literature, and contributed simulation. Kamil Canberk Atik wrote the article, surveyed the literature, and contributed to the perception system. Mustafa Erdem wrote the article, surveyed the literature, and contributed simulation. Baris Yalcin supervised the research. Gürkan Cetin supervised the research. Nazım Kemal Ure wrote the article and supervised the research.

Corresponding author

Correspondence to Mehmetcan Kaymaz.

Ethics declarations

Competing interests

The authors have no relevant financial or nonfinancial interests to disclose.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kaymaz, M., Ayzit, R., Akgün, O. et al. Trading-Off Safety with Agility Using Deep Pose Error Estimation and Reinforcement Learning for Perception-Driven UAV Motion Planning. J Intell Robot Syst 110, 55 (2024). https://doi.org/10.1007/s10846-024-02085-4

Download citation

Received: 04 April 2023
Accepted: 28 February 2024
Published: 27 March 2024
DOI: https://doi.org/10.1007/s10846-024-02085-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Trading-Off Safety with Agility Using Deep Pose Error Estimation and Reinforcement Learning for Perception-Driven UAV Motion Planning

Abstract

Article PDF

Similar content being viewed by others

UAV Autonomous Navigation Based on Multi-modal Perception: A Deep Hierarchical Reinforcement Learning Method

A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle Avoidance

A Deep Reinforcement Learning Strategy for UAV Path Following Control Under Sensor Fault

Availability of Code and Data

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation