A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform

Rodriguez-Ramos, Alejandro; Sampedro, Carlos; Bavle, Hriday; de la Puente, Paloma; Campoy, Pascual

doi:10.1007/s10846-018-0891-8

A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform

Published: 03 July 2018

Volume 93, pages 351–366, (2019)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Alejandro Rodriguez-Ramos ORCID: orcid.org/0000-0002-3257-4602¹,
Carlos Sampedro¹,
Hriday Bavle¹,
Paloma de la Puente¹ &
…
Pascual Campoy¹

4311 Accesses
137 Citations
4 Altmetric
Explore all metrics

Abstract

The use of multi-rotor UAVs in industrial and civil applications has been extensively encouraged by the rapid innovation in all the technologies involved. In particular, deep learning techniques for motion control have recently taken a major qualitative step, since the successful application of Deep Q-Learning to the continuous action domain in Atari-like games. Based on these ideas, Deep Deterministic Policy Gradients (DDPG) algorithm was able to provide outstanding results with continuous state and action domains, which are a requirement in most of the robotics-related tasks. In this context, the research community is lacking the integration of realistic simulation systems with the reinforcement learning paradigm, enabling the application of deep reinforcement learning algorithms to the robotics field. In this paper, a versatile Gazebo-based reinforcement learning framework has been designed and validated with a continuous UAV landing task. The UAV landing maneuver on a moving platform has been solved by means of the novel DDPG algorithm, which has been integrated in our reinforcement learning framework. Several experiments have been performed in a wide variety of conditions for both simulated and real flights, demonstrating the generality of the approach. As an indirect result, a powerful work flow for robotics has been validated, where robots can learn in simulation and perform properly in real operation environments. To the best of the authors knowledge, this is the first work that addresses the continuous UAV landing maneuver on a moving platform by means of a state-of-the-art deep reinforcement learning algorithm, trained in simulation and tested in real flights.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

UAV Path Planning Based on Deep Reinforcement Learning

A Deep Reinforcement Learning Strategy for UAV Path Following Control Under Sensor Fault

Re-planning of Quadrotors Under Disturbance Based on Meta Reinforcement Learning

Article 17 January 2023

References

Rucco, A., Sujit, P.B., Aguiar, A.P., Sousa, J.B., Pereira, F.L.: Optimal rendezvous trajectory for unmanned aerial-ground vehicles. arXiv:1612.06100 (2016)
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467 (2016)
Borowczyk, A., Nguyen, D.-T., Phu-Van Nguyen, A., Nguyen, D.Q., Saussié, D., Ny, J.L.: Autonomous Landing of a multirotor micro air vehicle on a high velocity ground vehicle. In: IFAC World Congress (2017)
Ananthakrishnan, U.S., Akshay, N., Manikutty, G., Bhavani, R.R.: Control of quadrotors using neural networks for precise landing maneuvers (2017)
Araar, O., Aouf, N., Vitanov, I.: Vision based autonomous landing of multirotor uav on moving platform. J. Intell. Robot. Syst. 85(2), 369–384 (2017)
Article Google Scholar
Arora, S., Jain, S., Scherer, S., Nuske, S., Chamberlain, L., Singh, S.: Infrastructure-free shipdeck tracking for autonomous landing. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 323–330 (2013)
Blösch, M., Weiss, S., Scaramuzza, D., Siegwart, R.: Vision based mav navigation in unknown and unstructured environments. In: 2010 IEEE International Conference on Robotics and Automation (ICRA), pp. 21–28. IEEE (2010)
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv:1606.01540 (2016)
Cantelli, L., Mangiameli, M., Melita, C.D., Muscato, G.: Uav/Ugv cooperation for surveying operations in humanitarian demining. In: 2013 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 1–6. IEEE (2013)
Dorigo, M., Colombetti, M.: Robot Shaping: an Experiment in Behavior Engineering. MIT Press, Cambridge (1998)
Google Scholar
Espié, E., Guionneau, C., Wymann, B., Dimitrakakis, C., Coulom, R., Sumner, A.: Torcs-the open racing car simulator. Available at: http://torcs.sourceforge.net (2005)
Falanga, D., Zanchettin, A., Simovic, A., Delmerico, J., Scaramuzza, D.: Vision-based autonomous quadrotor landing on a moving platform
Furrer, F., Burri, M., Achtelik, M., Siegwart, R.: Robot operating system (ROS): the complete reference (Volume 1), chap. RotorS—A Modular Gazebo MAV Simulator Framework, pp 595–625. Springer International Publishing, Cham (2016). https://doi.org/10.1007/978-3-319-26054-9_23
Book Google Scholar
Gautam, A., Sujit, P.B., Saripalli, S.: A survey of autonomous landing techniques for uavs. In: 2014 International Conference on Unmanned Aircraft Systems (ICUAS) (2014)
Gautam, A., Sujit, P.B., Saripalli, S.: Application of Guidance Laws to Quadrotor Landing. In: 2015 International Conference on Unmanned Aircraft Systems (ICUAS) (2015)
Giusti, A., Guzzi, J., Cireşan, D.C., He, F.L., Rodríguez, J.P., Fontana, F., Faessler, M., Forster, C., Schmidhuber, J., Di Caro, G., et al.: A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robotics and Automation Letters 1(2), 661–667 (2016)
Article Google Scholar
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning, pp. 2829–2838 (2016)
Hu, B., Lu, L., Mishra, S.: Fast, safe and precise landing of a quadrotor on an oscillating platform. In: 2015 American Control Conference (ACC) (2015)
Ivakhnenko, A.G.: Polynomial theory of complex systems. IEEE Trans. Syst. Man Cybern. 1(4), 364–378 (1971)
Article MathSciNet Google Scholar
Kai, W., Chunzhen, S., Yi, J.: Research on adaptive guidance technology of uav ship landing system based on net recovery. Procedia Engineering 99, 1027–1034 (2015)
Article Google Scholar
Kelchtermans, K., Tuytelaars, T.: How hard is it to cross the room?–training (recurrent) neural networks to steer a uav. arXiv:1702.07600 (2017)
Kendoul, F., Ahmed, B.: Bio-inspired taupilot for automated aerial 4d docking and landing of unmanned aircraft systems. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (2012)
Kim, D.K., Chen, T.: Deep neural network for real-time autonomous indoor navigation. arXiv:1511.04668 (2015)
Kim, J., Jung, Y., Lee, D., Shim, D.H.: Landing control on a mobile platform for multi-copters using an omnidirectional image sensor. J. Intell. Robot. Syst. 84(1–4), 529–541 (2016)
Article Google Scholar
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32 (11), 1238–1274 (2013)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Lee, D., Ryan, T., Kim, H.J.: Autonomous landing of a vtol uav on a moving platform using image-based visual servoing. In: 2012 IEEE International Conference on Robotics and Automation (2012)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015)
Ling, K., Chow, D., Das, A., Waslander, S.L.: Autonomous maritime landings for low-cost vtol aerial vehicles. In: 2014 Canadian Conference on Computer and Robot Vision (2014)
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv:1312.5602 (2013)
Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y.: Ros: an open-source robot operating system. In: ICRA Workshop on Open Source Software, vol. 3, p. 5. Kobe (2009)
Rezelj, A.: Autonomous charging of a quadrocopter by landing at a mobile platform (2013)
Rodriguez-Ramos, A., Sampedro, C., Bavle, H., Milosevic, Z., Garcia-Vaquero, A., Campoy, P.: Towards fully autonomous landing on moving platforms for rotary unmanned aerial vehicles. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 170–178. IEEE (2017)
Sadeghi, F., Levine, S.: rl: real single image flight without a single real image. 12, arXiv:1611.04201 (2016)
Sampedro, C., Bavle, H., Rodríguez-Ramos, A., Carrio, A., Fernández, R.A.S., Sanchez-Lopez, J.L., Campoy, P.: A fully-autonomous aerial robotic solution for the 2016 international micro air vehicle competition. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 989–998. IEEE (2017)
Sanchez-Lopez, J.L., Fernández, R.A.S., Bavle, H., Sampedro, C., Molina, M., Pestana, J., Campoy, P.: Aerostack: an architecture and open-source software framework for aerial robotics. In: 2016 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 332–341. IEEE (2016)
Santana, P., Correia, L., Mendonça, R., Alves, N., Barata, J.: Tracking natural trails with swarm-based visual saliency. J. Field Rob. 30(1), 64–86 (2013)
Article Google Scholar
Serra, P., Cunha, R., Hamel, T., Cabecinhas, D., Silvestre, C.: Landing of a quadrotor on a moving target using dynamic image-based visual servo control. IEEE Trans. Robot. 32(6), 1524–1535 (2016)
Article Google Scholar
Shaker, M., Smith, M.N., Yue, S., Duckett, T.: Vision-based landing of a simulated unmanned aerial vehicle with fast reinforcement learning. In: 2010 International Conference on Emerging Security Technologies (EST), pp. 183–188. IEEE (2010)
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 387–395 (2014)
Skoczylas, M.: Vision analysis system for autonomous landing of micro drone. Acta Mechanica et Automatica 8(4), 199–203 (2015)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction, vol. 1. MIT Press, Cambridge (1998)
Google Scholar
Todorov, E., Erez, T., Tassa, Y.: Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5026–5033. IEEE (2012)
Uhlenbeck, G.E., Ornstein, L.S.: On the theory of the brownian motion. Phys. Rev. 36(5), 823 (1930)
Article MATH Google Scholar
Vlantis, P., Marantos, P., Bechlioulis, C.P., Kyriakopoulos, K.J.: Quadrotor landing on an inclined platform of a moving ground vehicle. In: 2015 IEEE International Conference on Robotics and Automation (ICRA) (2015)
Wenzel, K.E., Masselli, A., Zell, A.: Automatic take off, tracking and landing of a miniature uav on a moving carrier vehicle. J. Intell. Robot. Syst. 61(1–4), 221–238 (2011)
Article Google Scholar
Zamora, I., Lopez, N.G., Vilches, V.M., Cordero, A.H.: Extending the openai gym for robotics: a toolkit for reinforcement learning using ros and gazebo. arXiv:1608.05742 (2016)
Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535. IEEE (2016)

Download references

Acknowledgements

This work was supported by the Spanish Ministry of Science (Project DPI2014-60139-R). The LAL UPM and the MONCLOA Campus of International Excellence are also acknowledged for funding the predoctoral contract of one of the authors.

An introductory version of this paper was presented in the 2017 International Conference on Unmanned Aircraft Systems (ICUAS), held in Miami, FL USA, on 13–16 June 2017.

Author information

Authors and Affiliations

Computer Vision and Aerial Robotics (CVAR), Centre for Automation and Robotics (CAR), Universidad Politécnica de Madrid (UPM), Madrid, Spain
Alejandro Rodriguez-Ramos, Carlos Sampedro, Hriday Bavle, Paloma de la Puente & Pascual Campoy

Authors

Alejandro Rodriguez-Ramos
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Sampedro
View author publications
You can also search for this author in PubMed Google Scholar
Hriday Bavle
View author publications
You can also search for this author in PubMed Google Scholar
Paloma de la Puente
View author publications
You can also search for this author in PubMed Google Scholar
Pascual Campoy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alejandro Rodriguez-Ramos.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rodriguez-Ramos, A., Sampedro, C., Bavle, H. et al. A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform. J Intell Robot Syst 93, 351–366 (2019). https://doi.org/10.1007/s10846-018-0891-8

Download citation

Received: 29 September 2017
Accepted: 18 June 2018
Published: 03 July 2018
Issue Date: 15 February 2019
DOI: https://doi.org/10.1007/s10846-018-0891-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform

Abstract

Access this article

Similar content being viewed by others

UAV Path Planning Based on Deep Reinforcement Learning

A Deep Reinforcement Learning Strategy for UAV Path Following Control Under Sensor Fault

Re-planning of Quadrotors Under Disturbance Based on Meta Reinforcement Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform

Abstract

Access this article

Similar content being viewed by others

UAV Path Planning Based on Deep Reinforcement Learning

A Deep Reinforcement Learning Strategy for UAV Path Following Control Under Sensor Fault

Re-planning of Quadrotors Under Disturbance Based on Meta Reinforcement Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation