Teaching UAVs to Race: End-to-End Regression of Agile Controls in Simulation

  • Matthias MüllerEmail author
  • Vincent Casser
  • Neil Smith
  • Dominik L. Michels
  • Bernard Ghanem
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11130)


Automating the navigation of unmanned aerial vehicles (UAVs) in diverse scenarios has gained much attention in recent years. However, teaching UAVs to fly in challenging environments remains an unsolved problem, mainly due to the lack of training data. In this paper, we train a deep neural network to predict UAV controls from raw image data for the task of autonomous UAV racing in a photo-realistic simulation. Training is done through imitation learning with data augmentation to allow for the correction of navigation mistakes. Extensive experiments demonstrate that our trained network (when sufficient data augmentation is used) outperforms state-of-the-art methods and flies more consistently than many human pilots. Additionally, we show that our optimized network architecture can run in real-time on embedded hardware, allowing for efficient on-board processing critical for real-world deployment. From a broader perspective, our results underline the importance of extensive data augmentation techniques to improve robustness in end-to-end learning setups.



This work was supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research through the Visual Computing Center (VCC) funding.


  1. 1.
    Andersson, O., Wzorek, M., Doherty, P.: Deep learning quadcopter control via risk-aware active learning. In: Thirty-First AAAI Conference on Artificial Intelligence (AAAI), 4–9 February 2017, San Francisco (2017, accepted)Google Scholar
  2. 2.
    Battaglia, P.W., Hamrick, J.B., Tenenbaum, J.B.: Simulation as an engine of physical scene understanding. Proc. Natl. Acad. Sci. 110(45), 18327–18332 (2013). Scholar
  3. 3.
    Bojarski, M., et al.: End to end learning for self-driving cars. CoRR abs/1604.07316 (2016).
  4. 4.
    Chen, C., Seff, A., Kornhauser, A., Xiao, J.: Deepdriving: learning affordance for direct perception in autonomous driving. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), ICCV 2015, pp. 2722–2730. IEEE Computer Society, Washington, DC (2015).
  5. 5.
    Dosovitskiy, A., Koltun, V.: Learning to act by predicting the future. vol. abs/1611.01779 (2017).
  6. 6.
    Dosovitskiy, A., Ros, G., Codevilla, F., López, A., Koltun, V.: CARLA: an open urban driving simulator. In: Conference on Robot Learning (CoRL) (2017)Google Scholar
  7. 7.
    Furrer, F., Burri, M., Achtelik, M., Siegwart, R.: RotorS—a modular gazebo MAV simulator framework. In: Koubaa, A. (ed.) Robot Operating System (ROS). SCI, vol. 625, pp. 595–625. Springer, Cham (2016). Scholar
  8. 8.
    Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4340–4349 (2016)Google Scholar
  9. 9.
    Guo, X., Singh, S., Lee, H., Lewis, R., Wang, X.: Deep learning for real-time atari game play using offline Monte-Carlo tree search planning. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS 2014, pp. 3338–3346. MIT Press, Cambridge (2014).
  10. 10.
    Ha, S., Liu, C.K.: Iterative training of dynamic skills inspired by human coaching techniques. ACM Trans. Graph. 34(1), 1:1–1:11 (2014). Scholar
  11. 11.
    Hamalainen, P., Eriksson, S., Tanskanen, E., Kyrki, V., Lehtinen, J.: Online motion synthesis using sequential Monte Carlo. ACM Trans. Graph. 33(4), 51:1–51:12 (2014). Scholar
  12. 12.
    Hamalainen, P., Rajamaki, J., Liu, C.K.: Online control of simulated humanoids using particle belief propagation. ACM Trans. Graph. 34(4), 81:1–81:13 (2015). Scholar
  13. 13.
    Hejrati, M., Ramanan, D.: Analysis by synthesis: 3D object recognition by object reconstruction. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2449–2456, June 2014.
  14. 14.
    Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: a survey of learning methods. ACM Comput. Surv. 50(2), 21:1–21:35 (2017). Scholar
  15. 15.
    Ju, E., Won, J., Lee, J., Choi, B., Noh, J., Choi, M.G.: Data-driven control of flapping flight. ACM Trans. Graph. 32(5), 151:1–151:12 (2013). Scholar
  16. 16.
    Kim, D.K., Chen, T.: Deep neural network for real-time autonomous indoor navigation. CoRR abs/1511.04668 (2015)Google Scholar
  17. 17.
    Koutník, J., Cuccu, G., Schmidhuber, J., Gomez, F.: Evolving large-scale neural networks for vision-based reinforcement learning. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, GECCO 2013, pp. 1061–1068. ACM, New York (2013).
  18. 18.
    Koutník, J., Schmidhuber, J., Gomez, F.: Online evolution of deep convolutional network for vision-based reinforcement learning. In: del Pobil, A.P., Chinellato, E., Martinez-Martin, E., Hallam, J., Cervera, E., Morales, A. (eds.) SAB 2014. LNCS (LNAI), vol. 8575, pp. 260–269. Springer, Cham (2014). Scholar
  19. 19.
    Lerer, A., Gross, S., Fergus, R.: Learning Physical Intuition of Block Towers by Example (2016). arXiv:1603.01312v1
  20. 20.
    Levine, S., Koltun, V.: Guided policy search. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, PMLR, 17–19 June 2013, Atlanta, vol. 28, pp. 1–9 (2013).
  21. 21.
    Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. ICLR abs/1509.02971 (2016).
  22. 22.
    Loquercio, A., Maqueda, A.I., del Blanco, C.R., Scaramuzza, D.: Dronet: learning to fly by driving. IEEE Robot. Autom. Lett. 3(2), 1088–1095 (2018)CrossRefGoogle Scholar
  23. 23.
    Marín, J., Vázquez, D., Gerónimo, D., López, A.M.: Learning appearance in virtual scenarios for pedestrian detection. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 137–144 (2010).
  24. 24.
    Mnih, V., et al.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)Google Scholar
  25. 25.
    Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). Scholar
  26. 26.
    Movshovitz-Attias, Y., Sheikh, Y., Naresh Boddeti, V., Wei, Z.: 3D pose-by-detection of vehicles via discriminatively reduced ensembles of correlation filters. In: Proceedings of the British Machine Vision Conference. BMVA Press (2014).
  27. 27.
    Mueller, M., Casser, V., Lahoud, J., Smith, N., Ghanem, B.: Sim4CV: a photo-realistic simulator for computer vision applications. Int. J. Comput. Vis. 126, 902–919 (2018)CrossRefGoogle Scholar
  28. 28.
    Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). Scholar
  29. 29.
    Muller, U., Ben, J., Cosatto, E., Flepp, B., Cun, Y.L.: Off-road obstacle avoidance through end-to-end learning. In: Weiss, Y., Schölkopf, P.B., Platt, J.C. (eds.) Advances in Neural Information Processing Systems, vol. 18, pp. 739–746. MIT Press (2006).
  30. 30.
    Nvidia: Gpu-based deep learning inference: A performance and power analysis, November 2015.
  31. 31.
    Papon, J., Schoeler, M.: Semantic pose using deep networks trained on synthetic RGB-D. CoRR abs/1508.00835 (2015).
  32. 32.
    Peng, X.B., Berseth, G., van de Panne, M.: Terrain-adaptive locomotion skills using deep reinforcement learning. ACM Trans. Graph. 35(4), 81 (2016). (Proc. SIGGRAPH 2016)Google Scholar
  33. 33.
    Peng, X.B., Berseth, G., Yin, K., van de Panne, M.: Deeploco: dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Trans. Graph. 36(4), 41 (2017). (Proc. SIGGRAPH 2017)CrossRefGoogle Scholar
  34. 34.
    Pepik, B., Stark, M., Gehler, P., Schiele, B.: Teaching 3D geometry to deformable part models. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3362–3369, June 2012.
  35. 35.
    Pomerleau, D.A.: ALVINN: an autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems, pp. 305–313. Morgan Kaufmann Publishers Inc., San Francisco (1989).
  36. 36.
    Prabowo, Y.A., Trilaksono, B.R., Triputra, F.R.: Hardware in-the-loop simulation for visual servoing of fixed wing UAV. In: 2015 International Conference on Electrical Engineering and Informatics (ICEEI), pp. 247–252, August 2015.
  37. 37.
    Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). Scholar
  38. 38.
    Ross, S., Gordon, G.J., Bagnell, J.A.: No-regret reductions for imitation learning and structured prediction. CoRR abs/1011.0686 (2010).
  39. 39.
    Sadeghi, F., Levine, S.: CAD2RL: real single-image flight without a single real image (2017)Google Scholar
  40. 40.
    Shah, S., Dey, D., Lovett, C., Kapoor, A.: AirSim: high-fidelity visual and physical simulation for autonomous vehicles. In: Hutter, M., Siegwart, R. (eds.) Field and Service Robotics. SPAR, vol. 5, pp. 621–635. Springer, Cham (2018). Scholar
  41. 41.
    Shah, U., Khawad, R., Krishna, K.M.: Deepfly: towards complete autonomous navigation of MAVs with monocular camera. In: Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP 2016, pp. 59:1–59:8. ACM, New York (2016).
  42. 42.
    Smolyanskiy, N., Kamenev, A., Smith, J., Birchfield, S.: Toward Low-Flying Autonomous MAV Trail Navigation using Deep Neural Networks for Environmental Awareness. ArXiv e-prints, May 2017Google Scholar
  43. 43.
    Tan, J., Gu, Y., Liu, C.K., Turk, G.: Learning bicycle stunts. ACM Trans. Graph. 33(4), 50:1–50:12 (2014). Scholar
  44. 44.
    Trilaksono, B.R., Triadhitama, R., Adiprawita, W., Wibowo, A., Sreenatha, A.: Hardware-in-the-loop simulation for visual target tracking of octorotor UAV. Aircraft Eng. Aerospace Technol. 83(6), 407–419 (2011). Scholar
  45. 45.
    Wymann, B., Dimitrakakis, C., Sumner, A., Espié, E., Guionneau, C., Coulom, R.: TORCS, the open racing car simulator (2014).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Matthias Müller
    • 1
    Email author
  • Vincent Casser
    • 1
  • Neil Smith
    • 1
  • Dominik L. Michels
    • 1
  • Bernard Ghanem
    • 1
  1. 1.Visual Computing Center at King Abdullah University of Science and TechnologyThuwalKingdom of Saudi Arabia

Personalised recommendations