Skip to main content

Vision-Based 2D Navigation of Unmanned Aerial Vehicles in Riverine Environments with Imitation Learning


There have been many researchers studying how to enable unmanned aerial vehicles (UAVs) to navigate in complex and natural environments autonomously. In this paper, we develop an imitation learning framework and use it to train navigation policies for the UAV flying inside complex and GPS-denied riverine environments. The UAV relies on a forward-pointing camera to perform reactive maneuvers and navigate itself in 2D space by adapting the heading. We compare the performance of a linear regression-based controller, an end-to-end neural network controller and a variational autoencoder (VAE)-based controller trained with data aggregation method in the simulation environments. The results show that the VAE-based controller outperforms the other two controllers in both training and testing environments and is able to navigate the UAV with a longer traveling distance and a lower intervention rate from the pilots.


  1. Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the 21st International Conference on Machine learning, p. 1 (2004)

  2. Ablett, T., Marić, F., Kelly, J.: Fighting failures with fire: Failure identification to reduce expert burden in intervention-based learning. arXiv:2007.00245 (2020)

  3. Alvarez, H., Paz, L.M., Sturm, J., Cremers, D.: Collision Avoidance for Quadrotors with a Monocular Camera. In: Experimental Robotics, pp 195–209. Springer (2016)

  4. Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst 57(5), 469–483 (2009)

    Article  Google Scholar 

  5. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., et al.: End to end learning for self-driving cars. arXiv:1604.07316(2016)

  6. Burgess, C.P., Higgins, I., Pal, A., Matthey, L., Watters, N., Desjardins, G., Lerchner, A.: Understanding disentangling in β-vae. arXiv:1804.03599 (2018)

  7. Chambers, A., Achar, S., Nuske, S., Rehder, J., Kitt, B., Chamberlain, L., Haines, J., Scherer, S., Singh, S.: Perception for a River Mapping Robot. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 227–234. IEEE (2011)

  8. Clement, L., Kelly, J., Barfoot, T.D.: Robust monocular visual teach and repeat aided by local ground planarity and color-constant imagery. J. Field Robot 34(1), 74–97 (2017)

    Article  Google Scholar 

  9. Daftry, S., Zeng, S., Khan, A., Dey, D., Melik-Barkhudarov, N., Bagnell, J.A., Hebert, M.: Robust monocular flight in cluttered outdoor environments. arXiv:1604.04779 (2016)

  10. Delmerico, J., Scaramuzza, D.: A Benchmark Comparison of Monocular Visual-Inertial Odometry Algorithms for Flying Robots. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 2502–2509. IEEE (2018)

  11. Dey, D., Shankar, K.S., Zeng, S., Mehta, R., Agcayazi, M.T., Eriksen, C., Daftry, S., Hebert, M., Bagnell, J.A.: Vision and Learning for Deliberative Monocular Cluttered Flight. In: Field and Service Robotics, pp. 391–409. Springer (2016)

  12. Furgale, P., Barfoot, T.D.: Visual teach and repeat for long-range rover autonomy. J. Field Robot 27(5), 534–560 (2010)

    Article  Google Scholar 

  13. Giusti, A., Guzzi, J., Cireşan, D. C., He, F.L., Rodríguez, J.P., Fontana, F., Faessler, M., Forster, C., Schmidhuber, J., Di Caro, G., et al.: A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robot. Autom. Lett 1(2), 661–667 (2015)

    Article  Google Scholar 

  14. Goecks, V.G., Gremillion, G.M., Lawhern, V.J., Valasek, J., Waytowich, N.R.: Efficiently combining human demonstrations and interventions for safe training of autonomous systems in real-time. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 2462–2470 (2019)

  15. Goerzen, C., Kong, Z., Mettler, B.: A survey of motion planning algorithms from the perspective of autonomous uav guidance. J. Intell. Robot. Syst. 57(1-4), 65 (2010)

    Article  Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  17. Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., Lerchner, A.: Beta-vae: Learning basic visual concepts with a constrained variational framework (2016)

  18. Hrabar, S., Sukhatme, G.S., Corke, P., Usher, K., Roberts, J.: Combined Optic-Flow and Stereo-Based Navigation of Urban Canyons for a Uav. In: 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3309–3316. IEEE (2005)

  19. Iost Filho, F.H., Heldens, W.B., Kong, Z., de Lange, E.S.: Drones: Innovative technology for use in precision pest management. J. Econ. Entomol. 113(1), 1–25 (2020)

    Article  Google Scholar 

  20. Irizarry, J., Gheisari, M., Walker, B.N.: Usability assessment of drone technology as safety inspection tools. Journal of Information Technology in Construction (ITcon) 17(12), 194–212 (2012)

    Google Scholar 

  21. Kelly, M., Sidrane, C., Driggs-Campbell, K., Kochenderfer, M.J.: Hg-Dagger: Interactive Imitation Learning with Human Experts. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8077–8083. IEEE (2019)

  22. Kendoul, F.: Survey of advances in guidance, navigation, and control of unmanned rotorcraft systems. J. Field. Robot 29(2), 315–378 (2012)

    Article  Google Scholar 

  23. Kim, H., Mnih, A.: Disentangling by Factorising. In: International Conference on Machine Learning, pp. 2649–2658. PMLR (2018)

  24. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv:1312.6114 (2013)

  25. Krajník, T., Cristóforis, P., Kusumam, K., Neubert, P., Duckett, T.: Image features for visual teach-and-repeat navigation in changing environments. Robot. Auton. Syst. 88, 127–141 (2017)

    Article  Google Scholar 

  26. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances Neural Inf. Process. Syst 25, 1097–1105 (2012)

    Google Scholar 

  27. Kwak, J., Sung, Y.: Autonomous uav flight control for gps-based navigation. IEEE Access 6, 37947–37955 (2018)

    Article  Google Scholar 

  28. Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond Pixels Using a Learned Similarity Metric. In: International Conference on Machine Learning, pp. 1558–1566. PMLR (2016)

  29. Lee, D.J., Merrell, P., Wei, Z., Nelson, B.E.: Two-frame structure from motion using optical flow probability distributions for unmanned air vehicle obstacle avoidance. Mach. Vis. Appl. 21(3), 229–240 (2010)

    Article  Google Scholar 

  30. Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res 17(1), 1334–1373 (2016)

    MathSciNet  MATH  Google Scholar 

  31. Loquercio, A., Maqueda, A.I., Del-Blanco, C.R., Scaramuzza, D.: Dronet: Learning to fly by driving. IEEE Robot. Autom. Lett 3(2), 1088–1095 (2018)

    Article  Google Scholar 

  32. Matthies, L., Brockers, R., Kuwata, Y., Weiss, S.: Stereo Vision-Based Obstacle Avoidance for Micro Air Vehicles Using Disparity Space. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3242–3249. IEEE (2014)

  33. Mellinger, D., Kumar, V.: Minimum Snap Trajectory Generation and Control for Quadrotors. In: 2011 IEEE International Conference on Robotics and Automation, pp. 2520–2525. IEEE (2011)

  34. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv:1312.5602 (2013)

  35. Naidoo, Y., Stopforth, R., Bright, G.: Development of an Uav for Search & Rescue Applications. In: IEEE Africon’11, pp. 1–6. IEEE (2011)

  36. Natalizio, E., Surace, R., Loscri, V., Guerriero, F., Melodia, T.: Filming sport events with mobile camera drones: Mathematical modeling and algorithms (2012)

  37. Ng, A.Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., Liang, E.: Autonomous inverted helicopter flight via reinforcement learning. In: Experimental Robotics IX, pp. 363–372. Springer (2006)

  38. Oleynikova, H., Honegger, D., Pollefeys, M.: Reactive Avoidance Using Embedded Stereo Vision for Mav Flight. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 50–56. IEEE (2015)

  39. Oleynikova, H., Taylor, Z., Siegwart, R., Nieto, J.: Safe local exploration for replanning in cluttered unknown environments for microaerial vehicles. IEEE Robot. Autom. Lett 3(3), 1474–1481 (2018)

    Article  Google Scholar 

  40. Pomerleau, D.A.: Alvinn: An autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems, pp. 305–313 (1989)

  41. Richter, C., Bry, A., Roy, N.: Polynomial trajectory planning for aggressive quadrotor flight in dense indoor environments. In: Robotics Research, pp. 649–666. Springer (2016)

  42. Ross, S., Bagnell, D.: Efficient reductions for imitation learning. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 661–668 (2010)

  43. Ross, S., Gordon, G., Bagnell, D.: A reduction of imitation learning and structured prediction to no-regret online learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 627–635 (2011)

  44. Ross, S., Melik-Barkhudarov, N., Shankar, K.S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive Uav control in cluttered natural environments. In: 2013 IEEE International Conference on Robotics and Automation, pp. 1765–1772. IEEE (2013)

  45. Sanket, N.J., Singh, C.D., Ganguly, K., Fermüller, C., Aloimonos, Y.: Gapflyt: Active vision based minimalist structure-less gap detection for quadrotor flight. IEEE Robot. Autom. Lett 3(4), 2799–2806 (2018)

    Article  Google Scholar 

  46. Scherer, S., Rehder, J., Achar, S., Cover, H., Chambers, A., Nuske, S., Singh, S.: River mapping from a flying robot: state estimation, river detection, and obstacle mapping. Auton. Robot. 33(1-2), 189–214 (2012)

    Article  Google Scholar 

  47. Scherer, S., Singh, S., Chamberlain, L., Saripalli, S.: Flying fast and low among obstacles. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 2023–2029. IEEE (2007)

  48. Shah, S., Dey, D., Lovett, C., Kapoor, A.: Airsim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles. In: Field and Service Robotics (2017)

  49. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  50. Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A Benchmark for the Evaluation of Rgb-D Slam Systems. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 573–580. IEEE (2012)

  51. Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  52. Tordesillas, J., How, J.P.: PANTHER: Perception-aware trajectory planner in dynamic environments. arXiv:2103.06372 (2021)

  53. Tordesillas, J., Lopez, B.T., Carter, J., Ware, J., How, J.P.: Real-time planning with multi-fidelity models for agile flights in unknown environments. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 725–731. IEEE (2019)

  54. Valavanis, K.P., Vachtsevanos, G.J.: Handbook of unmanned aerial vehicles, vol. 1 Springer (2015)

  55. Warren, M., Greeff, M., Patel, B., Collier, J., Schoellig, A.P., Barfoot, T.D.: There’s no place like home: Visual teach and repeat for emergency return of multirotor uavs during gps failure. IEEE Robotics and Automation Letters 4(1), 161–168 (2018)

    Article  Google Scholar 

  56. Wen, H., Clark, R., Wang, S., Lu, X., Du, B., Hu, W., Trigoni, N.: Efficient indoor positioning with visual experiences via lifelong learning. IEEE Trans. Mob. Comput. 18(4), 814–829 (2018)

    Article  Google Scholar 

  57. Yang, J., Rao, D., Chung, S.J., Hutchinson, S.: Monocular vision based navigation in Gps-denied riverine environments. In: Infotech@ Aerospace 2011, p. 1403 (2011)

  58. Zhou, B., Gao, F., Wang, L., Liu, C., Shen, S.: Robust and efficient quadrotor trajectory generation for fast autonomous flight. IEEE Robot. Autom. Lett 4(4), 3529–3536 (2019)

    Article  Google Scholar 

Download references


The authors would like to thank all human subjects who helped with the data collection in the simulation environments. We would also like to thank Greesan Gurumurthy and Marie Cor Croz from the Cyber-Human-Physical Systems lab for their kind help with the project.


The work was supported by the Office of Naval Research (ONR) under the NEPTUNE 2.0 program (No. N00014-20-1-2268).

Author information




P.W., R.L., A.M., and Z.K. designed the study. R.L. created high-fidelity environments in the simulation. P.W. and A.M. developed different vision-based controllers and the imitation learning framework. P.W. and A.M. wrote the script. P.W. and Z.K. designed the experiment. P.W. coordinated the experiment and analyzed the data. All authors wrote the final manuscript.

Corresponding author

Correspondence to Zhaodan Kong.

Ethics declarations

Ethics Approval

This study was approved by the Institutional Review Board (IRB) of the University of California Davis.

Consent to Participate

Freely-given, informed consent to participate in the study has been obtained from all human subjects.

Consent for Publication

All human subjects have consented to have their data published in a journal article.

Conflict of Interests

The authors declare that they have no conflict of interest for this work.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wei, P., Liang, R., Michelmore, A. et al. Vision-Based 2D Navigation of Unmanned Aerial Vehicles in Riverine Environments with Imitation Learning. J Intell Robot Syst 104, 47 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Imitation learning
  • Vison-based control
  • Unmanned aerial vehicle
  • Riverine environments