Artificial Life and Robotics

, Volume 19, Issue 2, pp 157–169

Learning to navigate in a virtual world using optic flow and stereo disparity signals

  • Florian Raudies
  • Schuyler Eldridge
  • Ajay Joshi
  • Massimiliano Versace
Original Article


Navigating in a complex world is challenging in that the rich, real environment provides a very large number of sensory states that can immediately precede a collision. Biological organisms such as rodents are able to solve this problem, effortlessly navigating in closed spaces by encoding in neural representations distance toward walls or obstacles for a given direction. This paper presents a method that can be used by virtual (simulated) or robotic agents, which uses states similar to neural representations to learn collision avoidance. Unlike other approaches, our reinforcement learning approach uses a small number of states defined by discretized distances along three constant directions. These distances are estimated either from optic flow or binocular stereo information. Parameterized templates for optic flow or disparity information are compared against the input flow or input disparity to estimate these distances. Simulations in a virtual environment show learning of collision avoidance. Our results show that learning with only stereo information is superior to learning with only optic flow information. Our work motivates the usage of abstract state descriptions for the learning of visual navigation. Future work will focus on the fusion of optic flow and stereo information, and transferring these models to robotic platforms.


Learning of navigation Optic flow Stereo disparity Virtual world 


  1. 1.
    Adiv G (1985) Determining three-dimensional motion and structure from optical flow generated by several moving objects. IEEE Trans Pattern Anal Mach Intell PAMI–7(4):384–401CrossRefGoogle Scholar
  2. 2.
    Adorni G, Cagnoni S, Enderle S, Kraetzschmar GK, Mordonini M, Plagge M, Ritter M, Sablatng S, Zell A (2001) Vision-based localization for mobile robots. Robot Auton Syst 36:103–119CrossRefMATHGoogle Scholar
  3. 3.
    Baker S, Scharstein D, Lewis JP, Roth S, Black MJ, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vis 92(1):1–31CrossRefGoogle Scholar
  4. 4.
    Baird LC (1995) Residual algorithms: Reinforcement learning with function approximation. In: Prieditis A, Russell S (eds) Proceedings of the twelfth international conference on machine learning. Morgan Kaufmann, San Francisco, pp 30–37Google Scholar
  5. 5.
    Barron J, Fleet D, Beauchemin S (1994) Performance of optical flow techniques. Int J Comput Vis 12(1):43–77CrossRefGoogle Scholar
  6. 6.
    Bonin-Font F, Ortiz A, Oliver G (2008) Visual navigation for mobile robots: a survey. J Intell Robot Syst 53:263–296CrossRefGoogle Scholar
  7. 7.
    Cumming BG, DeAngelis GC (2001) The physiology of stereopsis. Annu Rev Neurosci 24:203–238CrossRefGoogle Scholar
  8. 8.
    DeSouza GN, Kak AC (2002) Vision for mobile robot navigation: a survey. IEEE Trans Pattern Anal Mach Intell 24(2):237–269CrossRefGoogle Scholar
  9. 9.
    Dev A, Krose B, Groen F (1997) Navigation of mobile robot on the temporal development of the optic flow. Proc Intell Robots Syst (IROS) 2:558–563Google Scholar
  10. 10.
    Duda RO, Hart PE (1972) Use of the Hough transformation to detect lines and curves in pictures. Commun ACM 15(1):11–15CrossRefGoogle Scholar
  11. 11.
    Franz MO, Schlkopf B, Mallot HA, Blthoff H (1998) Learning view graphs for robot navigation. Auton Robots 5:111–125CrossRefGoogle Scholar
  12. 12.
    Gaskett C, Fletcher L, Zelinsky (2000) A Reinforcement learning for a vision based mobile robot. In: Proceedings of the IEEE conference on intelligent robots and systems (IROS), pp 403–409Google Scholar
  13. 13.
    Huang B-Q, Cao G-Y Guo M (2005) Reinforcement learning neural network to the problem of autonomous mobile robot obstacle avoidance. In: Proceedings of the 4th international conference on machine learning and cybernetics, pp 85–89Google Scholar
  14. 14.
    Kim D, Sun J, Oh SM, Rehg JM, Bobick AF (2006) Traversibility classification using unsupervised on-line visual learning for outdoor robot navigation. In: Proceedings of IEEE international conference on robotics and automation, Orlando, Florida, pp 518–525Google Scholar
  15. 15.
    Lemaire T, Berger C, Jung I-K, Lacroix S (2007) Vison-based SLAM: stereo and monocular approaches. Int J Comput Vis 74(3):343–364CrossRefGoogle Scholar
  16. 16.
    Longuet-Higgins HC, Prazdny K (1980) The interpretation of a moving retinal image. Proc R Soc Lond Ser B Biol Sci 208:385–397CrossRefGoogle Scholar
  17. 17.
    Lever C, Burton S, Jeewajee A, O’Keefe J, Burgess N (2009) Boundary vector cells in the subiculum of the hippocampal formation. J Neurosci 29(31):9771–9777CrossRefGoogle Scholar
  18. 18.
    Marinez-Marin T, Duckett T (2005) Fast reinforcement learning for vision-guided mobile robots. In: Proceedings of the IEEE conference on robotics and automation (IROS), Spain, pp 4170–4175Google Scholar
  19. 19.
    Michels J, Saxena A, Ng AY (2005) High speed obstacle avoidance using monocular vision and reinforcement learning. In: Proceedings of 22nd international conference on machine learning, Bonn, Germany, pp 593–600Google Scholar
  20. 20.
    Millan JR (1995) Reinforcement learning of goal-directed obstacle-avoiding reaction strategies in an autonomous mobile robot. Robot Auton Syst 15:275–299CrossRefGoogle Scholar
  21. 21.
    Prescott TJ, Mayhew JEW (1992) Obstacle avoidance through reinforcement learning. In: Moody JE, Hanson SJ, Lippman RP (eds) Advances in neural information processing systems 4. Morgan Kaufmann, San Mateo, pp 523–530Google Scholar
  22. 22.
    Ohya A, Kosaka A, Kak A (1998) Vision-based navigation by a mobile robot with obstacle avoidance using single-camera vision and ultrasonic sensing. Robot Autom 14(6):969–978CrossRefGoogle Scholar
  23. 23.
    Pack CC, Born RT (2008) Cortical mechanisms for the integration of visual motion. In: Masland RH, Albright T (eds) The senses: a comprehensive reference, vol 2, pp 189–218Google Scholar
  24. 24.
    Perrone J (1992) Model for the computation of self-motion in biological systems. J Opt Soc Am A 9(2):177–192MathSciNetCrossRefGoogle Scholar
  25. 25.
    Perrone J, Stone L (1994) A model of self-motion estimation within primate extrastriate visual cortex. Vis Res 34(21):2917–2938CrossRefGoogle Scholar
  26. 26.
    Santos-Victor J, Sandini G, Curotto F, Garibaldi S (1993) Divergence stereo for robot navigation: Learning from bees. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 434–439Google Scholar
  27. 27.
    Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vis 47(1/2/3):7–42CrossRefMATHGoogle Scholar
  28. 28.
    Shirley P, Marschner S (2009) Fundamentals of computer graphics, 3rd edn. A K Peters Natick, MassachusettsGoogle Scholar
  29. 29.
    Solstad T, Boccara CN, Kropff E, Moser M-B, Moser EI (2008) Representation of geometric borders in the entorhinal cortex. Science 332:1865–1868CrossRefGoogle Scholar
  30. 30.
    Strsslin T, Sheynikhovich D, Chavarriaga R, Gerstner W (2003) Robust self-localization and navigation based on hippocampal place cells. Neural Netw 18:1125–1140CrossRefGoogle Scholar
  31. 31.
    Sutton RS (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Touretzky DS, Mozer MC, Hasselmo ME (eds) Advances in neural information processing systems: proceedings of the 1995 conference. MIT Press, Cambridge, pp 1038–1044Google Scholar
  32. 32.
    Sutton RS, Barto AG (1998) Reinforcement Learning—an Introduction. MIT Press, CambridgeGoogle Scholar
  33. 33.
    Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8:279–292MATHGoogle Scholar
  34. 34.
    Waxman A, Duncan JH (1986) Binocular image flows: steps toward stereo-motion fusion. IEEE Trans Pattern Recognit Mach Intell PAMI–8(6):715–729CrossRefGoogle Scholar
  35. 35.
    Yue S, Rind C, Keil M, Cuadri J, Stafford R (2006) A bio-inspired visual collision detection mechanism for cars: optimisation of a model of a locust neuron to a novel environment. Neurocomputing 69:1591–1598CrossRefGoogle Scholar
  36. 36.
    Zhu W, Levinson S (2001) Vision-based reinforcement learning for robot navigation. In: Proceedings of international joint conference on neural networks, Washington DC, pp 1025–1030Google Scholar

Copyright information

© ISAROB 2014

Authors and Affiliations

  • Florian Raudies
    • 1
  • Schuyler Eldridge
    • 2
  • Ajay Joshi
    • 2
  • Massimiliano Versace
    • 1
  1. 1.Center for Computational Neuroscience and Neural Technology (CompNet) at Boston UniversityBostonUSA
  2. 2.Department of Electrical and Computer Engineering at Boston UniversityBostonUSA

Personalised recommendations