Advertisement

Autonomous Robots

, Volume 29, Issue 2, pp 137–149 | Cite as

Efficient vision-based navigation

Learning about the influence of motion blur
  • Armin Hornung
  • Maren Bennewitz
  • Hauke Strasdat
Article

Abstract

In this article, we present a novel approach to learning efficient navigation policies for mobile robots that use visual features for localization. As fast movements of a mobile robot typically introduce inherent motion blur in the acquired images, the uncertainty of the robot about its pose increases in such situations. As a result, it cannot be ensured anymore that a navigation task can be executed efficiently since the robot’s pose estimate might not correspond to its true location. We present a reinforcement learning approach to determine a navigation policy to reach the destination reliably and, at the same time, as fast as possible. Using our technique, the robot learns to trade off velocity against localization accuracy and implicitly takes the impact of motion blur on observations into account. We furthermore developed a method to compress the learned policy via a clustering approach. In this way, the size of the policy representation is significantly reduced, which is especially desirable in the context of memory-constrained systems. Extensive simulated and real-world experiments carried out with two different robots demonstrate that our learned policy significantly outperforms policies using a constant velocity and more advanced heuristics. We furthermore show that the policy is generally applicable to different indoor and outdoor scenarios with varying landmark densities as well as to navigation tasks of different complexity.

Keywords

Navigation Reinforcement learning Vision Motion blur 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

(MPG 11.3 MB)

(MPG 18.1 MB)

References

  1. Bay, H., Tuytelaars, T., & Van Gool, L. (2006). SURF: speeded-up robust features. Proc. of the European Conf. on Computer Vision, 110(3), 346–359. Google Scholar
  2. Bennewitz, M., Stachniss, C., Burgard, W., & Behnke, S. (2006). Metric localization with scale-invariant visual features using a single perspective camera. In H. I. Christiensen (Ed.), Springer tracts in advanced robotics : Vol. 22, European robotics symposium 2006. Berlin: Springer. CrossRefGoogle Scholar
  3. Brock, O., & Khatib, O. (1999). High-speed navigation using the global dynamic window approach. In Proc. of the IEEE int. conf. on robotics & automation—ICRA. Google Scholar
  4. Bryson, M., & Sukkarieh, S. (2006). Active airborne localisation and exploration in unknown environments using inertial SLAM. In IEEE Aerospace Conference. Google Scholar
  5. Cassandra, A. R., Kaelbling, L. P., & Kurien, J. A. (1996). Acting under uncertainty: discrete Bayesian models for mobile-robot navigation. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS (pp. 963–972). Google Scholar
  6. Doya, K. (2000). Reinforcement learning in continuous time and space. Neural Computation, 12(1), 219–245. CrossRefGoogle Scholar
  7. Fox, D., Burgard, W., & Thrun, S. (1997). The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine, 4, 23–33. CrossRefGoogle Scholar
  8. He, R., Prentice, S., & Roy, N. (2008). Planning in information space for a quadrotor helicopter in a GPS-denied environments. In Proc. of the IEEE int. conf. on robotics & automation—ICRA (pp. 1814–1820). Google Scholar
  9. Hornung, A., Strasdat, H., Bennewitz, M., & Burgard, W. (2009). Learning efficient policies for vision-based navigation. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS. Google Scholar
  10. Ido, J., Shimizu, Y., Matsumoto, Y., & Ogasawara, T. (2009). Indoor navigation for a humanoid robot using a view sequence. Int. Journal of Robotics Research, 28(2), 315–325. CrossRefGoogle Scholar
  11. Julier, S. J., & Uhlmann, J. K. (1997). A new extension of the Kalman filter to nonlinear systems. In Int. symposium on aerospace/defense sensing, simulation and controls, pp. 182–193. Google Scholar
  12. Kass, R. E., & Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. Journal of the American Statistical Association, 90(431), 928–934. zbMATHCrossRefMathSciNetGoogle Scholar
  13. Kollar, T., & Roy, N. (2006). Using reinforcement learning to improve exploration trajectories for error minimization. In Proc. of the IEEE int. conf. on robotics & automation—ICRA (pp. 3338–3343). Google Scholar
  14. Kwok, C., & Fox, D. (2004). Reinforcement learning for sensing strategies. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS (vol. 4, pp. 3158–3163), 28 Sept.–2 Oct. Google Scholar
  15. LaValle, S. M., & Kuffner, J. J. (1999). Randomized kinodynamic planning. In Proc. of the IEEE int. conf. on robotics & automation—ICRA (pp. 473–479). Google Scholar
  16. Lovejoy, W. S. (1991). Computationally feasible bounds for partially observed Markov decision processes. Operations Research, 39(1), 162–175. zbMATHCrossRefMathSciNetGoogle Scholar
  17. Martinez-Cantin, R., de Freitas, N., Brochu, E., Castellanos, J., & Doucet, A. (2009). A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Journal of Autonomous Robots, 27(2), 93–103. CrossRefGoogle Scholar
  18. Menache, I., Mannor, S., & Shimkin, N. (2005). Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134(1), 215–238. zbMATHCrossRefMathSciNetGoogle Scholar
  19. Michels, J., Saxena, A., & Ng, A. Y. (2005). High speed obstacle avoidance using monocular vision and reinforcement learning. In Proc. of the int. conf. on machine learning—ICML (pp. 593–600). New York: ACM. CrossRefGoogle Scholar
  20. Miura, J., Negishi, Y., & Shirai, Y. (2006). Adaptive robot speed control by considering map and motion uncertainty. Journal of Robotics & Autonomous Systems, 54(2), 110–117. CrossRefGoogle Scholar
  21. Neumann, G. (2005). The reinforcement learning toolbox, reinforcement learning for optimal control tasks. Diplomarbeit, Technischen Universität (University of Technology) Graz, May 2005. Google Scholar
  22. Pelleg, D., & Moore, A. (2000). X-means: extending K-means with efficient estimation of the number of clusters. In Proc. of the int. conf. on machine learning—ICML (pp. 727–734). San Mateo: Morgan Kaufmann. Google Scholar
  23. Pretto, A., Menegatti, E., Bennewitz, M., Burgard, W., & Pagello, E. (2009). A visual odometry framework robust to motion blur. In Proc. of the IEEE int. conf. on robotics & automation (ICRA). Google Scholar
  24. Roy, N., & Gordon, G. (2002). Exponential family PCA for belief compression in POMDPs. In S. Becker, S. Thrun, K. Obermayer (Eds.), Proc. of the conf. on neural information processing systems—NIPS (pp. 1043–1049), Vancouver, Canada, December 2002. Google Scholar
  25. Roy, N., & Thrun, S. (1999). Coastal navigation with mobile robots. In Proc. of the conf. on neural information processing systems—NIPS (vol. 12, pp. 1043–1049). Google Scholar
  26. Roy, N., Burgard, W., Fox, D., & Thrun, S. (1999). Coastal navigation–mobile robot navigation with uncertainty in dynamic environments. In Proc. of the IEEE int. conf. on robotics & automation—ICRA (vol. 1, pp. 35–40). Google Scholar
  27. Rubinstein, R. Y., & Kroese, D. P. (2004). The cross-entropy method: a unified approach to combinatorial optimization, monte-carlo simulation and neural computation. Berlin: Springer. Google Scholar
  28. Rummery, G. A., & Niranjan, M. (1994). On-line Q-learning using connectionist systems (Technical report CUED/F-INFENG/TR 166). Cambridge University, Cambridge, UK, September 1994. Google Scholar
  29. Satoh, H. (2006). A state space compression method based on multivariate analysis for reinforcement learning in high-dimensional continuous state spaces. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E89-A(8), 2181–2191. CrossRefGoogle Scholar
  30. Schlegel, C. (1998). Fast local obstacle avoidance under kinematic and dynamic constraints for a mobile robot. In: Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS. Google Scholar
  31. Simmons, R. (1996). The curvature-velocity method for local obstacle avoidance. In Proc. of the IEEE int. conf. on robotics & automation—ICRA. Google Scholar
  32. Sondik, E. J. (1971). The optimal control of partially observable Markov decision processes. Ph.D. thesis, Stanford University, Stanford, USA. Google Scholar
  33. Stachniss, C., & Burgard, W. (2002). An integrated approach to goal-directed obstacle avoidance under dynamic constraints for dynamic environments. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS (pp. 508–513), Lausanne, Switzerland. Google Scholar
  34. Strasdat, H., Stachniss, C., & Burgard, W. (2009). Which landmark is useful? Learning selection policies for navigation in unknown environments. In Proc. of the IEEE int. conf. on robotics & automation—ICRA. Google Scholar
  35. Sutton, R. S. (1996). Generalization in reinforcement learning: successful examples using sparse coarse coding. In Proc. of the conf. on neural information processing systems—NIPS (pp. 1038–1044). Cambridge: MIT Press. Google Scholar
  36. Sutton, R. S., & Barto, A. G. (1998). Adaptive computation and machine learning reinforcement learning: an introduction. Cambridge: MIT Press. Google Scholar
  37. Thrun, S., Burgard, W., & Fox, D. (2005). Probabilistic Robotics. Cambridge: MIT Press. zbMATHGoogle Scholar
  38. Uther, W. T. B., & Veloso, M. M. (1998). Tree based discretization for continuous state space reinforcement learning. In Proc. of the national conference on artificial intelligence—AAAI (pp. 769–774). Google Scholar
  39. Van Huynh, A., & Roy, N. (2009). icLQG: combining local and global optimization for control in information space. In Proc. of the IEEE international conference on robotics and automation—ICRA. Google Scholar
  40. Weiss, C., Fröhlich, H., & Zell, A. (2006). Vibration-based terrain classification using support vector machines. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS. Google Scholar
  41. Wiering, M., & Schmidhuber, J. (1998). Fast online Q(λ). Machine Learning, 33(1), 105–115. zbMATHCrossRefGoogle Scholar
  42. Wurm, K. M., Kuemmerle, R., Stachniss, C., & Burgard, W. (2009). Improving robot navigation in structured outdoor environments. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Armin Hornung
    • 1
  • Maren Bennewitz
    • 1
  • Hauke Strasdat
    • 2
  1. 1.Department of Computer ScienceUniversity of FreiburgFreiburgGermany
  2. 2.Department of ComputingImperial College LondonLondonUK

Personalised recommendations