An Automatic Scene Recognition Using TD-Learning for Mobile Robot Localization in an Outdoor Environment

  • Xiaochun WangEmail author
  • Xiali Wang
  • Don Mitchell Wilkes


Navigation is an important ability of mobile robots. Localization in an environment is the very first step to achieve it. In this chapter, based on the extensive research already conducted for known indoor environments, we are going to utilize a natural landmark-based localization strategy for mobile robot working in an outdoor unknown environment. Particularly, we are going to pursue a real-time scene recognition scheme so as to use objects segmented in it as the natural landmarks and to explore the suitability of configure representation for automatic scene recognition in robot navigation. Experiments designed to infer semantic prediction of a scene from different configurations of its stimuli using TD-learning are conducted and the results demonstrate the effectiveness of the proposed location learning method.


Localization Visual landmarks Scene recognition Configural representation TD-learning 


  1. Busch, M. A., Skubic, M., Keller, J. M., & Stone, E. E. (2007). A robot in a water maze: Learning a spatial memory task. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (pp. 1727–1732).Google Scholar
  2. Civera, J., Grasa, O., Davison, A., & Montiel, J. (2010). 1-point RANSAC for EKF filtering: Application to real-time structure from motion and visual odometry. Journal of Field Robot, 27(5), 609–631.CrossRefGoogle Scholar
  3. Conn, K., & Peters II, R. A. (2007). Reinforcement learning with a supervisor for a mobile robot in a real-world environment. In Proceedings of the IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA) (pp. 73–78).Google Scholar
  4. Deisenroth, M. P., & Rasmussen, C. E. (2011). PILCO: A model-based and data-efficient approach to policy search. In Proceedings of the 28th International Conference on Machine Learning (ICML) (pp. 465–472).Google Scholar
  5. Dellaert, F., Burgard, W., Fox, D., et al. (1999). Monte Carlo localization for mobile robots. In Proceedings of the IEEE/RSJ International Conference on Robotics and Automation (ICRA) (pp. 1322–1328).Google Scholar
  6. Duan, Y., Cui, B., & Yang, H. (2008). Robot navigation based on fuzzy RL algorithm. In International Symposium on Neural Networks (ISNN). Advances in Neural Networks. ISNN 2008 Lecture Notes in Computer Science (Vol. 5263, pp. 391–399).Google Scholar
  7. Dusek, J., & Eichenbaum, H. (1998). The hippocampus and transverse patterning guided by olfactory cues. Behavioral Neuroscience, 112(4), 762–771.CrossRefGoogle Scholar
  8. Fox, D. (1998). Markov localization: A probabilistic framework for mobile robot localization and navigation. Bonn, Germany: University of Bonn.zbMATHGoogle Scholar
  9. Guivant, J., & Nebot, E. (2001). Optimization of simultaneous localization and map building algorithm for real time implementation. IEEE Transactions on Robotics and Automation, 17(3), 242–257.CrossRefGoogle Scholar
  10. Hester, T., Quinlan, M., & Stone, P. (2012). RTMBA: A real-time model-based reinforcement learning architecture for robot control. In Proceedings of IEEE International Conference on Robotics and Automation (ICRA) (pp. 85–90).Google Scholar
  11. Hornung, A., Bennewitz, M., & Strasdat, H. (2010). Efficient vision-based navigation: Learning about the influence of motion blur. Autonomous Robots, 29(2), 137–149.CrossRefGoogle Scholar
  12. Jensfelt, P., & Christensen, H. I. (2001). Active global localization for a mobile robot using multiple hypothesis tracking. IEEE Transactions on Robotics and Automation, 17(2), 748–760.CrossRefGoogle Scholar
  13. Juang, C. F., & Hsu, C. H. (2009). Reinforcement ant optimized fuzzy controller for mobile-robot wall-following control. IEEE Transactions on Industrial Electronics, 56(10), 3931–3940.CrossRefGoogle Scholar
  14. Kober, J., Bagnell, J. A., & Peters, J. (2013). Reinforcement learning in robotics—A survey. The International Journal of Robotics Research, 32(11), 1238–1274.CrossRefGoogle Scholar
  15. Kollar, T., & Roy, N. (2008). Trajectory optimization using reinforcement learning for map exploration. International Journal of Robotics Research, 27(2), 175–197.CrossRefGoogle Scholar
  16. Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRefGoogle Scholar
  17. Michels, J., Saxena, A., & Ng, A. Y. (2005). High speed obstacle avoidance using monocular vision and reinforcement learning. In Proceedings of the 22 International Conference on Machine Learning (ICML) (pp. 593–600).Google Scholar
  18. Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.CrossRefGoogle Scholar
  19. Mouragnon, E., Lhuillier, M., Dhome, M., Dekeyser, F., & Sayd, P. (2009). Generic and real-time structure from motion using local bundle adjustment. Image and Vision Computing, 27(8), 1178–1193.CrossRefGoogle Scholar
  20. Murray, J. J., Cox, C., Lendaris, G. G., et al. (2002). Adaptive dynamic programming. IEEE Transactions on Systems of Man and Cybernetics—Part C: Applications and Reviews, 32(2), 140–153.CrossRefGoogle Scholar
  21. Nadel, L., & Willner, J. (1980). Context and conditioning: A place for space. Physiology & Behavior, 8, 218–228.Google Scholar
  22. Quintia, P., Iglesias, R., Rodriguez, M. A., Regueiro, C. V., & Valdes, F. (2012). Learning in real robots from environment interaction. Journal of Physical Agents, 6(1), 43–51.Google Scholar
  23. Rosten, E., & Drummond, T. (2006). Machine learning for high-speed corner detection. In Proceedings of European Conference on Computer Vision, Lecture Notes Computer Science (Vol. 3951, pp. 430–443).Google Scholar
  24. Rudy, J.W., & O’Reilly, R.C. (2001). Conjunctive representations, the hippocampus, and contextual fear conditioning. Cognitive, Affective, & Behavioral Neuroscience, 1(1), 66–82.Google Scholar
  25. Se, S., Lowe, D., & Little, J. (2001). Vision-based mobile robot localization and mapping using scale-invariant features. Proceedings IEEE International Conference on Robotics and Automation, 2, 2051–2058.Google Scholar
  26. Seymour, B., O’Doherty, J. P., et al. (2004). Temporal difference models describe higher-order learning in humans. Nature, 429(10), 664–667.CrossRefGoogle Scholar
  27. Smith, R., Self, M., & Cheesman, P. (1990). Estimating uncertain spatial relationships in robotics. In Autonomous Robot Vehicles (pp. 167–193). New York, USA: Springer-Verlag.Google Scholar
  28. Sutherland, R. J., & Rudy, J. W. (1989). Configural association theory: The role of the hippocampal formation in learning, memory, and amnesia. Psychology, 17, 129–144.Google Scholar
  29. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge: MIT Press.zbMATHGoogle Scholar
  30. Tardif, J., Pavlidis, Y., & Daniilidis, K. (2008). Monocular visual odometry in urban environments using an omnidirectional camera. In Proceedings of International Conference on Intelligent Robots and Systems (pp. 2531–2538).Google Scholar
  31. Thrun, S., Burgard, W., & Fox, D. (1998). A probabilistic approach to concurrent mapping and localization for mobile robots. Machine Learning, 31(1), 29–53.CrossRefGoogle Scholar
  32. Wang, X., Tugcu, M., Hunter, J. E., & Wilkes, D. M. (2009). Exploration of configure representation in landmark learning using working memory toolkit. Pattern Recognition Letters, 66–79.Google Scholar
  33. Wang, X, Chang, C., & Wang, X. L. (2017). A fast incremental spectral clustering algorithm for image segmentation. In Proceedings of the 2017 International Conference on Computational Science and Computational Intelligence (pp. 15–27), Las Vegas, December.Google Scholar
  34. Wehner, R. (1992). Arthropods. Animal Homing (ed. Papi, F.). Chapman and Hall, London, pp 45–144.Google Scholar
  35. Williams, H., Browne, W. N., & Carnegie, D. A. (2017). Learned action SLAM: Sharing SLAM through learned path planning information between heterogeneous robotic platforms. Applied Soft Computing, 313–326.Google Scholar
  36. Wolf, J., Burgard, W., & Burkhardt, H. (2005). Robust vision-based localization by combining an image retrieval system with Monte Carlo localization. IEEE Transactions on Robotics, 21(2), 208–216.CrossRefGoogle Scholar
  37. Xin, X. (2006). A sparse kernel-based least-squares temporal difference algorithm for reinforcement learning. In Proceedings of the 2006 International Conference on Natural Computation, Lecture Notes in Computer Science (Vol. 4221, pp. 47–56).Google Scholar

Copyright information

© Xi'an Jiaotong University Press 2020

Authors and Affiliations

  • Xiaochun Wang
    • 1
    Email author
  • Xiali Wang
    • 2
  • Don Mitchell Wilkes
    • 3
  1. 1.School of Software EngineeringXi’an Jiaotong UniversityXi’anChina
  2. 2.School of Information EngineeringChang’an UniversityXi’anChina
  3. 3.Department of Electrical Engineering and Computer ScienceVanderbilt UniversityNashvilleUSA

Personalised recommendations