Supervised reinforcement learning: Application to a wall following behaviour in a mobile robot
In this work we describe the design of a control approach in which, by way of supervised reinforcement learning, the learning potential is combined with the previous knowledge of the task in question, obtaining as a result rapid convergence to the desired behaviour as well as an increase in the stability of the process. We have tested the application of our approach in the design of a basic behaviour pattern in mobile robotics, such as that of wall following. We have carried out several experiments obtaining goods results which confirm the utility and advantages derived from the use of our approach.
Unable to display preview. Download preview PDF.
- 1.P. Cichosz. Reinforcement learning algorithms based on the methods of temporal differences. Master's thesis, Warsaw University of Technology, Septeniber 1994.Google Scholar
- 2.D. Fox, W. Burgard, and S. Thrun. The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine, 4(1):23–33, 1997.Google Scholar
- 3.R. Iglesias, C.V. Regueiro, J. Correa, and S. Barro. Implementation of a basic reactive behavior in mobile robotics through artificial neural networks. In Proc. of IWANN'97, 1997.Google Scholar
- 4.R. Iglesias, C.V. Regueiro, J. Correa, E. Sánchez, and S. Barro. Improving wall following lwhaviour in a mobile rol)ot using reinforcement learning. In Proceedings of the Iraternational ICSC Symposium on Engineering of Intelligent Systems, 1998.Google Scholar
- 5.R. Garcia J. Gasós, M.C. Garcia-Alegre. Fuzzy strategies for the navigation of autonomous mobile robots. In Proc. of IFES'91, pages 1024–1034, 1991.Google Scholar
- 6.R. Maclin and J.W. Shavlik. Creating advice-taking reinforcement learners. Machine, Learning, 22:251–281,1996.Google Scholar
- 7.Y. Smirnov, S. Koenig, M.M. Veloso, and R.G. Simmons. Efficient goal-directed exploration. In Proceedings of the Thirtenth National Conference on Artificial Intelligence, pages 292–297, 1996.Google Scholar
- 8.R. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9–44, 1988.Google Scholar
- 9.C. Watkins. Learning from Delayed Reruards. PhD thesis, King's College, Cambridge, 1989.Google Scholar