Abstract
This paper proposes and implements a reinforcement learning algorithm for an agent that can learn to navigate in an indoor and initially unknown environment. The agent learns a trajectory between an initial and a goal state through interactions with the environment. The environmental knowledge is encoded in two surfaces: the reward and the penalty surfaces. The former deals primarily with planning to reach the goal whilst the latter deals mainly with reaction to avoid obstacles. Temporal difference learning is the chosen strategy to construct both surfaces. The proposed algorithm is tested for different environments and types of obstacles. The simulation results suggest that the agent is able to reach a target from any point within the environment, avoiding local minimum points. Furthermore, the agent can improve an initial solution, employing a variable learning rate, through multiple visits to the spatial positions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Araújo, A.F.R., Braga, A.P.S.: Reward-Penalty Reinforcement Learning Scheme for Planning and Reactive Behavior. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics (1998)
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on SMC 3(5), 834–846 (1983)
Barto, A.G., Bradtke, S.J., Singh, S.P.: Learning to act using real-time dynamic programming. Artificial Intelligence 73(1), 81–138 (1995)
Brooks, R.A.: A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation RA-2(1), 14–23 (1986)
Donnart, J.-Y., Meyer, J.-A.: Learning reactive and planning rules in a motivationally autonomous animat. IEEE Transactions on SMC 26(3), 381–395 (1996)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Khatib, O.: Real-time obstacle avoidance for manipulators and mobile robots. In: Proceedings of IEEE International Conference on Robotics Automation, St. Louis, MO, pp. 500–505 (1995)
Koenig, S., Simmons, R.G.: The effect of representation and knowledge on goaldirected exploration with reinforcement learning algorithms. Machine Learning 22, 227–250 (1996)
Millán, J., del Rapid, R.: safe, and incremental learning of navigation strategies. IEEE Transactions on SMC 26, 408–420 (1996)
Sutton, R.S.: Integrated architectures for learning, planning and reacting based on approximating dynamic programming. In: Proceedings of International Conference on Machine Learning, pp. 216–224 (1990)
Sutton, R.S., Barto, A.: An Introduction to Reinforcement Learning. MIT Press, Bradford Books, Cambridge (1998)
Thrun, S., Moeller, K., Linden, A.: Planning with an Adaptive World Model. In: Touretzky, D., Lippmann, R. (eds.) Advances in Neural Information Processing Systems (NIPS), vol. 3. Morgan Kaufmann, San Francisco (1991)
Whitehead, S.D., Ballard, D.H.: Learning to perceive and act by trial and error. Machine Learning 7, 45–83 (1991)
Winston, P.H.: Artificial Intelligence. Addison Wesley, Reading (1992)
Wyatt, J., Hoar, J., Hayes, G.: Design, analysis and comparison of robot learners, accepted for publication. In: Nehmzow, U., Recce, M., Bisset, D. (eds.) Robotics and Autonomous Systems: Special Issue on quantitative methods in mobile robotics (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de S. Braga, A.P., Araújo, A.F.R. (1998). Goal-Directed Reinforcement Learning Using Variable Learning Rate. In: de Oliveira, F.M. (eds) Advances in Artificial Intelligence. SBIA 1998. Lecture Notes in Computer Science(), vol 1515. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10692710_14
Download citation
DOI: https://doi.org/10.1007/10692710_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65190-1
Online ISBN: 978-3-540-49523-9
eBook Packages: Springer Book Archive