Abstract
A novel method was designed to solve reinforcement learning problems with artificial potential field. Firstly a reinforcement learning problem was transferred to a path planning problem by using artificial potential field(APF), which was a very appropriate method to model a reinforcement learning problem. Secondly, a new APF algorithm was proposed to overcome the local minimum problem in the potential field methods with a virtual water-flow concept. The performance of this new method was tested by a gridworld problem named as key and door maze. The experimental results show that within 45 trials, good and deterministic policies are found in almost all simulations. In comparison with WIERING’s HQ-learning system which needs 20 000 trials for stable solution, the proposed new method can obtain optimal and stable policy far more quickly than HQ-learning. Therefore, the new method is simple and effective to give an optimal solution to the reinforcement learning problem.
Similar content being viewed by others
References
KAELBLING L P, LITTMAN M L, MOORE A W. Reinforcement learning: A survey [J]. Journal of Artificial Intelligence Research, 1996, 4(1): 273–285.
SUTTON R S, BARTO A. Reinforcement learning: An introduction [M]. Cambridge: MIT Press, 1998.
BANERJEE B, STONE P. General game learning using knowledge transfer [C]// Proceedings of the 20th International Joint Conference on Artificial Intelligence. California: AAAI Press, 2007: 672–677.
ASADI M, HUBER M. Effective control knowledge transfer through learning skill and representation hierarchies [C]// Proceedings of the 20th International Joint Conference on Artificial Intelligence. California: AAAI Press, 2007: 2054–2059.
KONIDARIS G, BARTO A. Autonomous shaping: Knowledge transfer in reinforcement learning [C]// Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh: ACM Press, 2006: 489–496.
MEHTA N, NATARAJAN S, TADEPALLI P, FERN A. Transfer in variable-reward hierarchical reinforcement learning [C]// Workshop on Transfer Learning at Neural Information Processing Systems. Oregon: ACM Press, 2005: 20–23.
WILSON A, FERN A, RAY S, TADEPALLI P. Multi-Task reinforcement learning: A hierarchical Bayesian approach [C]// Proceedings of the 24th International Conference on Machine Learning. Oregon: ACM Press, 2007: 923–930.
GOEL S, HUBER M. Subgoal discovery for hierarchical reinforcement learning using learned policies [C]// Proceedings of the 16th International FLAIRS Conference. Florida: AAAI Press, 2003: 346–350.
TAYOR M E, STONE P. Behavior transfer for value-function-based reinforcement learning [C]// The Fourth International Joint Conference on Autonomous Agents and Multiagent Systems. New York: ACM Press, 2005: 53–59.
HENGST B. Discovering hierarchy in reinforcement learning with HexQ [C]// Proceedings of the 19th International Conference on Machine Learning. San Francisco: Morgan Kaufmann, 2002: 243–250.
DIUK C, STREHL A L, LITTMAN M L. A hierarchical approach to efficient reinforcement learning in deterministic domains [C]// Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems. New York: ACM Press, 2006: 313–319.
ZHOU W, COGGINS R. A biologically inspired hierarchical reinforcement learning system [J]. Cybernetics and Systems, 2005, 36(1): 1–44.
BARTO A, MAHADEVAN S. Recent advances in hierarchical reinforcement learning [J]. Discrete Event Dynamic Systems: Theory and Applications, 2003, 13(1): 41–77.
KEARNS M, KOLLER D. Efficient reinforcement learning in factored MDPs [C]// Proceedings of the 6th International Joint Conference on Artificial Intelligence. Stockholm: Morgan Kaufmann, 1999: 740–747.
WEN Zhi-qiang, CAI Zi-xing. Global path planning approach based on ant colony optimization algorithm [J]. Journal of Central South University of Technology, 2006, 13(6): 707–712.
ZHU Xiao-cai, DONG Guo-hua, CAI Zi-xing. Robust simultaneous tracking and stabilization of wheeled mobile robots not satisfying nonholonomic constraint [J]. Journal of Central South University of Technology, 2007, 14(4): 537–545.
ZOU Xiao-bing, CAI Zi-xing, SUN Guo-rong. Non-smooth environment modeling and global path planning for mobile robots [J]. Journal of Central South University of Technology, 2003, 10(3): 248–254.
ANDREWS J R, HOGAN N. Impedance control as a framework for implementing obstacle avoidance in a manipulator [C]// Proceedings of Control of Manufacturing Process and Robotic System. New York: ASME Press, 1983: 243–251.
KHATIB O. Real-time obstacle avoidance for manipulators and mobile robots [J]. International Journal of Robotics Research, 1986, 5(1): 90–98.
HUANG W H, FAJEN B R, FINK J R. Visual navigation and obstacle avoidance using a steering potential function [J]. Journal of Robotics and Autonomous Systems, 2006, 54(4): 288–299.
PARK M G, LEE M C. Artificial potential field based path planning for mobile robots using a virtual obstacle concept [C]// Proceedings of IEEE/ASME International Conference on Advanced Intelligent Mechatronics. Victoria: IEEE Press, 2003: 735–740.
LIU C Q, KRISHNAN H, YONG L S. Virtual obstacle concept for local-minimum-recovery in potential-field based navigation [C]// Proceedings of the IEEE International Conference on Robotics & Automation. San Francisco: IEEE Press, 2000: 983–988.
BROCK O, KHATIB O. High-speed navigation using the global dynamic window approach [C]// Proceedings of the IEEE International Conference on Robotics and Automation. Detroit: IEEE Press, 1999: 341–346.
KONOLIGE K. A gradient method for real time robot control [C]// Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Victoria: IEEE Press, 2000: 639–646.
RIMON E, KODITSCHEK D. Exact robot navigation using artificial potential functions [J]. IEEE Transactions on Robotics and Automation, 1992, 8(5): 501–518.
WIERING M, SCHMIDHUBER J. HQ-learning [J]. Adaptive Behavior, 1998, 6(2): 219–246.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Projects(30270496, 60075019, 60575012) supported by the National Natural Science Foundation of China
Rights and permissions
About this article
Cite this article
Xie, Lj., Xie, Gr., Chen, Hw. et al. Solution to reinforcement learning problems with artificial potential field. J. Cent. South Univ. Technol. 15, 552–557 (2008). https://doi.org/10.1007/s11771-008-0104-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11771-008-0104-x