Study of Cooperation Strategy of Robot Based on Parallel Q-Learning Algorithm

  • Shuda Wang
  • Feng Si
  • Jing Yang
  • Shuoning Wang
  • Jun Yang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5314)

Abstract

How to solve MR (Multi-Robots) in a dynamic environment of the study of knowledge, and to complete a task or solve a problem, the robot can have the same goal , also different goals. Therefore, to put forward two architectures, which are more suitable for MR studying, according to the architecture, to design the improved learning methods algorithm Q for MR, which solve the problems of coordination and cooperation, such as the credit distribution, distribution of resources, tasks and conflict resolution. MR may be learning in independent environment, and fusing results after learning cycle, and the final results is going to be shared by all the robots, and as the basis of reference passing into next learning cycle, increase learning chances between MR and environment. Simulation results show that the learning algorithm enables MR learning rapidly and quickly surrounded by a mobile group, complying with better effective.

Keywords

Multi-Robots Reinforcement Learning Q-learning Dynamic Programming Parallel learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sutton, R.S., Precup, D., Singh, S.: Between MDPs and Semi-MDPs: a Frame Work for Temporal Abstraction in Reinforcement Learning. J. Artificial Intelligence 112, 181–211 (1999)MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Tsitsiklis, J.N.: Asynchronous Stochastic Approximation and Q-learning Machine Learning. J. Artificial Intelligence 16, 185–202 (1994)MATHGoogle Scholar
  3. 3.
    Ishiwaka, Y., Sato, T., Kakazu, Y.: An Approach to the Pursuit Problem on A Heterogeneous Multi-robot System Using Reinforcement. Learning Robotics and Autonomous Systems 43, 245–256 (2002)CrossRefGoogle Scholar
  4. 4.
    Shun, F.S., Ted, T., Hung, T.H.: Credit Assigned CMAC and its Applicant Ion to Online Learning Robust Controllers. IEEE Trans. on Systems, Man and Cybernetics–Part B 33, 202–213 (2003)CrossRefGoogle Scholar
  5. 5.
    Cai, Y., Chen, J., Yao, J., Li, S.: Global Planning from Local Eyeshot: An Implementation of Observation-based Plan Coordination in Robot Cup Simulation Games. In: Andreas, B., Silvia, C., Satoshi, T. (eds.) RobotCup 2001: Robot Soccer World Cup V. Springer, Heidelberg (2002)Google Scholar
  6. 6.
    Asama, H., Matsumoto, A., Ishida, Y.: Design of an Autonomous and Distributed Robot System. In: Proceedings of the IEEE/RSJ International Workshop on the Intelligent Robots and System, Tsukuba, pp. 283–290 (1989)Google Scholar
  7. 7.
    Ishiwaka, Y., Sato, T., Kakazu, Y.: An Approach to the Pursuit Problem on a Heterogeneous Multi-agent System Using Reinforcement Learning. J. Robotics and Autonomous Systems 43, 245–256 (2003)CrossRefGoogle Scholar
  8. 8.
    Wang, S., Zhu, Q., Lui, Z., Yang, J., Si, F.: Study of Reinforcement Learning Based on Multi-agent Robot Systems. J. Computational Information Systems. 3, 2001–2006 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Shuda Wang
    • 1
  • Feng Si
    • 1
  • Jing Yang
    • 1
  • Shuoning Wang
    • 1
  • Jun Yang
    • 1
  1. 1.College of Computer and Information EngineeringHarbin University of CommerceHarbinChina

Personalised recommendations