A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains

Fern??ndez, Fernando; Borrajo, Daniel; Parker, Lynne E.

doi:10.1007/s10846-005-5137-x

A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains

Published: 12 September 2005

Volume 43, pages 161–174, (2005)
Cite this article

Journal of Intelligent and Robotic Systems Aims and scope Submit manuscript

Fernando Fern??ndez¹,
Daniel Borrajo¹ &
Lynne E. Parker²

455 Accesses
28 Citations
Explore all metrics

Abstract

Reinforcement learning has been widely applied to solve a diverse set of learning tasks, from board games to robot behaviours. In some of them, results have been very successful, but some tasks present several characteristics that make the application of reinforcement learning harder to define. One of these areas is multi-robot learning, which has two important problems. The first is credit assignment, or how to define the reinforcement signal to each robot belonging to a cooperative team depending on the results achieved by the whole team. The second one is working with large domains, where the amount of data can be large and different in each moment of a learning step. This paper studies both issues in a multi-robot environment, showing that introducing domain knowledge and machine learning algorithms can be combined to achieve successful cooperative behaviours.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptively Shaping Reinforcement Learning Agents via Human Reward

Distributed Reinforcement Learning for Robot Teams: a Review

Article 01 September 2022

Behavior Adaptation by Means of Reinforcement Learning

References

Aha, D.: 1997, Lazy Learning, Kluwer Academic Publishers, Dordrecht.
MATH Google Scholar
Balch, T. and Parker, L. E. (eds): 2002, Robot Teams: from Diversity to Polymorphism. A. K. Peters Publishers.
Bellman, R.: 1957, Dynamic Programming, Princeton Univ. Press, Princeton, NJ.
Google Scholar
Bertsekas, D. P. and Tsitsiklis, J. N.: 1996, Neuro-Dynamic Programming, Athena Scientific, Bellmon, MA.
MATH Google Scholar
Duda, R. O. and Hart, P. E.: 1973, Pattern Classification and Scene Analysis, Wiley, New York.
MATH Google Scholar
Fern??ndez, F. and Borrajo, D.: 2000, VQQL. Applying vector quantization to reinforcement learning, in: RoboCup-99: Robot Soccer World Cup III, Lecture Notes in Artificial Intelligence, Vol. 1856, Springer, Berlin, pp. 292???303.
Chapter Google Scholar
Fern??ndez, F. and Borrajo, D.: 2002, On determinism handling while learning reduced state space representations, in: Proc. of the European Conf. on Artificial Intelligence (ECAI 2002), Lyon, France, July.
Fern??ndez, F. and Isasi, P.: 2002, Automatic finding of good classifiers following a biologically inspired metaphor, Computing Informatics 21(3), 205???220.
Google Scholar
Fern??ndez, F. and Isasi, P.: 2004, Evolutionary design of nearest prototype classifiers, J. Heuristics 10(4), 431???454.
Article Google Scholar
Fern??ndez, F. and Parker, L.: 2001, Learning in large cooperative multi-robot domains, Internat. J. Robotics Automat. 16(4), 217???226.
Google Scholar
Kaelbling, L. P., Littman, M. L., and Moore, A. W.: 1996, Reinforcement learning: A survey, J. Artificial Intelligence Res. 4, 237???285.
Google Scholar
Mahadevan, S. and Connell, J.: 1992, Automatic programming of behaviour-based robots using reinforcement learning, Artificial Intelligence 55(2/3), 311???365.
Article Google Scholar
Moore, A. W. and Atkeson, C. G.: 1995, The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces, Machine Learning 21(3), 199???233.
Google Scholar
Ng, A. Y. and Russel, S.: 2000, Algorithms for inverse reinforcement learning, in: Proc. of the Seventeenth Internat. Conf. on Machine Learning.
Parker, L. and Touzet, C.: 2000, Multi-robot learning in a cooperative observation task, in: L. E. Parker, G. Bekey and J. Barhen (eds), Distributed Autonomous Robotic Systems, Vol. 4, Springer, Berlin, pp. 391???401.
Google Scholar
Parker, L. E.: 2002, Distributed algorithms for multi-robot observation of multiple moving targets, Autonom. Robots 12(3), 231???255.
Article MATH Google Scholar
Puterman, M. L.: 1994, Markov Decision Processes ??? Discrete Stochastic Dynamic Programming, Wiley, New York.
MATH Google Scholar
Santamar??a, J. C., Sutton, R. S., and Ram, A.: 1998, Experiments with reinforcement learning in problems with continuous state and action spaces, Adaptive Behavior 6(2), 163???218.
Article Google Scholar
Smart, W. D.: 2002, Making reinforcement learning work on real robots, PhD Thesis, Department of Computer Science at Brown University, Providence, RI.
Stone, P. and Veloso, M.: 2000, Multiagent systems: A survey from a machine learning perspective, Autonom. Robots 8(3).
Tesauro, G.: 1992, Practical issues in temporal difference learning, Machine Learning 8, 257???277.
MATH Google Scholar
Tsitsiklis, J. N. and Van Roy, B.: 1996, Feature-based methods for large scale dynamic programming, Machine Learning 22, 59???94.
MATH Google Scholar
Watkins C. J. C. H.: 1989, Learning from delayed rewards, PhD Thesis, King???s College, Cambridge, UK.

Download references

Author information

Authors and Affiliations

Universidad Carlos III de Madrid, Avda/de la Universidad 30, 28911-Legan??s, Madrid, Spain
Fernando Fern??ndez & Daniel Borrajo
University of Tennessee, 203 Claxton Complex, 1122 Volunteer Blvd, Knoxville, TN, 37996-3450, U.S.A.
Lynne E. Parker

Authors

Fernando Fern??ndez
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Borrajo
View author publications
You can also search for this author in PubMed Google Scholar
Lynne E. Parker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fernando Fern??ndez.

Additional information

Fernando Fern??ndez: This work has been partially funded by a grant from Spanish Science and Technology Department.

Daniel Borrajo: This work has been partially funded by grants from Spanish Science and Technology Department number TAP1999-0535-C02-02, and TIC2002-04146-C05-05.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fern??ndez, F., Borrajo, D. & Parker, L.E. A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains. J Intell Robot Syst 43, 161–174 (2005). https://doi.org/10.1007/s10846-005-5137-x

Download citation

Received: 02 April 2004
Revised: 16 March 2005
Published: 12 September 2005
Issue Date: August 2005
DOI: https://doi.org/10.1007/s10846-005-5137-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains

Abstract

Access this article

Similar content being viewed by others

Adaptively Shaping Reinforcement Learning Agents via Human Reward

Distributed Reinforcement Learning for Robot Teams: a Review

Behavior Adaptation by Means of Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains

Abstract

Access this article

Similar content being viewed by others

Adaptively Shaping Reinforcement Learning Agents via Human Reward

Distributed Reinforcement Learning for Robot Teams: a Review

Behavior Adaptation by Means of Reinforcement Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation