Abstract
One of the main questions concerning learning in a Multi-Agent System’s environment is: “(How) can agents benefit from mutual interaction during the learning process?” This paper describes a technique that enables a heterogeneous group of Learning Agents (LAs) to improve its learning performance by exchanging advice. This technique uses supervised learning (back-propagation), where the desired response is not given by the environment but is based on advice given by peers with better performance score. The LAs are facing problems with similar structure, in environments where only reinforcement information is available. Each LA applies a different, well known, learning technique. The problem used for the evaluation of LAs performance is a simplified traffic-control simulation. In this paper the reader can find a summarized description of the traffic simulation and Learning Agents (focused on the advice-exchange mechanism), a discussion of the first results obtained and suggested techniques to overcome the problems that have been observed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
M. Tan. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents. Proc. of the Tenth International Conference on Machine Learning, Amherst, MA, 330–337, 1993
R. S. Sutton and A. G. Barto. A Temporal-Difference Model of Classical Conditioning. Tech Report GTE Labs. TR87-509.2, 1987
S. D. Whitehead. A complexity Analisys of Cooperative Mechanisms in Reinforcement Learning. Proc. of the 9th National Conference on Artificial Inteligence (AAAI-91), 607–613, 1991
L.-J. Lin. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning 8:293–321, Kluwer Academic publishers, 1992
C. J. C. H. Watkins, P. D. Dayan. Technical note: Q-learning. Machine Learning 8,3:279–292, Kluwer Academic publishers, 1992
S. D. Whitehead, D. H. Ballard. A study of cooperative mechanisms for faster reinforcement learning. TR 365, Computer Science Department, University of Rochester, 1991
M. J. Matarić. Using Communication to Reduce Locality in Distributed Multi-agent learning. Technical Report CS-96-190, Brandeis University, Dept. of Computer Science, 1996
C. Baroglio. Teaching by shaping. Proc. of ICML-95. Workshop on Learning by Induction vs. Learning by Demonstration, Tahoe City, CA, USA, 1995
J. A. Clouse. Learning from an automated training agent. Gerhard Weiß and Sandip Sen, editors, Adaptation and Learning in Multiagent Systems, Springer Verlag, Berlin, 1996
R. I. Brafman, M. Tennenholtz. On partially controlled multi-agent systems. Journal of Artificial Intelligence Research, 4:477–507, 1996
B. Price, C. Boutilier. Implicit imitation in Multiagent Reinforcement Learning. Proc. of the Sixteenth International Conference on Machine Learning, pp. 325–334. Bled, SI, 1999
H. R. Berenji, D. Vengerov. Advantages of Cooperation Between Reinforcement Learning Agents in Difficult Stochastic Problems. Proc. Of the Ninth IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’ 00), 2000
C. Claus, C. Boutilier. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems. Proc. of the Fifteenth National Conference on Artificial Intelligence (AAAI-98), 746–752, July 1998
S. Kapetanakis, D. Kudenko. Reinforcement learning of coordination in cooperative multiagent systems. Proc. of the Eighteenth National Conference on Artificial Intelligence, (AAAI02), 326–331, American Association for Artificial Intelligence 2002
R. Maclin, J. Shavlik. Creating advicetaking reinforcement learners. Machine Learning 22:251–281, 1997
M. J. Matarić. Learning in behaviour-based multi-robot systems: policies, models and other agents. Journal of Cognitive Systems Research 2:81–93, Elsvier, 2001
O. C. Jenkins, M. J. Matarić, S. Weber. Primitive-based movement classification for humanoid imitation. Proc. of the First International Conference on Humanoid Robotics (IEEE-RAS), Cambridge, MA, MIT, 2000
M. Nicoluescu, M. J. Matarić. Learning and interacting in human-robot domains. K. Dautenhahn (Ed.), IEEE Transactions on systems, Man Cybernetics, special issue on Socially Intelligent Agents — The Human In The Loop, 2001
M. J. Matarić. Sensory-motor primitives as a basis for imitation: linking perception to action and biology to robotics. C. Nehaniv & K. Dautenhahn (Eds.), Imitation in animals and artifacts, MIT Press, 2001
F. J. Provost, D. N. Hennessy. Scaling Up: Distributed Machine Learning with Cooperation. Proc. of the Thirteenth National Conference on Artificial Intelligence, 1996
J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975
J. R. Koza. Genetic programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge MA, 1992
D. E. Rumelhart, G. E. Hinton, R. J. Wlliams. Learning internal representations by error propagation. Parallel Distributed Processing: Exploration in the Microstructure of Cognition, vol. 1: Foundations, 318–362, Cambridge MA: MIT Press, 1986
R. Salustowicz. A Genetic Algorithm for the Topological Optimization of Neural Networks. PhD Thesis, Tech. Univ. Berlin, 1995
X. Yao. Evolving artificial neural networks. Proceedings of the IEEE, 87(9), 1423–1447, 1999
A.P. Topchy, O.A. Lebedko, V.V. Miagkikh. Fast learning in multilayered neural networks by means of hybrid evolutionary and gradient algorithms. Proc. of the International Conference on Evolutionary Computation and Its Applications, Moscow, 1996
K. W. C. Ku, M. W. Mak. Exploring the effects of Lamarckian and Baldwinian learning in evolving recurrent neural networks. Proc. of the IEEE International Conference on Evolutionary Computation, 617–621, 1997.
W. Erhard, T. Fink, M. M. Gutzmann, C. Rahn, A. Doering, M. Galicki, The Improvement and Comparison of different Algorithms for Optimizing Neural Networks on the MasPar {MP}-2. Neural Computation {NC}’98, ICSC Academic Press, Ed. M. Heiss, 617–623, 1998
P.A. Castillo, J. González, J.J. Merelo, V. Rivas, G. Romero, A. Prieto. SA-Prop: Optimization of Multilayer Perceptron Parameters using Simulated Annealing. Proc. of IWANN99, 1999
T. Hogg, C. P. Williams. Solving the Really Hard problems with Cooperative Search. Proc. of the Eleventh National Conference on Artificial Intelligence (AAAI-93), 231–236, 1993
C. Goldman, J. Rosenschein. Mutually supervised learning in multi-agent systems. Proc. of the IJCAI-95 Workshop on Adaptation and Learning in Multi-Agent Systems, Montreal, CA., August 1995
T. Thorpe. Vehicle Traffic Light Control Using SARSA. Masters Thesis, Department of Computer Science, Colorado State University, 1997
E. Brockfeld, R. Barlovic, A. Schadschneider, M. Schreckenberg. Optimizing Traffic Lights in a Cellular Automaton Model for City Traffic. Physical Review E 64, 2001
L. Nunes, E. Oliveira. On Learning By Exchanging advice. Symposium on Adaptive Agents and Multi-Agent Systems (AISB/AAMAS-II), Imperial College, London, April 2002
S. Kirkpatrick, C. D. Gelatt, M. P. Vecchi. Optimization by simulated Annealing. Science, Vol. 220: 671–680, May 1983
M. Glickman, K. Sycara. Evolution of Goal-Directed Behavior Using Limited Information in a Complex Environment. Proc. of the Genetic and Evolutionary Computation Conference (GECCO-99), July 1999
R. S. Sutton. Integrated architectures for learning planning and reacting based on approximating dynamic programming. Proc. of the Seventh International Conference on Machine Learning, 216–22, Morgan-Kaufman.
K. Nagel, M Shreckenberg. A Cellular Automaton Model for Freeway Traffic. J. Phisique I, 2(12): 2221–2229, 1992
S. Sen, A. Biswas, S. Debnath. Believing others: Pros and Cons. Proc. of the Fourth International Conference on Multiagent Systems, 279–286, 2000
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nunes, L., Oliveira, E. (2003). Cooperative Learning Using Advice Exchange. In: Alonso, E., Kudenko, D., Kazakov, D. (eds) Adaptive Agents and Multi-Agent Systems. AAMAS AAMAS 2002 2001. Lecture Notes in Computer Science(), vol 2636. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44826-8_3
Download citation
DOI: https://doi.org/10.1007/3-540-44826-8_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40068-4
Online ISBN: 978-3-540-44826-6
eBook Packages: Springer Book Archive