Using reciprocity to adapt to others
Interacting and adapting to other agents is an essential part of the behavior of intelligent agents. In most practical multiagent systems, it is likely that a single agent will interact with a number of agents representing different preferences and interests. Whereas it is recognized that if all agents were cooperating, peak system performance can be realized, it is impractical to assume benevolence from an arbitrary group of agents. More realistically, in an open system, agents should be self-interested. Can we then design environments where the best selfinterested actions lead to cooperation between agents? In this paper, we investigate environments where reciprocative actions allow self-interested agents to maximize local as well as global utility. We argue that the traditional choice of deterministic reciprocity is insufficient, and propose a probabilistic reciprocity scheme to decide on whether or not to help another agent. We present experimental results to show that agents using this probabilistic reciprocity scheme can both approach optimal global behavior and resist exploitation by selfish agents. Thus the reciprocation scheme is found to be both efficient and stable and allows agents to adapt to others using past experience.
Unable to display preview. Download preview PDF.
- 1.Robert Axelrod. The Evolution of Cooperation. Basic Books, 1984.Google Scholar
- 2.L. B. Booker. Classifier systems that learn internal world models. Machine Learning, 3:161–192, 1988.Google Scholar
- 3.Les Gasser. Social conceptions of knowledge and action: DAI foundations and open systems semantics. Artificial Intelligence, 47(1–3):107–138, 1991.Google Scholar
- 4.Claudia Goldman and Jeffrey S. Rosenschein. Emergent coordination through the use of cooperative state-changing rules. In Proceedings of the Twelfth National Conference on Artificial Intelligence, pages 408–413, 1994.Google Scholar
- 5.Carl Hewitt. Open information systems semantics for distributed artificial intelligence. Artificial Intelligence, 47(1–3):79–106, 1991.Google Scholar
- 6.A. Rapoport. Prisoner's dilemma. In J. Eatwell, M. Milgate, and P. Newman, editors, The New Palgrave: Game Theory, pages 199–204. Macmillan, London, 1989.Google Scholar
- 7.Mahendra Sekaran and Sandip Sen. To help or not to help, 1995.Google Scholar
- 8.S. Sian. Adaptation based on cooperative learning in multi-agent systems. In Y. Demazeau and J.-P. Müller, editors, Decentralize AI, volume 2, pages 257–272. Elsevier Science Publications, 1991.Google Scholar
- 9.C. Watkins. Learning from Delayed Rewards. PhD thesis, King's College, Cambridge University, 1989.Google Scholar