A Note on Working Memory in Agent Learning
An important dimension of system and mechanism design, working memory, has been paid insufficient attention by scholars. Existing literature reports mixed findings on the effects of the amount of working memory on system efficiency. In this note, we investigate this relationship with a computational approach. We design an intelligent agent system in which three agents, one buyer and two bidders, play an Exchange Game repeatedly. The buyer agent decides whether to list a request for proposal, while the bidders bid for it independently. Only one bidder can win on a given round of play. Once the winning bidder is chosen by the buyer, a transaction takes place. The two parties of the trade can either cooperate or defect at this point. The decisions are made simultaneously and the payoffs essentially follow the Prisoner’s Dilemma game. We find that the relationship between working memory and the efficiency of the system has an inverted U-shape, i.e., there seems to be an optimal memory size. When we mixed agents with different memory sizes together, agents with the same amount of working memory generate the most efficient outcome in terms of total payoffs.
KeywordsReinforcement Learning Multiagent System Memory Size Total Surplus Agent Learn
Unable to display preview. Download preview PDF.
- [BS00]Adam M. Brandenburger and Harborne (Gus) W. Stuart, Biform games, Working paper No. 00-06, Harvard NOM Research Paper, http://ssrn.com/abstract=264199, November 2000.Google Scholar
- [CB98]Caroline Claus and Craig Boutilier, The dynamics of reinforcement learning in cooperative multiagent systems, Proceedings of the Fifteenth National Conference on Artificial Intelligence (Menlo Park, CA), AAAI Press/MIT Press, 1998, pp. 746–752.Google Scholar
- [Del03]Chrysanthos N. Dellarocas, Efficiency and robustness of binary feedback mechanisms in trading environments with moral hazard, Working Paper No. 4297-03, MIT, Sloan School, Cambridge, MA, January 2003, http://ssrn.com/abstract=393043.Google Scholar
- [HW98]J. Hu and M. P. Wellman, Multiagent reinforcement learning: Theoretical framework and an algorithm, Fifteenth International Conference on Machine Learning, July 1998, pp. 242–250.Google Scholar
- [KL97]Steven O. Kimbrough and Ronald M. Lee, Formal aspects of electronic commerce: Research issues and challenges, International Journal of Electronic Commerce 1 (1997), no. 4, 11–30.Google Scholar
- [KLK04]Steven O. Kimbrough, Ming Lu, and Ann Kuo, A note on strategic learning in policy space, Formal Modelling in Electronic Commerce: Representation, Inference, and Strategic Interaction (Steven O. Kimbrough and D. J. Wu, eds.), Springer, 2004.Google Scholar
- [KLM96]Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore, Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4 (1996), 237–285.Google Scholar
- [SB98]Richar S. Sutton and Andrew G. Barto, Reinforcement learning: An introduction, The MIT Press, Cambridge, MA, 1998.Google Scholar
- [Sut91]J. Sutton, Sunk costs and market structure: Price competition, advertising, and the evolution of concentration, The MIT Press, Cambridge, MA, 1991.Google Scholar
- [WD92]C.J.C.H. Watkins and P. Dayan, Q-learning, Machine Learning 8 (1992), 279–292.Google Scholar