Learning and Tacit Collusion by Artificial Agents in Cournot Duopoly Games

  • Steven O. Kimbrough
  • Ming Lu
  • Frederic Murphy
Part of the International Handbooks on Information Systems book series (INFOSYS)


We examine learning by artificial agents in repeated play of Cournot duopoly games. Our learning model is simple and cognitively realistic. The model departs from standard reinforcement learning models, as applied to agents in games, in that it credits the agent with a form of conceptual ascent, whereby the agent is able to learn from a consideration set of strategies spanning more than one period of play. The resulting behavior is markedly different from behavior predicted by classical economics for the single-shot (unrepeated) Cournot duopoly game. In repeated play under our learning regime, agents are able to arrive at a tacit form of collusion and set production levels near to those for a monopolist. We note that Cournot duopoly games are reasonable approximations for many real-world arrangements, including hourly spot markets for electricity.


Reinforcement Learning Electricity Market Future Market Repeated Game Spot Market 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [AV93]
    B. Allaz and J.-L Vila, Cournot competition, forward markets and efficiency, Journal of Economic Theory 59 (1993), 1–16.CrossRefGoogle Scholar
  2. [Axe84]
    Robert Axelrod, The evolution of cooperation, Basic Books, Inc., New York, NY, 1984.Google Scholar
  3. [BM55]
    R. R. Bush and F. Mosteller, Stochastic models for learning, Wiley, New York, NY, 1955.Google Scholar
  4. [BMS00]
    B. Banerjee, R. Mukherjee, and S. Sen, Learning mutual trust, Working Notes of AGENTS-00 Workshop on Deception, Fraud and Trust in Agent Societies, 2000,, pp. 9–14.Google Scholar
  5. [BO03]
    D. W. Bunn and F. Oliveira, Evaluating individual market power in electricity markets via agent-based simulation, Annals of Operations Research 121 (2003), 57–78.CrossRefMathSciNetGoogle Scholar
  6. [Cam03]
    Colin F. Camerer, Behavioral game theory: Experiments in strategic interaction, Russell Sage Foundation and Princeton University Press, New York, NY and Princeton, NJ, 2003.Google Scholar
  7. [CB98]
    Caroline Claus and Craig Boutilier, The dynamics of reinforcement learning in cooperative multiagent systems, Proceedings of the Fifteenth National Conference on Artificial Intelligence (Menlo Park, CA), AAAI Press/MIT Press, 1998, pp. 746–752.Google Scholar
  8. [Col95]
    Andrew M. Colman, Game theory and its applications in the social and biological sciences, second ed., Routledge, London, UK, 1995.Google Scholar
  9. [Cou97]
    A. Cournot, Researches into the mathematical principles of the theory of wealth, Macmillan, New York, NY, 1897, English edition edited by N. Bacon. Originally published in French as Recherches sur Principes Mathématiques de la Théorie des Richesses in 1838.Google Scholar
  10. [Daw80]
    Robyn M. Dawes, Social dilemmas, Annual Review of Psychology 31 (1980), 169–193.CrossRefGoogle Scholar
  11. [DKL96]
    Garett O. Dworman, Steven O. Kimbrough, and James D. Laing, Bargaining by artificial agents in two coalition games: A study in genetic programming for electronic commerce, Genetic Programming 1996: Proceedings of the First Annual Genetic Programming Conference, July 28–31, 1996, Stanford University (John R. Koza, David E. Goldberg, David B. Fogel, and Rick L. Riolo, eds.), The MIT Press, 1996, pp. 54–62.Google Scholar
  12. [ER98]
    Ido Erev and Alvin E. Roth, Predicting how people play games: Reinforcement learning in experimental games with unique, mixed strategy equilibria, The American Economic Review 88 (1998), no. 4, 848–881.Google Scholar
  13. [FKP+02]
    [FKP+02]_Christina Fang, Steven O. Kimbrough, Stefano Pace, Annapurna Valluri, and Zhiqiang Zheng, On adaptive emergence of trust behavior in the game of stag hunt, Group Decision and Negotiation 11 (November 2002), no. 6, 449–467.CrossRefGoogle Scholar
  14. [Fri77]
    J. W. Friedman, Oligopoly and the theory of games, North Holland (now Elsevier), 1977.Google Scholar
  15. [GPW98]
    J. S. Gans, D. Price, and K. Woods, Contracts and electricity pool prices, Australian Journal of Management 23 (1998), no. 1, 83–96.CrossRefGoogle Scholar
  16. [HH00]
    S. M. Harvey and W. W. Hogan, California electricity prices and forward market hedging, Technical report: working paper series, Center for Business and Government, John F. Kennedy School of Government, Harvard University, Cambridge, Massachusetts 02138, October 2000.Google Scholar
  17. [Hol85]
    Charles A. Holt, An experimental test of the consistent-conjectures hypothesis, The American Economic Review 75 (1985), no. 3, 314–325.Google Scholar
  18. [HW98]
    J. Hu and M. P. Wellman, Multiagent reinforcement learning: Theoretical framework and an algorithm, Fifteenth International Conference on Machine Learning, July 1998, pp. 242–250.Google Scholar
  19. [KL03]
    Steven O. Kimbrough and Ming Lu, A note on Q-learning in the Cournot game, WeB 2003: Proceedings of the Second Workshop in e-Business (Seattle, WA), December 13–14, 2003, Available at Scholar
  20. [KL04]
    —, Simple reinforcement learning agents: Pareto beats Nash in an algorithmic game theory study, Information Systems and e-Business (forthcoming 2004).Google Scholar
  21. [KLK04]
    Steven O. Kimbrough, Ming Lu, and Ann Kuo, A note on strategic learning in policy space, Formal Modelling in Electronic Commerce: Representation, Inference, and Strategic Interaction (Steven O. Kimbrough and D. J. Wu, eds.), Springer, 2004.Google Scholar
  22. [KLM96]
    Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore, Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4 (1996), 237–285.Google Scholar
  23. [KR95]
    John H. Kagel and Alvin E. Roth (eds.), The handbook of experimental economics, Princeton University Press, Princeton, NJ, 1995.Google Scholar
  24. [Kre90]
    David M. Kreps, Game theory and economic modeling, Clarendon Press, Oxford, England, 1990.Google Scholar
  25. [LO02]
    C. Le Coq and Henrik Orzen, Do forward markets enhance competition? experimental evidence, Technical report: working paper series, The Economic Research Institute, Stockholm School of Economics, SSE/EFI Working Paper, Department of Economics, Sveavagen, P.O. Box 6501, 113 83 Stockholm, Sweden, August 2002.Google Scholar
  26. [MF02]
    Michael W. Macy and Andreas Flache, Learning dynamics in social dilemmas, Proceedings of the National Academy of Science (PNAS) 99 (2002), no. suppl. 3, 7229–7236.Google Scholar
  27. [MS04]
    Rajatish Mukherjee and Sandip Sen, Towards a pareto-optimal solution in general-sum games, 2004, Scholar
  28. [RC65]
    Anatol Rapoport and Albert M. Chammah, Prisoner’s dilemma: A study in conflict and cooperation, The University of Michigan Press, Ann Arbor, MI, 1965.Google Scholar
  29. [RE95]
    Alvin E. Roth and Ido Erev, Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term, Games and Economic Behavior 8 (1995), 164–212.CrossRefMathSciNetGoogle Scholar
  30. [RSB79]
    Amnon Rapoport, William E. Stein, and Graham J. Burkheimer, Response models for detection of change, D. Reidel Publishing Company, Dordrecht, Holland, 1979.Google Scholar
  31. [SB98]
    Richar S. Sutton and Andrew G. Barto, Reinforcement learning: An introduction, The MIT Press, Cambridge, MA, 1998.Google Scholar
  32. [SC95]
    T. Sandholm and R. Crites, Multiagent reinforcement learning in iterated prisoner’s dilemma, Biosystems 37 (1995), 147–166, Special Issue on the Prisoner’s Dilemma.Google Scholar
  33. [Var03]
    Hal R. Varian, Intermediate microeconomics: A modern approach, W. W. Norton & Company, New York, NY, 2003.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Steven O. Kimbrough
    • 1
  • Ming Lu
    • 2
  • Frederic Murphy
    • 3
  1. 1.University of PennsylvaniaPhiladelphiaUSA
  2. 2.University of PennsylvaniaPhiladelphiaUSA
  3. 3.Temple UniversityPhiladelphiaUSA

Personalised recommendations