International Journal of Game Theory

, Volume 42, Issue 3, pp 695–723 | Cite as

Wisdom of crowds versus groupthink: learning in groups and in isolation

  • Conor Mayo-Wilson
  • Kevin Zollman
  • David Danks


We evaluate the asymptotic performance of boundedly-rational strategies in multi-armed bandit problems, where performance is measured in terms of the tendency (in the limit) to play optimal actions in either (i) isolation or (ii) networks of other learners. We show that, for many strategies commonly employed in economics, psychology, and machine learning, performance in isolation and performance in networks are essentially unrelated. Our results suggest that the performance of various, common boundedly-rational strategies depends crucially upon the social context (if any) in which such strategies are to be employed.


Bandit problems Networks Reinforcement learning Simulating annealing Epsilon greedy 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Argiento R, Pemantle R, Skyrms B, Volkov S (2009) Learning to signal: analysis of a micro-level reinforcement model. Stoch Process Appl 119(2):373–390CrossRefGoogle Scholar
  2. Agrawal R (1995) Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Adv Appl Probab 27(4): 1054–1078CrossRefGoogle Scholar
  3. Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47: 235–256CrossRefGoogle Scholar
  4. Bala V, Goyal S (2008) Learning in networks. In: Benhabib J, Bisin A, Jackson MO (eds) Handbook of mathematical economics. Princeton University Press, PrincetonGoogle Scholar
  5. Baron J,Ritov I (2004) Omission bias, individual differences, and normality.Org BehavHum Decis Process 94:74–85CrossRefGoogle Scholar
  6. Beggs A (2005) On the convergence of reinforcement learning. J Econ Theory 122: 1–36CrossRefGoogle Scholar
  7. Berry DA, Fristedt B (1985) Bandit problems: sequential allocation of experiments. Chapman and Hall, chrisCrossRefGoogle Scholar
  8. Bertsimas D, Tsitsiklis J (1993) Simulated annealing. Stat Sci 8(1): 10–15CrossRefGoogle Scholar
  9. Bolton P, Harris C (1999) Strategic experimentation. Econometrica 67(2): 349–374CrossRefGoogle Scholar
  10. Branke J, Meisel S, Schmidt C (2008) Simulated annealing in the presence of noise. J Heuristics 14(6): 627–654CrossRefGoogle Scholar
  11. Cesa-Bianchi N, Lugosi G (2006) Prediction, learning, and games. Cambridge University Press, chrisCrossRefGoogle Scholar
  12. Ellison G, Fudenberg D (1993) Rules of thumb for social learning. J Polit Econ 101(4): 612–643CrossRefGoogle Scholar
  13. Hong L, Page S (2001) Problem solving by heterogeneous agents. J Econ Theory 97(1): 123–163CrossRefGoogle Scholar
  14. Hong L, Page S (2004) Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proc Natl Acad Sci 101(46): 16385–16389CrossRefGoogle Scholar
  15. Hopkins E (2002) Two competing models of how people learn in games. Econometrica 70(6): 2141–2166CrossRefGoogle Scholar
  16. Hopkins E, Posch M (2005) Attainability of boundary points under reinforcement learning. Games Econ Behav 53(1): 1105CrossRefGoogle Scholar
  17. Huttegger S (2011) Carnapian inductive logic for two-armed banditsGoogle Scholar
  18. Huttegger S, Skyrms B (2008) Emergence of information transfer by inductive learning. Studia Logica 89: 2376CrossRefGoogle Scholar
  19. Keller G, Rady S, Cripps M (2005) Strategic experimentation with exponential bandits. Econometrica 73(1): 39CrossRefGoogle Scholar
  20. Kuhlman MD, Marshello AF (1975) Individual differences in game motivation as moderators of preprogrammed strategy effects in prisoner’s dilemma. J Pers Soc Psychol 32(5): 922–931CrossRefGoogle Scholar
  21. Mayo-Wilson C, Zollman K, Danks D (2010) Wisdom of the crowds vs. groupthink: connections between individual and group epistemology. Carnegie Mellon University, Department of Philosophy. Technical Report No. 187Google Scholar
  22. Roth A, Erev I (1995) Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term. Games Econ Behav 8: 164–212CrossRefGoogle Scholar
  23. Skyrms B, Pemantle R (2004) Network formation by reinforcement learning: the long and medium run. Math Soc Sci 48(3): 315–327CrossRefGoogle Scholar
  24. Stanovich KE, West RF (1998) Individual differences in rational thought. J Exp Psychol Gen 127(2): 161–188CrossRefGoogle Scholar
  25. Zollman K (2009) The epistemic benefit of transient diversity. Erkenntnis 72(1): 17–35CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  1. 1.Department of Philosophy, Baker Hall 135Carnegie Mellon UniversityPittsburghUSA
  2. 2.Department of Philosophy, Baker Hall 155DCarnegie Mellon UniversityPittsburghUSA
  3. 3.Department of Philosophy, Baker Hall 161ECarnegie Mellon UniversityPittsburghUSA

Personalised recommendations