Wisdom of crowds versus groupthink: learning in groups and in isolation

Abstract

We evaluate the asymptotic performance of boundedly-rational strategies in multi-armed bandit problems, where performance is measured in terms of the tendency (in the limit) to play optimal actions in either (i) isolation or (ii) networks of other learners. We show that, for many strategies commonly employed in economics, psychology, and machine learning, performance in isolation and performance in networks are essentially unrelated. Our results suggest that the performance of various, common boundedly-rational strategies depends crucially upon the social context (if any) in which such strategies are to be employed.

This is a preview of subscription content, access via your institution.

References

  1. Argiento R, Pemantle R, Skyrms B, Volkov S (2009) Learning to signal: analysis of a micro-level reinforcement model. Stoch Process Appl 119(2):373–390

    Article  Google Scholar 

  2. Agrawal R (1995) Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Adv Appl Probab 27(4): 1054–1078

    Article  Google Scholar 

  3. Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47: 235–256

    Article  Google Scholar 

  4. Bala V, Goyal S (2008) Learning in networks. In: Benhabib J, Bisin A, Jackson MO (eds) Handbook of mathematical economics. Princeton University Press, Princeton

    Google Scholar 

  5. Baron J,Ritov I (2004) Omission bias, individual differences, and normality.Org BehavHum Decis Process 94:74–85

    Article  Google Scholar 

  6. Beggs A (2005) On the convergence of reinforcement learning. J Econ Theory 122: 1–36

    Article  Google Scholar 

  7. Berry DA, Fristedt B (1985) Bandit problems: sequential allocation of experiments. Chapman and Hall, chris

    Book  Google Scholar 

  8. Bertsimas D, Tsitsiklis J (1993) Simulated annealing. Stat Sci 8(1): 10–15

    Article  Google Scholar 

  9. Bolton P, Harris C (1999) Strategic experimentation. Econometrica 67(2): 349–374

    Article  Google Scholar 

  10. Branke J, Meisel S, Schmidt C (2008) Simulated annealing in the presence of noise. J Heuristics 14(6): 627–654

    Article  Google Scholar 

  11. Cesa-Bianchi N, Lugosi G (2006) Prediction, learning, and games. Cambridge University Press, chris

    Book  Google Scholar 

  12. Ellison G, Fudenberg D (1993) Rules of thumb for social learning. J Polit Econ 101(4): 612–643

    Article  Google Scholar 

  13. Hong L, Page S (2001) Problem solving by heterogeneous agents. J Econ Theory 97(1): 123–163

    Article  Google Scholar 

  14. Hong L, Page S (2004) Groups of diverse problem solvers can outperform groups of high-ability problem solvers. Proc Natl Acad Sci 101(46): 16385–16389

    Article  Google Scholar 

  15. Hopkins E (2002) Two competing models of how people learn in games. Econometrica 70(6): 2141–2166

    Article  Google Scholar 

  16. Hopkins E, Posch M (2005) Attainability of boundary points under reinforcement learning. Games Econ Behav 53(1): 1105

    Article  Google Scholar 

  17. Huttegger S (2011) Carnapian inductive logic for two-armed bandits

  18. Huttegger S, Skyrms B (2008) Emergence of information transfer by inductive learning. Studia Logica 89: 2376

    Article  Google Scholar 

  19. Keller G, Rady S, Cripps M (2005) Strategic experimentation with exponential bandits. Econometrica 73(1): 39

    Article  Google Scholar 

  20. Kuhlman MD, Marshello AF (1975) Individual differences in game motivation as moderators of preprogrammed strategy effects in prisoner’s dilemma. J Pers Soc Psychol 32(5): 922–931

    Article  Google Scholar 

  21. Mayo-Wilson C, Zollman K, Danks D (2010) Wisdom of the crowds vs. groupthink: connections between individual and group epistemology. Carnegie Mellon University, Department of Philosophy. Technical Report No. 187

  22. Roth A, Erev I (1995) Learning in extensive-form games: experimental data and simple dynamic models in the intermediate term. Games Econ Behav 8: 164–212

    Article  Google Scholar 

  23. Skyrms B, Pemantle R (2004) Network formation by reinforcement learning: the long and medium run. Math Soc Sci 48(3): 315–327

    Article  Google Scholar 

  24. Stanovich KE, West RF (1998) Individual differences in rational thought. J Exp Psychol Gen 127(2): 161–188

    Article  Google Scholar 

  25. Zollman K (2009) The epistemic benefit of transient diversity. Erkenntnis 72(1): 17–35

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Conor Mayo-Wilson.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Mayo-Wilson, C., Zollman, K. & Danks, D. Wisdom of crowds versus groupthink: learning in groups and in isolation. Int J Game Theory 42, 695–723 (2013). https://doi.org/10.1007/s00182-012-0329-7

Download citation

Keywords

  • Bandit problems
  • Networks
  • Reinforcement learning
  • Simulating annealing
  • Epsilon greedy