Abstract
Decisions often benefit from learned expectations about the sequential structure of the evidence. Here we show that individual differences in this learning process can reflect different implicit assumptions about sequence complexity, leading to performance trade-offs. For a task requiring decisions about dynamic evidence streams, human subjects with more flexible, history-dependent choices (low bias) had greater trial-to-trial choice variability (high variance). In contrast, subjects with more history-independent choices (high bias) were more predictable (low variance). We accounted for these behaviours using models in which assumed complexity was encoded by the size of the hypothesis space over the latent rate of change of the source of evidence. The most parsimonious model used an efficient sampling algorithm in which the range of sampled hypotheses represented an information bottleneck that gave rise to a bias–variance trade-off. This trade-off, which is well known in machine learning, may thus also have broad applicability to human decision-making.
Similar content being viewed by others
References
Gold, J. I. & Shadlen, M. N. The neural basis of decision making. Annu. Rev. Neurosci. 30, 535–574 (2007).
Smith, P. L. & Ratcliff, R. Psychology and neurobiology of simple decisions. Trends Neurosci. 27, 161–168 (2004).
Wald, A. Sequential Analysis (Wiley: New York, 1947).
Barnard, G. A. Sequential tests in industrial statistics. J. Roy. Stat. Soc. Suppl. 8, 1–26 (1946).
Brody, C. D. & Hanks, T. D. Neural underpinnings of the evidence accumulator. Curr. Opin. Neurobiol. 37, 149–157 (2016).
Kelly, S. P. & O’Connell, R. G. The neural processes underlying perceptual decision making in humans: recent progress and future directions. J. Physiol. Paris 109, 27–37 (2015).
Bogacz, R., Brown, E., Moehlis, J., Holmes, P. & Cohen, J. D. The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced choice tasks. Psychol. Rev. 113, 700–765 (2006).
Wilson, R. C., Nassar, M. R. & Gold, J. I. Bayesian online learning of the hazard rate in change-point problems. Neural Comput. 22, 2452–2476 (2010).
Wilson, R. C., Nassar, M. R. & Gold, J. I. A mixture of delta-rules approximation to bayesian inference in change-point problems. PLoS. Comput. Biol. 9, (2013).
Adams, R. P. & MacKay, D. J. C. Bayesian Online Changepoint Detection (University of Cambridge, Cambridge, 2007).
Fearnhead, P. & Liu, Z. On-line inference for multiple changepoint problems. J. R. Stat. Soc. Ser. B 69, 589–605 (2007).
Veliz-Cuba, A., Kilpatrick, Z. P. & Josic, K. Stochastic models of evidence accumulation in changing environments. SIAM Rev. 58, 264–289 (2016).
Glaze, C. M., Kable, G. W. & Gold, J. I. Normative evidence accumulation in unpredictable environments.eLife 4, (2015).
Ossmy, O. et al. The timescale of perceptual evidence integration can be adapted to the environment. Curr. Biol. 23, 981–986 (2013).
Behrens, T. E., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).
Krugel, L. K., Biele, G., Mohr, P. N., Li, S. C. & Heekeren, H. R. Genetic variation in dopaminergic neuromodulation influences the ability to rapidly and flexibly adapt decisions. Proc. Natl. Acad. Sci. USA 106, 17951–17956 (2009).
Nassar, M. R., Wilson, R. C., Heasly, B. & Gold, J. I. An approximately Bayesian delta-rule model explains the dynamics of belief updating in a changing environment. J. Neurosci. 30, 12366–12378 (2010).
Bishop, C. M. Pattern Recognition and Machine Learning (Springer, New York, NY, 2006).
Rao, R. P. Bayesian computation in recurrent neural circuits. Neural Comput. 16, 1–38 (2004).
Friston, K. The free-energy principle: a unified brain theory? Nat. Rev. Neurosci. 11, 127–138 (2010).
Shi, L. & Griffiths, T. L. Neural implementation of hierarchical Bayesian inference by importance sampling. In Advances in Neural Information Processing Systems 22 (eds Bengio, Y., Schuurmans, D., Lafferty, J. D., Williams, C. K. I. & Culotta, A.) 1669–1677 (NIPS, 2009).
Lochmann, T. & Deneve, S. Neural processing as causal inference. Curr. Opin. Neurobiol. 21, 774–781 (2011).
Legenstein, R. & Maass, W. Ensembles of spiking neurons with noise support optimal probabilistic inference in a dynamically changing environment. PLoS. Comput. Biol. 10, e1003859 (2014).
Acuña, D. E. & Schrater, P. Structure learning in human sequential decision-making. PLoS. Comput. Biol. 6, (2010).
Hastie, T. et al. The Elements of Statistical Learning (Springer, New York, NY, 2009).
Geman, S., Bienenstock, E. & Doursat, R. Neural networks and the bias/variance dilemma. Neural Comput. 4, 1–58 (1992).
Friedman, J. H. On bias, variance, 0/1—loss, and the curse-of-dimensionality. Data Min. Knowl. Discov. 1, 55–77 (1997).
Austerweil, J. L., Gershman, S. J., Tenenbaum, J. B. & Griffiths, T. L. in Oxford Handbook of Computational and Mathematical Psychology (eds Busemeyer, J. R., Wang, Z., Townsend, J. T. & Eidels, A.) 187–208 (Oxford Univ. Press, New York, NY, 2015).
Gigerenzer, G. & Gaissmaier, W. Heuristic decision making. Annu. Rev. Psychol. 62, 451–482 (2011).
Behrens, T. E., Woolrich, M. W., Walton, M. E. & Rushworth, M. F. Learning the value of information in an uncertain world. Nat. Neurosci. 10, 1214–1221 (2007).
Yu, A. J. & Cohen, J. D. Sequential effects: superstition or rational behavior? Adv. Neural Inf. Process. Syst. 21, 1873–1880 (2008).
Meyniel, F., Schlunegger, D. & Dehaene, S. The sense of confidence during probabilistic learning: a normative account. PLoS. Comput. Biol. 11, (2015).
Meyniel, F., Maheu, M. & Dehaene, S. Human inferences about sequences: a minimal transition probability model. PLoS. Comput. Biol. 12, (2016).
Mathys, C., Daunizeau, J., Friston, K. J. & Stephan, K. E. A Bayesian foundation for individual learning under uncertainty. Front. Hum. Neurosci. 5, 39 (2011).
Creutzig, F., Globerson, A. & Tishby, N. Past-future information bottleneck in dynamical systems. Phys. Rev. E 79, 041925 (2009).
Palmer, S. E., Marre, O., Berry, M. J. & Bialek, W. Predictive information in a sensory population. Proc. Natl. Acad. Sci. USA 112, 6908–6913 (2015).
Tishby, N., Pereira, F. C. & Bialek, W. The information bottleneck method. Preprint at https://arxiv.org/abs/physics/0004057 (2000).
Brown, S. D. & Steyvers, M. Detecting and predicting changes. Cogn. Psychol. 58, 49–67 (2009).
Boerlin, M., Machens, C. K. & Denève, S. Predictive coding of dynamical variables in balanced spiking networks. PLoS. Comput. Biol. 9, (2013).
Gonzalez Castro, L. N., Hadjiosif, A. M., Hemphill, M. A. & Smith, M. A. Environmental consistency determines the rate of motor adaptation. Curr. Biol. 24, 1050–1061 (2014).
Sato, Y. & Kording, K. P. How much to trust the senses: likelihood learning. J. Vis. 14, 13 (2014).
Radillo, A. E., Veliz-Cuba, A., Josic, K. & Kilpatrick, Z. P. Evidence accumulation and change rate Inference in dynamic environments. Neural Comput. 29, 1561–1610 (2017).
Deneve, S. Bayesian spiking neurons II: learning. Neural Comput. 20, 118–145 (2008).
Deneve, S. Making decisions with unknown sensory reliability. Front. Neurosci. 6, 75 (2012).
Kemp, C., Perfors, A. & Tenenbaum, J. B. Learning overhypotheses with hierarchical Bayesian models. Dev. Sci. 10, 307–321 (2007).
Lee, T. S. & Mumford, D. Hierarchical Bayesian inference in the visual cortex. J. Opt. Soc. Am. A 20, 1434–1448 (2003).
Botvinick, M. M., Niv, Y. & Barto, A. C. Hierarchically organized behavior and its neural foundations: a reinforcement learning perspective. Cognition 113, 262–280 (2008).
Diuk, C., Tsai, K., Wallis, J., Botvinick, M. & Niv, Y. Hierarchical learning induces two simultaneous, but separable, prediction errors in human basal ganglia. J. Neurosci. 33, 5797–5805 (2013).
Ribas-Fernandes, J. J. et al. A neural signature of hierarchical reinforcement learning. Neuron 71, 370–379 (2011).
Badre, D., Doll, B. B., Long, N. M. & Frank, M. J. Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron 73, 595–607 (2012).
Frank, M. J. & Badre, D. Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis. Cereb. Cortex 22, 509–526 (2012).
Mathys, C. D. et al. Uncertainty in perception and the hierarchical Gaussian filter. Front. Hum. Neurosci. 8, 825 (2014).
Daw, N. & Courville, A. The pigeon as particle filter. Adv. Neural Inf. Process. Syst. 20, 369–376 (2008).
Buesing, L., Bill, J., Nessler, B. & Maass, W. Neural dynamics as sampling: a model for stochastic computation in recurrent networks of spiking neurons. PLoS. Comput. Biol. 7, (2011).
Huang, Y. & Rao, R. P. Neurons as Monte Carlo samplers: Bayesian inference and learning in spiking networks. In Advances in Neural Information Processing Systems 27 (eds Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D. & Weinberger, K. Q.) 1943–1951 (NIPS, 2014).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press: Cambridge, MA, 1998).
Wu, H. G., Miyamoto, Y. R., Gonzalez Castro, L. N., Ölveczky, B. P. & Smith, M. A. Temporal structure of motor variability is dynamically regulated and predicts motor learning ability. Nat. Neurosci. 17, 312–321 (2014).
Tumer, E. C. & Brainard, M. S. Performance variability enables adaptive plasticity of ‘crystallized’ adult birdsong. Nature 450, 1240–1244 (2007).
Kaelbling, L. P., Littman, M. L. & Moore, A. W. Reinforcement learning: a survey. J. Art. Intel. Res. 4, 237–285 (1996).
Vapnik, V. Statistical Learning Theory (Wiley: New York, 1998).
Chervonenkis, A. I. A. & Vapnik, V. N. Theory of uniform convergence of frequencies of events to their probabilities and problems of search for an optimal solution from empirical data. Autom. Remote. Control. 32, 207–217 (1971).
Friston, K., Mattout, J., Trujillo-Barreto, N., Ashburner, J. & Penny, W. Variational free energy and the Laplace approximation. Neuroimage 34, 220–234 (2007).
Ming, L. & Vitányi, P. An Introduction to Kolmogorov Complexity and its Applications (Springer, Heidelberg, 1997).
Rissanen, J. in Complexity, Entropy and the Physics of Information (ed. Zurek, W. H.) 117–126 (Addison-Wesley Publishing, Redwood City, CA, 1990).
Bialek, W., Nemenman, I. & Tishby, N. Predictability, complexity, and learning. Neural Comput. 13, 2409–2463 (2001).
Bialek, W., Nemenman, I. & Tishby, N. Complexity through nonextensivity. Phys. A 302, 89–99 (2001).
Balasubramanian, V. Statistical inference, Occam’s razor, and statistical mechanics on the space of probability distributions. Neural Comput. 9, 349–368 (1997).
Balasubramanian, V. A geometric formulation of Occam’s razor for inference of parametric distributions. Preprint at https://arxiv.org/abs/adap-org/9601001 (1996).
Drugowitsch, J., Moreno-Bote, R., Churchland, A. K., Shadlen, M. N. & Pouget, A. The cost of accumulating evidence in perceptual decision making. J. Neurosci. 32, 3612–3628 (2012).
Davidson, M. & McCarthy, D. The Matching Law: A Research Review. (Erlbaum: Hillsdale, 1988.
Luce, R. D. Response Times: Their Role in Inferring Elementary Mental Organization 8 (Oxford University Press: New York, NY, 1986).
Laming, D. R. J. Information Theory of Choice Reaction Time (Wiley: New York, NY,1968).
Cho, R. Y. et al. Mechanisms underlying dependencies of performance on stimulus history in a two-alternative forced-choice task. Cogn. Affect. Behav. Neurosci. 2, 283–299 (2002).
Jones, M., Curran, T., Mozer, M. C. & Wilder, M. H. Sequential effects in response time reveal learning mechanisms and event representations. Psychol. Rev. 120, 628–666 (2013).
Zhang, S., Huang, H. C. & Yu, A. J. Sequential effects: A Bayesian analysis of prior bias on reaction time and behavioral choice. In Proc. Annual Meeting Cognitive Science Society 36, 1844–1849 (Cognitive Science Society, 2014).
Goldfarb, S., Wong-Lin, K. F., Schwemmer, M., Leonard, N. E. & Holmes, P. Can post-error dynamics explain sequential reaction time patterns? Front. Psychol. https://doi.org/10.3389/fpsyg.2012.00213 (2012).
McGuire, J. T., Nassar, M. R., Gold, J. I. & Kable, J. W. Functionally dissociable influences on learning rate in a dynamic environment. Neuron 84, 870–881 (2014).
Charles, A. & Dennis, J. E. Analysis of generalized pattern searches. SIAM J. Optim. 13, 889–903 (2003).
Acknowledgements
We thank G. Kroch and T. Kim for help with data collection and K. Krishnamurthy for comments. Funded by NSF-NCS 1533623. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript
Author information
Authors and Affiliations
Contributions
C.M.G., J.W.K. and J.I.G. designed the experiment; C.M.G. collected and analysed the data and implemented the models; A.L.S.F. implemented the complexity analysis; all five authors interpreted the results and drafted and/or revised the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figures 1–6.
Rights and permissions
About this article
Cite this article
Glaze, C.M., Filipowicz, A.L.S., Kable, J.W. et al. A bias–variance trade-off governs individual differences in on-line learning in an unpredictable environment. Nat Hum Behav 2, 213–224 (2018). https://doi.org/10.1038/s41562-018-0297-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41562-018-0297-4
- Springer Nature Limited
This article is cited by
-
Persistent activity in human parietal cortex mediates perceptual choice repetition bias
Nature Communications (2022)
-
Human inference reflects a normative balance of complexity and accuracy
Nature Human Behaviour (2022)
-
Individual beliefs about temporal continuity explain variation of perceptual biases
Scientific Reports (2022)
-
Collaborative Thompson Sampling
Mobile Networks and Applications (2020)
-
Controllability governs the balance between Pavlovian and instrumental action selection
Nature Communications (2019)