Reinforcement Learning

Man, Vincent; O’Doherty, John P.

doi:10.1007/978-3-031-45271-0_3

Vincent Man³ &
John P. O’Doherty⁴

254 Accesses

Abstract

Over the last two decades, the model-based approach to analysing functional magnetic resonance imaging (fMRI) data has been adopted across the cognitive neurosciences to study how computations are implemented in the brain. In this time, methods have advanced along both computational modelling and neuroimaging domains. This chapter aims to provide an introduction to the general method of integrating computational models into fMRI analyses as well as a discussion on contemporary considerations regarding these recent advances. The chapter begins with an exposition to the formalisation of qualitative psychological hypotheses into quantitatively testable and falsifiable computational models. We use examples from the conditioning and reinforcement learning literature to ground this discussion given the origin of model-based fMRI in uncovering neural correlates of learning processes. We then provide an overview of the methodological approach underlying model-based fMRI. This extends to pragmatic considerations when working in this domain with an eye towards more recent developments in both fMRI and computational modelling, such as multivariate analyses and assessing model quality, respectively. Finally, we provide examples in which computations described in the first section of the chapter were successfully bridged with fMRI analyses to provide a richer understanding of reinforcement learning in the brain. This chapter is therefore aimed at both the cognitive neuroscientist seeking to adapt computational approaches to their neuroimaging research as well as those specifically interested in learning and decision-making across levels of analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
There are a variety of ways in which \(\lambda ^{US}\) might be set depending on the experimental context. A simple coding would be to define \(\lambda ^{US}\) as 1 on trials in which a reward is presented and 0 otherwise, for example, in experiments where reward magnitudes are not manipulated. Alternatively, the magnitude of the UR elicited by the US may be a useful metric.
2.
It should be noted that the full formulation of the R-W model includes a second learning rate parameter associated with the US to incorporate the assumption that the rate of learning may also depend on the particular US in the experiment (Rescorla, 1972).
3.
Here we ignore some important aspects of the TD framework for simplicity, such as temporal discounting and eligibility traces. References to relevant papers on these aspects are presented in Sect. 7.
4.
The term ‘policy’ refers to how an animal or human acts given the situation they face.
5.
In the case of two alternatives, the softmax function reduces to a logistic sigmoid function of their difference.
6.
It can be important to check whether the parametric regressor is normalised or mean-centred. If not, it can be artificially highly correlated with the corresponding onset regressor, particularly if the parametric variable has only positive values. Some, but not all, common fMRI software packages automatically scale this regressor.
7.
This can be done easily with most standard fMRI software packages, which include functions to estimate the efficiency of the GLM design (Fig. 2c).
8.
In the case of Kahnt et al. (2011), values were computed from a model based on the objective features of the choice alternatives, rather a model of subjective valuation that depends on parameters fit to participants’ data. One possibility is that in the latter case it is advantageous to binarise computationally derived continuous variables, thereby turning the problem from one of regression to classification, depending on the reliability of the parameter estimates. Further simulation work is required to test these different approaches.

References

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.
Article Google Scholar
Barto, A. G., Sutton, R. S., & Anderson, C. W. (1983). Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics, 5, 834–846.
Article Google Scholar
Botvinick, M. M., Niv, Y., & Barto, A. G. (2009). Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective. Cognition, 113(3), 262–280.
Article PubMed Google Scholar
Büchel, C., Bornhövd, K., Quante, M., Glauche, V., Bromm, B., & Weiller, C. (2002). Dissociable neural responses related to pain intensity, stimulus intensity, and stimulus awareness within the anterior cingulate cortex: a parametric single-trial laser functional magnetic resonance imaging study. Journal of Neuroscience, 22(3), 970–976.
Article PubMed Google Scholar
Büchel, C., Holmes, A., Rees, G., & Friston, K. (1998). Characterizing stimulus–response functions using nonlinear regressors in parametric fMRI experiments. Neuroimage, 8(2), 140–148.
Article PubMed Google Scholar
Bush, R. R., & Mosteller, F. (1951). A mathematical model for simple learning. Psychological Review, 58(5), 313.
Article PubMed Google Scholar
Caplin, A., & Dean, M. (2008). Axiomatic methods, dopamine and reward prediction error. Current Opinion in Neurobiology, 18(2), 197–202.
Article PubMed Google Scholar
Casella, G., & Berger, R. L. (2021). Statistical inference. Cengage Learning.
Google Scholar
Chan, S. C., Niv, Y., & Norman, K. A. (2016). A probability distribution over latent causes, in the orbitofrontal cortex. Journal of Neuroscience, 36(30), 7817–7828.
Article PubMed Google Scholar
Cohen, J. D., Daw, N., Engelhardt, B., Hasson, U., Li, K., Niv, Y., Norman, K. A., Pillow, J., Ramadge, P. J., Turk-Browne, N. B., et al. (2017). Computational approaches to fMRI analysis. Nature Neuroscience, 20(3), 304–313.
Article PubMed PubMed Central Google Scholar
Colas, J. T., Pauli, W. M., Larsen, T., Tyszka, J. M., & O’Doherty, J. P. (2017). Distinct prediction errors in mesostriatal circuits of the human brain mediate learning about the values of both states and actions: Evidence from high-resolution fMRI. PLoS Computational Biology, 13(10), e1005810.
Article PubMed PubMed Central Google Scholar
Collins, A. G., & Frank, M. J. (2014). Opponent actor learning (OpAL): Modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive. Psychological Review, 121(3), 337.
Article PubMed Google Scholar
Cross, L., Cockburn, J., Yue, Y., & O’Doherty, J. P. (2020). Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments. Neuron, 109(4), 724–738.
Article PubMed PubMed Central Google Scholar
Davis, T., LaRocque, K. F., Mumford, J. A., Norman, K. A., Wagner, A. D., & Poldrack, R. A. (2014). What do differences between multi-voxel and univariate analysis mean? how subject-, voxel-, and trial-level variance impact fMRI analysis. Neuroimage, 97, 271–283.
Article PubMed Google Scholar
Daw, N. D. et al. (2011). Trial-by-trial data analysis using computational models. Decision Making, Affect, and Learning: Attention and Performance XXIII, 23(1), 3–38.
Article Google Scholar
Daw, N. D., O’doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876–879.
Article PubMed PubMed Central Google Scholar
Daw, N. D., & Tobler, P. N. (2014). Value learning through reinforcement: the basics of dopamine and reinforcement learning. In Neuroeconomics (pp. 283–298). Elsevier.
Google Scholar
Diedrichsen, J., & Kriegeskorte, N. (2017). Representational models: A common framework for understanding encoding, pattern-component, and representational-similarity analysis. PLoS Computational Biology, 13(4), e1005508.
Article PubMed PubMed Central Google Scholar
Dolan, R. J., & Dayan, P. (2013). Goals and habits in the brain. Neuron, 80(2), 312–325.
Article PubMed PubMed Central Google Scholar
Edelman, S., Grill-Spector, K., Kushnir, T., & Malach, R. (1998). Toward direct visualization of the internal shape representation space by fMRI. Psychobiology, 26(4), 309–321.
Article Google Scholar
Friston, K. J., Holmes, A. P., Price, C., Büchel, C., & Worsley, K. (1999). Multisubject fMRI studies and conjunction analyses. Neuroimage, 10(4), 385–396.
Article PubMed Google Scholar
Friston, K. J., Holmes, A. P., Worsley, K. J., Poline, J.-P., Frith, C. D., & Frackowiak, R. S. (1994). Statistical parametric maps in functional imaging: A general linear approach. Human Brain Mapping, 2(4), 189–210.
Article Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013). Bayesian data analysis. CRC Press.
Book Google Scholar
Gelman, A., Meng, X.-L., & Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6(4), 733–760.
Google Scholar
Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1–58.
Article Google Scholar
Gershman, S. J. (2016). Empirical priors for reinforcement learning models. Journal of Mathematical Psychology, 71, 1–6.
Article Google Scholar
Gittins, J. C., & Jones, D. M. (1979). A dynamic allocation index for the discounted multiarmed bandit problem. Biometrika, 66(3), 561–565.
Article Google Scholar
Gläscher, J. P., & O’Doherty, J. P. (2010). Model-based approaches to neuroimaging: combining reinforcement learning theory with fMRI data. Wiley Interdisciplinary Reviews: Cognitive Science, 1(4), 501–510.
PubMed Google Scholar
Glaser, J. I., Benjamin, A. S., Chowdhury, R. H., Perich, M. G., Miller, L. E., & Kording, K. P. (2020). Machine learning for neural decoding. Eneuro, 7(4), 1–16.
Article Google Scholar
Güçlü, U., & van Gerven, M. A. (2015). Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. Journal of Neuroscience, 35(27), 10005–10014.
Article PubMed Google Scholar
Hampton, A. N., Bossaerts, P., & O’doherty, J. P. (2006). The role of the ventromedial prefrontal cortex in abstract state-based inference during decision making in humans. Journal of Neuroscience, 26(32), 8360–8367.
Article PubMed Google Scholar
Hampton, A. N., Bossaerts, P., & O’Doherty, J. P. (2008). Neural correlates of mentalizing-related computations during strategic interactions in humans. Proceedings of the National Academy of Sciences, 105(18), 6741–6746.
Article Google Scholar
Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293(5539), 2425–2430.
Article PubMed Google Scholar
Haxby, J. V., Gobbini, M. I., & Nastase, S. A. (2020). Naturalistic stimuli reveal a dominant role for agentic action in visual representation. Neuroimage, 216, 116561.
Article PubMed Google Scholar
Haynes, J.-D. (2015). A primer on pattern-based approaches to fMRI: principles, pitfalls, and perspectives. Neuron, 87(2), 257–270.
Article PubMed Google Scholar
Haynes, J.-D., & Rees, G. (2006). Decoding mental states from brain activity in humans. Nature Reviews Neuroscience, 7(7), 523–534.
Article PubMed Google Scholar
Holland, P. C., & Rescorla, R. A. (1975). Second-order conditioning with food unconditioned stimulus. Journal of Comparative and Physiological Psychology, 88(1), 459.
Article PubMed Google Scholar
Hull, C. L. (1939). The problem of stimulus equivalence in behavior theory. Psychological Review, 46(1), 9.
Article Google Scholar
Hunt, L. T., Malalasekera, W. N., de Berker, A. O., Miranda, B., Farmer, S. F., Behrens, T. E., & Kennerley, S. W. (2018). Triple dissociation of attention and decision computations across prefrontal cortex. Nature Neuroscience, 21(10), 1471–1481.
Article PubMed PubMed Central Google Scholar
Hutcherson, C. A., Bushong, B., & Rangel, A. (2015). A neurocomputational model of altruistic choice and its implications. Neuron, 87(2), 451–462.
Article PubMed PubMed Central Google Scholar
Kahnt, T., Heinzle, J., Park, S. Q., & Haynes, J.-D. (2011). Decoding different roles for VMPFC and DLPFC in multi-attribute decision making. Neuroimage, 56(2), 709–715.
Article PubMed Google Scholar
Kamin, L. (1969). Predictability, surprise, attention, and conditioning. in B. A. Campbell, & R. M. Church (Eds.). Punishment and aversive behavior (pp. 279-296). New York: Appleton-Century-Crofts.
Google Scholar
Khaligh-Razavi, S.-M., & Kriegeskorte, N. (2014). Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS Computational Biology, 10(11), e1003915.
Article PubMed PubMed Central Google Scholar
Kriegeskorte, N. (2015). Deep neural networks: A new framework for modeling biological vision and brain information processing. Annual Review of Vision Science, 1, 417–446.
Article PubMed Google Scholar
Kriegeskorte, N., Goebel, R., & Bandettini, P. (2006). Information-based functional brain mapping. Proceedings of the National Academy of Sciences, 103(10), 3863–3868.
Article Google Scholar
Kriegeskorte, N., & Kievit, R. A. (2013). Representational geometry: integrating cognition, computation, and the brain. Trends in Cognitive Sciences, 17(8), 401–412.
Article PubMed PubMed Central Google Scholar
Kriegeskorte, N., Mur, M., & Bandettini, P. A. (2008). Representational similarity analysis-connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2, 4.
PubMed PubMed Central Google Scholar
Lau, B., & Glimcher, P. W. (2005). Dynamic response-by-response models of matching behavior in rhesus monkeys. Journal of the Experimental Analysis of Behavior, 84(3), 555–579.
Article PubMed PubMed Central Google Scholar
Lebreton, M., Bavard, S., Daunizeau, J., & Palminteri, S. (2019). Assessing inter-individual differences with task-related functional neuroimaging. Nature Human Behaviour, 3(9), 897–905.
Article PubMed Google Scholar
Lee, M. D., & Wagenmakers, E.-J. (2014). Bayesian cognitive modeling: A practical course. Cambridge University Press.
Book Google Scholar
Mack, M. L., Preston, A. R., & Love, B. C. (2013). Decoding the brain’s algorithm for categorization from its neural implementation. Current Biology, 23(20), 2023–2027.
Article PubMed Google Scholar
Marr, D., & Poggio, T. (1976). From understanding computation to understanding neural circuitry.
Google Scholar
Miller, R. R., Barnet, R. C., & Grahame, N. J. (1995). Assessment of the Rescorla-Wagner model. Psychological Bulletin, 117(3), 363.
Article PubMed Google Scholar
Milosavljevic, M., Malmaud, J., Huth, A., Koch, C., & Rangel, A. (2010). The drift diffusion model can account for value-based choice response times under high and low time pressure. Judgment and Decision Making, 5(6), 437–449.
Article Google Scholar
Montague, P. R., Dayan, P., & Sejnowski, T. J. (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. Journal of Neuroscience, 16(5), 1936–1947.
Article PubMed Google Scholar
Mumford, J. A., Davis, T., & Poldrack, R. A. (2014). The impact of study design on pattern estimation for single-trial multivariate pattern analysis. Neuroimage, 103, 130–138.
Article PubMed Google Scholar
Mumford, J. A., Poline, J.-B., & Poldrack, R. A. (2015). Orthogonalization of regressors in fMRI models. PloS One, 10(4), e0126255.
Article PubMed PubMed Central Google Scholar
Mumford, J. A., Turner, B. O., Ashby, F. G., & Poldrack, R. A. (2012). Deconvolving bold activation in event-related designs for multivoxel pattern classification analyses. Neuroimage, 59(3), 2636–2643.
Article PubMed Google Scholar
Myung, I. J. (2003). Tutorial on maximum likelihood estimation. Journal of Mathematical Psychology, 47(1), 90–100.
Article Google Scholar
Naselaris, T., Prenger, R. J., Kay, K. N., Oliver, M., & Gallant, J. L. (2009). Bayesian reconstruction of natural images from human brain activity. Neuron, 63(6), 902–915.
Article PubMed PubMed Central Google Scholar
Nastase, S. A., Goldstein, A., & Hasson, U. (2020). Keep it real: Rethinking the primacy of experimental control in cognitive neuroscience. NeuroImage, 222, 117254.
Article PubMed Google Scholar
Niv, Y., Daniel, R., Geana, A., Gershman, S. J., Leong, Y. C., Radulescu, A., & Wilson, R. C. (2015). Reinforcement learning in multidimensional environments relies on attention mechanisms. Journal of Neuroscience, 35(21), 8145–8157.
Article PubMed Google Scholar
Niv, Y., & Langdon, A. (2016). Reinforcement learning with MARR. Current Opinion in Behavioral Sciences, 11, 67–73.
Article PubMed PubMed Central Google Scholar
Niv, Y., & Schoenbaum, G. (2008). Dialogues on prediction errors. Trends in Cognitive Sciences, 12(7), 265–272.
Article PubMed Google Scholar
Norman, K. A., Polyn, S. M., Detre, G. J., & Haxby, J. V. (2006). Beyond mind-reading: Multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences, 10(9), 424–430.
Article PubMed Google Scholar
O’Doherty, J., Dayan, P., Schultz, J., Deichmann, R., Friston, K., & Dolan, R. J. (2004). Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science, 304(5669), 452–454.
Article PubMed Google Scholar
O’Doherty, J. P., Cockburn, J., & Pauli, W. M. (2017). Learning, reward, and decision making. Annual Review of Psychology, 68, 73–100.
Article PubMed Google Scholar
O’Doherty, J. P., Dayan, P., Friston, K., Critchley, H., & Dolan, R. J. (2003). Temporal difference models and reward-related learning in the human brain. Neuron, 38(2), 329–337.
Article PubMed Google Scholar
O’Doherty, J. P., Hampton, A., & Kim, H. (2007). Model-based fMRI and its application to reward learning and decision making. Annals of the New York Academy of Sciences, 1104(1), 35–53.
Article PubMed Google Scholar
O’Doherty, J. P., Lee, S., Tadayonnejad, R., Cockburn, J., Iigaya, K., & Charpentier, C. J. (2021). Why and how the brain weights contributions from a mixture of experts. Neuroscience & Biobehavioral Reviews, 123, 14–23.
Article Google Scholar
Palminteri, S., Wyart, V., & Koechlin, E. (2017). The importance of falsification in computational cognitive modeling. Trends in Cognitive Sciences, 21(6), 425–433.
Article PubMed Google Scholar
Parr, R., & Russell, S. (1998). Reinforcement learning with hierarchies of machines. Advances in Neural Information Processing Systems, 10, 1043–1049.
Google Scholar
Pavlov, I. P., & Anrep, G. V. (1927). Conditioned reflexes: An investigation of the physiological activity of the cerebral cortex (Vol. 3). London: Oxford University Press
Google Scholar
Piray, P., Dezfouli, A., Heskes, T., Frank, M. J., & Daw, N. D. (2019). Hierarchical Bayesian inference for concurrent model fitting and comparison for group studies. PLoS Computational Biology, 15(6), e1007043.
Article PubMed PubMed Central Google Scholar
Polyn, S. M., Natu, V. S., Cohen, J. D., & Norman, K. A. (2005). Category-specific cortical activity precedes retrieval during memory search. Science, 310(5756), 1963–1966.
Article PubMed Google Scholar
Pouget, A., Dayan, P., & Zemel, R. (2000). Information processing with population codes. Nature Reviews Neuroscience, 1(2), 125–132.
Article PubMed Google Scholar
Rescorla, R. A. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In Current research and theory (pp. 64–99).
Google Scholar
Rizley, R. C., & Rescorla, R. A. (1972). Associations in second-order conditioning and sensory preconditioning. Journal of Comparative and Physiological Psychology, 81(1), 1.
Article PubMed Google Scholar
Rutledge, R. B., Dean, M., Caplin, A., & Glimcher, P. W. (2010). Testing the reward prediction error hypothesis with an axiomatic model. Journal of Neuroscience, 30(40), 13525–13536.
Article PubMed Google Scholar
Schoenmakers, S., Barth, M., Heskes, T., & Van Gerven, M. (2013). Linear reconstruction of perceived images from human brain activity. NeuroImage, 83, 951–961.
Article PubMed Google Scholar
Schuck, N. W., Cai, M. B., Wilson, R. C., & Niv, Y. (2016). Human orbitofrontal cortex represents a cognitive map of state space. Neuron, 91(6), 1402–1412.
Article PubMed PubMed Central Google Scholar
Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275(5306), 1593–1599.
Article PubMed Google Scholar
Schwarz, G., et al. (1978). Estimating the dimension of a model. Annals of Statistics, 6(2), 461–464.
Article Google Scholar
Skinner, B. F. (1963). Operant behavior. American Psychologist, 18(8), 503.
Article Google Scholar
Sonkusare, S., Breakspear, M., & Guo, C. (2019). Naturalistic stimuli in neuroscience: critically acclaimed. Trends in Cognitive Sciences, 23(8), 699–714.
Article PubMed Google Scholar
Stephan, K. E., Penny, W. D., Daunizeau, J., Moran, R. J., & Friston, K. J. (2009). Bayesian model selection for group studies. Neuroimage, 46(4), 1004–1017.
Article PubMed Google Scholar
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9–44.
Article Google Scholar
Sutton, R. S. (1995). TD models: Modeling the world at a mixture of time scales. In Machine Learning Proceedings 1995 (pp. 531–539). Elsevier.
Google Scholar
Sutton, R. S., & Barto, A. G. (1981). Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review, 88(2), 135.
Article PubMed Google Scholar
Sutton, R. S., & Barto, A. G. (1987). A temporal-difference model of classical conditioning. In Proceedings of the Ninth Annual Conference of the Cognitive Science Society (pp. 355–378). Seattle, WA.
Google Scholar
Sutton, R. S., Barto, A. G., et al. (1998). Introduction to reinforcement learning (Vol. 135). Cambridge: MIT Press.
Google Scholar
Sutton, R. S., Precup, D., & Singh, S. (1999). Between MDPS and semi-MDPS: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1-2), 181–211.
Article Google Scholar
Thorndike, E. L. (1898). Animal intelligence: an experimental study of the associative processes in animals. The Psychological Review: Monograph Supplements, 2(4), i.
Google Scholar
Tolman, E. C. (1948). Cognitive maps in rats and men. Psychological Review, 55(4), 189.
Article PubMed Google Scholar
Turner, B. M., Forstmann, B. U., Love, B. C., Palmeri, T. J., & Van Maanen, L. (2017). Approaches to analysis in model-based cognitive neuroscience. Journal of Mathematical Psychology, 76, 65–79.
Article PubMed Google Scholar
Wilson, R. C., & Collins, A. G. (2019). Ten simple rules for the computational modeling of behavioral data. Elife, 8, e49547.
Article PubMed PubMed Central Google Scholar
Wilson, R. C., & Niv, Y. (2015). Is model fitting necessary for model-based fMRI? PLoS Computational Biology, 11(6), e1004237.
Article PubMed PubMed Central Google Scholar
Witten, I. H. (1977). An adaptive optimal controller for discrete-time Markov environments. Information and Control, 34(4), 286–295.
Article Google Scholar
Worsley, K. J., Liao, C. H., Aston, J., Petre, V., Duncan, G., Morales, F., & Evans, A. (2002). A general statistical analysis for fMRI data. Neuroimage, 15(1), 1–15.
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
Vincent Man
Division of Humanities and Social Sciences, Computation and Neural Systems, California Institute of Technology, Pasadena, CA, USA
John P. O’Doherty

Authors

Vincent Man
View author publications
You can also search for this author in PubMed Google Scholar
John P. O’Doherty
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vincent Man .

Editor information

Editors and Affiliations

Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
Birte U. Forstmann
Psychology Department, The Ohio State University, Columbus, OH, USA
Brandon M. Turner

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Man, V., O’Doherty, J.P. (2024). Reinforcement Learning. In: Forstmann, B.U., Turner, B.M. (eds) An Introduction to Model-Based Cognitive Neuroscience. Springer, Cham. https://doi.org/10.1007/978-3-031-45271-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-45271-0_3
Published: 20 November 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45270-3
Online ISBN: 978-3-031-45271-0
eBook Packages: Behavioral Science and PsychologyBehavioral Science and Psychology (R0)

Publish with us

Policies and ethics