Skip to main content

Advertisement

SpringerLink
  • Log in
Decision theory, reinforcement learning, and the brain
Download PDF
Download PDF
  • Connections between Computational and Neurobiological Perspectives on Decision Making
  • Published: December 2008

Decision theory, reinforcement learning, and the brain

  • Peter Dayan1 &
  • Nathaniel D. Daw2 

Cognitive, Affective, & Behavioral Neuroscience volume 8, pages 429–453 (2008)Cite this article

  • 7174 Accesses

  • 303 Citations

  • Metrics details

Abstract

Decision making is a core competence for animals and humans acting and surviving in environments they only partially comprehend, gaining rewards and punishments for their troubles. Decision-theoretic concepts permeate experiments and computational models in ethology, psychology, and neuroscience. Here, we review a well-known, coherent Bayesian approach to decision making, showing how it unifies issues in Markovian decision problems, signal detection psychophysics, sequential sampling, and optimal exploration and discuss paradigmatic psychological and neural examples of each problem. We discuss computational issues concerning what subjects know about their task and how ambitious they are in seeking optimal solutions; we address algorithmic topics concerning model-based and model-free methods for making choices; and we highlight key aspects of the neural implementation of decision making.

Download to read the full article text

Working on a manuscript?

Avoid the common mistakes

References

  • Ainslie, G. (2001). Breakdown of will. Cambridge: Cambridge University Press.

    Google Scholar 

  • Balleine, B. W., Delgado, M. R., & Hikosaka, O. (2007). The role of the dorsal striatum in reward and decision-making. Journal of Neuroscience, 27, 8161–8165.

    Article  PubMed  Google Scholar 

  • Barto, A. G. (1995). Adaptive critics and the basal ganglia. In J. C. Houk, J. L. Davis, & D. G. Beiser (Eds.), Models of information processing in the basal ganglia (pp. 215–232). Cambridge, MA: MIT Press.

    Google Scholar 

  • Battaglia, P. W., Jacobs, R. A., & Aslin, R. N. (2003). Bayesian integration of visual and auditory signals for spatial localization. Journal of the Optical Society of America A, 20, 1391–1397.

    Article  Google Scholar 

  • Baxter, J., & Bartlett, P. L. (2001). Infinite-horizon policy-gradient estimation. Journal of Artificial Intelligence Research, 15, 319–350.

    Article  Google Scholar 

  • Beck, J. [M.], Ma, W. J., Latham, P. E., & Pouget, A. (2007). Probabilistic population codes and the exponential family of distributions. Progress in Brain Research, 165, 509–519.

    Article  PubMed  Google Scholar 

  • Beck, J. M., & Pouget, A. (2007). Exact inferences in a neural implementation of a hidden Markov model. Neural Computation, 19, 1344–1361.

    Article  PubMed  Google Scholar 

  • Bellman, R. E. (1957). Dynamic programming. Princeton, NJ: Princeton University Press.

    Google Scholar 

  • Berger, J. O. (1985). Statistical decision theory and Bayesian analysis. New York: Springer.

    Google Scholar 

  • Berridge, K. C. (2007). The debate over dopamine’s role in reward: The case for incentive salience. Psychopharmacology, 191, 391–431.

    Article  PubMed  Google Scholar 

  • Berry, D. A., & Fristedt, B. (1985). Bandit problems: Sequential allocation of experiments (Monographs on Statistics and Applied Probability). London: Chapman & Hall.

    Google Scholar 

  • Bertsekas, D. P. (2007). Dynamic programming and optimal control (2 vol.). Belmont, MA: Athena Scientific.

    Google Scholar 

  • Bertsekas, D. P., & Tsitsiklis, J. N. (1996). Neuro-dynamic programming. Belmont, MA: Athena Scientific.

    Google Scholar 

  • Bogacz, R., Brown, E., Moehlis, J., Holmes, P., & Cohen, J. D. (2006). The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced-choice tasks. Psychology Review, 113, 700–765.

    Article  Google Scholar 

  • Britten, K. H., Newsome, W. T., Shadlen, M. N., Celebrini, S., & Movshon, J. A. (1996). A relationship between behavioral choice and the visual responses of neurons in macaque mt. Visual Neuroscience, 13, 87–100.

    Article  PubMed  Google Scholar 

  • Britten, K. H., Shadlen, M. N., Newsome, W. T., & Movshon, J. A. (1992). The analysis of visual motion: A comparison of neuronal and psychophysical performance. Journal of Neuroscience, 12, 4745–4765.

    PubMed  Google Scholar 

  • Chrisman, L. (1992). Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Proceedings of the 10th National Conference on Artificial Intelligence (pp. 183-188). Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Cohen, J. D., McClure, S. M., & Yu, A. J. (2007). Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philosophical Transactions of the Royal Society B, 362, 933–942.

    Article  Google Scholar 

  • Costa, R. M. (2007). Plastic corticostriatal circuits for action learning: What’s dopamine got to do with it? In B. W. Balleine, K. Doya, J. O’Doherty, & M. Sakagami (Eds.), Reward and decision making in corticobasal ganglia networks (Annals of the New York Academy of Sciences, Vol. 1104, pp. 172–191). New York: New York Academy of Sciences.

    Google Scholar 

  • Daw, N. D., Courville, A. C., & Touretzky, D. S. (2006). Representation and timing in theories of the dopamine system. Neural Computation, 18, 1637–1677.

    Article  PubMed  Google Scholar 

  • Daw, N. D., & Doya, K. (2006). The computational neurobiology of learning and reward. Current Opinion in Neurobiology, 16, 199–204.

    Article  PubMed  Google Scholar 

  • Daw, N. D., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8, 1704–1711.

    Article  PubMed  Google Scholar 

  • Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441, 876–879.

    Article  PubMed  Google Scholar 

  • Day, J. J., Roitman, M. F., Wightman, R. M., & Carelli, R. M. (2007). Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens. Nature Neuroscience, 10, 1020–1028.

    Article  PubMed  Google Scholar 

  • Dayan, P., & Abbott, L. F. (2005). Theoretical neuroscience: Computational and mathematical modeling of neural systems. Cambridge, MA: MIT Press.

    Google Scholar 

  • Dayan, P., & Sejnowski, T. J. (1996). Exploration bonuses and dual control. Machine Learning, 25, 5–22.

    Google Scholar 

  • Dearden, R., Friedman, N., & Russell, S. (1998). Bayesian Q-learning. In Proceedings of the Fifteenth National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence (pp. 761–768). Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39, 1–38.

    Google Scholar 

  • Deneve, S. (2008). Bayesian spiking neurons I: Inference. Neural Computation, 20, 91–117.

    Article  PubMed  Google Scholar 

  • Dickinson, A., & Balleine, B. (2002). The role of learning in motivation. In C. Gallistel (Ed.), Stevens’s handbook of experimental psychology (Vol. 3, pp. 497–533). New York: Wiley.

    Google Scholar 

  • Doya, K. (2002). Metalearning and neuromodulation. Neural Networks, 15, 495–506.

    Article  PubMed  Google Scholar 

  • Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415, 429–433.

    Article  PubMed  Google Scholar 

  • Everitt, B. J., & Robbins, T. W. (2005). Neural systems of reinforcement for drug addiction: From actions to habits to compulsion. Nature Neuroscience, 8, 1481–1489.

    Article  PubMed  Google Scholar 

  • Friston, K. J., Tononi, G., Reeke, G. N., Jr., Sporns, O., & Edelman, G. M. (1994). Value-dependent selection in the brain: Simulation in a synthetic neural model. Neuroscience, 59, 229–243.

    Article  PubMed  Google Scholar 

  • Gittins, J. C. (1989). Multi-armed bandit allocation indices. New York: Wiley.

    Google Scholar 

  • Glimcher, P. W. (2004). Decisions, uncertainty, and the brain: The science of neuroeconomics. Cambridge, MA: MIT Press, Bradford Books.

    Google Scholar 

  • Gold, J. I., & Shadlen, M. N. (2001). Neural computations that underlie decisions about sensory stimuli. Trends in Cognitive Sciences, 5, 10–16.

    Article  PubMed  Google Scholar 

  • Gold, J. I., & Shadlen, M. N. (2002). Banburismus and the brain: Decoding the relationship between sensory stimuli, decisions, and reward. Neuron, 36, 299–308.

    Article  PubMed  Google Scholar 

  • Gold, J. I., & Shadlen, M. N. (2007). The neural basis of decision making. Annual Review of Neuroscience, 30, 535–574.

    Article  PubMed  Google Scholar 

  • Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.

    Google Scholar 

  • Haruno, M., Kuroda, T., Doya, K., Toyama, K., Kimura, M., Samejima, K., et al. (2004). A neural correlate of reward-based behavioral learning in caudate nucleus:A functional magnetic resonance imagresearch ing study of a stochastic decision task. Journal of Neuroscience, 24, 1660–1665.

    Article  PubMed  Google Scholar 

  • Hyman, S. E., Malenka, R. C., & Nestler, E. J. (2006). Neural mechanisms of addiction: The role of reward-related learning and memory. Annual Review of Neuroscience, 29, 565–598.

    Article  PubMed  Google Scholar 

  • Jaakkola, T., Jordan, M. I., & Singh, S. P. (1994). On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6, 1185–1201.

    Article  Google Scholar 

  • Jacobs, R. A. (1999). Optimal integration of texture and motion cues to depth. Vision Research, 39, 3621–3629.

    Article  PubMed  Google Scholar 

  • Jazayeri, M., & Movshon, J. A. (2006). Optimal representation of sensory information by neural populations. Nature Neuroscience, 9, 690–696.

    Article  PubMed  Google Scholar 

  • Joel, D., Niv, Y., & Ruppin, E. (2002). Actor-critic models of the basal ganglia: New anatomical and computational perspectives. Neural Networks, 15, 535–547.

    Article  PubMed  Google Scholar 

  • Kable, J. W., & Glimcher, P. W. (2007). The neural correlates of subjective value during intertemporal choice. Nature Neuroscience, 10, 1625–1633.

    Article  PubMed  Google Scholar 

  • Kaelbling, L. P. (1993). Learning in embedded systems. Cambridge, MA: MIT Press.

    Google Scholar 

  • Kakade, S., & Dayan, P. (2002). Dopamine: Generalization and bonuses. Neural Networks, 15, 549–559.

    Article  PubMed  Google Scholar 

  • Killcross, S., & Coutureau, E. (2003). Coordination of actions and habits in the medial prefrontal cortex of rats. Cerebral Cortex, 13, 400–408.

    Article  PubMed  Google Scholar 

  • Körding, K. [P.] (2007). Decision theory: What “should”the nervous system do? Science, 318, 606–610.

    Article  PubMed  Google Scholar 

  • Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427, 244–247.

    Article  PubMed  Google Scholar 

  • Krebs, J., Kacelnik, A., & Taylor, P. (1978). Test of optimal sampling by foraging great tits. Nature, 275, 27–31.

    Article  Google Scholar 

  • Lengyel, M., & Dayan, P. (2008). Hippocampal contributions to control: The third way. In J. Platt, D. Koller, Y. Singer, & S. Roweis (Eds.), Advances in neural information processing systems 20 (pp. 889–896). Cambridge, MA: MIT Press.

    Google Scholar 

  • Lo, C.-C., & Wang, X.-J. (2006). Cortico-basal ganglia circuit mechanism for a decision threshold in reaction time tasks. Nature Neuroscience, 9, 956–963.

    Article  PubMed  Google Scholar 

  • Ma, W. J., Beck, J. M., Latham, P. E., & Pouget, A. (2006). Bayesian inference with probabilistic population codes. Nature Neuroscience, 9, 1432–1438.

    Article  PubMed  Google Scholar 

  • Mangel, M., & Clark, C. W. (1989). Dynamic modeling in behavioral ecology. Princeton, NJ: Princeton University Press.

    Google Scholar 

  • McClure, S. M., Gilzenrat, M. S., & Cohen, J. D. (2006). An exploration-exploitation model based on norepinephrine and dopamine activity. In Y. Weiss, B. Schölkopf, & J. Platt (Eds.), Advances in neural information processing systems 18 (pp. 867–874). Cambridge, MA: MIT Press.

    Google Scholar 

  • McClure, S. M., Laibson, D. I., Loewenstein, G., & Cohen, J. D. (2004). Separate neural systems value immediate and delayed monetary rewards. Science, 306, 503–507.

    Article  PubMed  Google Scholar 

  • McNamara, J., & Houston, A. (1980). The application of statistical decision theory to animal behaviour. Journal of Theoretical Biology, 85, 673–690.

    Article  PubMed  Google Scholar 

  • Montague, [P.] R. (2006). Why choose this book?: How we make decisions. New York: Dutton.

    Google Scholar 

  • Montague, P. R., Dayan, P., & Sejnowski, T. J. (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. Journal of Neuroscience, 16, 1936–1947.

    PubMed  Google Scholar 

  • Neal, R. M., & Hinton, G. E. (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. I. Jordan (Ed.), Learning in graphical models (pp. 355–368). Norwell, MA: Kluwer.

    Google Scholar 

  • Ng, A., Harada, D., & Russell, S. (1999). Policy invariance under reward transformations: Theory and application to reward shaping. In Proceedings of the Sixteenth International Conference on Machine Learning (pp. 278–287). San Francisco: Morgan Kaufmann.

    Google Scholar 

  • O’Doherty, J. P., Dayan, P., Friston, K., Critchley, H., & Dolan, R. J. (2003). Temporal difference models and reward-related learning in the human brain. Neuron, 38, 329–337.

    Article  PubMed  Google Scholar 

  • O’Doherty, J. [P.], Dayan, P., Schultz, J., Deichmann, R., Friston, K., & Dolan, R. J. (2004). Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science, 304, 452–454.

    Article  PubMed  Google Scholar 

  • Parker, A. J., & Newsome, W. T. (1998). Sense and the single neuron: Probing the physiology of perception. Annual Review of Neuroscience, 21, 227–277.

    Article  PubMed  Google Scholar 

  • Platt, M. L., & Glimcher, P. W. (1999). Neural correlates of decision variables in parietal cortex. Nature, 400, 233–238.

    Article  PubMed  Google Scholar 

  • Puterman, M. L. (2005). Markov decision processes: Discrete stochastic dynamic programming. New York: Wiley Interscience.

    Google Scholar 

  • Pyke, G. H. (1984). Optimal foraging theory: A critical review. Annual Review of Ecology & Systematics, 15, 523–575.

    Article  Google Scholar 

  • Rao, R. P. N. (2004). Bayesian computation in recurrent neural circuits. Neural Computation, 16, 1–38.

    Article  PubMed  Google Scholar 

  • Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20, 873–922.

    Article  PubMed  Google Scholar 

  • Ratcliff, R., & Rouder, J. (1998). Modeling response times for twochoice decisions. Psychological Science, 9, 347–356.

    Article  Google Scholar 

  • Redgrave, P., Gurney, K., & Reynolds, J. (2008). What is reinforced by phasic dopamine signals? Brain Research Reviews, 58, 322–339.

    Article  PubMed  Google Scholar 

  • Rescorla, R., & Wagner, A. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64–99). New York: Appleton-Century-Crofts.

    Google Scholar 

  • Roitman, J. D., & Shadlen, M. N. (2002). Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. Journal of Neuroscience, 22, 9475–9489.

    PubMed  Google Scholar 

  • Ross, S. (1983). Introduction to stochastic dynamic programming: Probability and mathematical. Orlando, FL: Academic Press.

    Google Scholar 

  • Sahani, M., & Dayan, P. (2003). Doubly distributional population codes: Simultaneous representation of uncertainty and multiplicity. Neural Computation, 15, 2255–2279.

    Article  PubMed  Google Scholar 

  • Schultz, W. (2002). Getting formal with dopamine and reward. Neuron, 36, 241–263.

    Article  PubMed  Google Scholar 

  • Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275, 1593–1599.

    Article  PubMed  Google Scholar 

  • Seymour, B., O’Doherty, J. P., Dayan, P., Koltzenburg, M., Jones, A. K., Dolan, R. J., et al. (2004). Temporal difference models describe higher-order learning in humans. Nature, 429, 664–667.

    Article  PubMed  Google Scholar 

  • Shadlen, M. N., Britten, K. H., Newsome, W. T., & Movshon, J. A. (1996). A computational analysis of the relationship between neuronal and behavioral responses to visual motion. Journal of Neuroscience, 16, 1486–1510.

    PubMed  Google Scholar 

  • Shadlen, M. N., Hanks, T. D., Churchland, A. K., Kiani, R., & Yang, T. (2007). The speed and accuracy of a simple perceptual decision: A mathematical primer. In K. Doya, S. Ishii, A. Pouget, & R. P. Rao (Eds.), Bayesian brain: Probabilistic approaches to neural coding (pp. 209–238). Cambridge, MA: MIT Press.

    Google Scholar 

  • Shadlen, M. N., & Newsome, W. T. (1996). Motion perception: Seeing and deciding. Proceedings of the National Academy of Sciences, 93, 628–633.

    Article  Google Scholar 

  • Smith, P. L., & Ratcliff, R. (2004). Psychology and neurobiology of simple decisions. Trends in Neurosciences, 27, 161–168.

    Article  PubMed  Google Scholar 

  • Stocker, A. A., & Simoncelli, E. P. (2006). Noise characteristics and prior expectations in human visual speed perception. Nature Neuroscience, 9, 578–585.

    Article  PubMed  Google Scholar 

  • Suri, R. E., & Schultz, W. (1998). Learning of sequential movements by neural network model with dopamine-like reinforcement signal. Experimental Brain Research, 121, 350–354.

    Article  Google Scholar 

  • Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9–44.

    Google Scholar 

  • Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In B. W. Porter & R. J. Mooney (Eds.), Proceedings of the Seventh International Conference on Machine Learning (pp. 216–224). San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.

    Google Scholar 

  • Trommershäuser, J., Landy, M. S., & Maloney, L. T. (2006). Humans rapidly estimate expected gain in movement planning. Psychological Science, 17, 981–988.

    Article  PubMed  Google Scholar 

  • Trommershäuser, J., Maloney, L. T., & Landy, M. S. (2003a). Statistical decision theory and the selection of rapid, goal-directed movements. Journal of the Optical Society of America A, 20, 1419–1433.

    Article  Google Scholar 

  • Trommershäuser, J., Maloney, L. T., & Landy, M. S. (2003b). Statistical decision theory and trade-offs in the control of motor response. Spatial Vision, 16, 255–275.

    Article  PubMed  Google Scholar 

  • Usher, M., & McClelland, J. L. (2001). The time course of perceptual choice: The leaky, competing accumulator model. Psychological Review, 108, 550–592.

    Article  PubMed  Google Scholar 

  • Wald, A. (1947). Sequential analysis. New York: Wiley.

    Google Scholar 

  • Wang, X.-J. (2002). Probabilistic decision making by slow reverberation in cortical circuits. Neuron, 36, 955–968.

    Article  PubMed  Google Scholar 

  • Wang, X.-J. (2006). Toward a prefrontal microcircuit model for cognitive deficits in schizophrenia. Pharmacopsychiatry, 39 (Suppl. 1), S80-S87.

    Article  PubMed  Google Scholar 

  • Watkins, C. (1989). Learning from delayed rewards. Unpublished doctoral thesis, University of Cambridge.

  • Whiteley, L., & Sahani, M. (2008). Implicit knowledge of visual uncertainty guides decisions with asymmetric outcomes. Journal of Vision, 8, 1–15.

    Article  PubMed  Google Scholar 

  • Wickens, J. [R.] (1990). Striatal dopamine in motor activation and reward-mediated learning: Steps towards a unifying model. Journal of Neural Transmission, 80, 9–31.

    Article  PubMed  Google Scholar 

  • Wickens, J. R., Horvitz, J. C., Costa, R. M., & Killcross, S. (2007). Dopaminergic mechanisms in actions and habits. Journal of Neuroscience, 27, 8181–8183.

    Article  PubMed  Google Scholar 

  • Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8, 229–256.

    Google Scholar 

  • Wittmann, B. C., Daw, N. D., Seymour, B., & Dolan, R. J. (2008). Striatal activity underlies novelty-based choice in humans. Neuron, 58, 967–973.

    Article  PubMed  Google Scholar 

  • Yang, T., & Shadlen, M. N. (2007). Probabilistic reasoning by neurons. Nature, 447, 1075–1080.

    Article  PubMed  Google Scholar 

  • Yoshida, W., & Ishii, S. (2006). Resolution of uncertainty in prefrontal cortex. Neuron, 50, 781–789.

    Article  PubMed  Google Scholar 

  • Yu, A. J. (2007). Optimal change-detection and spiking neurons. In B. Schölkopf, J. Platt, & T. Hofmann (Eds.), Advances in neural information processing systems 19 (pp. 1545–1552). Cambridge, MA: MIT Press.

    Google Scholar 

  • Yuille, A. J., & Bülthoff, H. H. (1996). Bayesian decision theory and psychophysics. In D. C. Knill & W. Richards (Eds.), Perception as Bayesian inference (pp. 123–161). Cambridge: Cambridge University Press.

    Google Scholar 

  • Zemel, R. S., Dayan, P., & Pouget, A. (1998). Probabilistic interpretation of population codes. Neural Computation, 10, 403–430.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Gatsby Computational Neuroscience Unit, University College London, Room 407, Alexandra House, 17 Queen Square, WC1N 3AR, London, England

    Peter Dayan

  2. Center for Neural Science, Department of Psychology and Center for Neuroeconomics, New York University, 4 Washington Place, 10003, New York, NY

    Nathaniel D. Daw

Authors
  1. Peter Dayan
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Nathaniel D. Daw
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Peter Dayan or Nathaniel D. Daw.

Additional information

Funding came from the Gatsby Charitable Foundation (to P.D.).

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Dayan, P., Daw, N.D. Decision theory, reinforcement learning, and the brain. Cognitive, Affective, & Behavioral Neuroscience 8, 429–453 (2008). https://doi.org/10.3758/CABN.8.4.429

Download citation

  • Received: 19 March 2008

  • Accepted: 24 June 2008

  • Issue Date: December 2008

  • DOI: https://doi.org/10.3758/CABN.8.4.429

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Belief State
  • Markov Decision Problem
  • Lateral Intraparietal Area
  • Temporal Difference Model
  • Ective State
Download PDF

Working on a manuscript?

Avoid the common mistakes

Advertisement

Over 10 million scientific documents at your fingertips

Switch Edition
  • Academic Edition
  • Corporate Edition
  • Home
  • Impressum
  • Legal information
  • Privacy statement
  • California Privacy Statement
  • How we use cookies
  • Manage cookies/Do not sell my data
  • Accessibility
  • FAQ
  • Contact us
  • Affiliate program

Not logged in - 89.238.176.4

Not affiliated

Springer Nature

© 2022 Springer Nature Switzerland AG. Part of Springer Nature.