Psychopharmacology

, Volume 191, Issue 3, pp 507–520

Tonic dopamine: opportunity costs and the control of response vigor

  • Yael Niv
  • Nathaniel D. Daw
  • Daphna Joel
  • Peter Dayan
Original Investigation

Abstract

Rationale

Dopamine neurotransmission has long been known to exert a powerful influence over the vigor, strength, or rate of responding. However, there exists no clear understanding of the computational foundation for this effect; predominant accounts of dopamine’s computational function focus on a role for phasic dopamine in controlling the discrete selection between different actions and have nothing to say about response vigor or indeed the free-operant tasks in which it is typically measured.

Objectives

We seek to accommodate free-operant behavioral tasks within the realm of models of optimal control and thereby capture how dopaminergic and motivational manipulations affect response vigor.

Methods

We construct an average reward reinforcement learning model in which subjects choose both which action to perform and also the latency with which to perform it. Optimal control balances the costs of acting quickly against the benefits of getting reward earlier and thereby chooses a best response latency.

Results

In this framework, the long-run average rate of reward plays a key role as an opportunity cost and mediates motivational influences on rates and vigor of responding. We review evidence suggesting that the average reward rate is reported by tonic levels of dopamine putatively in the nucleus accumbens.

Conclusions

Our extension of reinforcement learning models to free-operant tasks unites psychologically and computationally inspired ideas about the role of tonic dopamine in striatum, explaining from a normative point of view why higher levels of dopamine might be associated with more vigorous responding.

Keywords

Dopamine Motivation Response rate Energizing Reinforcement learning Free operant 

References

  1. Aberman JE, Salamone JD (1999) Nucleus accumbens dopamine depletions make rats more sensitive to high ratio requirements but do not impair primary food reinforcement. Neuroscience 92(2):545–552PubMedGoogle Scholar
  2. Ainslie G (1975) Specious reward: a behavioural theory of impulsiveness and impulse control. Psychol Bull 82:463–496PubMedGoogle Scholar
  3. Barrett JE, Stanley JA (1980) Effects of ethanol on multiple fixed-interval fixed-ratio schedule performances: dynamic interactions at different fixed-ratio values. J Exp Anal Behav 34(2):185–198PubMedGoogle Scholar
  4. Barto AG (1995) Adaptive critics and the basal ganglia. In: Houk JC, Davis JL, Beiser DG (eds) Models of information processing in the basal ganglia. MIT Press, Cambridge, pp 215–232Google Scholar
  5. Beninger RJ (1983) The role of dopamine in locomotor activity and learning. Brain Res Brain Res Rev 6:173–196Google Scholar
  6. Bergstrom BP, Garris PA (2003) ‘Passive stabilization’ of striatal extracellular dopamine across the lesion spectrum encompassing the presymptomatic phase of Parkinson’s disease: a voltammetric study in the 6-OHDA lesioned rat. J Neurochem 87(5):1224–1236PubMedGoogle Scholar
  7. Berridge KC (2004) Motivation concepts in behavioral neuroscience. Physiol Behav 81(2):179–209PubMedGoogle Scholar
  8. Berridge KC, Robinson TE (1998) What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience? Brain Res Brain Res Rev 28:309–369PubMedGoogle Scholar
  9. Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena, BelmontGoogle Scholar
  10. Bolles RC (1967) Theory of motivation. Harper and Row, New YorkGoogle Scholar
  11. Carr GD, White NM (1987) Effects of systemic and intracranial amphetamine injections on behavior in the open field: a detailed analysis. Pharmacol Biochem Behav 27:113–122PubMedGoogle Scholar
  12. Catania AC, Reynolds GS (1968) A quantitative analysis of the responding maintained by interval schedules of reinforcement. J Exp Anal Behav 11:327–383PubMedGoogle Scholar
  13. Catania AC, Matthews TJ, Silverman PJ, Yohalem R (1977) Yoked variable-ratio and variable-interval responding in pigeons. J Exp Anal Behav 28:155–161PubMedGoogle Scholar
  14. Chéramy A, Barbeito L, Godeheu G, Desce J, Pittaluga A, Galli T, Artaud F, Glowinski J (1990) Respective contributions of neuronal activity and presynaptic mechanisms in the control of the in vivo release of dopamine. J Neural Transm Suppl 29:183–193PubMedGoogle Scholar
  15. Chesselet MF (1990) Presynaptic regulation of dopamine release. Implications for the functional organization of the basal ganglia. Ann N Y Acad Sci 604:17–22PubMedGoogle Scholar
  16. Correa M, Carlson BB, Wisniecki A, Salamone JD (2002) Nucleus accumbens dopamine and work requirements on interval schedules. Behav Brain Res 137:179–187PubMedGoogle Scholar
  17. Cousins MS, Atherton A, Turner L, Salamone JD (1996) Nucleus accumbens dopamine depletions alter relative response allocation in a T-maze cost/benefit task. Behav Brain Res 74:189–197PubMedGoogle Scholar
  18. Daw ND (2003) Reinforcement learning models of the dopamine system and their behavioral implications. Unpublished doctoral dissertation, Carnegie Mellon UniversityGoogle Scholar
  19. Daw ND, Touretzky DS (2002) Long-term reward prediction in TD models of the dopamine system. Neural Comp 14:2567–2583Google Scholar
  20. Daw ND, Kakade S, Dayan P (2002) Opponent interactions between serotonin and dopamine. Neural Netw 15(4–6):603–616PubMedGoogle Scholar
  21. Daw ND, Niv Y, Dayan P (2005) Uncertainty based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci 8(12):1704–1711PubMedGoogle Scholar
  22. Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical substrates for exploratory decisions in humans. Nature 441:876–879PubMedGoogle Scholar
  23. Dawson GR, Dickinson A (1990) Performance on ratio and interval schedules with matched reinforcement rates. Q J Exp Psychol B 42:225–239PubMedGoogle Scholar
  24. Denk F, Walton ME, Jennings KA, Sharp T, Rushworth MF, Bannerman DM (2005) Differential involvement of serotonin and dopamine systems in cost–benefit decisions about delay or effort. Psychopharmacology (Berl) 179(3):587–596Google Scholar
  25. Dickinson A (1985) Actions and habits: the development of behavioural autonomy. Philos Trans R Soc Lond B Biol Sci 308(1135):67–78Google Scholar
  26. Dickinson A, Balleine B (1994) Motivational control of goal-directed action. Anim Learn Behav 22:1–18Google Scholar
  27. Dickinson A, Balleine B (2002) The role of learning in the operation of motivational systems. In: Pashler H, Gallistel R (eds) Stevens’ handbook of experimental psychology. Learning, motivation and emotion, 3rd edn, vol 3. Wiley, New York, pp 497–533Google Scholar
  28. Dickinson A, Smith J, Mirenowicz J (2000) Dissociation of Pavlovian and instrumental incentive learning under dopamine agonists. Behav Neurosci 114(3):468–483PubMedGoogle Scholar
  29. Domjan M (2003) Principles of learning and behavior, 5th edn. Thomson/Wadsworth, BelmontGoogle Scholar
  30. Dragoi V, Staddon JER (1999) The dynamics of operant conditioning. Psychol Rev 106(1):20–61PubMedGoogle Scholar
  31. Evenden JL, Robbins TW (1983) Increased dopamine switching, perseveration and perseverative switching following d-amphetamine in the rat. Psychopharmacology (Berl) 80:67–73Google Scholar
  32. Faure A, Haberland U, Condé F, Massioui NE (2005) Lesion to the nigrostriatal dopamine system disrupts stimulus–response habit formation. J Neurosci 25:2771–2780PubMedGoogle Scholar
  33. Fiorillo C, Tobler P, Schultz W (2003) Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299(5614):1898–1902PubMedGoogle Scholar
  34. Fletcher PJ, Korth KM (1999) Activation of 5-HT1B receptors in the nucleus accumbens reduces amphetamine-induced enhancement of responding for conditioned reward. Psychopharmacology (Berl) 142:165–174Google Scholar
  35. Floresco SB, West AR, Ash B, Moore H, Grace AA (2003) Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission. Nat Neurosci 6(9):968–973PubMedGoogle Scholar
  36. Foster TM, Blackman KA, Temple W (1997) Open versus closed economies: performance of domestic hens under fixed-ratio schedules. J Exp Anal Behav 67:67–89PubMedGoogle Scholar
  37. Friston KJ, Tononi G, Reeke GNJ, Sporns O, Edelman GM (1994) Value-dependent selection in the brain: simulation in a synthetic neural model. Neuroscience 59(2):229–243PubMedGoogle Scholar
  38. Gallistel CR, Gibbon J (2000) Time, rate and conditioning. Psychol Rev 107:289–344PubMedGoogle Scholar
  39. Gallistel CR, Stellar J, Bubis E (1974) Parametric analysis of brain stimulation reward in the rat: I. The transient process and the memory-containing process. J Comp Physiol Psychol 87:848–860PubMedGoogle Scholar
  40. Gibbon J (1977) Scalar expectancy theory and Weber’s law in animal timing. Psychol Rev 84(3):279–325Google Scholar
  41. Goto Y, Grace A (2005) Dopaminergic modulation of limbic and cortical drive of nucleus accumbens in goal-directed behavior. Nat Neurosci 8:805–812PubMedGoogle Scholar
  42. Grace AA (1991) Phasic versus tonic dopamine release and the modulation of dopamine system responsivity: a hypothesis for the etiology of schizophrenia. Neuroscience 41(1):1–24PubMedGoogle Scholar
  43. Hernandez G, Hamdani S, Rajabi H, Conover K, Stewart J, Arvanitogiannis A, Shizgal P (2006) Prolonged rewarding stimulation of the rat medial forebrain bundle: neurochemical and behavioral consequences. Behav Neurosci 120(4):888–904PubMedGoogle Scholar
  44. Herrnstein RJ (1961) Relative and absolute strength of response as a function of frequency of reinforcement. J Exp Anal Behav 4(3):267–272PubMedGoogle Scholar
  45. Herrnstein RJ (1970) On the law of effect. J Exp Anal Behav 13(2):243–266PubMedGoogle Scholar
  46. Houk JC, Adams JL, Barto AG (1995) A model of how the basal ganglia generate and use neural signals that predict reinforcement. In: Houk JC, Davis JL, Beiser DG (eds) Models of information processing in the basal ganglia. MIT Press, Cambridge, pp 249–270Google Scholar
  47. Ikemoto S, Panksepp J (1999) The role of nucleus accumbens dopamine in motivated behavior: a unifying interpretation with special reference to reward-seeking. Brain Res Brain Res Rev 31:6–41PubMedGoogle Scholar
  48. Jackson DM, Anden N, Dahlstrom A (1975) A functional effect of dopamine in the nucleus accumbens and in some other dopamine-rich parts of the rat brain. Psychopharmacologia 45:139–149PubMedGoogle Scholar
  49. Kacelnik A (1997) Normative and descriptive models of decision making: time discounting and risk sensitivity. In: Bock GR, Cardew G (eds) Characterizing human psychological adaptations: Ciba Foundation symposium 208. Wiley, Chichester, pp 51–70Google Scholar
  50. Killeen PR (1995) Economics, ecologies and mechanics: the dynamics of responding under conditions of varying motivation. J Exp Anal Behav 64:405–431PubMedGoogle Scholar
  51. Konorski J (1967) Integrative activity of the brain: an interdisciplinary approach. University of Chicago Press, ChicagoGoogle Scholar
  52. Lauwereyns J, Watanabe K, Coe B, Hikosaka O (2002) A neural correlate of response bias in monkey caudate nucleus. Nature 418(6896):413–417PubMedGoogle Scholar
  53. Le Moal M, Simon H (1991) Mesocorticolimbic dopaminergic network: functional and regulatory roles. Physiol Rev 71:155–234PubMedGoogle Scholar
  54. Ljungberg T, Enquist M (1987) Disruptive effects of low doses of d-amphetamine on the ability of rats to organize behaviour into functional sequences. Psychopharmacology (Berl) 93:146–151Google Scholar
  55. Ljungberg T, Apicella P, Schultz W (1992) Responses of monkey dopaminergic neurons during learning of behavioral reactions. J Neurophys 67:145–163Google Scholar
  56. Lodge DJ, Grace AA (2005) The hippocampus modulates dopamine neuron responsivity by regulating the intensity of phasic neuron activation. Neuropsychopharmacology 31:1356–1361PubMedGoogle Scholar
  57. Lodge DJ, Grace AA (2006) The laterodorsal tegmentum is essential for burst firing of ventral tegmental area dopamine neurons. Proc Nat Acad Sci U S A 103(13):5167–5172PubMedGoogle Scholar
  58. Lyon M, Robbins TW (1975) The action of central nervous system stimulant drugs: a general theory concerning amphetamine effects. In: Current developments in psychopharmacology. Spectrum, New York, pp 80–163Google Scholar
  59. Mahadevan S (1996) Average reward reinforcement learning: foundations, algorithms and empirical results. Mach Learn 22:1–38Google Scholar
  60. Mazur JA (1983) Steady-state performance on fixed-, mixed-, and random-ratio schedules. J Exp Anal Behav 39(2):293–307PubMedGoogle Scholar
  61. McClure SM, Daw ND, Montague PR (2003) A computational substrate for incentive salience. Trends Neurosci 26(8):423–428PubMedGoogle Scholar
  62. Mingote S, Weber SM, Ishiwari K, Correa M, Salamone JD (2005) Ratio and time requirements on operant schedules: effort-related effects of nucleus accumbens dopamine depletions. Eur J Neurosci 21:1749–1757PubMedGoogle Scholar
  63. Montague PR (2006) Why choose this book?: how we make decisions. Dutton, New YorkGoogle Scholar
  64. Montague PR, Dayan P, Sejnowski TJ (1996) A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci 16(5):1936–1947PubMedGoogle Scholar
  65. Moore H, West AR, Grace AA (1999) The regulation of forebrain dopamine transmission: relevance to the psychopathology of schizophrenia. Biol Psychiatry 46:40–55PubMedGoogle Scholar
  66. Murschall A, Hauber W (2006) Inactivation of the ventral tegmental area abolished the general excitatory influence of Pavlovian cues on instrumental performance. Learn Mem 13:123–126PubMedGoogle Scholar
  67. Niv Y, Daw ND, Dayan P (2005a) How fast to work: response vigor, motivation and tonic dopamine. In: Weiss Y, Schölkopf B, Platt J (eds) NIPS 18. MIT Press, Cambridge, pp 1019–1026Google Scholar
  68. Niv Y, Daw ND, Joel D, Dayan P (2005b) Motivational effects on behavior: towards a reinforcement learning model of rates of responding. COSYNE 2005, Salt Lake CityGoogle Scholar
  69. Niv Y, Joel D, Dayan P (2006) A normative perspective on motivation. Trends Cogn Sci 10:375–381PubMedGoogle Scholar
  70. Oades RD (1985) The role of noradrenaline in tuning and dopamine in switching between signals in the CNS. Neurosci Biobehav Rev 9(2):261–282PubMedGoogle Scholar
  71. Packard MG, Knowlton BJ (2002) Learning and memory functions of the basal ganglia. Annu Rev Neurosci 25:563–593PubMedGoogle Scholar
  72. Phillips PEM, Wightman RM (2004) Extrasynaptic dopamine and phasic neuronal activity. Nat Neurosci 7:199PubMedGoogle Scholar
  73. Phillips PEM, Stuber GD, Heien MLAV, Wightman RM, Carelli RM (2003) Subsecond dopamine release promotes cocaine seeking. Nature 422:614–618PubMedGoogle Scholar
  74. Redgrave P, Prescott TJ, Gurney K (1999) The basal ganglia: a vertebrate solution to the selection problem? Neuroscience 89:1009–1023PubMedGoogle Scholar
  75. Robbins TW, Everitt BJ (1982) Functional studies of the central catecholamines. Int Rev Neurobiol 23:303–365PubMedCrossRefGoogle Scholar
  76. Roitman MF, Stuber GD, Phillips PEM, Wightman RM, Carelli RM (2004) Dopamine operates as a subsecond modulator of food seeking. J Neurosci 24(6):1265–1271PubMedGoogle Scholar
  77. Salamone JD, Correa M (2002) Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine. Behav Brain Res 137:3–25PubMedGoogle Scholar
  78. Salamone JD, Wisniecki A, Carlson BB, Correa M (2001) Nucleus accumbens dopamine depletions make animals highly sensitive to high fixed ratio requirements but do not impair primary food reinforcement. Neuroscience 5(4):863–870Google Scholar
  79. Satoh T, Nakai S, Sato T, Kimura M (2003) Correlated coding of motivation and outcome of decision by dopamine neurons. J Neurosci 23(30):9913–9923PubMedGoogle Scholar
  80. Schoenbaum G, Setlow B, Nugent S, Saddoris M, Gallagher M (2003) Lesions of orbitofrontal cortex and basolateral amygdala complex disrupt acquisition of odor-guided discriminations and reversals. Learn Mem 10:129–140PubMedGoogle Scholar
  81. Schultz W (1998) Predictive reward signal of dopamine neurons. J Neurophys 80:1–27Google Scholar
  82. Schultz W, Apicella P, Ljungberg T (1993) Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J Neurosci 13:900–913PubMedGoogle Scholar
  83. Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593–1599PubMedGoogle Scholar
  84. Schwartz A (1993) A reinforcement learning method for maximizing undiscounted rewards. In: Proceedings of the tenth international conference on machine learning. Morgan Kaufmann, San Francisco, pp 298–305Google Scholar
  85. Sokolowski JD, Salamone JD (1998) The role of accumbens dopamine in lever pressing and response allocation: effects of 6-OHDA injected into core and dorsomedial shell. Pharmacol Biochem Behav 59(3):557–566PubMedGoogle Scholar
  86. Solomon RL, Corbit JD (1974) An opponent-process theory of motivation. I. Temporal dynamics of affect. Psychol Rev 81:119–145PubMedGoogle Scholar
  87. Staddon JER (2001) Adaptive dynamics. MIT Press, CambridgeGoogle Scholar
  88. Sutton RS, Barto AG (1981) Toward a modern theory of adaptive networks: expectation and prediction. Psychol Rev 88:135–170PubMedGoogle Scholar
  89. Sutton RS, Barto AG (1990) Time-derivative models of Pavlovian reinforcement. In: Gabriel M, Moore J (eds) Learning and computational neuroscience: foundations of adaptive networks. MIT Press, Cambridge, pp 497–537Google Scholar
  90. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, CambridgeGoogle Scholar
  91. Taghzouti K, Simon H, Louilot A, Herman J, Le Moal M (1985) Behavioral study after local injection of 6-hydroxydopamine into the nucleus accumbens in the rat. Brain Res 344:9–20PubMedGoogle Scholar
  92. Takikawa Y, Kawagoe R, Itoh H, Nakahara H, Hikosaka O (2002) Modulation of saccadic eye movements by predicted reward outcome. Exp Brain Res 142(2):284–291PubMedGoogle Scholar
  93. Taylor JR, Robbins TW (1984) Enhanced behavioural control by conditioned reinforcers following microinjections of d-amphetamine into the nucleus accumbens. Psychopharmacology (Berl) 84:405–412Google Scholar
  94. Taylor JR, Robbins TW (1986) 6-Hydroxydopamine lesions of the nucleus accumbens, but not of the caudate nucleus, attenuate enhanced responding with reward-related stimuli produced by intra-accumbens d-amphetamine. Psychopharmacology (Berl) 90:390–397Google Scholar
  95. Tobler P, Fiorillo C, Schultz W (2005) Adaptive coding of reward value by dopamine neurons. Science 307(5715):1642–1645PubMedGoogle Scholar
  96. van den Bos R, Charria Ortiz GA, Bergmans AC, Cools AR (1991) Evidence that dopamine in the nucleus accumbens is involved in the ability of rats to switch to cue-directed behaviours. Behav Brain Res 42:107–114PubMedGoogle Scholar
  97. Waelti P, Dickinson A, Schultz W (2001) Dopamine responses comply with basic assumptions of formal learning theory. Nature 412:43–48PubMedGoogle Scholar
  98. Walton ME, Kennerley SW, Bannerman DM, Phillips PEM, Rushworth MFS (2006) Weighing up the benefits of work: behavioral and neural analyses of effort-related decision making. Neural networks (in press)Google Scholar
  99. Watanabe M, Cromwell H, Tremblay L, Hollerman J, Hikosaka K, Schultz W (2001) Behavioral reactions reflecting differential reward expectations in monkeys. Exp Brain Res 140(4):511–518PubMedGoogle Scholar
  100. Weiner I (1990) Neural substrates of latent inhibition: the switching model. Psychol Bull 108:442–461PubMedGoogle Scholar
  101. Weiner I, Joel D (2002) Dopamine in schizophrenia: dysfunctional information processing in basal ganglia-thalamocortical split circuits. In: Chiara GD (ed) Handbook of experimental pharmacology, vol 154/II. Dopamine in the CNS II. Springer, Berlin Heidelberg New York, pp 417–472Google Scholar
  102. Wickens J (1990) Striatal dopamine in motor activation and reward-mediated learning: steps towards a unifying model. J Neural Transm 80:9–31Google Scholar
  103. Wickens J, Kötter R (1995) Cellular models of reinforcement. In: Houk JC, Davis JL, Beiser DG (eds) Models of information processing in the basal ganglia. MIT Press, Cambridge, pp 187–214Google Scholar
  104. Wilson C, Nomikos GG, Collu M, Fibiger HC (1995) Dopaminergic correlates of motivated behavior: importance of drive. J Neurosci 15(7):5169–5178PubMedGoogle Scholar
  105. Wise RA (2004) Dopamine, learning and motivation. Nat Rev Neurosci 5:483–495PubMedGoogle Scholar
  106. Yin HH, Knowlton BJ, Balleine BW (2004) Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. Eur J Neurosci 19:181–189PubMedGoogle Scholar
  107. Zuriff GE (1970) A comparison of variable-ratio and variable-interval schedules of reinforcement. J Exp Anal Behav 13:369–374PubMedGoogle Scholar

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  • Yael Niv
    • 1
    • 2
  • Nathaniel D. Daw
    • 2
  • Daphna Joel
    • 3
  • Peter Dayan
    • 2
  1. 1.Interdisciplinary Center for Neural ComputationThe Hebrew University of JerusalemJerusalemIsrael
  2. 2.Gatsby Computational Neuroscience UnitUniversity College LondonLondonUK
  3. 3.Department of PsychologyTel Aviv UniversityTel AvivIsrael

Personalised recommendations