Journal of Computational Neuroscience

, Volume 6, Issue 3, pp 191–214

A Predictive Reinforcement Model of Dopamine Neurons for Learning Approach Behavior

  • José L. Contreras-Vidal
  • Wolfram Schultz


A neural network model of how dopamine and prefrontal cortex activity guides short- and long-term information processing within the cortico-striatal circuits during reward-related learning of approach behavior is proposed. The model predicts two types of reward-related neuronal responses generated during learning: (1) cell activity signaling errors in the prediction of the expected time of reward delivery and (2) neural activations coding for errors in the prediction of the amount and type of reward or stimulus expectancies. The former type of signal is consistent with the responses of dopaminergic neurons, while the latter signal is consistent with reward expectancy responses reported in the prefrontal cortex. It is shown that a neural network architecture that satisfies the design principles of the adaptive resonance theory of Carpenter and Grossberg (1987) can account for the dopamine responses to novelty, generalization, and discrimination of appetitive and aversive stimuli. These hypotheses are scrutinized via simulations of the model in relation to the delivery of free food outside a task, the timed contingent delivery of appetitive and aversive stimuli, and an asymmetric, instructed delay response task.

Neural network prefrontal reinforcement learning striatum timing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Alexander GE, DeLong MR, Strick PL (1986) Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Ann. Rev. Neurosci. 9:357–381.Google Scholar
  2. Apicella P, Ljungberg T, Scarnati E, Schultz W (1991) Responses to reward in monkey dorsal and ventral striatum. Exp. Brain Res. 85:491–500.Google Scholar
  3. Apicella P, Scarnati E, Ljungberg T, Schultz W(1992) Neuronal activity in monkey striatum related to the expectation of predictable environmental events. J. Neurophysiol. 68:945–960.Google Scholar
  4. Bowman EM, Aigner TG, Richmond BJ (1996) Neural signals in the monkey ventral striatum related to motivation for juice and cocaine rewards. J. Neurophysiol. 75:1061–1073.Google Scholar
  5. Brown RG, Marsden CD (1990) Cognitive function in Parkinson' disease: From description to theory. Trends in Neurosci. 13:21–29.Google Scholar
  6. Brown VJ, Schwarz U, Bowman EM, Fuhr P, Robinson DL, Hallet M (1993) Dopamine dependent reaction time deficits with Parkinson' disease are task specific. Neuropsychologia 31:459–469.Google Scholar
  7. Buonomano DV, Mauk MD (1994) Neural network model of the cerebellum: Temporal discrimination and the timing of motor responses. Neural Computation 6:38–55.Google Scholar
  8. Calabresi P, Maj R, Pisani A, Mercuri NB, Bernardi G (1992) Longterm synaptic depression in the striatum: Physiological and pharmacological characterization. J. Neurosci. 12:4224–4233.Google Scholar
  9. Canavan AGM, Passingham RE, Marsden CD, Quinn N, Wyke M, Polkey CE (1989) The performance on learning tasks of patients in the early stages of Parkinson' disease. Neuropsychologia 27:141–156.Google Scholar
  10. Carpenter GA (1997) Distributed learning, recognition, and prediction by ART and ARTMAP neural networks. Neural Networks 10:1473–1494.Google Scholar
  11. Carpenter GA, Grossberg S (1987) ART 2: Self-organization of stable category recognition codes for analog input patterns. Applied Optics 26:4919–4930.Google Scholar
  12. Carpenter GA, Grossberg S (1990) ART-3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architectures. Neural Networks 3:129–152.Google Scholar
  13. Contreras-Vidal JL, Schultz W (1996) A neural network model of reward-related learning, motivation and orienting behavior (Abstract). Soc. Neurosci. Abs. 22:2029.Google Scholar
  14. Contreras-Vidal JL, Stelmach GE (1995) A neural model of basal ganglia-thalamocortical relations in normal and parkinsonian movement. Biol. Cybern. 73:467–476.Google Scholar
  15. Eblen F, Graybiel AM (1995) Highly restricted origin of prefrontal cortical inputs to striosomes in the macaque monkey. J. Neurosci. 15:5999–6013.Google Scholar
  16. Fiala JC, Grossberg S, Bullock D (1996) Metabotropic glutamate receptor activation in cerebellar purkinje cells as substrate for adaptive timing of the classically conditioned eye-blink response. J. Neurosci. 16:3760–3774.Google Scholar
  17. Gaffan D, Murray EA, Fabre-Thorpe M (1993) Interaction of the amygdala with the frontal lobe in reward memory. Eur. J. Neurosci. 5:968–975.Google Scholar
  18. Gariano RF, Groves PM (1988) Burst firing induced in midbrain dopamine neurons by stimulation of the medial prefrontal and anterior cingulate cortices. Brain Res. 462:194–198.Google Scholar
  19. Gaspar P, Stepniewska I, Kaas J (1992) Topography and collateralization of the dopaminergic projections to motor and lateral prefrontal cortex in Owl monkeys. J. Comp. Neurol. 325:1–21.Google Scholar
  20. Gerfen CR (1989) The neostriatal mosaic: Striatal patch-matrix organization is related to cortical lamination. Science 246:385–388.Google Scholar
  21. Gerfen CR (1992) The neostriatal mosaic: Multiple levels of compartmental organization in the basal ganglia. Annu. Rev. Neurosci. 15:285–320.Google Scholar
  22. Gerfen CR, Herkenham M, Thibault J (1987) The neostriatal mosaic. II. Patch-and matrix-directed mesostriatal dopaminergic and nondopaminergic systems. J. Neurosci. 7:3935–3944.Google Scholar
  23. Goldman-Rakic PS, Porrino LJ (1985) The primate mediodorsal (MD) nucleus and its projection to the frontal lobe. J. Comp. Neurol. 242:535–560.Google Scholar
  24. Gotham AM, Brown RG, Marsden CD (1988) ”Frontal” cognitive function in patients with Parkinson' disease “on” and “off” levodopa. Brain 111:299–321.Google Scholar
  25. Grace AA, Bunney BS (1985) Opposing effects of striatonigral feedback pathways on midbrain dopamine cell activity. Brain Res. 333:271–284.Google Scholar
  26. Graveland GA, DiFiglia M (1985) The frequency and distribution of medium-sized neurons with indented nuclei in the primate and rodent neostriatum. Brain Res 327:307–311.Google Scholar
  27. Graybiel AM (1990) Neurotransmitters and neuromodulators in the basal ganglia. Trends in Neurosci. 13:244–254.Google Scholar
  28. Groenewegen HJ (1988) Organization of the afferent connections of the mediodorsal thalamic nucleus in the rat, related to the mediodorsal-prefrontal topography. Neuroscience 24:379–431.Google Scholar
  29. Grossberg S, Merrill JWL (1992) A neural network model of adaptively timed reinforcement learning and hippocampal dynamics. Cogn. Brain Res. 1:3–38.Google Scholar
  30. Haber SN, Lynd E, Klein C, Groenewegen HJ (1990) Topographic organization of the ventral striatal efferent projections in the rhesus monkey: An anterograde tracing study. J. Comp. Neurol. 293:282–298.Google Scholar
  31. Hodgkin AL, Huxley AF (1952) A quatitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 117:500–544.Google Scholar
  32. Hollerman J, Schultz W(1996) Activity of dopamine neurons during learning in a familiar task context (Abstract). Soc. Neurosci. Abs. 22:1388.Google Scholar
  33. Hoover JE, Strick PL (1993) Multiple output channels in the basal ganglia. Science 259:819–821.Google Scholar
  34. Houk JC, Adams JL, Barto AG (1995) A model of how basal ganglia generate and use neural signals that predict reinforcement. In: JC Houk, JL Davis, DG Beiser, eds. Models of Information Processing in the Basal Ganglia. MIT Press, Cambridge, MA. pp. 249–270.Google Scholar
  35. Ivry RB (1996) The representation of temporal information in perception and motor control. Current Opinion in Neurobiology 6:851–857.Google Scholar
  36. Jaeger D, Kita H, Wilson CJ (1994) Surround inhibition among projection neurons is weak or nonexistent in the rat neostriatum. J. Neurophysiology 72:2555–2558.Google Scholar
  37. Jimenez-Castellanos J, Graybiel AM (1987) Subdivisions of the dopamine-containing A8-A9-A10 complex identified by their differential mesostriatal innervation of striosomes and extrastriosomal matrix. Neurosci. 23:223–242.Google Scholar
  38. Jimenez-Castellanos J, Graybiel AM (1989) Evidence that histochemically distinct zones of the primate substantia nigra pars compacta are related to patterned distributions of nigrostriatal projection neurons and striatonigral fibers. Exp. Brain Res. 74:227–238.Google Scholar
  39. Jueptner M, Rijntjes M, Weiller C, Faiss JH, Timmann D, Mueller SP, Diener HC (1995) Localization of a cerebellar timing process using PET. Neurology 45:1540–1545.Google Scholar
  40. Knowlton BJ, Mangels JA, Squire LR (1996) A neostriatal habit learning system in humans. Science 273:1399–1354.Google Scholar
  41. Kornhuber J, Kim J-S, Kornhuber ME, Kornhuber HH (1984) The cortico-nigral projection: Reduced glutamate content in the substantia nigra following frontal cortex ablation in the rat. Brain Res. 322:124–126.Google Scholar
  42. Kötter R, Wickens J (1995) Interactions of glutamate and dopamine in a computational model of the striatum. J. Computational Neuroscience 2:195–214.Google Scholar
  43. Künzle H (1978) An autoradiographic analysis of the efferent connections from premotor and adjacent prefrontal regions (Areas 6 and 9) in Macaca fascicularis. Brain Behav. Evol. 15:185–234.Google Scholar
  44. Levine DS, Prueitt PS (1989) Modeling some effects of frontal lobe damage: Novelty and perseveration. Neural Networks 2:103–116.Google Scholar
  45. Linden A, Bracke-Tolkmitt R, Lutzenberger W, Canavan AGM, Scholz E, Diener H-C, Birbaumer N (1990) Slow cortical potentials in parkinsonian patients during the course of an associative learning task. J. Psychophysiol. 4:145–162.Google Scholar
  46. Ljungberg T, Apicella P, Schultz W (1992) Responses of monkey dopamine neurons during learning of behavioral reactions. J. Neurophysiol. 67:145–163.Google Scholar
  47. Maricq AV, Church RM (1983) The differential effects of haloperidol and methamphetamine on time estimation in the rat. Psychopharmacology 79:10–15.Google Scholar
  48. Meck WH (1996) Neuropharmacology of timing and time perception. Cogn. Brain Res. 3:227–242.Google Scholar
  49. Milner B (1963) Effects of different brain lesions on card sorting. Arch. Neurol. 9:90–100.Google Scholar
  50. Milner B (1964) Some effects of frontal lobectomy in man. In: J Warren, K Akert, eds. The Frontal Granular Cortex and Behavior. McGraw-Hill, New York. pp. 313–334.Google Scholar
  51. Mirenowicz J, Schultz W (1994) Importance of unpredictability for reward responses in primate dopamine neurons. J. Neurophysiol. 72:1024–1027.Google Scholar
  52. Mirenowicz J, Schultz W (1996) Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli. Nature 379:449–451.Google Scholar
  53. Montague PR, Dayan P, Sejnowski TJ (1996) A framework for mesencephalic dopamine systems based on predictive hebbian learning. J. Neurosci. 16:1936–1947.Google Scholar
  54. Murase S, Grenhoff J, Chouvet G, Gonon FG, Svensson TH (1993) Prefrontal cortex regulates burst firing and transmitter release in rat mesolimbic dopamine neurons studied in vivo. Neurosci. Lett. 157:53–56.Google Scholar
  55. Nishino H, Ono T, Fukuda M, Sasaki K, Muramoto KI (1981) Single unit activity in monkey caudate nucleus during operant bar pressing feeding behavior. Neurosci. Lett. 21:105–110.Google Scholar
  56. Parent A, Hazrati L-N (1995) Functinal anatomy of the basal ganglia. I. The cortico-basal ganglia-thalamo-cortical loop. Brain Res. Rev. 20:91–127.Google Scholar
  57. Pastor MA, Artieda J, Jahanshahi M, Obeso JA (1992) Time estimation and reproduction is abnormal in Parkinson' disease. Brain 115:211–225.Google Scholar
  58. Perret SP, Ruiz BP, Mauk MD (1993) Cerebellar cortez lesions disrupt learning-dependent timing of conditioned eyelid responses. J. Neurosci. 13:1708–1718.Google Scholar
  59. Plenz D, Aertsen A (1996) Neural dynamics in cortex-striatum cocultures. II. Spatiotemporal characteristics of neuronal activity. Neuroscience 70:893–924.Google Scholar
  60. Porrino LJ, Goldman-Rakic PS (1982) Brainstem innervation of prefrontal and anterior cingulate cortex in the Rhesus monkey revealed by retrograde transport of HPR. J. Comp. Neurol. 205:63–76.Google Scholar
  61. Pucak ML, Grace AA (1994) Regulation of substantia nigra dopamine neurons. Critical Rev. Neurobiol. 9:67–89.Google Scholar
  62. Raijmakers MEJ, van der Maas HLJ, Molenaar PCM (1996) Numerical bifurcation analysis of distance-dependent on-center off-surround shunting neural networks. Biol. Cybern. 75:495–507.Google Scholar
  63. Rebec GV, Curtis SD (1988) Reciprocal zones of excitation and inhibition in the neostriatum. Synapse 2:633–635.Google Scholar
  64. Romo R, Schultz W (1990) Dopamine neurons of the monkey midbrain: Contingencies of responses to active touch during selfinitiated arm movements. J. Neurophysiol. 63:592–606.Google Scholar
  65. Russchen FT, Bakst I, Amaral DG, Price JL (1985) The amygdolostriatal projections in the monkey: An anterograde tracing study. Brain Res. 329:241–257.Google Scholar
  66. Sahakian B, Morris R, Evenden J, Heald A, Levy R, Philpot M, Robins T (1988) A comparative study of visuo-spatial memory and learning in Alzheimer-type dementia and Parkinson' disease. Brain 111:695–718.Google Scholar
  67. Schultz W (1986) Activity of pars reticulata neurons of monkey substantia nigra in relation to motor, sensory, and complex events. J. Neurophysiol. 55:660–677.Google Scholar
  68. Schultz W, Apicella P, Ljungberg T (1993) Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J. Neurosci. 13:900–913.Google Scholar
  69. Schultz W, Apicella P, Scarnati E, Ljungberg T (1992) Neuronal activity in monkey ventral striatum related to the expectation of reward. J. Neurophysiol. 12:4595–4610.Google Scholar
  70. Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593–1598.Google Scholar
  71. Schultz W, Romo R (1990) Dopamine neurons of the monkey midbrain: Contingencies of responses to stimuli eliciting immediate behavioral reactions. J. Neurophysiol. 63:607–624.Google Scholar
  72. Sherman SM, Guillery RW (1996) Functional organization of thalamocortical relays. J. Neurophysiol. 76:1367–1395.Google Scholar
  73. Smith Y, Charara A, Parent A (1996) Synaptic innervation of midbrain dopaminergic neurons by glutamate-enriched terminals in the squirrel monkey. J. Comp. Neurol. 364:231–253.Google Scholar
  74. Sprengelmeyer R, Canavan AGM, Lange HW, Homberg V (1995) Associative learning in degenerative neostriatal disorders: Contrasts in explicit and implicit remembering between Parkinson' and Huntington' diseases. Mov. Disorders 10:51–65.Google Scholar
  75. Strick PL, Dum RP, Mushiake H (1995) Basal ganglia “loops” with the cerebral cortex. In: M Kimura, AM Graybiel, eds. Functions of the Cortico-Basal Ganglia Loop. Springer-Verlag, Tokyo. pp. 106–124.Google Scholar
  76. Suri RE, Schultz W (1996) A neural learning model based on the activity of primate dopamine neurons (Abstrac). Soc. Neurosci.Abs. 22:1389.Google Scholar
  77. Sutton RS (1988) Learning to predict by the methods of temporal differences. Machine Learning 3:9–44.Google Scholar
  78. Sutton RS, Barto AG (1981) Toward a modern theory of adaptive networks: Expectation and prediction. Psychol. Rev. 88:135–171.Google Scholar
  79. Tong ZY, Overton PG, Clark D (1996) Stimulation of the prefrontal cortex in the rat induces patterns of activity in midbrain dopaminergic neurons which resemble natural burst events. Synapse 22:195–208.Google Scholar
  80. Watanabe M (1990) Prefrontal unit activity during associative learning in the monkey. Exp. Brain Res. 80:296–309.Google Scholar
  81. Watanabe M (1996) Reward expectancy in primate prefrontal neurons. Nature 382:629–632.Google Scholar
  82. Wichmann T, Vitek JL, DeLong MR (1995) Parkinson' disease and the basal ganglia: Lessons from the laboratory and from neurosurgery. Neuroscientist 1:236–244.Google Scholar
  83. Wickens JR, Alexander ME, Miller R. (1991) Two dynamic modes of striatal function unders dopaminergic-cholinergic control: Simulation and analysis of a model. Synapse 8:1–12.Google Scholar
  84. Wickens JR, Begg AJ, Arbuthnott GW (1996) Dopamine reverses the depression of rat corticostriatal synapses which normally follows high-frequency stimulation of cortex in vitro. Neuroscience 70:1–5.Google Scholar
  85. Young WS III, Alheid GF, Heimer L (1984) The ventral pallidal projection to the mediodorsal thalamus: A study with fluorescent retrograde tracers and immunohisto-fluorescence. J. Neurosci. 4:1626–1638.Google Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • José L. Contreras-Vidal
    • 1
  • Wolfram Schultz
    • 2
  1. 1.Motor Control LaboratoryArizona State UniversityTempeUSA
  2. 2.Institute of PhysiologyUniversity of FribourgFribourgSwitzerland

Personalised recommendations