Abstract
This chapter considers addiction from a purely theoretical point of view. It tries to substantiate the idea that addictive behaviour is a natural consequence of abnormal perceptual learning. In short, addictive behaviours emerge when behaviour confounds its own acquisition. Specifically, we consider what would happen if behaviour interfered with the neurotransmitter systems responsible for optimising the conditional certainty or precision of inferences about causal structure in the world. We will pursue this within a rather abstract framework provided by free-energy formulations of action and perception. Although this treatment does not touch upon many of the neurobiological or psychosocial issues in addiction research, it provides a principled framework within which to understand exchanges with the environment and how they can be disturbed. Our focus will be on behaviour as active inference and the key role of prior expectations. These priors play the role of policies in reinforcement learning and place crucial constraints on perceptual inference and subsequent action. A dynamical treatment of these policies suggests a fundamental distinction between fixed-point policies that lead to a single attractive state and itinerant policies that support wandering behavioural orbits among sets of attractive states. Itinerant policies may provide a useful metaphor for many forms of behaviour and, in particular, addiction. Under these sorts of policies, neuromodulatory (e.g., dopaminergic) perturbations can lead to false inference and consequent learning, which produce addictive and preservative behaviour.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abeles M, Hayon G, Lehmann D (2004) Modeling compositionality by dynamic binding of synfire chains. J Comput Neurosci 17(2):179–201
Ahmed SH, Graupner M, Gutkin B (2009) Computational approaches to the neurobiology of drug addiction. Pharmacopsychiatry 42(1):S144–S152 Suppl
Alcaro A, Huber R, Panksepp J (2007) Behavioral functions of the mesolimbic dopaminergic system: an affective neuroethological perspective. Brains Res Rev 56(2):283–321
Ballard DH, Hinton GE, Sejnowski TJ (1983) Parallel visual computation. Nature 306:21–26
Bellman R (1952) On the theory of dynamic programming. Proc Natl Acad Sci USA 38:716–719
Berke JD, Hyman SE (2000) Addiction, dopamine, and the molecular mechanisms of memory. Neuron 25(3):515–532
Birkhoff GD (1931) Proof of the ergodic theorem. Proc Natl Acad Sci USA 17:656–660
Breakspear M, Stam CJ (2005) Dynamics of a neural system with a multiscale architecture. Philos Trans R Soc Lond B, Biol Sci 360(1457):1051–1074
Bressler SL, Tognoli E (2006) Operational principles of neurocognitive networks. Int J Psychophysiol 60(2):139–148
Bromberg-Martin ES, Hikosaka O (2009) Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron 63:119–126
Coricelli G, Dolan RJ, Sirigu A (2007) Brain, emotion and decision making: the paradigmatic example of regret. Trends Cogn Sci 11(6):258–265
Camerer CF (2003) Behavioural studies of strategic thinking in games. Trends Cogn Sci 7(5):225–231
Chen X, Zelinsky GJ (2006) Real-world visual search is dominated by top-down guidance. Vis Res 46(24):4118–4133
Colliaux D, Molter C, Yamaguchi Y (2009) Working memory dynamics and spontaneous activity in a flip-flop oscillations network model with a Milnor attractor. Cogn Neurodyn 3(2):141–151
Crauel H (1999) Global random attractors are uniquely determined by attracting deterministic compact sets. Ann Mat Pura Appl 176(4):57–72
Crauel H, Flandoli F (1994) Attractors for random dynamical systems. Probab Theory Relat Fields 100:365–393
Davidson TL (1993) The nature and function of interoceptive signals to feed: toward integration of physiological and learning perspectives. Psychol Rev 100(4):640–657
Daw ND, Doya K (2006) The computational neurobiology of learning and reward. Curr Opin Neurobiol 16(2):199–204
Daw ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical substrates for exploratory decisions in humans. Nature 441(7095):876–879
Dayan P, Daw ND (2008) Decision theory, reinforcement learning, and the brain. Cogn Affect Behav Neurosci 8(4):429–453
Dayan P, Hinton GE, Neal RM (1995) The Helmholtz machine. Neural Comput 7:889–904
Dommett E, Coizet V, Blaha CD, Martindale J, Lefebvre V, Walton N, Mayhew JE, Overton PG, Redgrave P (2005) How visual stimuli activate dopaminergic neurons at short latency. Science 307:1476–1479
Eldredge N, Gould SJ (1972) Punctuated equilibria: an alternative to phyletic gradualism. In: Schopf TJM (ed) Models in paleobiology. Freeman, San Francisco, pp 82–115
Evans DJ (2003) A non-equilibrium free energy theorem for deterministic systems. Mol Phys 101:15551–15554
Feynman RP (1972) Statistical mechanics. Benjamin, Reading
Freeman WJ (1994) Characterization of state transitions in spatially distributed, chaotic, nonlinear, dynamical systems in cerebral cortex. Integr Physiol Behav Sci 29(3):294–306
Friston KJ, Tononi G, Reeke GN Jr, Sporns O, Edelman GM (1994) Value-dependent selection in the brain: simulation in a synthetic neural model. Neuroscience 59(2):229–243
Friston KJ (2000) The labile brain. II. Transients, complexity and selection. Phil Trans Biol Sci 355(1394):237–252
Friston K (2005) A theory of cortical responses. Philos Trans R Soc Lond B, Biol Sci 360(1456):815–836
Friston K (2008) Hierarchical models in the brain. PLoS Comput Biol 4(11):e1000211
Friston K, Kilner J, Harrison L (2006) A free energy principle for the brain. J Physiol Paris 100(1–3):70–87
Friston KJ, Daunizeau J, Kiebel SJ (2009) Reinforcement learning or active inference? PLoS ONE 29;4(7):e6421
Friston KJ, Daunizeau J, Kilner J, Kiebel SJ (2010) Action and behavior: a free-energy formulation. Biol Cybern [Epub ahead of print]
Fiorillo CD, Tobler PN, Schultz W (2003) Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299(5614):1898–1902
Fiorillo CD (2008) Towards a general theory of neural computation based on prediction by single neurons. PLoS ONE 3:e3298
Goto Y, Yang CR, Otani S (2010) Functional and dysfunctional synaptic plasticity in prefrontal cortex: roles in psychiatric disorders. Biol Psychiatry 67(3):199–207
Gregory RL (1968) Perceptual illusions and brain models. Proc R Soc Lond B 171:179–196
Gregory RL (1980) Perceptions as hypotheses. Phil Trans R Soc Lond B 290:181–197
Gros C (2009) Cognitive computation with autonomously active neural networks: an emerging field. Cogn Comput 1:77–99
Haile PA, Hortaçsu A, Kosenok G (2008) On the empirical content of quantal response equilibrium. Am Econ Rev 98:180–200
Haken H (1983) Synergistics: an introduction. Non-equilibrium phase transition and self-organisation in physics, chemistry and biology, 3rd edn. Springer, Berlin
Herrmann JM, Pawelzik K, Geisel T (1999) Self-localization of autonomous robots by hidden representations. Auton Robots 7:31–40
Hinton GE, van Camp D (1993) Keeping neural networks simple by minimising the description length of weights. In: Proceedings of COLT-93, pp 5–13
von Helmholtz H (1866) Concerning the perceptions in general. In: Treatise on physiological optics, vol III, 3rd edn (translated by J.P.C. Southall 1925 Opt Soc Am Section 26, reprinted New York, Dover, 1962)
Henry DJ, White FJ (1995) The persistence of behavioral sensitization to cocaine parallels enhanced inhibition of nucleus accumbens neurons. J Neurosci 15(9):6287–6299
Hull C (1943) Principles of behavior. Appleton/Century-Crofts, New York
Hsu M, Bhatt M, Adolphs R, Tranel D, Camerer CF (2005) Neural systems responding to degrees of uncertainty in human decision-making. Science 310(5754):1680–1683
Jirsa VK, Friedrich R, Haken H, Kelso JA (1994) A theoretical model of phase transitions in the human brain. Biol Cybern 71(1):27–35
Johnson A, van der Meer MA, Redish AD (2007) Integrating hippocampus and striatum in decision-making. Curr Opin Neurobiol 17(6):692–697
Kelley AE, Berridge KC (2002) The neuroscience of natural rewards: relevance to addictive drugs. J Neurosci 22(9):3306–3311
Kersten D, Mamassian P, Yuille A (2004) Object perception as Bayesian inference. Annu Rev Psychol 55:271–304
Khoshbouei H, Wang H, Lechleiter JD, Javitch JA, Galli A (2003) Amphetamine-induced dopamine efflux. A voltage-sensitive and intracellular Na+-dependent mechanism. J Biol Chem 278(14):12070–12077
Knill DC, Pouget A (2004) The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci 27(12):712–719
Lapish CC, Seamans JK, Chandler LJ (2006) Glutamate-dopamine cotransmission and reward processing in addiction. Alcohol Clin Exp Res 30:451–1465
Lee TS, Mumford D (2003) Hierarchical Bayesian inference in the visual cortex. J Opt Soc Am A, Opt Image Sci Vis 20:1434–1448
Lee HJ, Youn JM, MJ O, Gallagher M, Holland PC (2006) Role of substantia nigra-amygdala connections in surprise-induced enhancement of attention. J Neurosci 26(22):6077–6081
Liss B, Roeper J (2008) Individual dopamine midbrain neurons: functional diversity and flexibility in health and disease. Brains Res Rev 58(2):314–321
Lodge DJ, Grace AA (2006) The hippocampus modulates dopamine neuron responsivity by regulating the intensity of phasic neuron activation. Neuropsychopharmacology 31:1356–1361
MacKay DM (1956) The epistemological problem for automata. In: Shannon CE, McCarthy J (eds) Automata studies. Princeton University Press, Princeton, pp 235–251
MacKay DJC (1995) Free-energy minimisation algorithm for decoding and cryptoanalysis. Electron Lett 31:445–447
Matheron G (1975) Random sets and integral geometry. Wiley, New York
Matsumoto M, Hikosaka O (2009) Two types of dopamine neuron distinctly convey positive and negative motivational signals. Nature 459:837–841
Maturana HR, Varela F (1980) De máquinas y seres vivos. Editorial Universitaria, Santiago. English version: Autopoiesis: the organization of the living, in Maturana, HR, and Varela, FG, Autopoiesis and Cognition. Dordrecht, Netherlands: Reidel
Maynard Smith J (1992) Byte-sized evolution. Nature 355:772–773
McDonald RJ, Ko CH, Hong NS (2002) Attenuation of context-specific inhibition on reversal learning of a stimulus-response task in rats with neurotoxic hippocampal damage. Behav Brain Res 136(1):113–126
McKelvey R, Palfrey T (1995) Quantal response equilibria for normal form games. Games Econ Behav 10:6–38
Montague PR, Dayan P, Person C, Sejnowski TJ (1995) Bee foraging in uncertain environments using predictive Hebbian learning. Nature 377(6551):725–728
Montague PR, Dayan P, Sejnowski TJ (1996) A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci 16:1936–1947
Moore CC (1966) Ergodicity of flows on homogeneous spaces. Am J Math 88:154–178
Morris R (1984) Developments of a water-maze procedure for studying spatial learning in the rat. J Neurosci Methods 11(1):47–60
Mumford D (1992) On the computational architecture of the neocortex. II. The role of cortico-cortical loops. Biol Cybern 66:241–251
Nara S (2003) Can potentially useful dynamics to solve complex problems emerge from constrained chaos and/or chaotic itinerancy? Chaos 13(3):1110–1121
Neisser U (1967) Cognitive psychology. Appleton/Century-Crofts, New York
Nestler EJ (2005) Is there a common molecular pathway for addiction? Nat Neurosci 8(11):1445–1449
Niesink RJ, Van Ree JM (1989) Involvement of opioid and dopaminergic systems in isolation-induced pinning and social grooming of young rats. Neuropharmacology 28(4):411–418
Niv Y, Schoenbaum G (2008) Dialogues on prediction errors. Trends Cogn Sci 12(7):265–272
Nowak M, Sigmund K (1993) A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner’s Dilemma game. Nature 364:56–58
O’Keefe J, Dostrovsky J (1971) The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely-moving rat. Brain Res 34(1):171–175
Panksepp J, Siviy S, Normansell L (1984) The psychobiology of play: theoretical and methodological perspectives. Neurosci Biobehav Rev 8(4):465–492
Panksepp J, Knutson B, Burgdorf J (2002) The role of brain emotional systems in addictions: a neuro-evolutionary perspective and new ‘self-report’ animal model. Addiction 97(4):459–469
Pasquale V, Massobrio P, Bologna LL, Chiappalone M, Martinoia S (2008) Self-organization and neuronal avalanches in networks of dissociated cortical neurons. Neuroscience 153(4):1354–1369
Pierce RC, Kalivas PW (1997) A circuitry model of the expression of behavioural sensitization to amphetamine-like psychostimulants. Brain Res Brain Res Rev 25(2):192–216
Porr B, Wörgötter F (2003) Isotropic sequence order learning. Neural Comput 15(4):831–864
Rabinovich M, Huerta R, Laurent G (2008) Neuroscience. Transient dynamics for neural processing. Science 321(5885):48–50
Rao RP, Ballard DH (1999) Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci 2(1):79–87
Redgrave P, Gurney K (2006) The short-latency dopamine signal: a role in discovering novel actions? Nat Rev, Neurosci 7(12):967–975
Redish AD (2004) Addiction as a computational process gone awry. Science 306:1944–1947
Rescorla RA, Wagner AR (1972) A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement. In: Black AH, Prokasy WF (eds) Classical conditioning II: current research and theory. Appleton/Century Crofts, New York, pp 64–99
Robbe D, Buzsáki G (2009) Alteration of theta timescale dynamics of hippocampal place cells by a cannabinoid is associated with memory impairment. J Neurosci 29(40):12597–12605
Salzman CD, Belova MA, Paton JJ (2005) Beetles, boxes and brain cells: neural mechanisms underlying valuation and learning. Curr Opin Neurobiol 15(6):721–729
Schultz W (1998) Predictive reward signal of dopamine neurons. J Neurophysiol 80(1):1–27
Schultz W, Dickinson A (2000) Neuronal coding of prediction errors. Annu Rev Neurosci 23:473–500
Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593–1599
Seip KM, Pereira M, Wansaw MP, Reiss JI, Dziopa EI, Morrell JI (2008) Incentive salience of cocaine across the postpartum period of the female rat. Psychopharmacology 199(1):119–130
Sheynikhovich D, Chavarriaga R, Strösslin T, Arleo A, Gerstner W (2009) Is there a geometric module for spatial orientation? Insights from a rodent navigation model. Psychol Rev 116(3):540–566
Shreve S, Soner HM (1994) Optimal investment and consumption with transaction costs. Ann Appl Probab 4:609–692
Sutton RS, Barto AG (1981) Toward a modern theory of adaptive networks: expectation and prediction. Psychol Rev 88(2):135–170
Takahashi Y, Schoenbaum G, Niv Y (2008) Silencing the critics: understanding the effects of cocaine sensitization on dorsolateral and ventral striatum in the context of an actor/critic model. Front Neurosci 2:86–99
Tani J, Ito M, Sugita Y (2004) Self-organization of distributedly represented multiple behavior schemata in a mirror system: reviews of robot experiments using RNNPB. Neural Netw 17:1273–1289
Thiagarajan TC, Lebedev MA, Nicolelis MA, Plenz D (2010) Coherence potentials: loss-less all-or-none network events in the cortex. PLoS Biol 8(1):e1000278
Todorov E (2006) Linearly-solvable Markov decision problems. In: Scholkopf et al. (ed) Advances in neural information processing systems, vol 19, pp 1369–1376. MIT Press, Cambridge
Traulsen A, Claussen JC, Hauert C (2006) Coevolutionary dynamics in large, but finite populations. Phys Rev E, Stat Nonlinear Soft Matter Phys 74(1 Pt 1):011901
Tschacher W, Haken H (2007) Intentionality in non-equilibrium systems? The functional aspects of self-organised pattern formation. New Ideas Psychol 25:1–15
Tsuda I (2001) Toward an interpretation of dynamic neural activity in terms of chaotic dynamical systems. Behav Brain Sci 24(5):793–810
Tyukin I, van Leeuwen C, Prokhorov D (2003) Parameter estimation of sigmoid superpositions: dynamical system approach. Neural Comput 15(10):2419–2455
Tyukin I, Tyukina T, van Leeuwen C (2009) Invariant template matching in systems with spatiotemporal coding: a matter of instability. Neural Netw 22(4):425–449
van Leeuwen C (2008) Chaos breeds autonomy: connectionist design between bias and baby-sitting. Cogn Process 9(2):83–92
Verschure PF, Voegtlin T, Douglas RJ (2003) Environmentally mediated synergy between perception and behavior in mobile robots. Nature 425:620–624
Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8:279–292
Wittmann BC, Bunzeck N, Dolan RJ, Duzel E (2007) Anticipation of novelty recruits reward system and hippocampus while promoting recollection. Neuroimage 38:194–202
Zack M, Poulos CX (2009) Parallel roles for dopamine in pathological gambling and psychostimulant addiction. Curr Drug Abus Rev 2(1):11–25
Zhao Y, Kerscher N, Eysel U, Funke K (2001) Changes of contrast gain in cat dorsal lateral geniculate nucleus by dopamine receptor agonists. Neuroreport 12(13):2939–2945
Zink CF, Pagnoni G, Chappelow J, Martin-Skurski M, Berns GS (2006) Human striatal activation reflects degree of stimulus saliency. Neuroimage 29:977–983
Acknowledgements
The Wellcome Trust funded this work and greatest thanks to Marcia Bennett for helping prepare this manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix A: Parameter Optimisation and Newton’s Method
There is a close connection between the updates implied by Eq. (9.9) and Newton’s method for optimisation. Consider the update under a local linearisation, assuming \(\mathcal{L}_{\varphi}\approx\mathcal{F}_{\varphi}\)
As time proceeds, the change in generalised mean becomes
The first line means the motion cancels itself and becomes zero, while the change in the conditional mean \(\Delta\mu^{(\varphi )} = -\mathcal{L}_{\varphi \varphi} ^{ - 1}\mathcal{L}_{\varphi}\) becomes a classical Newton update. The conditional expectations of the parameters were updated after every simulated exposure using this scheme, as described in Friston (2008).
Appendix B: Simulating Action and Perception
The simulations in this paper involve integrating time-varying states in the environment and the agent. This is the solution to the following ordinary differential equation
To update these states we use a local linearisation; \(\Delta u =(\exp(\Delta t\Im) - I)\Im(t)^{ - 1}\dot{u}\) over time steps of Δt, where \(\Im = \partial\dot{u} / \partial u\) is evaluated at the current conditional expectation (Friston et al. 2010).
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Friston, K. (2012). Policies and Priors. In: Gutkin, B., Ahmed, S. (eds) Computational Neuroscience of Drug Addiction. Springer Series in Computational Neuroscience, vol 10. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-0751-5_9
Download citation
DOI: https://doi.org/10.1007/978-1-4614-0751-5_9
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-0750-8
Online ISBN: 978-1-4614-0751-5
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)