Neurobiological Fundamentals of Strategy Change: A Core Competence of Companion-Systems

  • Andreas L. Schulz
  • Marie L. Woldeit
  • Frank W. Ohl
Part of the Cognitive Technologies book series (COGTECH)


Companion-Systems interact with users via flexible, goal-directed dialogs. During dialogs both, user and Companion-System, can identify and communicate their goals iteratively. In that sense, they can be conceptualized as communication partners, equipped with a processing scheme producing actions as outputs in consequence of (1) inputs from the other communication partner and (2) internally represented goals. A quite general core competence of communication partners is the capability for strategy change, defined as the modification of action planning under the boundary condition of maintaining a constant goal. Interestingly, the biological fundamentals for this capability are largely unknown. Here we describe a research program that employs an animal model for strategy change to (1) investigate its underlying neuronal mechanisms and (2) describe these mechanisms in an algorithmic syntax, suitable for implementation in technical Companion-Systems. It is crucial for this research program that investigated scenarios be sufficiently complex to contain all relevant aspects of strategy change, but at the same time simple enough to allow for a detailed neurophysiological analysis only obtainable in animal models. To this end, two forms of strategy change are considered in detail: Strategy change caused by modified feature selection, and strategy change caused by modified action assignment.



This work was done within the Transregional Collaborative Research Centre SFB/TRR 62 “Companion-Technology for Cognitive Technical Systems” funded by the German Research Foundation (DFG).


  1. 1.
    Amemori, K.I., Gibb, L.G., Graybiel, A.M.: Shifting responsibly: the importance of striatal modularity to reinforcement learning in uncertain environments. Front. Hum. Neurosci. 5, 47 (2011). doi:10.3389/fnhum.2011.00047. CrossRefGoogle Scholar
  2. 2.
    Bathellier, B., Tee, S.P., Hrovat, C., Rumpel, S.: A multiplicative reinforcement learning model capturing learning dynamics and interindividual variability in mice. Proc. Natl. Acad. Sci. USA 110(49), 19950–19955 (2013). doi:10.1073/pnas.1312125110. CrossRefGoogle Scholar
  3. 3.
    Bitterman, M.E.: The comparative analysis of learning. Science 188(4189), 699–709 (1975). doi:10.1126/science.188.4189.699. CrossRefGoogle Scholar
  4. 4.
    Bond, A.B., Kamil, A.C., Balda, R.P.: Serial reversal learning and the evolution of behavioral flexibility in three species of North American corvids (Gymnorhinus cyanocephalus, Nucifraga columbiana, Aphelocoma californica). J. Comp. Psychol. 121(4), 372–379 (2007). doi:10.1037/0735-7036.121.4.372. CrossRefGoogle Scholar
  5. 5.
    Boulougouris, V., Dalley, J.W., Robbins, T.W.: Effects of orbitofrontal, infralimbic and prelimbic cortical lesions on serial spatial reversal learning in the rat. Behav. Brain. Res. 179(2), 219–228 (2007). doi:10.1016/j.bbr.2007.02.005. CrossRefGoogle Scholar
  6. 6.
    Budinger, E., Laszcz, A., Lison, H., Scheich, H., Ohl, F.W.: Non-sensory cortical and subcortical connections of the primary auditory cortex in mongolian gerbils: bottom-up and top-down processing of neuronal information via field ai. Brain Res. 1220, 2–32 (2008). doi:10.1016/j.brainres.2007.07.084. CrossRefGoogle Scholar
  7. 7.
    Bussey, T.J., Muir, J.L., Everitt, B.J., Robbins, T.W.: Triple dissociation of anterior cingulate, posterior cingulate, and medial frontal cortices on visual discrimination tasks using a touchscreen testing procedure for the rat. Behav. Neurosci. 111(5), 920–936 (1997)CrossRefGoogle Scholar
  8. 8.
    Castañé, A., Theobald, D.E.H., Robbins, T.W.: Selective lesions of the dorsomedial striatum impair serial spatial reversal learning in rats. Behav. Brain. Res. 210(1), 74–83 (2010). doi:10.1016/j.bbr.2010.02.017. CrossRefGoogle Scholar
  9. 9.
    Clayton, K.N.: The relative effects of forced reward and forced nonreward during widely spaced successive discrimination reversal. J. Comput. Physiol. Psychol. 55, 992–997 (1962)CrossRefGoogle Scholar
  10. 10.
    Dabrowska, J.: Multiple reversal learning in frontal rats. Acta Biol. Exp. (Warsz) 24, 99–102 (1964)Google Scholar
  11. 11.
    Deco, G., Rolls, E.T.: Synaptic and spiking dynamics underlying reward reversal in the orbitofrontal cortex. Cereb. Cortex 15(1), 15–30 (2005). doi:10.1093/cercor/bhh103. CrossRefGoogle Scholar
  12. 12.
    Dias, R., Robbins, T.W., Roberts, A.C.: Primate analogue of the Wisconsin card sorting test: effects of excitotoxic lesions of the prefrontal cortex in the marmoset. Behav. Neurosci. 110(5), 872–886 (1996)CrossRefGoogle Scholar
  13. 13.
    Dias, R., Robbins, T.W., Roberts, A.C.: Dissociable forms of inhibitory control within prefrontal cortex with an analog of the Wisconsin card sort test: restriction to novel situations and independence from “on-line” processing. J. Neurosci. 17(23), 9285–9297 (1997)Google Scholar
  14. 14.
    Divac, I.: Frontal lobe system and spatial reversal in the rat. Neuropsychologia 9(2), 175–183 (1971)CrossRefGoogle Scholar
  15. 15.
    Dombrowski, P.A., Maia, T.V., Boschen, S.L., Bortolanza, M., Wendler, E., Schwarting, R.K.W., Brandão, M.L., Winn, P., Blaha, C.D., Cunha, C.D.: Evidence that conditioned avoidance responses are reinforced by positive prediction errors signaled by tonic striatal dopamine. Behav. Brain Res. 241, 112–119 (2013). doi:10.1016/j.bbr.2012.06.031. CrossRefGoogle Scholar
  16. 16.
    Feldman, J.: Successive discrimination reversal performance as a function of level of drive and incentive. Psychon. Sci. 13(5), 265–266 (1968). doi:10.3758/BF03342516. CrossRefGoogle Scholar
  17. 17.
    Fellows, L.K.: Orbitofrontal contributions to value-based decision making: evidence from humans with frontal lobe damage. Ann. N. Y. Acad. Sci. 1239, 51–58 (2011). doi:10.1111/j.1749-6632.2011.06229.x. CrossRefGoogle Scholar
  18. 18.
    Ferry, A.T., Lu, X.C., Price, J.L.: Effects of excitotoxic lesions in the ventral striatopallidal–thalamocortical pathway on odor reversal learning: inability to extinguish an incorrect response. Exp. Brain Res. 131(3), 320–335 (2000)CrossRefGoogle Scholar
  19. 19.
    Frank, M.J.: Dynamic dopamine modulation in the basal ganglia: a neurocomputational account of cognitive deficits in medicated and nonmedicated parkinsonism. J. Cogn. Neurosci. 17(1), 51–72 (2005). doi:10.1162/0898929052880093. CrossRefGoogle Scholar
  20. 20.
    Frank, M.J., Seeberger, L.C., O’Reilly, R.C.: By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306(5703), 1940–1943 (2004). doi:10.1126/science.1102941. CrossRefGoogle Scholar
  21. 21.
    Garner, H., Wessinger, W., McMillan, D.: Effect of multiple discrimination reversals on acquisition of a drug discrimination task in rats. Behav. Pharmacol. 7(2), 200–204 (1996)CrossRefGoogle Scholar
  22. 22.
    Gossette, R.L., Hood, P.: Successive discrimination reversal measures as a function of variation of motivational and incentive levels. Percept. Mot. Skills 26(1), 47–52 (1968). doi:10.2466/pms.1968.26.1.47. CrossRefGoogle Scholar
  23. 23.
    Gossette, R.L., Inman, N.: Comparison of spatial successive discrimination reversal performances of two groups of new world monkeys. Percept. Mot. Skills 23(1), 169–170 (1966). doi:10.2466/pms.1966.23.1.169. CrossRefGoogle Scholar
  24. 24.
    Haber, S.N., Calzavara, R.: The cortico-basal ganglia integrative network: the role of the thalamus. Brain Res. Bull. 78(2-3), 69–74 (2009). doi:10.1016/j.brainresbull.2008.09.013. CrossRefGoogle Scholar
  25. 25.
    Hamilton, D.A., Brigman, J.L.: Behavioral flexibility in rats and mice: contributions of distinct frontocortical regions. Genes Brain Behav. 14(1), 4–21 (2015). doi:10.1111/gbb.12191. CrossRefGoogle Scholar
  26. 26.
    Houk, J.C.: Agents of the mind. Biol. Cybern. 92(6), 427–437 (2005). doi:10.1007/s00422-005-0569-8. CrossRefMATHGoogle Scholar
  27. 27.
    Houk, J.C., Wise, S.P.: Distributed modular architectures linking basal ganglia, cerebellum, and cerebral cortex: their role in planning and controlling action. Cereb. Cortex 5(2), 95–110 (1995)CrossRefGoogle Scholar
  28. 28.
    Ilango, A., Wetzel, W., Scheich, H., Ohl, F.W.: The combination of appetitive and aversive reinforcers and the nature of their interaction during auditory learning. Neuroscience 166(3), 752–762 (2010). doi:10.1016/j.neuroscience.2010.01.010. CrossRefGoogle Scholar
  29. 29.
    Ilango, A., Shumake, J., Wetzel, W., Scheich, H., Ohl, F.W.: Effects of ventral tegmental area stimulation on the acquisition and long-term retention of active avoidance learning. Behav. Brain Res. 225(2), 515–521 (2011). doi:10.1016/j.bbr.2011.08.014. CrossRefGoogle Scholar
  30. 30.
    Ilango, A., Shumake, J., Wetzel, W., Scheich, H., Ohl, F.W.: The role of dopamine in the context of aversive stimuli with particular reference to acoustically signaled avoidance learning. Front. Neurosci. 6, 132 (2012)CrossRefGoogle Scholar
  31. 31.
    Ilango, A., Shumake, J., Wetzel, W., Ohl, F.W.: Contribution of emotional and motivational neurocircuitry to cue-signaled active avoidance learning. Front. Behav. Neurosci. 8, 372 (2014). doi:10.3389/fnbeh.2014.00372. Google Scholar
  32. 32.
    Ionescu, T.: Exploring the nature of cognitive flexibility. New Ideas Psychol. 30(2), 190–200 (2012). doi:10.1016/j.newideapsych.2011.11.001. CrossRefGoogle Scholar
  33. 33.
    Jonker, F.A., Jonker, C., Scheltens, P., Scherder, E.J.A.: The role of the orbitofrontal cortex in cognition and behavior. Rev. Neurosci. 26(1), 1–11 (2015). doi:10.1515/revneuro-2014-0043. CrossRefGoogle Scholar
  34. 34.
    Kangas, B.D., Bergman, J.: Repeated acquisition and discrimination reversal in the squirrel monkey (Saimiri sciureus). Anim. Cogn. 17(2), 221–228 (2014). doi:10.1007/s10071-013-0654-7. CrossRefGoogle Scholar
  35. 35.
    Kehagia, A.A., Murray, G.K., Robbins, T.W.: Learning and cognitive flexibility: frontostriatal function and monoaminergic modulation. Curr. Opin. Neurobiol. 20(2), 199–204 (2010). doi:10.1016/j.conb.2010.01.007. CrossRefGoogle Scholar
  36. 36.
    Kulig, B.M., Calhoun, W.H.: Enhancement of successive discrimination reversal learning by methamphetamine. Psychopharmacologia 27(3), 233–240 (1972)CrossRefGoogle Scholar
  37. 37.
    Li, L., Shao, J.: Restricted lesions to ventral prefrontal subareas block reversal learning but not visual discrimination learning in rats. Physiol. Behav. 65(2), 371–379 (1998)CrossRefGoogle Scholar
  38. 38.
    Mackintosh, N., Cauty, A.: Spatial reversal learning in rats, pigeons, and goldfish. Psychon. Sci. 22, 281–282 (1971)CrossRefGoogle Scholar
  39. 39.
    Mackintosh, N.J., McGonigle, B., Holgate, V., Vanderver, V.: Factors underlying improvement in serial reversal learning. Can. J. Psychol. 22(2), 85–95 (1968)CrossRefGoogle Scholar
  40. 40.
    McAlonan, K., Brown, V.J.: Orbital prefrontal cortex mediates reversal learning and not attentional set shifting in the rat. Behav. Brain Res. 146(1-2), 97–103 (2003)CrossRefGoogle Scholar
  41. 41.
    McDannald, M.A., Jones, J.L., Takahashi, Y.K., Schoenbaum, G.: Learning theory: a driving force in understanding orbitofrontal function. Neurobiol. Learn. Mem. 108, 22–27 (2014). doi:10.1016/j.nlm.2013.06.003. CrossRefGoogle Scholar
  42. 42.
    McGeorge, A.J., Faull, R.L.: The organization of the projection from the cerebral cortex to the striatum in the rat. Neuroscience 29(3), 503–37 (1989). CrossRefGoogle Scholar
  43. 43.
    McHaffie, J.G., Stanford, T.R., Stein, B.E., Coizet, V., Redgrave, P.: Subcortical loops through the basal ganglia. Trends Neurosci. 28(8), 401–407 (2005). doi:10.1016/j.tins.2005.06.006. CrossRefGoogle Scholar
  44. 44.
    Montague, P.R., Dayan, P., Sejnowski, T.J.: A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J. Neurosci. 16(5), 1936–1947 (1996)Google Scholar
  45. 45.
    Mota, T., Giurfa, M.: Multiple reversal olfactory learning in honeybees. Front. Behav. Neurosci. 4 (2010). doi:10.3389/fnbeh.2010.00048.
  46. 46.
    Mowrer, O.H.: Two-factor learning theory reconsidered, with special reference to secondary reinforcement and the concept of habit. Psychol. Rev. 63(2), 114–128 (1956)CrossRefGoogle Scholar
  47. 47.
    Nolte, G., Bai, O., Wheaton, L., Mari, Z., Vorbach, S., Hallett, M.: Identifying true brain interaction from eeg data using the imaginary part of coherency. Clin. Neurophysiol. 115(10), 2292–2307 (2004). doi:10.1016/j.clinph.2004.04.029. CrossRefGoogle Scholar
  48. 48.
    O’Doherty, J., Dayan, P., Schultz, J., Deichmann, R., Friston, K., Dolan, R.J.: Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science 304(5669), 452–454 (2004). doi:10.1126/science.1094285. CrossRefGoogle Scholar
  49. 49.
    Ohl, F.W., Scheich, H., Freeman, W.J.: Change in pattern of ongoing cortical activity with auditory category learning. Nature 412(6848), 733–736 (2001). doi:10.1038/35089076. CrossRefGoogle Scholar
  50. 50.
    Pennartz, C.M.A., Berke, J.D., Graybiel, A.M., Ito, R., Lansink, C.S., van der Meer, M., Redish, A.D., Smith, K.S., Voorn, P.: Corticostriatal interactions during learning, memory processing, and decision making. J. Neurosci. 29(41), 12831–12838 (2009). doi:10.1523/JNEUROSCI.3177-09.2009. CrossRefGoogle Scholar
  51. 51.
    Piray, P.: The role of dorsal striatal d2-like receptors in reversal learning: a reinforcement learning viewpoint. J. Neurosci. 31(40), 14049–14050 (2011). doi:10.1523/JNEUROSCI.3008-11.2011. CrossRefGoogle Scholar
  52. 52.
    Pubols, B. Jr.: Successive discrimination reversal learning in the white rat: a comparison of two procedures. J. Comput. Physiol. Psychol. 50(3), 319–322 (1957)CrossRefGoogle Scholar
  53. 53.
    Pubols, B.H.: Serial reversal learning as a function of the number of trials per reversal. J. Comput. Physiol. Psychol. 55, 66–68 (1962)CrossRefGoogle Scholar
  54. 54.
    Ragozzino, M.E.: Acetylcholine actions in the dorsomedial striatum support the flexible shifting of response patterns. Neurobiol. Learn. Mem. 80(3), 257–267 (2003)CrossRefGoogle Scholar
  55. 55.
    Remijnse, P.L., Nielen, M.M.A., Uylings, H.B.M., Veltman, D.J.: Neural correlates of a reversal learning task with an affectively neutral baseline: an event-related fMRI study. Neuroimage 26(2), 609–618 (2005). doi:10.1016/j.neuroimage.2005.02.009. CrossRefGoogle Scholar
  56. 56.
    Rodgers, C.C., DeWeese, M.R.: Neural correlates of task switching in prefrontal cortex and primary auditory cortex in a novel stimulus selection task for rodents. Neuron 82(5), 1157–1170 (2014). doi:10.1016/j.neuron.2014.04.031. CrossRefGoogle Scholar
  57. 57.
    Schoenbaum, G., Nugent, S.L., Saddoris, M.P., Setlow, B.: Orbitofrontal lesions in rats impair reversal but not acquisition of go, no-go odor discriminations. Neuroreport 13(6), 885–890 (2002)CrossRefGoogle Scholar
  58. 58.
    Schoenbaum, G., Setlow, B., Nugent, S.L., Saddoris, M.P., Gallagher, M.: Lesions of orbitofrontal cortex and basolateral amygdala complex disrupt acquisition of odor-guided discriminations and reversals. Learn. Mem. 10(2), 129–140 (2003). doi:10.1101/lm.55203. CrossRefGoogle Scholar
  59. 59.
    Schultz, W.: The reward signal of midbrain dopamine neurons. News Physiol. Sci. 14, 249–255 (1999)Google Scholar
  60. 60.
    Schultz, W.: Reward signaling by dopamine neurons. Neuroscientist 7(4), 293–302 (2001)CrossRefGoogle Scholar
  61. 61.
    Schultz, W., Dayan, P., Montague, P.R.: A neural substrate of prediction and reward. Science 275(5306), 1593–1599 (1997)CrossRefGoogle Scholar
  62. 62.
    Schulz, A.L., Woldeit, M.L., Gonçalves, A.I., Saldeitis, K., Ohl, F.W.: Selective increase of auditory cortico-striatal coherence during auditory-cued go/nogo discrimination learning. Front. Behav. Neurosci. 9(368) (2016). doi:10.3389/fnbeh.2015.00368Google Scholar
  63. 63.
    Smith, Y., Surmeier, D.J., Redgrave, P., Kimura, M.: Thalamic contributions to basal ganglia-related behavioral switching and reinforcement. J. Neurosci. 31(45), 16102–16106 (2011). doi:10.1523/JNEUROSCI.4634-11.2011. CrossRefGoogle Scholar
  64. 64.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT, Cambridge, MA (1998)Google Scholar
  65. 65.
    Tremblay, L., Hollerman, J.R., Schultz, W.: Modifications of reward expectation-related neuronal activity during learning in primate striatum. J. Neurophysiol. 80(2), 964–977 (1998)CrossRefGoogle Scholar
  66. 66.
    von der Gablentz, J., Tempelmann, C., Münte, T.F., Heldmann, M.: Performance monitoring and behavioral adaptation during task switching: an fMRI study. Neuroscience 285, 227–235 (2015). doi:10.1016/j.neuroscience.2014.11.024. CrossRefGoogle Scholar
  67. 67.
    Voorn, P., Vanderschuren, L.J.M.J., Groenewegen, H.J., Robbins, T.W., Pennartz, C.M.a.: Putting a spin on the dorsal-ventral divide of the striatum. Trends Neurosci. 27(8), 468–74 (2004). doi:10.1016/j.tins.2004.06.006.
  68. 68.
    Walton, M.E., Behrens, T.E.J., Noonan, M.P., Rushworth, M.F.S.: Giving credit where credit is due: orbitofrontal cortex and valuation in an uncertain world. Ann. N. Y. Acad. Sci. 1239, 14–24 (2011). doi:10.1111/j.1749-6632.2011.06257.x. CrossRefGoogle Scholar
  69. 69.
    Wassum, K.M., Izquierdo, A.: The basolateral amygdala in reward learning and addiction. Neurosci. Biobehav. Rev. 57, 271–283 (2015). doi:10.1016/j.neubiorev.2015.08.017. CrossRefGoogle Scholar
  70. 70.
    Woldeit, M.L., Schulz, A.L., Ohl, F.W.: Phase de-synchronization effects auditory gating in the ventral striatum but not auditory cortex. Neuroscience 216, 70–81 (2012). doi:10.1016/j.neuroscience.2012.04.058. CrossRefGoogle Scholar
  71. 71.
    Xiong, Q., Znamenskiy, P., Zador, A.M.: Selective corticostriatal plasticity during acquisition of an auditory discrimination task. Nature (2015). doi:10.1038/nature14225. Google Scholar
  72. 72.
    Xue, G., Xue, F., Droutman, V., Lu, Z.L., Bechara, A., Read, S.: Common neural mechanisms underlying reversal learning by reward and punishment. PLoS One 8(12), e82169 (2013). doi:10.1371/journal.pone.0082169. CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Andreas L. Schulz
    • 1
  • Marie L. Woldeit
    • 1
  • Frank W. Ohl
    • 1
    • 2
    • 3
  1. 1.Leibniz Institute for NeurobiologyMagdeburgGermany
  2. 2.Otto-von-Guericke UniversityMagdeburgGermany
  3. 3.Center for Behavioral Brain Sciences (CBBS)MagdeburgGermany

Personalised recommendations