Model-Based and Model-Free Mechanisms of Human Motor Learning

Conference paper
Part of the Advances in Experimental Medicine and Biology book series (volume 782)

Abstract

Motor learning can be framed theoretically as a problem of optimizing a movement policy in a potentially uncertain or changing environment. This is precisely the general problem studied in the field of reinforcement learning. Reinforcement learning theory proposes two distinct approaches to solving this general problem: Model-based approaches first identify the dynamics of the task or environment then use this knowledge to compute the optimal movement policy. Model-free approaches, by contrast, directly identify successful policies through a process of trial and error. Here, we review existing literature on motor control in the light of this distinction. Motor learning research in the last decade has been dominated by studies that elicit learning through adaptation paradigms and find the results to be consistent with a model-based framework. Studying the behavior of patients in such adaptation paradigms has implicated the cerebellum as prime candidate for the neural substrate of the internal models that sub serve model-based control. A growing body of experimental results, however, demonstrates that not all of motor learning in conventional paradigms can be explained within model-based frameworks, but can be understood in terms of an additional component of learning driven by model-free reinforcement of successful actions. We conclude that the brain maintains distinct model-based and model-free learning systems, with distinct neural substrates, which act in competitive balance to direct behavior.

References

  1. Ariff G, Donchin O, Nanayakkara T, Shadmehr R (2002) A real-time state predictor in motor control: study of saccadic eye movements during unseen reaching movements. J Neurosci 22:7721–7729PubMedGoogle Scholar
  2. Avila I, Reilly MP, Sanabria F, Posadas-Sanchez D, Chavez CL, Banerjee N, Killeen P, Castaneda E (2009) Modeling operant behavior in the Parkinsonian rat. Behav Brain Res 198:298–305PubMedCrossRefGoogle Scholar
  3. Balleine BW, Dickinson A (1998) Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 37:407–419PubMedCrossRefGoogle Scholar
  4. Balleine BW, O’Doherty JP (2010) Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35:48–69PubMedCrossRefGoogle Scholar
  5. Bastian AJ (2006) Learning to predict the future: the cerebellum adapts feedforward movement control. Curr Opin Neurobiol 16:645–649PubMedCrossRefGoogle Scholar
  6. Bedard P, Sanes JN (2011) Basal ganglia-dependent processes in recalling learned visual-motor adaptations. Exp Brain Res 209:385–393PubMedCrossRefGoogle Scholar
  7. Bertsekas DP (1996) Dynamic programming and optimal control: Athena ScientificGoogle Scholar
  8. Carmena JM, Lebedev MA, Crist RE, O’Doherty JE, Santucci DM, Dimitrov DF, Patil PG, Henriquez CS, Nicolelis MA (2003) Learning to control a brain-machine interface for reaching and grasping by primates. PLoS Biol 1:E42Google Scholar
  9. Chen-Harris H, Joiner WM, Ethier V, Zee DS, Shadmehr R (2008) Adaptive control of saccades via internal feedback. J Neurosci 28:2804–2813PubMedCrossRefGoogle Scholar
  10. Classen J, Liepert J, Wise SP, Hallett M, Cohen LG (1998) Rapid plasticity of human cortical movement representation induced by practice. J Neurophysiol 79:1117–1123PubMedGoogle Scholar
  11. Conditt MA, Gandolfo F, Mussa-Ivaldi FA (1997) The motor system does not learn the dynamics of the arm by rote memorization of past experience. J Neurophysiol 78:554–560PubMedGoogle Scholar
  12. Criscimagna-Hemminger SE, Bastian AJ, Shadmehr R (2010) Size of error affects cerebellar contributions to motor learning. J Neurophysiol 103:2275–2284PubMedCrossRefGoogle Scholar
  13. Daw ND, Niv Y, Dayan P (2005) Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat Neurosci 8:1704–1711PubMedCrossRefGoogle Scholar
  14. Dayan P (2009) Goal-directed control and its antipodes. Neural Netw 22:213–219PubMedCrossRefGoogle Scholar
  15. Diedrichsen J, Verstynen T, Lehman SL, Ivry RB (2005) Cerebellar involvement in anticipating the consequences of self-produced actions during bimanual movements. J Neurophysiol 93:801–812PubMedCrossRefGoogle Scholar
  16. Diedrichsen J, White O, Newman D, Lally N (2010) Use-dependent and error-based learning of motor behaviors. J Neurosci 30:5159–5166PubMedCrossRefGoogle Scholar
  17. Donchin O, Francis JT, Shadmehr R (2003) Quantifying generalization from trial-by-trial behavior of adaptive systems that learn with basis functions: theory and experiments in human motor control. J Neurosci 23:9032–9045PubMedGoogle Scholar
  18. Donchin O, Rabe K, Diedrichsen J, Lally N, Schoch B, Gizewski ER, Timmann D (2011) Cerebellar regions involved in adaptation to force field and visuomotor perturbation. J Neurophysiol 107(1):134–47PubMedCrossRefGoogle Scholar
  19. Doya K (1999) What are the computations of the cerebellum, the basal ganglia and the cerebral cortex? Neural Netw 12:961–974PubMedCrossRefGoogle Scholar
  20. Fermin A, Yoshida T, Ito M, Yoshimoto J, Doya K (2010) Evidence for model-based action planning in a sequential finger movement task. J Mot Behav 42:371–379PubMedCrossRefGoogle Scholar
  21. Flanagan JR, Wing AM (1997) The role of internal models in motion planning and control: evidence from grip force adjustments during movements of hand-held loads. J Neurosci 17:1519–1528PubMedGoogle Scholar
  22. Frank MJ, Seeberger LC, O’Reilly R C (2004) By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306:1940–1943PubMedCrossRefGoogle Scholar
  23. Franklin DW, Burdet E, Tee KP, Osu R, Chew CM, Milner TE, Kawato M (2008) CNS learns stable, accurate, and efficient movements using a simple algorithm. J Neurosci 28:11165–11173PubMedCrossRefGoogle Scholar
  24. Galea JM, Vazquez A, Pasricha N, de Xivry JJ, Celnik P (2011) Dissociating the roles of the cerebellum and motor cortex during adaptive learning: the motor cortex retains what the cerebellum learns. Cereb Cortex 21:1761–1770PubMedCrossRefGoogle Scholar
  25. Gentner R, Gorges S, Weise D, aufm Kampe K, Buttmann M, Classen J (2010) Encoding of motor skill in the corticomuscular system of musicians. Curr Biol 20:1869–1874PubMedCrossRefGoogle Scholar
  26. Glascher J, Daw N, Dayan P, O’Doherty JP (2010) States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron 66:585–595PubMedCrossRefGoogle Scholar
  27. Hosp JA, Pekanovic A, Rioult-Pedotti MS, Luft AR (2011) Dopaminergic projections from midbrain to primary motor cortex mediate motor skill learning. J Neurosci 31:2481–2487PubMedCrossRefGoogle Scholar
  28. Huang VS, Haith A, Mazzoni P, Krakauer JW (2011) Rethinking motor learning and savings in adaptation paradigms: model-free memory for successful actions combines with internal models. Neuron 70:787–801PubMedCrossRefGoogle Scholar
  29. Izawa J, Shadmehr R (2011) Learning from sensory and reward prediction errors during motor adaptation. PLoS Comput Biol 7:e1002012Google Scholar
  30. Izawa J, Criscimagna-Hemminger SE, Shadmehr R (2011) Cerebellar Contributions to Learning Sensory Consequences of Action. J Neurosci 32(12):4230–4239CrossRefGoogle Scholar
  31. Jax SA, Rosenbaum DA (2007) Hand path priming in manual obstacle avoidance: evidence that the dorsal stream does not only control visually guided actions in real time. J Exp Psychol 33:425–441Google Scholar
  32. Jordan MIaR, D.E. (1992) Forward models: Supervised learning with a distal teacher. Cognitive Sci 16:307–354CrossRefGoogle Scholar
  33. Kawato M, Gomi H (1992) A computational model of four regions of the cerebellum based on feedback-error learning. Biol Cybern 68:95–103PubMedCrossRefGoogle Scholar
  34. Killcross S, Coutureau E (2003) Coordination of actions and habits in the medial prefrontal cortex of rats. Cereb Cortex 13:400–408PubMedCrossRefGoogle Scholar
  35. Krakauer JW, Ghilardi MF, Ghez C (1999) Independent learning of internal models for kinematic and dynamic control of reaching. Nat Neurosci 2:1026–1031PubMedCrossRefGoogle Scholar
  36. Krakauer JW, Pine ZM, Ghilardi MF, Ghez C (2000) Learning of visuomotor transformations for vectorial planning of reaching trajectories. J Neurosci 20:8916–8924PubMedGoogle Scholar
  37. Lackner JR, Dizio P (1994) Rapid adaptation to Coriolis force perturbations of arm trajectory. J Neurophysiol 72:299–313PubMedGoogle Scholar
  38. Leow LA, Loftus AM and Hammond GR (2012) Impaired savings despite intact initial learning of motor adaptation in Parkinson’s disease. Exp Brain Res 2:295–304CrossRefGoogle Scholar
  39. Marinelli L, Crupi D, Di Rocco A, Bove M, Eidelberg D, Abbruzzese G, Ghilardi MF (2009) Learning and consolidation of visuo-motor adaptation in Parkinson’s disease. Parkinsonism Relat Disord 15:6–11PubMedCrossRefGoogle Scholar
  40. Martin TA, Keating JG, Goodkin HP, Bastian AJ, Thach WT (1996) Throwing while looking through prisms. I. Focal olivocerebellar lesions impair adaptation. Brain 119 (4):1183–1198PubMedCrossRefGoogle Scholar
  41. Maschke M, Gomez CM, Ebner TJ, Konczak J (2004) Hereditary cerebellar ataxia progressively impairs force adaptation during goal-directed arm movements. J Neurophysiol 91:230–238PubMedCrossRefGoogle Scholar
  42. Mazzoni P, Krakauer JW (2006) An implicit plan overrides an explicit strategy during visuomotor adaptation. J Neurosci 26:3642–3645PubMedCrossRefGoogle Scholar
  43. McGuire LM, Sabes PN (2009) Sensory transformations and the use of multiple reference frames for reach planning. Nat Neurosci 12:1056–1061PubMedCrossRefGoogle Scholar
  44. Medina JF (2011) The multiple roles of Purkinje cells in sensori-motor calibration: to predict, teach and command. Curr Opin Neurobiol 21:616–622PubMedCrossRefGoogle Scholar
  45. Mehta B, Schaal S (2002) Forward models in visuomotor control. J Neurophysiol 88:942–953PubMedGoogle Scholar
  46. Miall RC, Christensen LO, Cain O, Stanley J (2007) Disruption of state estimation in the human lateral cerebellum. PLoS Biol 5:e316Google Scholar
  47. Montague PR, Dayan P, Sejnowski TJ (1996) A framework for mesencephalic dopamine systems based on predictive Hebbian learning. J Neurosci 16:1936–1947PubMedGoogle Scholar
  48. Mosier KM, Scheidt RA, Acosta S, Mussa-Ivaldi FA (2005) Remapping hand movements in a novel geometrical environment. J Neurophysiol 94:4362–4372PubMedCrossRefGoogle Scholar
  49. Munuera J, Morel P, Duhamel JR, Deneve S (2009) Optimal sensorimotor control in eye movement sequences. J Neurosci 29:3026–3035PubMedCrossRefGoogle Scholar
  50. Nagengast AJ, Braun DA, Wolpert DM (2009) Optimal control predicts human performance on objects with internal degrees of freedom. PLoS Comput Biol 5:e1000419Google Scholar
  51. Nowak DA, Hermsdorfer J, Rost K, Timmann D, Topka H (2004) Predictive and reactive finger force control during catching in cerebellar degeneration. Cerebellum 3:227–235PubMedCrossRefGoogle Scholar
  52. Pasalar S, Roitman AV, Durfee WK, Ebner TJ (2006) Force field effects on cerebellar Purkinje cell discharge with implications for internal models. Nat Neurosci 9:1404–1411PubMedCrossRefGoogle Scholar
  53. Pekny SE, Criscimagna-Hemminger SE, Shadmehr R (2011) Protection and expression of human motor memories. J Neurosci 31:13829–13839PubMedCrossRefGoogle Scholar
  54. Rabe K, Livne O, Gizewski ER, Aurich V, Beck A, Timmann D, Donchin O (2009) Adaptation to visuomotor rotation and force field perturbation is correlated to different brain areas in patients with cerebellar degeneration. J Neurophysiol 101:1961–1971PubMedCrossRefGoogle Scholar
  55. Reis J, Schambra HM, Cohen LG, Buch ER, Fritsch B, Zarahn E, Celnik PA, Krakauer JW (2009) Noninvasive cortical stimulation enhances motor skill acquisition over multiple days through an effect on consolidation. Proc Natl Acad Sci 106:1590–1595PubMedCrossRefGoogle Scholar
  56. Roitman AV, Pasalar S, Johnson MT, Ebner TJ (2005) Position, direction of movement, and speed tuning of cerebellar Purkinje cells during circular manual tracking in monkey. J Neurosci 25:9244–9257PubMedCrossRefGoogle Scholar
  57. Rost K, Nowak DA, Timmann D, Hermsdorfer J (2005) Preserved and impaired aspects of predictive grip force control in cerebellar patients. Clin Neurophysiol 116:1405–1414PubMedCrossRefGoogle Scholar
  58. Schaefer SY, Shelly IL, Thoroughman KA (2012) Beside the point: motor adaptation without feedback-based error correction in task-irrelevant conditions. J Neurophysiol 107(4):1247–56PubMedCrossRefGoogle Scholar
  59. Schultz W, Dayan P, Montague PR (1997) A neural substrate of prediction and reward. Science 275:1593–1599.PubMedCrossRefGoogle Scholar
  60. Shadmehr R, Mussa-Ivaldi FA (1994) Adaptive representation of dynamics during learning of a motor task. J Neurosci 14:3208–3224PubMedGoogle Scholar
  61. Shadmehr R, Krakauer JW (2008) A computational neuroanatomy for motor control. Exp Brain Res 185:359–381PubMedCrossRefGoogle Scholar
  62. Shadmehr R, Smith MA, Krakauer JW (2010) Error correction, sensory prediction, and adaptation in motor control. Annu Rev Neurosci 33:89–108PubMedCrossRefGoogle Scholar
  63. Shmuelof L, Krakauer JW, Mazzoni P (2012) How is a motor skill learned? Change and invariance at the levels of task success and trajectory control. J Neurophysiol 08(2):578–94CrossRefGoogle Scholar
  64. Shohamy D, Myers CE, Grossman S, Sage J, Gluck MA (2005) The role of dopamine in cognitive sequence learning: evidence from Parkinson’s disease. Behav Brain Res 156:191–199PubMedCrossRefGoogle Scholar
  65. Simani MC, McGuire LM, Sabes PN (2007) Visual-shift adaptation is composed of separable sensory and task-dependent effects. J Neurophysiol 98:2827–2841PubMedCrossRefGoogle Scholar
  66. Smith MA, Shadmehr R (2005) Intact ability to learn internal models of arm dynamics in Huntington’s disease but not cerebellar degeneration. J Neurophysiol 93:2809–2821PubMedCrossRefGoogle Scholar
  67. Squire LR (1992) Memory and the hippocampus: a synthesis of findings with rats, monkeys and humans. Psychol Rev 99:195–231PubMedCrossRefGoogle Scholar
  68. Sternad D, Abe MO, Hu X, Muller H (2011) Neuromotor noise, error tolerance and velocity-dependent costs in skilled performance. PLoS Comput Biol 7:e1002159Google Scholar
  69. Sutton RS, Barto AG (1998) Reinforcement learning: An introduction. Cambridge Univ PressGoogle Scholar
  70. Synofzik M, Thier P, Lindner A (2006) Internalizing agency of self-action: perception of one’s own hand movements depends on an adaptable prediction about the sensory action outcome. J Neurophysiol 96:1592–1601PubMedCrossRefGoogle Scholar
  71. Synofzik M, Lindner A, Thier P (2008) The cerebellum updates predictions about the visual consequences of one’s behavior. Curr Biol 18:814–818PubMedCrossRefGoogle Scholar
  72. Taylor JA, Ivry RB (2011) Flexible cognitive strategies during motor learning. PLoS Comput Biol 7:e1001096Google Scholar
  73. Taylor JA, Klemfuss NM, Ivry RB (2010) An explicit strategy prevails when the cerebellum fails to compute movement errors. Cerebellum 9:580–586PubMedCrossRefGoogle Scholar
  74. Thoroughman KA, Shadmehr R (2000) Learning of action through adaptive combination of motor primitives. Nature 407:742–747PubMedCrossRefGoogle Scholar
  75. Todorov E (2007) Optimal control theory. In: Bayesian brain: probabilistic approaches to neural coding, MIT Press, Cambridge, p 269–298Google Scholar
  76. Todorov E, Jordan MI (2002) Optimal feedback control as a theory of motor coordination. Nat Neurosci 5:1226–1235PubMedCrossRefGoogle Scholar
  77. Tseng YW, Diedrichsen J, Krakauer JW, Shadmehr R, Bastian AJ (2007) Sensory prediction errors drive cerebellum-dependent adaptation of reaching. J Neurophysiol 98:54–62PubMedCrossRefGoogle Scholar
  78. van der Meer MA, Redish AD (2011) Ventral striatum: a critical look at models of learning and evaluation. Curr Opin Neurobiol 21:387–392PubMedCrossRefGoogle Scholar
  79. Verstynen T, Sabes PN (2011) How each movement changes the next: an experimental and theoretical study of fast adaptive priors in reaching. J Neurosci 31:10050–10059PubMedCrossRefGoogle Scholar
  80. Wallman J, Fuchs AF (1998) Saccadic gain modification: visual error drives motor adaptation. J Neurophysiol 80:2405–2416PubMedGoogle Scholar
  81. Wagner MJ, Smith MA (2008) Shared internal models for feedforward and feedback control. J Neurosci 28:10663–10673PubMedCrossRefGoogle Scholar
  82. Wolpert DM, Miall RC (1996) Forward Models for Physiological Motor Control. Neural Netw 9:1265–1279PubMedCrossRefGoogle Scholar
  83. Wolpert DM, Miall RC, Kawato M (1998) Internal models in the cerebellum. Trends Cogn Sci 2:338–347PubMedCrossRefGoogle Scholar
  84. Wolpert DM, Diedrichsen J, Flanagan JR (2011) Principles of sensorimotor learning. Nat Rev Neurosci 12:739–751PubMedGoogle Scholar
  85. Wong AL, Shelhamer M (2011) Sensorimotor adaptation signals are derived from realistic predictions of movement outcomes. J Neurophysiol 105(3):1130–40PubMedCrossRefGoogle Scholar
  86. Xu-Wilson M, Chen-Harris H, Zee DS, Shadmehr R (2009) Cerebellar contributions to adaptive control of saccades in humans. J Neurosci 29:12930–12939PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of NeurologyJohns Hopkins UniversityBaltimoreUSA
  2. 2.Departments of Neurology and NeuroscienceJohns Hopkins UniversityBaltimoreUSA

Personalised recommendations