Adaptive dynamic programming as a theory of sensorimotor control

Jiang, Yu; Jiang, Zhong-Ping

doi:10.1007/s00422-014-0613-7

Adaptive dynamic programming as a theory of sensorimotor control

Original Paper
Published: 25 June 2014

Volume 108, pages 459–473, (2014)
Cite this article

Biological Cybernetics Aims and scope Submit manuscript

Yu Jiang¹ &
Zhong-Ping Jiang¹

Abstract

Many characteristics of sensorimotor control can be explained by models based on optimization and optimal control theories. However, most of the previous models assume that the central nervous system has access to the precise knowledge of the sensorimotor system and its interacting environment. This viewpoint is difficult to be justified theoretically and has not been convincingly validated by experiments. To address this problem, this paper presents a new computational mechanism for sensorimotor control from a perspective of adaptive dynamic programming (ADP), which shares some features of reinforcement learning. The ADP-based model for sensorimotor control suggests that a command signal for the human movement is derived directly from the real-time sensory data, without the need to identify the system dynamics. An iterative learning scheme based on the proposed ADP theory is developed, along with rigorous convergence analysis. Interestingly, the computational model as advocated here is able to reproduce the motor learning behavior observed in experiments where a divergent force field or velocity-dependent force field was present. In addition, this modeling strategy provides a clear way to perform stability analysis of the overall system. Hence, we conjecture that human sensorimotor systems use an ADP-type mechanism to control movements and to achieve successful adaptation to uncertainties present in the environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A robust adaptive dynamic programming principle for sensorimotor control with signal-dependent noise

Article 26 February 2015

Motor Control: On the Way to Physics of Living Systems

Active Inference or Control as Inference? A Unifying View

References

Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton, NJ
Google Scholar
Berniker M, Kording K (2008) Estimating the sources of motor errors for adaptation and generalization. Nat Neurosci 11(12):1454–1461
Article CAS PubMed Central PubMed Google Scholar
Bhushan N, Shadmehr R (1999) Computational nature of human adaptive control during learning of reaching movements in force fields. Biol Cybern 81(1):39–60
Article CAS PubMed Google Scholar
Bristow DA, Tharayil M, Alleyne AG (2006) A survey of iterative learning control. IEEE Control Syst Mag 26(3):96–114
Article Google Scholar
Burdet E, Osu R, Franklin D, Yoshioka T, Milner T, Kawato M (2000) A method for measuring endpoint stiffness during multi-joint arm movements. J Biomech 33(12):1705–1709
Article CAS PubMed Google Scholar
Burdet E, Osu R, Franklin DW, Milner TE, Kawato M (2001) The central nervous system stabilizes unstable dynamics by learning optimal impedance. Nature 414(6862):446–449
Article CAS PubMed Google Scholar
Davidson PR, Wolpert DM (2003) Motor learning and prediction in a variable environment. Curr Opin Neurobiol 13(2):232–237
Article CAS PubMed Google Scholar
Diedrichsen J, Shadmehr R, Ivry RB (2010) The coordination of movement: optimal feedback control and beyond. Trends Cognit Sci 14(1):31–39
Article Google Scholar
Doya K (2000) Reinforcement learning in continuous time and space. Neural Comput 12(1):219–245
Article CAS PubMed Google Scholar
Doya K, Kimura H, Kawato M (2001) Neural mechanisms of learning and control. IEEE Control Syst Mag 21(4):42–54
Article Google Scholar
Fitts PM (1954) The information capacity of the human motor system in controlling the amplitude of movement. J Exp Psychol 47(6):381–391
Article CAS PubMed Google Scholar
Flash T, Hogan N (1985) The coordination of arm movements: an experimentally confirmed mathematical model. J Neurosci 5(7):1688–1703
CAS PubMed Google Scholar
Franklin DW, Wolpert DM (2011) Computational mechanisms of sensorimotor control. Neuron 72(3):425–442
Article CAS PubMed Google Scholar
Franklin DW, Burdet E, Osu R, Kawato M, Milner TE (2003) Functional significance of stiffness in adaptation of multijoint arm movements to stable and unstable dynamics. Exp Brain Res 151(2):145–157
PubMed Google Scholar
Franklin DW, Burdet E, Tee KP, Osu R, Chew CM, Milner TE, Kawato M (2008) CNS learns stable, accurate, and efficient movements using a simple algorithm. J Neurosci 28(44):11165–11173
CAS PubMed Google Scholar
Gomi H, Kawato M (1996) Equilibrium-point control hypothesis examined by measured arm stiffness during multijoint movement. Science 272:117–120
CAS PubMed Google Scholar
Harris CM, Wolpert DM (1998) Signal-dependent noise determines motor planning. Nature 394:780–784
CAS PubMed Google Scholar
Hogan N (1985) The mechanics of multi-joint posture and movement control. Biol Cybern 52(5):315–331
CAS PubMed Google Scholar
Hogan N, Flash T (1987) Moving gracefully: quantitative theories of motor coordination. Trends Neurosci 10(4):170–174
Google Scholar
Horn RA (1990) Matrix analysis. Cambridge University Press, Cambridge
Google Scholar
Hudson TE, Landy MS (2012) Adaptation to sensory-motor reflex perturbations is blind to the source of errors. J Vis 12(1):1–10
Google Scholar
Itô K (1944) Stochastic integral. Proc Jpn Acad Ser A Math Sci 20(8):519–524
Google Scholar
Izawa J, Shadmehr R (2011) Learning from sensory and reward prediction errors during motor adaptation. PLoS Comput Biol 7(3):e1002,012
CAS Google Scholar
Izawa J, Rane T, Donchin O, Shadmehr R (2008) Motor adaptation as a process of reoptimization. J Neurosci 28(11):2883–2891
CAS PubMed Central PubMed Google Scholar
Jiang Y, Jiang ZP (2012a) Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48(10):2699–2704
Google Scholar
Jiang Y, Jiang ZP (2012b) Robust adaptive dynamic programming. In: Liu D, Lewis F (eds) Reinforcement learning and adaptive dynamic programming for feedback control, Chap 13. Wiley, New York, pp 281–302
Google Scholar
Jiang Y, Jiang ZP (2013a) Robust adaptive dynamic programming for optimal nonlinear control design. arXiv, preprint arXiv:13032247v1 [mathDS]
Jiang ZP, Jiang Y (2013b) Robust adaptive dynamic programming for linear and nonlinear systems: an overview. Eur J Control 19(5):417–425
Google Scholar
Jiang Y, Jiang ZP (2014) Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Trans Neural Netw Learn Syst 25(5):882–893
PubMed Google Scholar
Jiang Y, Chemudupati S, Jorgensen JM, Jiang ZP, Peskin CS (2011a) Optimal control mechanism involving the human kidney. In: The 50th IEEE conference on decision and control and European control conference (CDC–ECC), Orlando, FL, pp 3688–3693
Jiang Y, Jiang ZP, Qian N (2011b) Optimal control mechanisms in human arm reaching movements. In: Proceedings of the 30th Chinese control conference, IEEE, Yantai, China, pp 1377–1382
Kleinman D (1969a) On the stability of linear stochastic systems. IEEE Trans Autom Control 14(4):429–430
Google Scholar
Kleinman D (1969b) Optimal stationary control of linear systems with control-dependent noise. IEEE Trans Autom Control 14(6):673 –677
Kording KP, Tenenbaum JB, Shadmehr R (2007) The dynamics of memory as a consequence of optimal adaptation to a changing body. Nat Neurosci 10(6):779–786
CAS PubMed Central PubMed Google Scholar
Lewis F, Syrmos V (1995) Optimal control. Wiley, New York
Google Scholar
Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50
Google Scholar
Liu D, Todorov E (2007) Evidence for the flexible sensorimotor strategies predicted by optimal feedback control. J Neurosci 27(35):9354–9368
CAS PubMed Google Scholar
Ljung L (1999) System identification. Wiley, London
Google Scholar
Milne TE (1993) Dependence of elbow viscoelastic behavior on speed and loading in voluntary movements. Exp Brain Res 93(1):177–180
Google Scholar
Morasso P (1981) Spatial control of arm movements. Exp Brain Res 42(2):223–227
CAS PubMed Google Scholar
Murray JJ, Cox CJ, Lendaris GG, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Syst Man Cybern C Appl Rev 32(2):140–153
Google Scholar
Mussa-Ivaldi FA, Hogan N, Bizzi E (1985) Neural, mechanical, and geometric factors subserving arm posture in humans. J Neurosci 5(10):2732–2743
CAS PubMed Google Scholar
Powell WB (2011) Approximate dynamic programming: solving the curses of dimensionality, 2nd edn. Wiley, London
Google Scholar
Qian N, Jiang Y, Jiang ZP, Mazzoni P (2013) Movement duration, Fitts’s law, and an infinite-horizon optimal feedback control model for biological motor systems. Neural Comput 25(3):697–724
CAS PubMed Central PubMed Google Scholar
Schmidt RA, Lee TD (2011) Motor control and learning: a behavioral emphasis, 5th edn. Human Kinetics
Scott SH (2004) Optimal feedback control and the neural basis of volitional motor control. Nat Rev Neurosci 5(7):532–546
CAS PubMed Google Scholar
Shadmehr R, Mussa-Ivaldi FA (1994) Adaptive representation of dynamics during learning of a motor task. J Neurosci 14(5):3208–3224
CAS PubMed Google Scholar
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Google Scholar
Tanaka H, Krakauer JW, Qian N (2006) An optimization principle for determining movement duration. J Neurophysiol 95(6):3875–3886
PubMed Google Scholar
Tee KP, Franklin DW, Kawato M, Milner TE, Burdet E (2010) Concurrent adaptation of force and impedance in the redundant muscle system. Biol Cybern 102(1):31–44
PubMed Google Scholar
Todorov E (2005) Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system. Neural Comput 17(5):1084–1108
PubMed Central PubMed Google Scholar
Todorov E, Jordan MI (2002) Optimal feedback control as a theory of motor coordination. Nat Neurosci 5(11):1226–1235
Uno Y, Kawato M, Suzuki R (1989) Formation and control of optimal trajectory in human multijoint arm movement: minimum torque-change model. Biolog Cybern 61(2):89–101
Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis F (2009) Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 45(2):477–484
Watkins C (1989) Learning from delayed rewards. PhD thesis. University of Cambridge, Cambridge
Wei K, Körding K (2010), Uncertainty of feedback and state estimation determines the speed of motor adaptation. Front Comput Neurosci 4:1–9
Werbos P (1968) The elements of intelligence. Cybernetica (Namur) (3)
Werbos P (1974) Beyond regression: new tools for prediction and analysis in the behavioral sciences. PhD thesis. Harvard University, Harvard
Werbos PJ (1989) Neural networks for control and system identification. In: Proceedings of the 28th IEEE conference on decision and control, pp 260–265
Wolpert DM, Ghahramani Z (2000) Computational principles of movement neuroscience. Nat Neurosci 3:1212–1217
CAS PubMed Google Scholar
Yang C, Ganesh G, Haddadin S, Parusel S, Albu-Schaeffer A, Burdet E (2011) Human-like adaptation of force and impedance in stable and unstable interactions. IEEE Trans Robot 27(5):918–930
Google Scholar
Zhou K, Doyle JC, Glover K (1996) Robust and optimal control, vol 272. Prentice Hall, New Jersey
Google Scholar
Zhou SH, Oetomo D, Tan Y, Burdet E, Mareels I (2012) Modeling individual human motor behavior through model reference iterative learning control. IEEE Trans Biomed Eng 59(7):1892–1901
PubMed Google Scholar

Download references

Acknowledgments

We would like to thank the Editor and anonymous reviewers for the constructive comments that are helpful for improving the presentation of this paper.

Author information

Authors and Affiliations

Control and Networks Laboratory, Department of Electrical and Computer Engineering, Polytechnic School of Engineering, New York University, 5 Metrotech Center, Brooklyn, NY , 11201, USA
Yu Jiang & Zhong-Ping Jiang

Authors

Yu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Zhong-Ping Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhong-Ping Jiang.

Additional information

This work has been supported in part by the National Science Foundation Grants DMS-0906659, ECCS-1101401, and ECCS-1230040.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, Y., Jiang, ZP. Adaptive dynamic programming as a theory of sensorimotor control. Biol Cybern 108, 459–473 (2014). https://doi.org/10.1007/s00422-014-0613-7

Download citation

Received: 27 March 2013
Accepted: 22 May 2014
Published: 25 June 2014
Issue Date: August 2014
DOI: https://doi.org/10.1007/s00422-014-0613-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive dynamic programming as a theory of sensorimotor control

Abstract

Access this article

Similar content being viewed by others

A robust adaptive dynamic programming principle for sensorimotor control with signal-dependent noise

Motor Control: On the Way to Physics of Living Systems

Active Inference or Control as Inference? A Unifying View

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive dynamic programming as a theory of sensorimotor control

Abstract

Access this article

Similar content being viewed by others

A robust adaptive dynamic programming principle for sensorimotor control with signal-dependent noise

Motor Control: On the Way to Physics of Living Systems

Active Inference or Control as Inference? A Unifying View

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation