Adaptive Optimal Feedback Control with Learned Internal Dynamics Models

Mitrovic, Djordje; Klanke, Stefan; Vijayakumar, Sethu

doi:10.1007/978-3-642-05181-4_4

Djordje Mitrovic⁴,
Stefan Klanke⁴ &
Sethu Vijayakumar⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 264))

1667 Accesses
43 Citations

Abstract

Optimal Feedback Control (OFC) has been proposed as an attractive movement generation strategy in goal reaching tasks for anthropomorphic manipulator systems. Recent developments, such as the Iterative Linear Quadratic Gaussian (ILQG) algorithm, have focused on the case of non-linear, but still analytically available, dynamics. For realistic control systems, however, the dynamics may often be unknown, difficult to estimate, or subject to frequent systematic changes. In this chapter, we combine the ILQG framework with learning the forward dynamics for simulated arms, which exhibit large redundancies, both, in kinematics and in the actuation. We demonstrate how our approach can compensate for complex dynamic perturbations in an online fashion. The specific adaptive framework introduced lends itself to a computationally more efficient implementation of the ILQG optimisation without sacrificing control accuracy – allowing the method to scale to large DoF systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abbeel, P., Quigley, M., Ng, A.Y.: Using inaccurate models in reinforcement learning. In: Proc. Int. Conf. on Machine Learning (ICML), pp. 1–8 (2006)
Google Scholar
Atkeson, C.G.: Randomly sampling actions in dynamic programming. In: Proc. Int. Symp. on Approximate Dynamic Programming and Reinforcement Learning, pp. 185–192 (2007)
Google Scholar
Atkeson, C.G., Moore, A., Schaal, S.: Locally weighted learning for control. AI Review 11, 75–113 (1997)
Google Scholar
Atkeson, C.G., Schaal, S.: Learning tasks from a single demonstration. In: Proc. Int. Conf. on Robotics and Automation (ICRA), Albuquerque, New Mexico, pp. 1706–1712 (1997)
Google Scholar
Bertsekas, D.P.: Dynamic programming and optimal control. Athena Scientific, Belmont (1995)
MATH Google Scholar
Conradt, J., Tevatia, G., Vijayakumar, S., Schaal, S.: On-line learning for humanoid robot systems. In: Proc. Int. Conf. on Machine Learning (ICML), pp. 191–198 (2000)
Google Scholar
Corke, P.I.: A robotics toolbox for MATLAB. IEEE Robotics and Automation Magazine 3(1), 24–32 (1996)
Article Google Scholar
D’Souza, A., Vijayakumar, S., Schaal, S.: Learning inverse kinematics. In: Proc. Int. Conf. on Intelligence in Robotics and Autonomous Systems (IROS), Hawaii, pp. 298–303 (2001)
Google Scholar
Dyer, P., McReynolds, S.: The Computational Theory of Optimal Control. Academic Press, New York (1970)
Google Scholar
Flash, T., Hogan, N.: The coordination of arm movements: an experimentally confirmed mathematical model. Journal of Neuroscience 5, 1688–1703 (1985)
Google Scholar
Grebenstein, M., van der Smagt, P.: Antagonism for a highly anthropomorphic hand-arm system. Advanced Robotics 22(1), 39–55 (2008)
Article Google Scholar
Jacobson, D.H., Mayne, D.Q.: Differential Dynamic Programming. Elsevier, New York (1970)
MATH Google Scholar
Katayama, M., Kawato, M.: Virtual trajectory and stiffness ellipse during multijoint arm movement predicted by neural inverse model. Biological Cybernetics 69, 353–362 (1993)
MATH Google Scholar
Klanke, S., Vijayakumar, S., Schaal, S.: A library for locally weighted projection regression. Journal of Machine Learning Research 9, 623–626 (2008)
MathSciNet Google Scholar
Li, W.: Optimal Control for Biological Movement Systems. PhD dissertation, University of California, San Diego (2006)
Google Scholar
Li, W., Todorov, E.: Iterative linear-quadratic regulator design for nonlinear biological movement systems. In: Proc. 1st Int. Conf. Informatics in Control, Automation and Robotics (2004)
Google Scholar
Li, W., Todorov, E.: Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system. International Journal of Control 80(9), 14391–14453 (2007)
Article MathSciNet Google Scholar
Nguyen-Tuong, D., Peters, J., Seeger, M., Schoelkopf, B.: Computed torque control with nonparametric regressions techniques. In: American Control Conference (2008)
Google Scholar
Özkaya, N., Nordin, M.: Fundamentals of biomechanics: equilibrium, motion, and deformation. Van Nostrand Reinhold, New York (1991)
Google Scholar
Schaal, S.: Learning Robot Control. In: The handbook of brain theory and neural networks, pp. 983–987. MIT Press, Cambridge (2002)
Google Scholar
Shadmehr, R., Mussa-Ivaldi, F.A.: Adaptive representation of dynamics during learning of a motor task. The Journal of Neurosciene 14(5), 3208–3224 (1994)
Google Scholar
Shadmehr, R., Wise, S.P.: The Computational Neurobiology of Reaching and Ponting. MIT Press, Cambridge (2005)
Google Scholar
Stengel, R.F.: Optimal control and estimation. Dover Publications, New York (1994)
MATH Google Scholar
Thrun, S.: Monte carlo POMDPs. In: Advances in Neural Information Processing Systems (NIPS), pp. 1064–1070 (2000)
Google Scholar
Todorov, E.: Optimality principles in sensorimotor control. Nature Neuroscience 7(9), 907–915 (2004)
Article Google Scholar
Todorov, E., Jordan, M.: Optimal feedback control as a theory of motor coordination. Nature Neuroscience 5, 1226–1235 (2002)
Article Google Scholar
Todorov, E., Jordan, M.: A minimal intervention principle for coordinated movement. In: Advances in Neural Information Processing Systems (NIPS), pp. 27–34. MIT Press, Cambridge (2003)
Google Scholar
Todorov, E., Li, W.: A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems. In: Proc. of the American Control Conference (2005)
Google Scholar
Uno, Y., Kawato, M., Suzuki, R.: Formation and control of optimal trajectories in human multijoint arm movements: minimum torque-change model. Biological Cybernetics 61, 89–101 (1989)
Article Google Scholar
Vijayakumar, S., D’Souza, A., Schaal, S.: Incremental online learning in high dimensions. Neural Computation 17, 2602–2634 (2005)
Article MathSciNet Google Scholar
Vijayakumar, S., D’Souza, A., Shibata, T., Conradt, J., Schaal, S.: Statistical learning for humanoid robots. Autonomous Robots 12(1), 55–69 (2002)
Article MATH Google Scholar
Wolf, S., Hirzinger, G.: A new variable stiffness design: Matching requirements of the next robot generation. In: Proc. Int. Conf. on Robotics and Automation (ICRA), pp. 1741–1746 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh, EH8 9AB, United Kingdom
Djordje Mitrovic, Stefan Klanke & Sethu Vijayakumar

Authors

Djordje Mitrovic
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Klanke
View author publications
You can also search for this author in PubMed Google Scholar
Sethu Vijayakumar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut des Systèmes Intelligents et de Robotique (CNRS UMR 7222), Université Pierre et Marie Curie Pyramide, Tour 55 Boîte courrier 173, 4 Place Jussieu, 75252, PARIS cedex 05, France
Olivier Sigaud
Dept. Schölkopf, Max-Planck Institute for Biological Cybernetics, Spemannstraße 38,Rm 223, 72076, Tübingen, Germany
Jan Peters

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mitrovic, D., Klanke, S., Vijayakumar, S. (2010). Adaptive Optimal Feedback Control with Learned Internal Dynamics Models. In: Sigaud, O., Peters, J. (eds) From Motor Learning to Interaction Learning in Robots. Studies in Computational Intelligence, vol 264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05181-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-05181-4_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05180-7
Online ISBN: 978-3-642-05181-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics