Efficient policy search in low-dimensional embedding spaces by generalizing motion primitives with a parameterized skill memory

Reinhart, René Felix; Steil, Jochen Jakob

doi:10.1007/s10514-014-9417-9

Efficient policy search in low-dimensional embedding spaces by generalizing motion primitives with a parameterized skill memory

Published: 22 October 2014

Volume 38, pages 331–348, (2015)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

René Felix Reinhart¹ &
Jochen Jakob Steil¹

586 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

Motion primitives are an established paradigm to generate complex motions from simpler building blocks. A much less addressed issue is at which level to encode and how to organize a library of motion primitives. Typically, the intrinsic variability of a skill is significantly lower-dimensional than the parameter space of motion primitive models. This paper therefore proposes a parameterized skill memory in a first step, which organizes a set of motion primitives in a low-dimensional, topology-preserving embedding space. The skill memory acts as a pivotal mechanism that links low-dimensional skill parametrization to motion primitive parameters and complete motion trajectories. The skill memory is implemented by means of a dynamical system which features continuous generalization of motion shapes and the multi-directional retrieval of motion primitive parameters from low-dimensional skill parametrizations. The skill parametrization can be predefined or automatically discovered, e.g. by unsupervised dimension reduction techniques. The paper shows that parameterized skill memories achieve excellent generalization of motion shapes from few training examples in several scenarios, including the bi-manual manipulation of a rod with the humanoid robot iCub. In a second step, the low-dimensional and topological skill parametrization is leveraged for efficient, gradient-based policy search. Policy search by generalizing motion shapes from low-dimensional parametrizations is compared to conventional policy search in the parameter space of a motion primitive model. It turns out that the reduced search space accessible through the skill memory significantly accelerates the policy improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Motion Generation with Geodesic Paths on Learnt Skill Manifolds

Generalizing Movement Primitives to New Situations

A tutorial on task-parameterized movement learning and retrieval

Article 26 September 2015

References

Barhen, J., Gulati, S., & Zak, M. (1989). Neural learning of constrained nonlinear transformations. Computer, 22, 67–76.
Article Google Scholar
Bishop, C. M., Svensén, M., & Williams, C. K. I. (1998). GTM: The generative topographic mapping. Neural Computation, 10(1), 215–234.
Article Google Scholar
Bitzer, S., Howard, M., & Vijayakumar, S. (2010). Using dimensionality reduction to exploit constraints in reinforcement learning. In IEEE/RSJ international conference on intelligent robots and systems (pp. 3219–3225).
Calinon, S., Guenter, F., & Billard, A. (2007). On learning, representing, and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 37(2), 286–298.
Article Google Scholar
Cox, T. F., & Cox, M. A. A. (2001). Multidimensional scaling. Boca Raton: Chapman & Hall/CRC.
MATH Google Scholar
da Silva, B. C., Konidaris, G., & Barto, A. G. (2012). Learning parameterized skills. In International conference on machine learning.
Emmerich, C., Reinhart, R. F., & Steil, J. J. (2013). Multi-directional continuous association with input-driven neural dynamics. Neurocomputing, 112, 47–57.
Article Google Scholar
Flash, T., & Hogan, N. (1985). The coordination of arm movements: An experimentally confirmed mathematical model. The Journal of Neuroscience, 5(7), 1688–1703.
Google Scholar
Flash, T., & Hochner, B. (2005). Motor primitives in vertebrates and invertebrates. Current Opinion in Neurobiology, 15(6), 660–666.
Article Google Scholar
Forte, D., Gams, A., Morimoto, J., & Ude, A. (2012). On-line motion synthesis and adaptation using a trajectory database. Robotics and Autonomous Systems, 60(10), 1327–1339.
Article Google Scholar
Hart, C. B., & Giszter, S. F. (2010). A neural basis for motor primitives in the spinal cord. The Journal of Neuroscience, 30(4), 1322–1336.
Article Google Scholar
Hinton, G., & Roweis, S. (2002). Stochastic neighbor embedding. In Advances in neural information processing systems (pp. 833–840). Cambridge: MIT Press.
Hoffmann, H., Pastor, P., Park, D.-H., & Schaal, S. (2009). Biologically-inspired dynamical systems for movement generation: Automatic real-time goal adaptation and obstacle avoidance. In IEEE international conference on robotics and automation (pp. 2587–2592).
Ijspeert, A. J., Nakanishi, J., & Schaal, S. (2003). Learning attractor landscapes for learning motor primitives. Advances in Neural Information Processing Systems, 15, 1523–1530.
Google Scholar
Ijspeert, A. J., Nakanishi, J., Hoffmann, H., Pastor, P., & Schaal, S. (2013). Dynamical movement primitives: Learning attractor models for motor behaviors. Neural Computation, 25(2), 328–373.
Article MATH MathSciNet Google Scholar
Inamura, T., Toshima, I., & Nakamura, Y. (2003). Acquiring motion elements for bidirectional computation of motion recognition and generation. In: Experimental robotics VIII, volume 5 of Springer tracts in advanced robotics (pp. 372–381).
Khansari-Zadeh, S. M., & Billard, A. (2011). Learning stable nonlinear dynamical systems with gaussian mixture models. IEEE Transactions on Robotics, 27(5), 943–957. The data set of handwriting motions can be downloaded from http://lasa.epfl.ch/khansari/SEDS_handwriting_motions.zip. Accessed 16 Oct 2014.
Kober, J., Wilhelm, A., Oztop, E., & Peters, J. (2012). Reinforcement learning to adjust parametrized motor primitives to new situations. Autonomous Robots, 33, 361–379.
Article Google Scholar
Kohonen, T. (1990). The self-organizing map. Proceedings of the IEEE, 78(9), 1464–1480.
Article Google Scholar
Kupcsik, A., Deisenroth, M. P., Peters, J., & Neumann, G. (2013). Data-efficient generalization of robot skills with contextual policy search. In Proceedings of the AAAI conference on artificial intelligence (pp. 1401–1407).
Legenstein, R., Wilbert, N., & Wiskott, L. (2010). Reinforcement learning on slow features of high-dimensional input streams. PLOS Computational Biology, 6(8), e1000894.
Article MathSciNet Google Scholar
Lemme, A., Neumann, K., Reinhart, R. F., & Steil, J. J. (2013). Neurally imprinted stable vector fields. In European symposium on artificial neural networks, best student paper (pp. 327–332).
Meier, F., Theodorou, E., Stulp, F., & Schaal, S. (2011). Movement segmentation using a primitive library. In IEEE/RSJ international conference on intelligent robots and systems (pp. 3407–3412).
Mühlig, M., Gienger, M., Hellbach, S., Steil, J. J., & Goerick. C. (2009). Task-level imitation learning using variance-based movement optimization. In IEEE international conference on robotics and automation (pp. 1177–1184).
Nemec, B., & Ude, A. (2012). Action sequencing using dynamic movement primitives. Robotica, 30, 837–846.
Article Google Scholar
Neumann, K., Lemme, A., & Steil, J. J. (2013). Neural learning of stable dynamical systems based on data-driven Lyapunov candidates. In IEEE/RSJ international conference on intelligent robots and systems (pp. 1216–1222).
Pastor, P., Hoffmann, H., Asfour, T., & Schaal, S. (2009). Learning and generalization of motor skills by learning from demonstration. In IEEE international conference on robotics and automation (pp. 763–768).
Reinhart, R. F., & Rolf, M. (2013). Learning versatile sensorimotor coordination with goal babbling and neural associative dynamics. In IEEE international conference on development and learning.
Reinhart, R. F., & Steil, J. J. (2012). Learning whole upper body control with dynamic redundancy resolution in coupled associative radial basis function networks. In IEEE/RSJ international conference on intelligent robots and systems (pp. 1487–1492).
Reinhart, R. F., Lemme, A., & Steil, J. J. (2012). Representation and generalization of bi-manual skills from kinesthetic teaching. In IEEE-RAS international conference on humanoid robots (pp. 560–567).
Saul, L. K., & Roweis, S. T. (2003). Think globally, fit locally: Unsupervised learning of low dimensional manifolds. The Journal of Machine Learning Research, 4, 119–155.
MathSciNet Google Scholar
Schaal, S., Ijspeert, A. J., & Billard, A. (2003a). Computational approaches to motor learning by imitation. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 358(1431), 537–547.
Schaal, S., Peters, J., Nakanishi, J., & Ijspeert, A. J. (2003b). Control, planning, learning, and imitation with dynamic movement primitives. In IEEE international conference on intelligent robots and systems, workshop on bilateral paradigms on humans and humanoids.
Soltoggio, A., & Lemme, A. (2013). Movement primitives as a robotic tool to interpret trajectories through learning-by-doing. International Journal of Automation and Computing, 10(5), 375–386.
Article Google Scholar
Steffen, J., Haschke, R., & Ritter, H. (2008) Towards dextrous manipulation using manipulation manifolds. In IEEE/RSJ international conference on intelligent robots and systems (pp. 2738–2743).
Stulp, F., & Sigaud, O. (2013). Policy improvement: Between black-box optimization and episodic reinforcement learning. In Journées Francophones Planification, Décision, et Apprentissage pour la conduite de systèmes. http://hal.archives-ouvertes.fr/hal-00738463/. Accessed 16 Oct 2014.
Tavan, P., Grubmüller, H., & Kühnel, H. (1990). Self-organization of associative memory and pattern classification: Recurrent signal processing on topological feature maps. Biological Cybernetics, 64, 95–105.
Article MATH Google Scholar
The MathWorks Inc., Matlab Neural Network Toolbox. http://www.mathworks.de/products/neural-network/.
Theodorou, E., Buchli, J., & Schaal, S. (2010). A generalized path integral control approach to reinforcement learning. The Journal of Machine Learning Research, 11, 3137–3181.
MATH MathSciNet Google Scholar
Ude, A., Riley, M., Nemec, B., Kos, A., Asfour, T., & Cheng, G. (2007). Synthesizing goal-directed actions from a library of example movements. In IEEE-RAS international conference on humanoid robots (pp. 115–121).
Ude, A., Gams, A., Asfour, T., & Morimoto, J. (2010). Task-specific generalization of discrete and periodic dynamic movement primitives. IEEE Transactions on Robotics, 26(5), 800–815.
Article Google Scholar
Waegeman, T., Wyffels, F., & Schrauwen, B. (2012). A discrete/rhythmic pattern generating RNN. In European symposium on artificial neural networks (pp 567–572).
Walter, J., & Ritter, H. (1996). Rapid learning with parametrized self-organizing maps. Neurocomputing, 12(2–3), 131–153.
Article MATH Google Scholar
Wang, X., Tino, P., Fardal, M. A., Raychaudhury, S., & Babul, A. (2009). Fast Parzen window density estimator. In International joint conference on neural networks (pp. 3267–3274).
Yamashita, Y., & Tani, J. (2008). Emergence of functional hierarchy in a multiple timescale neural network model: A humanoid robot experiment. PLoS Computational Biology, 4(11), e1000220.
Article Google Scholar

Download references

Acknowledgments

The research leading to these results has received funding from the European Community’s 7th Framework Program FP7/2007–2013, Challenge 2 - Cognitive Systems, Interaction, Robotics - under Grant Agreement 248311 - AMARSi.

Author information

Authors and Affiliations

Research Institute for Cognition and Robotics (CoR-Lab), Bielefeld University, Universitätsstr. 25, 33615, Bielefeld, Germany
René Felix Reinhart & Jochen Jakob Steil

Authors

René Felix Reinhart
View author publications
You can also search for this author in PubMed Google Scholar
Jochen Jakob Steil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to René Felix Reinhart.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reinhart, R.F., Steil, J.J. Efficient policy search in low-dimensional embedding spaces by generalizing motion primitives with a parameterized skill memory. Auton Robot 38, 331–348 (2015). https://doi.org/10.1007/s10514-014-9417-9

Download citation

Received: 02 September 2013
Accepted: 08 October 2014
Published: 22 October 2014
Issue Date: April 2015
DOI: https://doi.org/10.1007/s10514-014-9417-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient policy search in low-dimensional embedding spaces by generalizing motion primitives with a parameterized skill memory

Abstract

Access this article

Similar content being viewed by others

Motion Generation with Geodesic Paths on Learnt Skill Manifolds

Generalizing Movement Primitives to New Situations

A tutorial on task-parameterized movement learning and retrieval

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient policy search in low-dimensional embedding spaces by generalizing motion primitives with a parameterized skill memory

Abstract

Access this article

Similar content being viewed by others

Motion Generation with Geodesic Paths on Learnt Skill Manifolds

Generalizing Movement Primitives to New Situations

A tutorial on task-parameterized movement learning and retrieval

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation