Advertisement

Autonomous Robots

, Volume 41, Issue 7, pp 1521–1537 | Cite as

Autonomous exploration of motor skills by skill babbling

  • René Felix Reinhart
Article

Abstract

Autonomous exploration of motor skills is a key capability of learning robotic systems. Learning motor skills can be formulated as inverse modeling problem, which targets at finding an inverse model that maps desired outcomes in some task space, e.g., via points of a motion, to appropriate actions, e.g., motion control policy parameters. In this paper, autonomous exploration of motor skills is achieved by incrementally learning inverse models starting from an initial demonstration. The algorithm is referred to as skill babbling, features sample-efficient learning, and scales to high-dimensional action spaces. Skill babbling extends ideas of goal-directed exploration, which organizes exploration in the space of goals. The proposed approach provides a modular framework for autonomous skill exploration by separating the learning of the inverse model from the exploration mechanism and a model of achievable targets, i.e. the workspace. The effectiveness of skill babbling is demonstrated for a range of motor tasks comprising the autonomous bootstrapping of inverse kinematics and parameterized motion primitives.

Keywords

Autonomous exploration Goal babbling Motion primitives Parameterized motor skills Inverse models 

Notes

Acknowledgements

This research and development project is funded by the German Federal Ministry of Education and Research (BMBF) within the Leading-Edge Cluster Competition and managed by the Project Management Agency Karlsruhe (PTKA).

References

  1. Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483.CrossRefGoogle Scholar
  2. Baranes, A., & Oudeyer, P. Y. (2013). Active learning of inverse models with intrinsically motivated goal exploration in robots. Robotics and Autonomous Systems, 61(1), 49–73.CrossRefGoogle Scholar
  3. Calinon, S., Alizadeh, T., & Caldwell, D. G. (2013). On improving the extrapolation capability of task-parameterized movement models. In IEEE/RSJ international conference on intelligent robots and systems (pp. 610–616).Google Scholar
  4. Edelsbrunner, H., & Mücke, E. P. (1994). Three-dimensional alpha shapes. ACM Transactions on Graphics, 13(1), 43–72.CrossRefzbMATHGoogle Scholar
  5. Haykin, S. (1991). Adaptive filter theory. New York: Prentice Hall.zbMATHGoogle Scholar
  6. Huang, G. B., Zhu, Q. Y., & Siew, C. K. (2004). Extreme learning machine: A new learning scheme of feedforward neural networks. IEEE International Joint Conference on Neural Networks, 2, 985–990.Google Scholar
  7. Ijspeert, A. J., Nakanishi, J., Hoffmann, H., Pastor, P., & Schaal, S. (2013). Dynamical movement primitives: Learning attractor models for motor behaviors. Neural Computation, 25(2), 328–373.MathSciNetCrossRefzbMATHGoogle Scholar
  8. Jordan, M. I., & Rumelhart, D. E. (1992). Forward models: Supervised learning with a distal teacher. Cognitive Science, 16(3), 307–354.CrossRefGoogle Scholar
  9. Khansari-Zadeh, S. (2012). http://www.amarsi-project.eu/open-source
  10. Kober, J., Wilhelm, A., Oztop, E., & Peters, J. (2012). Reinforcement learning to adjust parametrized motor primitives to new situations. Autonomous Robots, 33, 361–379.CrossRefGoogle Scholar
  11. Kober, J., Bagnell, J. A., & Peters, J. (2013). Reinforcement learning in robotics: A survey. International Journal of Robotics Research, 32(11), 1238–1274.CrossRefGoogle Scholar
  12. Kormushev, P., Calinon, S., & Caldwell, D. (2010). Robot motor skill coordination with EM-based reinforcement learning. In IEEE/RSJ international conference on intelligent robots and systems (pp. 3232–3237).Google Scholar
  13. Kulvicius, T., Ning, K., Tamosiunaite, M., & Worgötter, F. (2012). Joining movement sequences: Modified dynamic movement primitives for robotics applications exemplified on handwriting. IEEE Transactions on Robotics, 28(1), 145–157.CrossRefGoogle Scholar
  14. Kupcsik, A., Deisenroth, M. P., Peters, J., & Neumann, G. (2013). Data-efficient generalization of robot skills with contextual policy search. In AAAI conference on artificial intelligence (pp. 1401–1407).Google Scholar
  15. Lemme, A., Meirovitch, Y., Khansari-Zadeh, S. M., Flash, T., Billard, A., & Steil, J. J. (2015). Open-source benchmarking for learned reaching motion generation in robotics. Paladyn Journal of Behavioral Robotics, 6(1), 30–41.CrossRefGoogle Scholar
  16. Liang, N. Y., Huang, G. B., Saratchandran, P., & Sundararajan, N. (2006). A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Transactions on Neural Networks, 17(6), 1411–1423.CrossRefGoogle Scholar
  17. Matsubara, T., Hyon, S., & Morimoto, J. (2010). Learning stylistic dynamic movement primitives from multiple demonstrations. In IEEE/RSJ international conference on intelligent robots and systems (pp. 1277–1283).Google Scholar
  18. Mülling, K., Kober, J., Kroemer, O., & Peters, J. (2013). Learning to select and generalize striking movements in robot table tennis. Intern Journal of Robotics Research, 32(3), 263–279.CrossRefGoogle Scholar
  19. Pontón, B., Farshidian, F., & Buchli, J. (2014). Learning compliant locomotion on a quadruped robot. In I. R. O. S. Workshop (Ed.), Compliant manipulation: Challenges in learning and control.Google Scholar
  20. Reinhart, R., & Steil, J. (2014). Efficient policy search with a parameterized skill memory. In IEEE/RSJ international conference on intelligent robots and systems (pp. 1400–1407).Google Scholar
  21. Reinhart, R. F., & Steil, J. J. (2015). Efficient policy search in low-dimensional embedding spaces by generalizing motion primitives with a parameterized skill memory. Autonomous Robots, 38(4), 331–348.CrossRefGoogle Scholar
  22. Ritter, H. (1991). Learning with the self-organizing map. In Artificial neural networks (pp. 357–364). New York: Elsevier.Google Scholar
  23. Rolf, M., & Steil, J. (2014). Efficient exploratory learning of inverse kinematics on a bionic elephant trunk. IEEE Transactions on Neural Networks and Learning Systems, 25(6), 1147–1160.Google Scholar
  24. Rolf, M., Steil, J., & Gienger, M. (2010). Goal babbling permits direct learning of inverse kinematics. IEEE Transactions on Autonomous Mental Development, 2(3), 216–229.CrossRefGoogle Scholar
  25. Rolf, M., Steil, J., & Gienger, M. (2011). Online goal babbling for rapid bootstrapping of inverse models in high dimensions. IEEE International Conference on Development and Learning, 2, 1–8.Google Scholar
  26. Schmidt, W., Kraaijveld, M., & Duin, R. (1992). Feedforward neural networks with random weights. In IAPR international conference on pattern recognition, conference B: Pattern recognition methodology and systems (Vol. II, pp. 1–4).Google Scholar
  27. da Silva B. C., Konidaris, G., & Barto, A. G. (2012). Learning parameterized skills. In International conference on machine learning (pp. 1679–1686).Google Scholar
  28. da Silva B. C., Baldassarre, G., Konidaris, G., & Barto, A. (2014a) Learning parameterized motor skills on a humanoid robot. In IEEE international conference on robotics and automation (pp. 5239–5244).Google Scholar
  29. da Silva, B. C., Konidaris, G., & Barto, A. (2014b) Active learning of parameterized skills. In International conference on machine learning, JMLR workshop and conference proceedings (pp. 1737–1745).Google Scholar
  30. Stulp, F., & Sigaud, O. (2013). Robot skill learning: From reinforcement learning to evolution strategies. Paladyn Journal of Behavioral Robotics, 4(1), 49–61.CrossRefGoogle Scholar
  31. Stulp, F., Raiola, G., Hoarau, A., Ivaldi, S., & Sigaud, O. (2013). Learning compact parameterized skills with a single regression. In IEEE-RAS international conference on humanoid robots (pp. 417–422).Google Scholar
  32. Theodorou, E., Buchli, J., & Schaal, S. (2010). A generalized path integral control approach to reinforcement learning. The Journal of Machine Learning Research, 11, 3137–3181.MathSciNetzbMATHGoogle Scholar
  33. Ude, A., Riley, M., Nemec, B., Kos, A., Asfour, T., & Cheng, G. (2007). Synthesizing goal-directed actions from a library of example movements. In IEEE-RAS international conference on humanoid robots (pp. 115–121).Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Research Institute for Cognition and Robotics – CoR-LabBielefeld UniversityBielefeldGermany

Personalised recommendations