Learning New Basic Movements for Robotics
Obtaining novel skills is one of the most important problems in robotics. Machine learning techniques may be a promising approach for automatic and autonomous acquisition of movement policies. However, this requires both an appropriate policy representation and suitable learning algorithms. Employing the most recent form of the dynamical systems motor primitives originally introduced by Ijspeert et al. , we show how both discrete and rhythmic tasks can be learned using a concerted approach of both imitation and reinforcement learning, and present our current best performing learning algorithms. Finally, we show that it is possible to include a start-up phase in rhythmic primitives. We apply our approach to two elementary movements, i.e., Ball-in-a-Cup and Ball-Paddling, which can be learned on a real Barrett WAM robot arm at a pace similar to human learning.
Unable to display preview. Download preview PDF.
- 1.A. J. Ijspeert, J. Nakanishi, and S. Schaal, “Learning attractor landscapes for learning motor primitives,” in Advances in Neural Information Processing Systems (NIPS), 2003.Google Scholar
- 3.S. Schaal, J. Peters, J. Nakanishi, and A. J. Ijspeert, “Learning movement primitives,” in International Symposium on Robotics Research 2003 (ISRR), ser. Springer Tracts in Advanced Robotics, 2004, pp. 561–572.Google Scholar
- 4.J. Peters and S. Schaal, “Policy gradient methods for robotics,” in Proceedings of the IEEE/RSJ 2006 International Conference on Intelligent RObots and Systems (IROS), 2006, pp. 2219–2225.Google Scholar
- 5.S. Schaal, J. Peters, J. Nakanishi, and A. J. Ijspeert, “Control, planning, learning, and imitation with dynamic movement primitives,” in Proceedings of the Workshop on Bilateral Paradigms on Humans and Humanoids, IEEE 2003 International Conference on Intelligent RObots and Systems (IROS), 2003.Google Scholar
- 6.F. Guenter, M. Hersch, S. Calinon, and A. Billard, “Reinforcement learning for imitating constrained reaching movements,” Advanced Robotics, Special Issue on Imitative Robots, vol. 21, no. 13, pp. 1521–1544, 2007.Google Scholar
- 7.H. Urbanek, A. Albu-Schäffer, and P.v.d. Smagt, “Learning from demonstration repetitive movements for autonomous service robotics,” in Proceedings of the IEEE/RSL 2004 International Conference on Intelligent RObots and Systems (IROS), 2004, pp. 3495–3500.Google Scholar
- 8.R. E. Bellman, Dynamic Programming. Princeton University Press, 1957.Google Scholar
- 10.R. Sutton and A. Barto, Reinforcement Learning. MIT PRESS, 1998.Google Scholar
- 11.J. Peters and S. Schaal, “Reinforcement learning for operational space,” in Proceedings of the International Conference on Robotics and Automation (ICRA), 2007.Google Scholar
- 13.C. G. Atkeson, “Using local trajectory optimizers to speed up global optimization in dynamic programming,” in Advances in Neural Information Processing Systems 6 (NIPS), 1994.Google Scholar
- 15.J. Kober and J. Peters, “Policy search for motor primitives in robotics,” in Advances in Neural Information Processing Systems (NIPS), 2008.Google Scholar
- 16.T. Rückstieß, M. Felder, and J. Schmidhuber, “State-dependent exploration for policy gradient methods,” in Proceedings of the European Conference on Machine Learning (ECML), 2008, pp. 234–249.Google Scholar
- 17.Wikipedia, “Ball-in-a-cup,” January 2009. [Online]. Available: http://en.wikipedia.org/wiki/Ball_in_a_cupGoogle Scholar