Autonomous Robots

, Volume 27, Issue 2, pp 105–121

A novel method for learning policies from variable constraint data

Authors

    • Institute of Perception Action and BehaviourUniversity of Edinburgh
  • Stefan Klanke
    • Institute of Perception Action and BehaviourUniversity of Edinburgh
  • Michael Gienger
    • Honda Research Institute Europe (GmBH)
  • Christian Goerick
    • Honda Research Institute Europe (GmBH)
  • Sethu Vijayakumar
    • Institute of Perception Action and BehaviourUniversity of Edinburgh
Article

DOI: 10.1007/s10514-009-9129-8

Cite this article as:
Howard, M., Klanke, S., Gienger, M. et al. Auton Robot (2009) 27: 105. doi:10.1007/s10514-009-9129-8

Abstract

Many everyday human skills can be framed in terms of performing some task subject to constraints imposed by the environment. Constraints are usually unobservable and frequently change between contexts. In this paper, we present a novel approach for learning (unconstrained) control policies from movement data, where observations come from movements under different constraints. As a key ingredient, we introduce a small but highly effective modification to the standard risk functional, allowing us to make a meaningful comparison between the estimated policy and constrained observations. We demonstrate our approach on systems of varying complexity, including kinematic data from the ASIMO humanoid robot with 27 degrees of freedom, and present results for learning from human demonstration.

Keywords

Direct policy learningConstrained motionImitationNullspace control

Supplementary material

View video

Learning to reach for a ball. (23.5 MB)

View video

Learning to wash a car. (37.3 MB)

Copyright information

© Springer Science+Business Media, LLC 2009