Journal of Intelligent & Robotic Systems

, Volume 83, Issue 3, pp 393–408

Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller

  • Abbas Abdolmaleki
  • Nuno Lau
  • Luis Paulo Reis
  • Jan Peters
  • Gerhard Neumann
Article

DOI: 10.1007/s10846-016-0347-y

Cite this article as:
Abdolmaleki, A., Lau, N., Reis, L.P. et al. J Intell Robot Syst (2016) 83: 393. doi:10.1007/s10846-016-0347-y

Abstract

We investigate learning of flexible robot locomotion controllers, i.e., the controllers should be applicable for multiple contexts, for example different walking speeds, various slopes of the terrain or other physical properties of the robot. In our experiments, contexts are desired walking linear speed of the gait. Current approaches for learning control parameters of biped locomotion controllers are typically only applicable for a single context. They can be used for a particular context, for example to learn a gait with highest speed, lowest energy consumption or a combination of both. The question of our research is, how can we obtain a flexible walking controller that controls the robot (near) optimally for many different contexts? We achieve the desired flexibility of the controller by applying the recently developed contextual relative entropy policy search(REPS) method which generalizes the robot walking controller for different contexts, where a context is described by a real valued vector. In this paper we also extend the contextual REPS algorithm to learn a non-linear policy instead of a linear policy over the contexts which call it RBF-REPS as it uses Radial Basis Functions. In order to validate our method, we perform three simulation experiments including a walking experiment using a simulated NAO humanoid robot. The robot learns a policy to choose the controller parameters for a continuous set of forward walking speeds.

Keywords

Learning humanoids robot locomotions Generalizing robot skills Stochastic search Contextual relative entropy policy search Nonlinear policies Nao robot 

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Abbas Abdolmaleki
    • 1
    • 2
    • 3
  • Nuno Lau
    • 1
  • Luis Paulo Reis
    • 2
    • 3
  • Jan Peters
    • 4
    • 5
  • Gerhard Neumann
    • 6
  1. 1.DETI / IEETA, University of AveiroAveiroPortugal
  2. 2.DSI, University of MinhoGuimarãesPortugal
  3. 3.LIACC, University of PortoPortoPortugal
  4. 4.IAS, TU DarmstadtDarmstadtGermany
  5. 5.MPI for Intelligent SystemsStuttgartGermany
  6. 6.CLAS, TU DarmstadtDarmstadtGermany

Personalised recommendations