Abstract
This chapter presents an overview of learning approaches for the acquisition of controllers and movement skills in humanoid robots. The term learning control refers to the process of acquiring a control strategy to achieve a task. While the definition is in some cases restrained to trial-and-error learning, we present here learning control in a broader perspective, with a focus on the representation of skills to be acquired, and on the different learning strategies that can contribute to the acquisition of robust and adaptive controllers for humanoids.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
B. Akgun, A. Thomaz, Simultaneously learning actions and goals from demonstration. Auton. Robot. 40(2), 211–227 (2016)
S. An, D. Lee, Prioritized inverse kinematics with multiple task definitions, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 1423–1430
A. Anandkumar, R. Ge, D. Hsu, S.M. Kakade, M. Telgarsky, Tensor decompositions for learning latent variable models. J. Mach. Learn. Res. 15(1), 2773–2832 (2014)
C.G. Atkeson, Using local models to control movement, in Advances in Neural Information Processing Systems (NIPS), vol. 2, 1989, pp. 316–323
C.G. Atkeson, A.W. Moore, S. Schaal, Locally weighted learning for control. Artif. Intell. Rev. 11(1–5), 75–113 (1997)
D.A. Bristow, M. Tharayil, A.G. Alleyne, A survey of iterative learning control. IEEE Control. Syst. 26(3), 96–114 (2006)
A.E. Bryson, Dynamic Optimization (Addison Wesley Longman, Menlo Park, 1999)
J. Buchli, F. Stulp, E. Theodorou, S. Schaal, Learning variable impedance control. Int. J. Robot. Res. 30(7), 820–833 (2011)
S. Calinon, A tutorial on task-parameterized movement learning and retrieval. Intell. Serv. Robot. 9(1), 1–29 (2016)
S. Calinon, A.G. Billard, Active teaching in robot programming by demonstration, in Proceedings of IEEE International Symposium on Robot and Human Interactive Communication (Ro-Man), Jeju, 2007, pp. 702–707
S. Calinon, F. D’halluin, E.L. Sauser, D.G. Caldwell, A.G. Billard, Learning and reproduction of gestures by imitation: an approach based on hidden Markov model and Gaussian mixture regression. IEEE Robot. Autom. Mag. 17(2), 44–54 (2010)
S. Calinon, Z. Li, T. Alizadeh, N.G. Tsagarakis, D.G. Caldwell, Statistical dynamical systems for skills acquisition in humanoids, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), Osaka, 2012, pp. 323–329
S. Calinon, T. Alizadeh, D.G. Caldwell, On improving the extrapolation capability of task-parameterized movement models, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, 2013a, pp. 610–616
S. Calinon, P. Kormushev, D.G. Caldwell, Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning. Robot. Auton. Syst. 61(4), 369–379 (2013b)
S. Calinon, D. Bruno, D.G. Caldwell, A task-parameterized probabilistic model with minimal intervention control, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, 2014, pp. 3339–3344
W.S. Cleveland, Robust locally weighted regression and smoothing scatterplots. Am. Stat. Assoc. 74(368), 829–836 (1979)
B. Dariush, M. Gienger, B. Jian, C. Goerick, K. Fujimura, Whole body humanoid control from human motion descriptors, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2008, pp. 2677–2684
M. de Lasa, A. Hertzmann, Prioritized optimization for task-space control, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), St Louis, 2009, pp. 5755–5762
N. Dehio, R.F. Reinhart, J.J. Steil, Multiple task optimization with a mixture of controllers for motion generation, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, 2015, pp. 6416–6421
E. Demircan, L. Sentis, V.D. Sapio, O. Khatib, Human motion reconstruction by direct control of marker trajectories, in Advances in Robot Kinematics, 2008, pp. 263–272
A. Dietrich, A. Albu-Schäffer, G. Hirzinger, On continuous null space projections for torque-based, hierarchical, multi-objective manipulation, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2012, pp. 2978–2985
P. Evrard, E. Gribovskaya, S. Calinon, A.G. Billard, A. Kheddar, Teaching physical collaborative tasks: object-lifting case study with a humanoid, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), Paris, 2009, pp. 399–404
D. Forte, A. Gams, J. Morimoto, A. Ude, On-line motion synthesis and adaptation using a trajectory database. Robot. Auton. Syst. 60(10), 1327–1339 (2012)
S. Furui, Speaker-independent isolated word recognition using dynamic features of speech spectrum. IEEE Trans. Acoust. Speech Signal Process. 34(1), 52–59 (1986)
A. Gams, B. Nemec, A. Ijspeert, A. Ude, Coupling movement primitives: interaction with the environment and bimanual tasks. IEEE Trans. Robot. 30(4), 816–830 (2014a)
A. Gams, J. Van den Kieboom, M. Vespignani, L. Guyot, A. Ude, A. Ijspeert, Rich periodic motor skills on humanoid robots: riding the pedal racer, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2014b, pp. 2326–2332
M.A. Giese, A. Mukovskiy, A.-N. Park, L. Omlor, J.-J.E. Slotine, Real-time synthesis of body movements based on learned primitives, in Statistical and Geometrical Approaches to Visual Motion Analysis: International Dagstuhl Seminar (Springer, Berlin/Heidelberg, 2009), pp. 107–127
V. Gómez, H.J. Kappen, J. Peters, G. Neumann, Policy search for path integral control, in Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases, vol. 8724 (Springer, New York, 2014), pp. 482–497
M. Gonzalez-Fierro, C. Balaguer, N. Swann, T. Nanayakkara, Full-body postural control of a humanoid robot with both imitation learning and skill innovation. Int. J. Humanoid Rob. 11(2), 1–34 (2014)
S. Hak, N. Mansard, O. Stasse, J.P. Laumond, Reverse control for humanoid robot task recognition. IEEE Trans. Syst. Man Cybern. B Cybern. 42(6), 1524–1537 (2012)
M. Hersch, F. Guenter, S. Calinon, A.G. Billard, Dynamical system modulation for robot learning via kinesthetic demonstrations. IEEE Trans. Robot. 24(6), 1463–1467 (2008)
M. Howard, S. Klanke, M. Gienger, C. Goerick, S. Vijayakumar, Behaviour generation in humanoids by learning potential-based policies from constrained motion. Appl. Bionics Biomech. 5(4), 195–211 (2008)
K. Hu, D. Lee, Prediction-based synchronized human walking motion imitation by a humanoid robot. Automatisierungstechnik 60(11), 705–714 (2012)
K. Hu, C. Ott, D. Lee, Online human walking imitation in task and joint space based on quadratic programming, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA) (IEEE, 2014), pp. 3458–3464
K. Hu, C. Ott, D. Lee, Online iterative learning control of zero-moment point for biped walking stabilization, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 5127–5133
J. Hwangbo, C. Gehring, H. Sommer, R. Siegwart, J. Buchli, ROCK*: efficient black-box optimization for policy learning, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), 2014, pp. 535–540
A. Ijspeert, J. Nakanishi, P. Pastor, H. Hoffmann, S. Schaal, Dynamical movement primitives: learning attractor models for motor behaviors. Neural Comput. 25(2), 328–373 (2013)
A.J. Ijspeert, J. Nakanishi, S. Schaal, Trajectory formation for imitation with nonlinear dynamical systems, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2001, pp. 752–757
T. Inamura, N. Kojo, M. Inaba, Situation recognition and behavior induction based on geometric symbol representation of multimodal sensorimotor patterns, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2006, pp. 5147–5152
S. Kern, S.D. Mueller, N. Hansen, D. Bueche, J. Ocenasek, P. Koumoutsakos, Learning probability distributions in continuous evolutionary algorithms – a comparative review. Nat. Comput. 3(1), 77–112 (2004)
S. M. Khansari-Zadeh, A. Billard, Learning stable non-linear dynamical systems with Gaussian mixture models. IEEE Trans. Robot. 27(5), 943–957 (2011)
S. M. Khansari-Zadeh, A. Billard, Learning control Lyapunov function to ensure stability of dynamical system-based robot reaching motions. Robot. Auton. Syst. 62(6), 752–765 (2014)
S. Kim, C. Kim, J.H. Park, Human-like arm motion generation for humanoid robots using motion capture database, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2006, pp. 3486–3491
S. Kim, S. Hong, D. Kim, A walking motion imitation framework of a humanoid robot by human walking recognition from IMU motion data, in 9th IEEE-RAS International Conference on Humanoid Robots, 2010, pp. 343–348
S. Kim, A. Shukla, A. Billard, Catching objects in flight. IEEE Trans. Robot. 30(5), 1049–1065 (2014)
J. Kober, J. Peters, Imitation and reinforcement learning: practical algorithms for motor primitives in robotics. IEEE Robot. Autom. Mag. 17(2), 55–62 (2010)
J. Koenemann, F. Burget, M. Bennewitz, Real-time imitation of human whole-body motions by humanoids, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2014, pp. 2806–2812
P. Kormushev, S. Calinon, R. Saegusa, G. Metta, Learning the skill of archery by a humanoid robot iCub, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), Nashville, 2010, pp. 417–423
P. Kormushev, D.N. Nenchev, S. Calinon, D.G. Caldwell, Upper-body kinesthetic teaching of a free-standing humanoid robot, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Shanghai, 2011a, pp. 3970–3975
P. Kormushev, B. Ugurlu, S. Calinon, N. Tsagarakis, D.G. Caldwell, Bipedal walking energy minimization by reinforcement learning with evolving policy parameterization, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Francisco, 2011b, pp. 318–324
D.P. Kroese, S. Porotsky, R.Y. Rubinstein, The cross-entropy method for continuous multi-extremal optimization. Methodol. Comput. Appl. Probab. 8, 383–407 (2006)
D. Kulic, C. Ott, D. Lee, J. Ishikawa, Y. Nakamura, Incremental learning of full body motion primitives and their sequencing through human motion observation. Int. J. Robot. Res. 31(3), 330–345 (2012)
J. Kwon, F.C. Park, Natural movement generation using hidden Markov models and principal components. IEEE Trans. Syst. Man Cybern. B 38(5), 1184–1194 (2008)
D. Lee, C. Ott, Incremental motion primitive learning by physical coaching using impedance control, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 4133–4140
D. Lee, C. Ott, Incremental kinesthetic teaching of motion primitives using the motion refinement tube. Auton. Robot. 31(2), 115–131 (2011)
D. Lee, C. Ott, Y. Nakamura, Mimetic communication model with compliant physical contact in human-humanoid interaction. Int. J. Robot. Res. 29(13), 1684–1704 (2010)
S.H. Lee, I.H. Suh, S. Calinon, R. Johansson, Learning basis skills by autonomous segmentation of humanoid motion trajectories, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), Osaka, 2012, pp. 112–119
S. Levine, C. Finn, T. Darrell, P. Abbeel, End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(39), 1–40 (2016)
H.-C. Lin, M. Howard, S. Vijayakumar, Learning null space projections, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 2613–2619
M. Liu, Y. Tan, V. Padois, Generalized hierarchical control. Auton. Robot. 40(1), 17–31 (2016)
R. Lober, V. Padois, O. Sigaud, Multiple task optimization using dynamical movement primitives for whole-body reactive control, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), Madrid, 2014, pp. 193–198
R. Lober, V. Padois, O. Sigaud, Variance modulated task prioritization in whole-body control, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, 2015, pp. 3944–3949
R.W. Longman, K.D. Mombaur, Investigating the use of iterative learning control and repetitive control to implement periodic gaits, in Fast Motions in Biomechanics and Robotics (Springer, 2006), pp. 189–218
G.J. Maeda, G. Neumann, M. Ewerton, R. Lioutikov, O. Kroemer, J. Peters, Probabilistic movement primitives for coordination of multiple human-robot collaborative tasks. Auton. Robot. 41(3), 593–612 (2017)
J.R. Medina, D. Lee, S. Hirche, Risk-sensitive optimal feedback control for haptic assistance, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2012, pp. 1025–1031
V. Modugno, G. Neumann, E. Rueckert, G. Oriolo, J. Peters, S. Ivaldi, Learning soft task priorities for control of redundant robots, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Stockholm, 2016
F.L. Moro, M. Gienger, A. Goswami, N.G. Tsagarakis, An attractor-based whole-body motion control (WBMC) system for humanoid robots, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), Atlanta, 2013, pp. 42–49
M. Mühlig, M. Gienger, J. Steil, Interactive imitation learning of object movement skills. Auton. Robot. 32(2), 97–114 (2012)
J. Nakanishi, J. Morimoto, G. Endo, G. Cheng, S. Schaal, M. Kawato, Learning from demonstration and adaptation of biped locomotion. Robot. Auton. Syst. 47(2–3), 79–91 (2004)
K. Neumann, J.J. Steil, Learning robot motions with stable dynamical systems under diffeomorphic transformations. Robot. Auton. Syst. 70, 1–15 (2015)
C. Ott, D. Lee, Y. Nakamura, Motion capture based human motion recognition and imitation by direct marker control, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), 2008, pp. 399–405
C. Ott, B. Henze, D. Lee, Kinesthetic teaching of humanoid motion based on whole-body compliance control with interaction-aware balancing, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013, pp. 4615–4621
A. Paraschos, C. Daniel, J. Peters, G. Neumann, Probabilistic movement primitives, in Advances in Neural Information Processing Systems (NIPS) (Curran Associates, Inc. 2013), pp. 2616–2624
N. Perrin, P. Schlehuber-Caissier, Fast diffeomorphic matching to learn globally asymptotically stable nonlinear dynamical systems. Syst. Control Lett. 96, 51–59 (2016)
R.A. Peters, C. Campbell, W. Bluethmann, E. Huber, Robonaut task learning through teleoperation, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2003, pp. 2806–2811
N. Pollard, J. Hodgins, M. Riley, C. Atkeson, Adapting human motion for the control of a humanoid robot, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2002, pp. 1390–1397
L.R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–285 (1989)
S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson, S. Aigrain, Gaussian processes for time-series modelling. Phil. Trans. R. Soc. A 371(1984), 1–25 (2012)
L. Rozo, J. Silvério, S. Calinon, D.G. Caldwell, Learning controllers for reactive and proactive behaviors in human-robot collaboration. Front. Robot. AI 3(30), 1–11 (2016)
E. Rueckert, J. Mundo, A. Paraschos, J. Peters, G. Neumann, Extracting low-dimensional control variables for movement primitives, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Seattle, 2015, pp. 1511–1518
J. Salini, V. Padois, P. Bidaud, Synthesis of complex humanoid whole-body behavior: a focus on sequencing and tasks transitions, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 1283–1290
E.L. Sauser, B.D. Argall, G. Metta, A.G. Billard, Iterative learning of grasp adaptation through human corrections. Robot. Auton. Syst. 60(1), 55–71 (2012)
M. Saveriano, S. An, D. Lee, Incremental kinesthetic teaching of end-effector and null-space motion primitives, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 3570–3575
S. Schaal, C.G. Atkeson, Constructive incremental learning from only local information. Neural Comput. 10(8), 2047–2084 (1998)
J. Schreiter, P. Englert, D. Nguyen-Tuong, M. Toussaint, Sparse Gaussian process regression for compliant, real-time robot control, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 2586–2591
J. Silvério, L. Rozo, S. Calinon, D.G. Caldwell, Learning bimanual end-effector poses from demonstrations using task-parameterized dynamical systems, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, 2015, pp. 464–470
F. Stulp, O. Sigaud, Path integral policy improvement with covariance matrix adaptation, in Proceedings of International Conference on Machine Learning (ICML), 2012, pp. 1–8
F. Stulp, O. Sigaud, Robot skill learning: from reinforcement learning to evolution strategies. Paladyn J. Behav. Robot. 4(1), 49–61 (2013)
F. Stulp, O. Sigaud, Many regression algorithms, one unified model – a review. Neural Netw. 69, 60–79 (2015)
F. Stulp, J. Buchli, E. Theodorou, S. Schaal, Reinforcement learning of full-body humanoid motor skills, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), Nashville, 2010, pp. 405–410
N. Sugimoto, J. Morimoto, Trajectory-model-based reinforcement learning: application to bimanual humanoid motor learning with a closed-chain constraint, in Proceedings of IEEE International Conference on Humanoid Robots (Humanoids), 2013, pp. 429–434
K. Sugiura, N. Iwahashi, H. Kashioka, S. Nakamura, Learning, generation, and recognition of motions by reference-point-dependent probabilistic models. Adv. Robot. 25(6–7), 825–848 (2011)
W. Takano, Y. Nakamura, Real-time unsupervised segmentation of human whole-body motion and its application to humanoid robot acquisition of motion symbols. Robot. Auton. Syst. 75(Part B), 260–272 (2016)
A.K. Tanwani, S. Calinon, Learning robot manipulation tasks with task-parameterized semi-tied hidden semi-Markov model. IEEE Robot. Autom. Lett. (RA-L) 1(1), 235–242 (2016)
J. Ting, M. Kalakrishnan, S. Vijayakumar, S. Schaal, Bayesian kernel shaping for learning control, in Advances in Neural Information Processing Systems (NIPS), 2008, pp. 1673–1680
E. Todorov, M.I. Jordan, Optimal feedback control as a theory of motor coordination. Nat. Neurosci. 5, 1226–1235 (2002)
A. Ude, A. Gams, T. Asfour, J. Morimoto, Task-specific generalization of discrete and periodic dynamic movement primitives. IEEE Trans. Robot. 26(5), 800–815 (2010)
S. Vijayakumar, A. D’souza, S. Schaal, Incremental online learning in high dimensions. Neural Comput. 17(12), 2602–2634 (2005)
A. Werner, D. Trautmann, D. Lee, R. Lampariello, Generalization of optimal motion trajectories for bipedal walking, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, pp. 1571–1577
A. Whiten, N. McGuigan, S. Marshall-Pescini, L.M. Hopper, Emulation, imitation, over-imitation and the scope of culture for child and chimpanzee. Phil. Trans. R. Soc. B 364(1528), 2417–2428 (2009)
C.K.I. Williams, C.E. Rasmussen, Gaussian processes for regression, in Advances in Neural Information Processing Systems (NIPS), 1996, pp. 514–520
A.G. Wilson, Z. Ghahramani, Generalised Wishart processes, in Annual Conference on Uncertainty in Artificial Intelligence (Barcelona, 2011)
S. Wrede, C. Emmerich, R. Ricarda, A. Nordmann, A. Swadzba, J.J. Steil, A user study on kinesthetic teaching of redundant robots in task and configuration space. J. Hum.-Robot Interact. 2(1), 56–81 (2013)
M.J.A. Zeestraten, S. Calinon, D.G. Caldwell, Variable duration movement encoding with minimal intervention control, in Proceedings of IEEE International Conference on Robotics and Automation (ICRA), Stockholm, 2016, pp. 497–503
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature B.V.
About this entry
Cite this entry
Calinon, S., Lee, D. (2019). Learning Control. In: Goswami, A., Vadakkepat, P. (eds) Humanoid Robotics: A Reference. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6046-2_68
Download citation
DOI: https://doi.org/10.1007/978-94-007-6046-2_68
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6045-5
Online ISBN: 978-94-007-6046-2
eBook Packages: Intelligent Technologies and RoboticsReference Module Computer Science and Engineering