Autonomous Robots

, Volume 40, Issue 5, pp 903–927 | Cite as

A modular approach to learning manipulation strategies from human demonstration

  • Bidan Huang
  • Miao Li
  • Ravin Luis De Souza
  • Joanna J. Bryson
  • Aude Billard


Object manipulation is a challenging task for robotics, as the physics involved in object interaction is complex and hard to express analytically. Here we introduce a modular approach for learning a manipulation strategy from human demonstration. Firstly we record a human performing a task that requires an adaptive control strategy in different conditions, i.e. different task contexts. We then perform modular decomposition of the control strategy, using phases of the recorded actions to guide segmentation. Each module represents a part of the strategy, encoded as a pair of forward and inverse models. All modules contribute to the final control policy; their recommendations are integrated via a system of weighting based on their own estimated error in the current task context. We validate our approach by demonstrating it, both in a simulation for clarity, and on a real robot platform to demonstrate robustness and capacity to generalise. The robot task is opening bottle caps. We show that our approach can modularize an adaptive control strategy and generate appropriate motor commands for the robot to accomplish the complete task, even for novel bottles.


Learning by demonstration Manipulation Modular approach 



This work was funded primarily by the Swiss National Foundation through the National Center of Competence in Research (NCCR) in Robotics. Ravin de Souza was also supported by a doctoral grant (SFRH/BD/51071/2010) from the Portuguese Fundacao para a Ciencia e a Tecnologia and Miao Li was supported by the European Union Seventh Framework ProgrammeP7/2007–2013 under Grant agreement no 288533 ROBOHOW.COG. Bidan Huang was also supported by a studentship from the University of Bath. The authors would like to thank Sahar El-Khoury for her valuable comments.

Supplementary material

10514_2015_9501_MOESM1_ESM.wmv (17.4 mb)
Supplementary material 1 (wmv 17807 KB)


  1. Asfour, T., Azad, P., Gyarfas, F., & Dillmann, R. (2008). Imitation learning of dual-arm manipulation tasks in humanoid robots. International Journal of Humanoid Robotics, 5(02), 183–202.CrossRefGoogle Scholar
  2. Athans, M., Castanon, D., Dunn, K. P., Greene, C., Lee, W., Sandell, N, Jr, et al. (1977). The stochastic control of the f-8c aircraft using a multiple model adaptive control (MMAC) method. Part I: Equilibrium flight. IEEE Transactions on Automatic Control, 22(5), 768–780.CrossRefGoogle Scholar
  3. Bernardino, A., Henriques, M., Hendrich, N., & Zhang, J. (2013). Precision grasp synergies for dexterous robotic hands. In IEEE International Conference on Robotics and Biomimetics (ROBIO) (pp. 62–67). IEEE, Piscataway.Google Scholar
  4. Berndt, D. J., & Clifford, J. (1994). Using dynamic time warping to find patterns in time series. In KDD Workshop, Seattle, WA (vol. 10, pp. 359–370).Google Scholar
  5. Bryson, J. J. (2000). Cross-paradigm analysis of autonomous agent architecture. Journal of Experimental and Theoretical Artificial Intelligence, 12(2), 165–190.CrossRefzbMATHGoogle Scholar
  6. Bryson, J. J., & Stein, L. A. (2001). Modularity and design in reactive intelligence. In Proceedings of the 17th International Joint Conference on Artificial Intelligence (pp. 1115–1120). Seattle: Morgan Kaufmann.Google Scholar
  7. Buchli, J., Stulp, F., Theodorou, E., & Schaal, S. (2011). Learning variable impedance control. The International Journal of Robotics Research, 30(7), 820–833.CrossRefGoogle Scholar
  8. Calinon, S., & Billard, A. (2007). Incremental learning of gestures by imitation in a humanoid robot. In Proceedings of the ACM/IEEE international conference on Human-robot interaction (pp. 255–262). ACM, New York.Google Scholar
  9. Calinon, S., Guenter, F., & Billard, A. (2007). On learning, representing, and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 37(2), 286–298.CrossRefGoogle Scholar
  10. Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. arXiv:cs/9603104, preprint.
  11. Dang, H., & Allen, P. K. (2014). Semantic grasping: Planning task-specific stable robotic grasps. Autonomous Robots, 37(3), 1–16.CrossRefGoogle Scholar
  12. Demiris, Y., & Khadhouri, B. (2006). Hierarchical attentive multiple models for execution and recognition of actions. Robotics and Autonomous Systems, 54(5), 361–369.CrossRefGoogle Scholar
  13. Dillmann, R. (2004). Teaching and learning of robot tasks via observation of human performance. Robotics and Autonomous Systems, 47(2), 109–116.MathSciNetCrossRefGoogle Scholar
  14. Do, M., Asfour, T., & Dillmann, R. (2011). Towards a unifying grasp representation for imitation learning on humanoid robots. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 482–488). IEEE, Piscataway.Google Scholar
  15. El-Khoury, S., Li, M., & Billard, A. (2013). On the generation of a variety of grasps. Robotics and Autonomous Systems, 61(12), 1335–1349.CrossRefGoogle Scholar
  16. Fekri, S., Athans, M., & Pascoal, A. (2007). Robust multiple model adaptive control (RMMAC): A case study. International Journal of Adaptive Control and Signal Processing, 21(1), 1–30.MathSciNetCrossRefzbMATHGoogle Scholar
  17. Fischer, M., van der Smagt, P., & Hirzinger, G. (1998) Learning techniques in a dataglove based telemanipulation system for the DLR hand. In Proceedings of 1998 IEEE International Conference on Robotics and Automation (vol. 2, pp 1603–1608). IEEE, Piscataway.Google Scholar
  18. Flanagan, J. R., Bowman, M. C., & Johansson, R. S. (2006). Control strategies in object manipulation tasks. Current Opinion in Neurobiology, 16(6), 650–659.CrossRefGoogle Scholar
  19. Gustafsson, E. (2013). Investigation of friction between plastic parts. Master’s thesis, Chalmers University of Technology, Gothenburg.Google Scholar
  20. Haruno, M., Wolpert, D. M., & Kawato, M. (2001). Mosaic model for sensorimotor learning and control. Neural Computation, 13(10), 2201–2220.CrossRefzbMATHGoogle Scholar
  21. Howard, M., Mitrovic, D., & Vijayakumar, S. (2010). Transferring impedance control strategies between heterogeneous systems via apprenticeship learning. In 10th IEEE-RAS International Conference on Humanoid Robots (Humanoids) (pp. 98–105)Google Scholar
  22. Huang, B., Bryson, J., & Inamura, T. (2013a). Learning Motion Primitives of Object Manipulation Using Mimesis Model. In Proceedings of 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO) Google Scholar
  23. Huang, B., El-Khoury, S., Li, M., Bryson, J. J., & Billard, A. (2013b). Learning a real time grasping strategy. In 2013 IEEE International Conference on Robotics and Automation (ICRA) (pp. 593–600). IEEE, PiscatawayGoogle Scholar
  24. Hueser, M., Baier, T., Zhang, J. (2006). Learning of demonstrated grasping skills by stereoscopic tracking of human head configuration. In Proceedings 2006 IEEE International Conference on Robotics and Automation (ICRA 2006) (pp. 2795–2800). IEEE, Piscataway.Google Scholar
  25. Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Computation, 3(1), 79–87.CrossRefGoogle Scholar
  26. Jain, A., & Kemp, C. C. (2013). Improving robot manipulation with data-driven object-centric models of everyday forces. Autonomous Robots, 35(2–3), 143–159.CrossRefGoogle Scholar
  27. Johnson, M., & Demiris, Y. (2005). Hierarchies of coupled inverse and forward models for abstraction in robot action planning, recognition and imitation. In Proceedings of the AISB 2005 Symposium on Imitation in Animals and Artifacts, Citeseer (pp. 69–76)Google Scholar
  28. Khalil, W., & Dombre, E. (2004). Modeling, identification and control of robots. Oxford: Butterworth-Heinemann.zbMATHGoogle Scholar
  29. Kondo, M., Ueda, J., & Ogasawara, T. (2008). Recognition of in-hand manipulation using contact state transition for multifingered robot hand control. Robotics and Autonomous Systems, 56(1), 66–81.CrossRefGoogle Scholar
  30. Korkinof, D., & Demiris, Y. (2013). Online quantum mixture regression for trajectory learning by demonstration. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 3222–3229). IEEE, Piscataway.Google Scholar
  31. Kortenkamp, D., Bonasso, R. P., & Murphy, R. (Eds.). (1998). Artificial intelligence and mobile robots: Case studies of successful robot systems. Cambridge, MA: MIT Press.Google Scholar
  32. Kronander, K., & Billard, A. (2012). Online learning of varying stiffness through physical human-robot interaction. In 2012 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1842–1849). IEEE, Piscataway.Google Scholar
  33. Kuipers, M., & Ioannou, P. (2010). Multiple model adaptive control with mixing. IEEE Transactions on Automatic Control, 55(8), 1822–1836.MathSciNetCrossRefGoogle Scholar
  34. Kulić, D., Takano, W., & Nakamura, Y. (2008). Incremental learning, clustering and hierarchy formation of whole body motion patterns using adaptive hidden Markov chains. The International Journal of Robotics Research, 27(7), 761–784.CrossRefGoogle Scholar
  35. Kulic, D., Takano, W., & Nakamura, Y. (2009). Online segmentation and clustering from continuous observation of whole body motions. IEEE Transactions on Robotics, 25(5), 1158–1166.CrossRefGoogle Scholar
  36. Kulić, D., Ott, C., Lee, D., Ishikawa, J., & Nakamura, Y. (2012). Incremental learning of full body motion primitives and their sequencing through human motion observation. The International Journal of Robotics Research, 31(3), 330–345.CrossRefGoogle Scholar
  37. Li, M., Yin, H., Tahara, K., & Billard, A. (2014). Learning object-level impedance control for robust grasping and dexterous manipulation. In Proceedings of International Conference on Robotics and Automation (ICRA), 2014.Google Scholar
  38. Nakanishi, J., Radulescu, A., & Vijayakumar, S. (2013). Spatio-temporal optimization of multi-phase movements: Dealing with contacts and switching dynamics. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 5100–5107). IEEE, Piscataway.Google Scholar
  39. Narendra, K. S., & Balakrishnan, J. (1997). Adaptive control using multiple models. IEEE Transactions on Automatic Control, 42(2), 171–187.MathSciNetCrossRefzbMATHGoogle Scholar
  40. Narendra, K. S., Balakrishnan, J., & Ciliz, M. K. (1995). Adaptation and learning using multiple models, switching, and tuning. IEEE Control Systems, 15(3), 37–51.CrossRefGoogle Scholar
  41. Nehaniv, C. L., & Dautenhahn, K. (2002). The correspondence problem, chapter 2. In K. Dautenhahn & C. L. Nehaniv (Eds.), Imitation in animals and artifacts (pp. 41–62). Cambridge: MIT Press.Google Scholar
  42. Okamura, A. M., Smaby, N., & Cutkosky, M. R. (2000). An overview of dexterous manipulation. In Proceedings of IEEE International Conference on Robotics and Automation (ICRA’00) (vol. 1, pp. 255–262). IEEE, PiscatawayGoogle Scholar
  43. Pais, AL., & Billard, A. (2014). Encoding bi-manual coordination patterns from human demonstrations. In Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction (pp. 264–265). ACM, New York.Google Scholar
  44. Pais, L., Umezawa, K., Nakamura, Y., & Billard, A. (2013). Learning robot skills through motion segmentation and constraints extraction. In HRI Workshop on Collaborative Manipulation.Google Scholar
  45. Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., & Schaal, S. (2011). Skill learning and task outcome prediction for manipulation. In 2011 IEEE International Conference on Robotics and Automation (ICRA) (pp. 3828–3834). IEEE, Piscataway.Google Scholar
  46. Petkos, G., Toussaint, M., & Vijayakumar, S. (2006). Learning multiple models of non-linear dynamics for control under varying contexts. In Artificial Neural Networks–ICANN 2006 (pp. 898–907). Springer, Berlin.Google Scholar
  47. Romano, J. M., Hsiao, K., Niemeyer, G., Chitta, S., & Kuchenbecker, K. J. (2011). Human-inspired robotic grasp control with tactile sensing. IEEE Transactions on Robotics, 27(6), 1067–1079.CrossRefGoogle Scholar
  48. Sauser, E., Argall, B., Metta, G., & Billard, A. (2011). Iterative learning of grasp adaptation through human corrections. Robotics and Autonomous Systems, 60, 55–71.CrossRefGoogle Scholar
  49. de Souza, R., El Khoury, S., Santos-Victor, J., & Billard, A. (2014). Towards comprehensive capture of human grasping and manipulation skills. In 13th International Symposium on 3D Analysis of Human Movement Google Scholar
  50. Sugimoto, N., Morimoto, J., Hyon, S. H., & Kawato, M. (2012). The eMOSAIC model for humanoid robot control. Neural Networks, 29, 8–19.CrossRefGoogle Scholar
  51. Tribology-abccom. (2014). Coefficient of friction, rolling resistance, air resistance, aerodynamics. Retrieved August 09, 2014, from
  52. Willett, P. (1988). Recent trends in hierarchic document clustering: A critical review. Information Processing & Management, 24(5), 577–597.CrossRefGoogle Scholar
  53. Wimböck, T., Ott, C., Albu-Schäffer, A., & Hirzinger, G. (2012). Comparison of object-level grasp controllers for dynamic dexterous manipulation. The International Journal of Robotics Research, 31(1), 3–23.CrossRefGoogle Scholar
  54. Wolpert, D. M., & Kawato, M. (1998). Multiple paired forward and inverse models for motor control. Neural Networks, 11(7), 1317–1329.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Bidan Huang
    • 1
  • Miao Li
    • 3
  • Ravin Luis De Souza
    • 3
  • Joanna J. Bryson
    • 2
  • Aude Billard
    • 3
  1. 1.The Hamlyn CentreImperial College LondonLondonUnited Kingdom
  2. 2.Intelligent Systems Group (IS), Computer Science DepartmentUniversity of BathNorth East SomersetUnited Kingdom
  3. 3.Learning Algorithms and Systems Laboratory (LASA)Swiss Federal Institute of Technology Lausanne (EPFL)LausanneSwitzerland

Personalised recommendations