Simultaneous learning of hierarchy and primitives for complex robot tasks

Abstract

We present a new interaction paradigm for robot learning from demonstration, called simultaneous learning of hierarchy and primitives (SLHAP), in which information about hierarchy and primitives is naturally interleaved in a single, coherent demonstration session. A key innovation in the new paradigm is the human demonstrator’s narration of primitives as he executes them, which allows the system to identify the boundaries between primitives. Hierarchy is represented using hierarchical task networks; motion planning constraints on the primitives are represented using task space regions. We implemented SLHAP on an autonomous robot and produced an interaction video illustrating its effectiveness learning a complex task with five levels of hierarchy and eight types of primitives. The underlying algorithms which make SLHAP possible are described and evaluated.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Notes

  1. 1.

    Niekum et al. (2015) learn both low-level motion trajectories and high-level tasks (as state machines) from demonstration, but the high-level tasks are not hierarchical.

  2. 2.

    The human’s head pose is only used to control where the robot “looks”; it is not part of the task learning process.

  3. 3.

    Our speech recognition and understanding is not general-purpose; we use a push-to-talk button operated offscreen and a predefined grammar for the human utterances. Solutions to these limitations are beyond the scope of this work.

  4. 4.

    The inputs of a task are the target and reference objects. The output of a task is any object whose properties, such as location, are changed by the task.

  5. 5.

    Reusable by the human; retargeting the primitive for the robot is addressed by the TSR constraint learning subcomponent.

  6. 6.

    It is clear that this solution will not work for all possible manipulation primitives, and therefore needs further investigation. In learning theory, this relates to the issue of automatic feature selection. Our algorithm for identifying the primitives only targets tasks with one target and one reference object. In future work, we plan to extend our work to learn tasks with multiple target and reference objects.

  7. 7.

    We also recorded motion data for a cup retrieval task to specifically evaluate the pose constraint learning—see Li and Berenson (2016).

References

  1. Akgun, B., Cakmak, M., Jiang, K., & Thomaz, A. L. (2012). Keyframe-based learning from demonstration. International Journal of Social Robotics, 4(4), 343–355.

    Article  Google Scholar 

  2. Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483.

    Article  Google Scholar 

  3. Baisero, A., Mollard, Y., Lopes, M., Toussaint, M., & Lutkebohle, I. (2015). Temporal segmentation of pair-wise interaction phases in sequential manipulation demonstrations. In IROS.

  4. Berenson, D., Srinivasa, S. S., & Kuffner, J. (2011). Task space regions: A framework for pose-constrained manipulation planning. The International Journal of Robotics Research, 30, 1435–1460.

    Article  Google Scholar 

  5. Cakmak, M., Chao, C., & Thomaz, A. L. (2010). Designing interactions for robot active learners. IEEE Transactions on Autonomous Mental Development, 2(2), 108–118.

    Article  Google Scholar 

  6. Cakmak, M., & Thomaz, A. L. (2012). Designing robot learners that ask good questions. In ACM/IEEE international conference on human–robot interaction (pp. 17–24). ACM.

  7. Calinon, S., Guenter, F., & Billard, A. (2007). On learning, representing, and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 37(2), 286–298.

    Article  Google Scholar 

  8. Chernova, S., & Thomaz, A. L. (2014). Robot learning from human teachers. Synthesis Lectures on Artificial Intelligence and Machine Learning, 8(3), 1–121.

    Article  Google Scholar 

  9. Chiappa, S., & Peters, J. R. (2010). Movement extraction by detecting dynamics switches and repetitions. In Advances in neural information processing systems (pp. 388–396).

  10. Erol, K., Hendler, J., & Nau, D. S. (1994). HTN planning: Complexity and expressivity. AAAI, 94, 1123–1128.

    Google Scholar 

  11. Garland, A., Ryall, K., & Rich, C. (2001). Learning hierarchical task models by defining and refining examples. In International conference on knowledge capture (pp. 44–51).

  12. Hayes, B., & Scassellati, B. (2014). Discovering task constraints through observation and active learning. In IEEE/RSJ international conference on intelligent robots and systems.

  13. Hsu, D., Jiang, T., Reif, J., & Sun, Z. (2003). The bridge test for sampling narrow passages with probabilistic roadmap planners. In ICRA.

  14. Huffman, S. B., & Laird, J. E. (1995). Flexibly instructable agents. Journal of Artificial Intelligence Research, 3, 271–324.

    Article  MATH  Google Scholar 

  15. Konidaris, G. (2016). Constructing abstraction hierarchies using a skill-symbol loop. In IJCAI: Proceedings of the conference (p. 1648), NIH Public Access.

  16. Kulic, D., Lee, D., Ott, C., & Nakamura, Y. (2008). Incremental learning of full body motion primitives for humanoid robots. In 8th IEEE-RAS international conference on humanoid robots, 2008. Humanoids 2008 (pp. 326–332). IEEE.

  17. Levy-leduc, C., & Harchaoui, Z. (2008). Catching change-points with lasso. In Advances in neural information processing systems (pp. 617–624).

  18. Li, C., & Berenson, D. (2016). Learning object orientation constraints and guiding constraints for narrow passages from one demonstration. In International symposium on experimental robotics.

  19. Minnen, D., Starner, T., Essa, I. A., & Isbell, C. L, Jr. (2007). Improving activity discovery with automatic neighborhood estimation. IJCAI, 7, 2814–2819.

    Google Scholar 

  20. Mohammad, Y., & Nishida, T. (2015). Exact multi-length scale and mean invariant motif discovery. Applied Intelligence, 44, 322–339.

    Article  Google Scholar 

  21. Mohan, S., & Laird, J. E. (2011). Towards situated, interactive, instructable agents in a cognitive architecture. In AAAI Fall symposium series.

  22. Mohseni-Kabir, A., Chernova, S., & Rich, C. (2014). Collaborative learning of hierarchical task networks from demonstration and instruction. In Workshop on human–robot collaboration for industrial manufacturing, robotics science and systems, Berkeley, CA.

  23. Mohseni-Kabir, A., Rich, C., Chernova, S., Sidner, C. L., & Miller, D. (2015). Interactive hierarchical task learning from a single demonstration. In Proceedings of the tenth annual ACM/IEEE international conference on human–robot interaction (pp. 205–212). ACM.

  24. Mohseni-Kabir, A., Wu, V., Chernova, S., & Rich, C. (2016). What’s in a primitive? identifying reusable motion trajectories in narrated demonstrations. In IEEE international symposium on robot and human interactive communication (ROMAN).

  25. Mollard, Y., Munzer, T., Baisero, A., Toussaint, M., & Lopes, M. (2015). Robot programming from demonstration, feedback and transfer. In IROS.

  26. Niekum, S., Osentoski, S., Konidaris, G., Chitta, S., Marthi, B., & Barto, A. G. (2015). Learning grounded finite-state representations from unstructured demonstrations. The International Journal of Robotics Research, 34(2), 131–157.

    Article  Google Scholar 

  27. Oates, T. (2002). Peruse: An unsupervised algorithm for finding recurring patterns in time series. In 2002 IEEE international conference on data mining, 2002. ICDM 2003. Proceedings (pp. 330–337). IEEE.

  28. Pardowitz, M., Knoop, S., Dillmann, R., & Zollner, R. (2007). Incremental learning of tasks from user demonstrations, past experiences, and vocal comments. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 37(2), 322–332.

    Article  Google Scholar 

  29. Phillips, M., Hwang, V., Chitta, S., & Likhachev, M. (2016). Learning to plan for constrained manipulation from demonstrations. Autonomous Robots, 40(1), 109–124.

    Article  Google Scholar 

  30. Rich, C. (2009). Building task-based user interfaces with ANSI/CEA-2018. Computer, 42(8), 20–27.

    Article  Google Scholar 

  31. Rich, C., & Sidner, C. (2012). Using collaborative discourse theory to partially automate dialogue tree authoring. In Intelligent virtual agents (pp. 327–340). Springer.

  32. Rudin, L. I., Osher, S., & Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena, 60(1), 259–268.

    MathSciNet  Article  MATH  Google Scholar 

  33. Rybski, P. E., Yoon, K., Stolarz, J., & Veloso, M. M. (2007). Interactive robot task training through dialog and demonstration. In ACM/IEEE international conference on human–robot interaction (pp. 49–56).

  34. Senin, P., Lin, J., Wang, X., Oates, T., Gandhi, S., Boedihardjo, A. P., et al. (2014). Grammarviz 2.0: A tool for grammar-based pattern discovery in time series. In Machine learning and knowledge discovery in databases (pp. 468–472). Springer.

  35. Ye, G., & Alterovitz, R. (2011). Demonstration-guided motion planning. In ISRR.

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Sonia Chernova.

Additional information

This work is supported in part by the Office of Naval Research under Grant N00014-13-1-0735.

Electronic supplementary material

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mohseni-Kabir, A., Li, C., Wu, V. et al. Simultaneous learning of hierarchy and primitives for complex robot tasks. Auton Robot 43, 859–874 (2019). https://doi.org/10.1007/s10514-018-9749-y

Download citation

Keywords

  • Learning from demonstration
  • Hierarchical task network
  • Task space region
  • Motion planning
  • Tire rotation