Learning Probabilistic Decision Making by a Service Robot with Generalization of User Demonstrations and Interactive Refinement

  • Sven R. Schmidt-Rohr
  • Fabian Romahn
  • Pascal Meissner
  • Rainer Jäkel
  • Rüdiger Dillmann
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 194)


When learning abstract probabilistic decision making models for multi-modal service robots from human demonstrations, alternative courses of events may be missed by human teachers during demonstrations. We present an active model space exploration approach with generalization of observed action effect knowledge leading to interactive requests of new demonstrations to verify generalizations.

At first, the robot observes several user demonstrations of interacting humans, including dialog, object poses and human body movement. Discretization and analysis then lead to a symbolic-causal model of a demonstrated task in the form of a preliminary Partially observable Markov decision process. Based on the transition model generated from demonstrations, new hypotheses of unobserved action effects, generalized transitions, can be derived along with a generalization confidence estimate. To validate generalized transitions which have a strong impact on a decision policy, a request generator proposes further demonstrations to human teachers, used in turn to implicitly verify hypotheses.

The system has been evaluated on a multi-modal service robot with realistic tasks, including furniture manipulation and execution-time interacting humans.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Schmidt-Rohr, S.R., Lösch, M., Jäkel, R., Dillmann, R.: Programming by demonstration of probabilistic decision making on a multi-modal service robot. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipeh, Taiwan (2010)Google Scholar
  2. 2.
    Cassandra, A.R., Kaelbling, L.P., Littman, M.L.: Acting optimally in partially observable stochastic domains. In: Proceedings of the Twelfth National Conference on Artificial Intelligence (1994)Google Scholar
  3. 3.
    Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces. In: Proc. Robotics: Science and Systems (2008)Google Scholar
  4. 4.
    Schmidt-Rohr, S.R., Knoop, S., Lösch, M., Dillmann, R.: Bridging the gap of abstraction for probabilistic decision making on a multi-modal service robot. In: RSS, Zürich (2008)Google Scholar
  5. 5.
    Lösch, M., Schmidt-Rohr, S., Knoop, S., Vacek, S., Dillmann, R.: Feature set selection and optimal classifier for human activity recognition. In: RO-MAN (2007)Google Scholar
  6. 6.
    Jaekel, R., Schmidt-Rohr, S.R., Loesch, M., Dillmann, R.: Representation and constrained planning of manipulation strategies in the context of programming by demonstration. In: IEEE International Conference on Robotics and Automation, ICRA 2010 (2010)Google Scholar
  7. 7.
    Pardowitz, M., Knoop, S., Dillmann, R., Zollner, R.D.: Incremental learning of tasks from user demonstrations, past experiences, and vocal comments. IEEE Trans. on Systems, Man, and Cybernetics (2007)Google Scholar
  8. 8.
    Grollman, D., Jenkins, O.C.: Incremental learning of subtasks from unsegmented demonstration. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2010)Google Scholar
  9. 9.
    Veeraraghavan, H., Veloso, M.: Learning task specific plans through sound and visually interpretable demonstrations. In: IROS (2008)Google Scholar
  10. 10.
    Ross, S., Chaib-draa, B., Pineau, J.: Bayes-adaptive pomdps. In: NIPS. MIT Press (2007)Google Scholar
  11. 11.
    Jaulmes, R., Pineau, J., Precup, D.: A formal framework for robot learning and control under model uncertainty. In: 2007 IEEE International Conference on Robotics and Automation (April 2007)Google Scholar
  12. 12.
    Shon, A.P., Storz, J.J., Rao, R.P.N.: Towards a real-time bayesian imitation system for a humanoid robot. In: 2007 IEEE International Conference on Robotics and Automation, pp. 2847–2852 (2007)Google Scholar
  13. 13.
    Tenorth, M., Beetz, M.: Priming Transformational Planning with Observations of Human Activities.. In: IEEE International Conference on Robotics and Automation, ICRA (2010)Google Scholar
  14. 14.
    Chernova, S., Veloso, M.: Interactive policy learning through confidence-based autonomy. Journal of Artificial Intelligence Research 34 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sven R. Schmidt-Rohr
    • 1
  • Fabian Romahn
    • 1
  • Pascal Meissner
    • 1
  • Rainer Jäkel
    • 1
  • Rüdiger Dillmann
    • 1
  1. 1.Institute for Anthropomatics (IFA)Karlsruhe Institute of TechnologyKarlsruheGermany

Personalised recommendations