Skip to main content

Advertisement

Log in

Confidence-Based Multi-Robot Learning from Demonstration

  • Published:
International Journal of Social Robotics Aims and scope Submit manuscript

Abstract

Learning from demonstration algorithms enable a robot to learn a new policy based on demonstrations provided by a teacher. In this article, we explore a novel research direction, multi-robot learning from demonstration, which extends demonstration based learning methods to collaborative multi-robot domains. Specifically, we study the problem of enabling a single person to teach individual policies to multiple robots at the same time. We present flexMLfD, a task and platform independent multi-robot demonstration learning framework that supports both independent and collaborative multi-robot behaviors. Building upon this framework, we contribute three approaches to teaching collaborative multi-robot behaviors based on different information sharing strategies, and evaluate these approaches by teaching two Sony QRIO humanoid robots to perform three collaborative ball sorting tasks. We then present scalability analysis of flexMLfD using up to seven Sony AIBO robots. We conclude the article by proposing a formalization for a broader multi-robot learning from demonstration research area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alissandrakis A, Nehaniv CL, Dautenhahn K (2002) Do as i do: Correspondences across different robotic embodiments. In: Kim J, Polani D, Martinetz T (eds) Fifth German workshop on artificial life (GWAL5), pp 143–152

  2. Argall B, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robot Auton Syst 57(5):469–483

    Article  Google Scholar 

  3. Atkeson CG, Schaal S (1997) Robot learning from demonstration. In: Fisher DH Jr (ed) Machine learning: proceedings of the fourteenth international conference (ICML’97). San Francisco, California, pp 12–20

  4. Balch T, Arkin RC (1994) Communication in reactive multiagent robotic systems. Auton Robots 1(1):27–52

    Article  Google Scholar 

  5. Bentivegna DC, Ude A, Atkeson CG, Cheng G (2004) Learning to act from observation and practice. Int J Humanoid Robot 1(4)

  6. Breazeal C, Hoffman G, Lockerd A (2004) Teaching and working with robots as a collaboration. In: AAMAS ’04: Proceedings of the third international joint conference on autonomous agents and multiagent systems. IEEE Computer Society, Washington, DC, pp 1030–1037

    Google Scholar 

  7. Browning B, Xu L, Veloso M (2004) Skill acquisition and use for a dynamically-balancing soccer robot. In: Proceedings of nineteenth national conference on artificial intelligence (AAAI’04)

  8. Calinon S, Billard A (2007) Incremental learning of gestures by imitation in a humanoid robot. In: Second annual conference on human-robot interactions (HRI’07). Arlington, Virginia, March 2007

  9. Chaimowicz L, Campos MFM, Kumar V (2002) Dynamic role assignment for cooperative robots. In: Proc. of the IEEE intl. conf. on robotics and automation (ICRA), pp 293–298

  10. Chernova S (2009) Confidence-based robot policy learning from demonstration. PhD thesis, Computer Science Dept., Carnegie Mellon University, Advisor-Manuela Veloso

  11. Chernova S, Veloso M (2008) Multi-thresholded approach to demonstration selection for interactive robot learning. In: Proceedings of 3rd ACM/IEEE international conference on human-robot interaction (HRI’08), March 2008

  12. Chernova S, Veloso M (2008) Teaching multi-robot coordination using demonstration of communication and state sharing (short paper). In: Proceedings of the international conference on autonomous agents and multiagent systems (AMMAS ’08), May 2008

  13. Chernova S, Veloso M (2009) Interactive policy learning through confidence-based autonomy. J Artif Intell Res 34(1):1–25

    MATH  MathSciNet  Google Scholar 

  14. Clouse JA (1996) On integrating apprentice learning and reinforcement learning. PhD thesis, University of Massachusetts, Department of Computer Science. Director-Paul E Utgoff

  15. Crandall JW, Goodrich MA, Olsen DR Jr, Nielsen, CW (2005) Validating human-robot interaction schemes in multitasking environments. IEEE Trans Syst Man Cybern A 35(4):438–449

    Article  Google Scholar 

  16. Dias MB, Zlot R, Kalra N, Stentz A (2006) Market-based multirobot coordination: A survey and analysis. Proc IEEE 94(7):1257–1270

    Article  Google Scholar 

  17. Endsley MR, Garland DJ (2000) Situation awareness: analysis and measurement. Lawrence Erlbaum Associates

  18. Farinelli A, Farinelli R, Iocchi L, Nardi D (2004) Multi-robot systems: A classification focused on coordination. IEEE Trans Syst Man Cybern B 34:2015–2028

    Article  Google Scholar 

  19. Fong TW, Thorpe C, Baur C (2003) Robot, asker of questions. In: Robotics and autonomous systems

  20. Gerkey BP, Mataric MJ (2000) Principled communication for dynamic multi-robot task allocation. In: Experimental robotics VII. LNCIS, vol 271. Springer, Berlin, pp 353–362

    Google Scholar 

  21. Goodrich MA, Schultz AC (2007) Human-robot interaction: a survey. Found Trends Hum Comput Interact 1(3):203–275

    Article  MATH  Google Scholar 

  22. Goodrich MA, Olsen DR Jr (2003) Seven principles of efficient human robot interaction. In: Proc IEEE Int Conf Syst, Man and Cybernetics, vol 4, pp 3942–3948

  23. Grollman SH, Jenkins OC (2007) Dogged learning for robots. In: Proceedings of the IEEE international conference on robotics and automation (ICRA’07), Roma, Italy

  24. Guenter F, Hersch M, Calinon S, Billard A (2007) Reinforcement learning for imitating constrained reaching movements. RSJ Adv Robot 21(13):1521–1544 (Special issue on imitative robots)

    Google Scholar 

  25. Hersch M, Guenter F, Calinon S, Billard A (2008) Dynamical system modulation for robot learning via kinesthetic demonstrations. IEEE Trans Robot 24(6):1463–1467

    Article  Google Scholar 

  26. Jan’t Hoen P, Tuyls K, Panait L, Luke S, La Poutré JA (2005) An overview of cooperative and competitive multiagent learning. In: LAMAS, pp 1–46

  27. Jones C, Shell D, Matarić M, Gerkey B (2004) Principled approaches to the design of multi-robot systems. In: IEEE/RSJ intl conf on intelligent robots and systems, workshop on networked robotics

  28. Kube RC, Zhang H (1997) Task modelling in collective robotics. Auton Robots 4(1):53–72

    Article  Google Scholar 

  29. Lee JD, See KA (2004) Trust in automation: designing for appropriate reliance. Hum Factors 46:50–80

    Google Scholar 

  30. Likert R (1932) A technique for the measurement of attitudes. In: Archives of psychology, pp 1–55

  31. Lockerd A, Breazeal C (2004) Tutelage and socially guided robot learning. In: IEEE/RSJ international conference on intelligent robots and systems

  32. Mataric MJ (2002) Sensory-motor primitives as a basis for learning by imitation: Linking perception to action and biology to robotics. In: Dautenhahn K, Nehaniv C (eds) Imitation in animals and artifacts. MIT Press, Cambridge, pp 392–422

    Google Scholar 

  33. Mayo M (2003) Symbol grounding and its implications for artificial intelligence. In: Oudshoorn MJ (ed) Twenty-sixth australasian computer science conference (ACSC2003), CRPIT, vol 16. Adelaide, Australia, ACS, pp 55–60

  34. Nielsen CW, Few DA, Athey DS (2008) Using mixed-initiative human-robot interaction to bound performance in a search task. In: international conference on intelligent sensors, sensor networks and information processing. ISSNIP 2008, pp 195–200

  35. Oliveira E, Nunes L (2004) Learning by exchanging Advice. Springer, Berlin

    Google Scholar 

  36. Ossowski S, Menezes R (2006) On coordination and its significance to distributed and multi-agent systems: Research articles. Concurr Comput Pract Exper 18(4):359–370

    Article  Google Scholar 

  37. Pagello E, D’Angelo A, Montesello F, Garelli F, Ferrari C (1999) Cooperative behaviors in multi-robot systems through implicit communication. Robot Auton Syst 29(1):65–77

    Article  Google Scholar 

  38. Peters J, Vijayakumar S, Schaal S (2003) Reinforcement learning for humanoid robotics. In: IEEE-RAS international conference on humanoid robots, pp 1–20

  39. Pollard N, Hodgins JK (2002) Generalizing demonstrated manipulation tasks. In Workshop on the algorithmic foundations of robotics, December 2002

  40. Price B, Boutilier C (2003) Accelerating reinforcement learning through implicit imitation. J Artif Intell Res 19:569–629

    MATH  Google Scholar 

  41. Roth M, Vail D, Veloso M (2003) A real-time world model for multi-robot teams with high-latency communication. In: IEEE/RSJ international conference on intelligent robots and systems, vol 3. pp 2494–2499

  42. Rybski PE, Yoon K, Stolarz J, Veloso MM (2007) Interactive robot task training through dialog and demonstration. In: HRI’07: Proceedings of the ACM/IEEE international conference on human-robot interaction. ACM Press, New York, pp 49–56

    Chapter  Google Scholar 

  43. Saunders J, Nehaniv CL, Dautenhahn K (2006) Teaching robots by moulding behavior and scaffolding the environment. In: HRI ’06: proceeding of the 1st ACM SIGCHI/SIGART conference on human-robot interaction. ACM Press, New York, pp 118–125

    Chapter  Google Scholar 

  44. Schaal S, Ijspeert A, Billard A (2003) Computational approaches to motor learning by imitation. Philos Trans R Soc Lond, B, Biol Sci 358:537–547

    Article  Google Scholar 

  45. Scholtz J, Antonishek B, Young J (2004) Evaluation of a human-robot interface: Development of a situational awareness methodology. In: HICSS ’04: Proceedings of the 37th annual Hawaii international conference on system sciences (HICSS’04)—Track 5, IEEE Computer Society, Washington, DC p 50130.3

  46. Smart WD, Kaelbling LP (2002) Effective reinforcement learning for mobile robots. In: IEEE international conference on robotics and automation

  47. Steinfeld A (2004) Interface lessons for fully and semi-autonomous mobile robots. In: IEEE international conference on robotics and automation

  48. Steinfeld A, Fong T, Kaber D, Lewis M, Scholtz J, Schultz A, Goodrich M (2006) Common metrics for human-robot interaction. In: 1st annual conference on human-robot interaction, Salt Lake City, Utah

  49. Stone P, Veloso M (2000) Multiagent systems: A survey from a machine learning perspective. Auton Robots 8(3):345–383

    Article  Google Scholar 

  50. Wang J, Lewis M (2007) Human control for cooperating robot teams. In HRI ’07: Proceedings of the ACM/IEEE international conference on human-robot interaction, New York, NY, USA, pp 9–16

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sonia Chernova.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chernova, S., Veloso, M. Confidence-Based Multi-Robot Learning from Demonstration. Int J of Soc Robotics 2, 195–215 (2010). https://doi.org/10.1007/s12369-010-0060-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12369-010-0060-0

Keywords

Navigation