Abstract
Learning from demonstration algorithms enable a robot to learn a new policy based on demonstrations provided by a teacher. In this article, we explore a novel research direction, multi-robot learning from demonstration, which extends demonstration based learning methods to collaborative multi-robot domains. Specifically, we study the problem of enabling a single person to teach individual policies to multiple robots at the same time. We present flexMLfD, a task and platform independent multi-robot demonstration learning framework that supports both independent and collaborative multi-robot behaviors. Building upon this framework, we contribute three approaches to teaching collaborative multi-robot behaviors based on different information sharing strategies, and evaluate these approaches by teaching two Sony QRIO humanoid robots to perform three collaborative ball sorting tasks. We then present scalability analysis of flexMLfD using up to seven Sony AIBO robots. We conclude the article by proposing a formalization for a broader multi-robot learning from demonstration research area.
Similar content being viewed by others
References
Alissandrakis A, Nehaniv CL, Dautenhahn K (2002) Do as i do: Correspondences across different robotic embodiments. In: Kim J, Polani D, Martinetz T (eds) Fifth German workshop on artificial life (GWAL5), pp 143–152
Argall B, Chernova S, Veloso M, Browning B (2009) A survey of robot learning from demonstration. Robot Auton Syst 57(5):469–483
Atkeson CG, Schaal S (1997) Robot learning from demonstration. In: Fisher DH Jr (ed) Machine learning: proceedings of the fourteenth international conference (ICML’97). San Francisco, California, pp 12–20
Balch T, Arkin RC (1994) Communication in reactive multiagent robotic systems. Auton Robots 1(1):27–52
Bentivegna DC, Ude A, Atkeson CG, Cheng G (2004) Learning to act from observation and practice. Int J Humanoid Robot 1(4)
Breazeal C, Hoffman G, Lockerd A (2004) Teaching and working with robots as a collaboration. In: AAMAS ’04: Proceedings of the third international joint conference on autonomous agents and multiagent systems. IEEE Computer Society, Washington, DC, pp 1030–1037
Browning B, Xu L, Veloso M (2004) Skill acquisition and use for a dynamically-balancing soccer robot. In: Proceedings of nineteenth national conference on artificial intelligence (AAAI’04)
Calinon S, Billard A (2007) Incremental learning of gestures by imitation in a humanoid robot. In: Second annual conference on human-robot interactions (HRI’07). Arlington, Virginia, March 2007
Chaimowicz L, Campos MFM, Kumar V (2002) Dynamic role assignment for cooperative robots. In: Proc. of the IEEE intl. conf. on robotics and automation (ICRA), pp 293–298
Chernova S (2009) Confidence-based robot policy learning from demonstration. PhD thesis, Computer Science Dept., Carnegie Mellon University, Advisor-Manuela Veloso
Chernova S, Veloso M (2008) Multi-thresholded approach to demonstration selection for interactive robot learning. In: Proceedings of 3rd ACM/IEEE international conference on human-robot interaction (HRI’08), March 2008
Chernova S, Veloso M (2008) Teaching multi-robot coordination using demonstration of communication and state sharing (short paper). In: Proceedings of the international conference on autonomous agents and multiagent systems (AMMAS ’08), May 2008
Chernova S, Veloso M (2009) Interactive policy learning through confidence-based autonomy. J Artif Intell Res 34(1):1–25
Clouse JA (1996) On integrating apprentice learning and reinforcement learning. PhD thesis, University of Massachusetts, Department of Computer Science. Director-Paul E Utgoff
Crandall JW, Goodrich MA, Olsen DR Jr, Nielsen, CW (2005) Validating human-robot interaction schemes in multitasking environments. IEEE Trans Syst Man Cybern A 35(4):438–449
Dias MB, Zlot R, Kalra N, Stentz A (2006) Market-based multirobot coordination: A survey and analysis. Proc IEEE 94(7):1257–1270
Endsley MR, Garland DJ (2000) Situation awareness: analysis and measurement. Lawrence Erlbaum Associates
Farinelli A, Farinelli R, Iocchi L, Nardi D (2004) Multi-robot systems: A classification focused on coordination. IEEE Trans Syst Man Cybern B 34:2015–2028
Fong TW, Thorpe C, Baur C (2003) Robot, asker of questions. In: Robotics and autonomous systems
Gerkey BP, Mataric MJ (2000) Principled communication for dynamic multi-robot task allocation. In: Experimental robotics VII. LNCIS, vol 271. Springer, Berlin, pp 353–362
Goodrich MA, Schultz AC (2007) Human-robot interaction: a survey. Found Trends Hum Comput Interact 1(3):203–275
Goodrich MA, Olsen DR Jr (2003) Seven principles of efficient human robot interaction. In: Proc IEEE Int Conf Syst, Man and Cybernetics, vol 4, pp 3942–3948
Grollman SH, Jenkins OC (2007) Dogged learning for robots. In: Proceedings of the IEEE international conference on robotics and automation (ICRA’07), Roma, Italy
Guenter F, Hersch M, Calinon S, Billard A (2007) Reinforcement learning for imitating constrained reaching movements. RSJ Adv Robot 21(13):1521–1544 (Special issue on imitative robots)
Hersch M, Guenter F, Calinon S, Billard A (2008) Dynamical system modulation for robot learning via kinesthetic demonstrations. IEEE Trans Robot 24(6):1463–1467
Jan’t Hoen P, Tuyls K, Panait L, Luke S, La Poutré JA (2005) An overview of cooperative and competitive multiagent learning. In: LAMAS, pp 1–46
Jones C, Shell D, Matarić M, Gerkey B (2004) Principled approaches to the design of multi-robot systems. In: IEEE/RSJ intl conf on intelligent robots and systems, workshop on networked robotics
Kube RC, Zhang H (1997) Task modelling in collective robotics. Auton Robots 4(1):53–72
Lee JD, See KA (2004) Trust in automation: designing for appropriate reliance. Hum Factors 46:50–80
Likert R (1932) A technique for the measurement of attitudes. In: Archives of psychology, pp 1–55
Lockerd A, Breazeal C (2004) Tutelage and socially guided robot learning. In: IEEE/RSJ international conference on intelligent robots and systems
Mataric MJ (2002) Sensory-motor primitives as a basis for learning by imitation: Linking perception to action and biology to robotics. In: Dautenhahn K, Nehaniv C (eds) Imitation in animals and artifacts. MIT Press, Cambridge, pp 392–422
Mayo M (2003) Symbol grounding and its implications for artificial intelligence. In: Oudshoorn MJ (ed) Twenty-sixth australasian computer science conference (ACSC2003), CRPIT, vol 16. Adelaide, Australia, ACS, pp 55–60
Nielsen CW, Few DA, Athey DS (2008) Using mixed-initiative human-robot interaction to bound performance in a search task. In: international conference on intelligent sensors, sensor networks and information processing. ISSNIP 2008, pp 195–200
Oliveira E, Nunes L (2004) Learning by exchanging Advice. Springer, Berlin
Ossowski S, Menezes R (2006) On coordination and its significance to distributed and multi-agent systems: Research articles. Concurr Comput Pract Exper 18(4):359–370
Pagello E, D’Angelo A, Montesello F, Garelli F, Ferrari C (1999) Cooperative behaviors in multi-robot systems through implicit communication. Robot Auton Syst 29(1):65–77
Peters J, Vijayakumar S, Schaal S (2003) Reinforcement learning for humanoid robotics. In: IEEE-RAS international conference on humanoid robots, pp 1–20
Pollard N, Hodgins JK (2002) Generalizing demonstrated manipulation tasks. In Workshop on the algorithmic foundations of robotics, December 2002
Price B, Boutilier C (2003) Accelerating reinforcement learning through implicit imitation. J Artif Intell Res 19:569–629
Roth M, Vail D, Veloso M (2003) A real-time world model for multi-robot teams with high-latency communication. In: IEEE/RSJ international conference on intelligent robots and systems, vol 3. pp 2494–2499
Rybski PE, Yoon K, Stolarz J, Veloso MM (2007) Interactive robot task training through dialog and demonstration. In: HRI’07: Proceedings of the ACM/IEEE international conference on human-robot interaction. ACM Press, New York, pp 49–56
Saunders J, Nehaniv CL, Dautenhahn K (2006) Teaching robots by moulding behavior and scaffolding the environment. In: HRI ’06: proceeding of the 1st ACM SIGCHI/SIGART conference on human-robot interaction. ACM Press, New York, pp 118–125
Schaal S, Ijspeert A, Billard A (2003) Computational approaches to motor learning by imitation. Philos Trans R Soc Lond, B, Biol Sci 358:537–547
Scholtz J, Antonishek B, Young J (2004) Evaluation of a human-robot interface: Development of a situational awareness methodology. In: HICSS ’04: Proceedings of the 37th annual Hawaii international conference on system sciences (HICSS’04)—Track 5, IEEE Computer Society, Washington, DC p 50130.3
Smart WD, Kaelbling LP (2002) Effective reinforcement learning for mobile robots. In: IEEE international conference on robotics and automation
Steinfeld A (2004) Interface lessons for fully and semi-autonomous mobile robots. In: IEEE international conference on robotics and automation
Steinfeld A, Fong T, Kaber D, Lewis M, Scholtz J, Schultz A, Goodrich M (2006) Common metrics for human-robot interaction. In: 1st annual conference on human-robot interaction, Salt Lake City, Utah
Stone P, Veloso M (2000) Multiagent systems: A survey from a machine learning perspective. Auton Robots 8(3):345–383
Wang J, Lewis M (2007) Human control for cooperating robot teams. In HRI ’07: Proceedings of the ACM/IEEE international conference on human-robot interaction, New York, NY, USA, pp 9–16
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chernova, S., Veloso, M. Confidence-Based Multi-Robot Learning from Demonstration. Int J of Soc Robotics 2, 195–215 (2010). https://doi.org/10.1007/s12369-010-0060-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12369-010-0060-0