Abstract
The limited understanding of the surrounding environment still restricts the capabilities of robotic systems in real world applications. Specifically, the acquisition of knowledge about the environment typically relies only on perception, which requires intensive ad hoc training and is not sufficiently reliable in a general setting. In this paper, we aim at integrating new acquisition devices, such as tangible user interfaces, speech technologies and vision-based systems, with established AI methodologies, to present a novel and effective knowledge acquisition approach. A natural interaction paradigm is presented, where humans move within the environment with the robot and easily acquire information by selecting relevant spots, objects, or other relevant landmarks. The synergy between novel interaction technologies and semantic knowledge leverages humans’ cognitive skills to support robots in acquiring and grounding knowledge about the environment; such richer representation can be exploited in the realization of robot autonomous skills for task accomplishment.
Similar content being viewed by others
Notes
Material about this experiment is provided at http://www.dis.uniroma1.it/~randelli/index.php?id=10.
Videos showing examples of semantic mapping and knowledge acquisition from these environments are available from https://sites.google.com/site/isrsubmission/.
References
Arsenio A, Fitzpatrick P, Kemp CC, Metta G (2003) The whole world in your hand: active and interactive segmentation. URL http://cogprints.org/3329/1/Arsenio.pdf
Bohus D, Horvitz E, Kanda T, Mutlu B, Raux A, editors (2011) Special issue on dialog with robots. Association for the advancement of artificial intelligence , vol 32:4.
Bolt R (1980) Put-that-there: Voice and gesture at the graphics interface. ACM Siggraph Comput Gr 14(3):262–270
Brick T, Schermerhorn P, Scheutz M (2007) Speech and action: integration of action and language for mobile robots. In: intelligent robots and systems, 2007. IROS 2007. IEEE/RSJ international conference on, IEEE, pp 1423–1428. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4399576
Burges C (1998) A tutorial on support vector machines for pattern recognition. Data min knowl discov 2(2):121–167
Buschka P, Saffiotti A (2002) A virtual sensor for room detection. In: proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 637–642
Chen X, Jiang J, Ji J, Jin G, Wang F (2009) Integrating NLP with reasoning about actions for autonomous agents communicating with humans. 2009 IEEE/WIC/ACM international joint conference on web intelligence and intelligent agent technology pp 137–140. doi:10.1109/WI-IAT.2009.142, URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5284848
Cohen P (2000) Speech cant do everything: a case for multimodal systems. Speech technology magazine 5(4)
DeVault D, Stone M (2009) Learning to interpret utterances using dialogue history. In: proceedings of the 12th conference of the European chapter of the association for computational linguistics, association for computational linguistics, pp 184–192., URL http://dl.acm.org/citation.cfm?id=1609087
Diosi A, Taylor G, Kleeman L (2005) Interactive slam using laser and advanced sonar. Proceedings of the IEEE International Conference on Robotics and Automation, Barcelona, Spain, In, pp 1103–1108
Dourish P (2004) Where the action is: the foundations of embodied interaction. The MIT Press, USA
Foster ME, Giuliani M, Isard A, Matheson C, Oberlander J, Knoll A (2009) Evaluating description and reference strategies in a cooperative human–robot dialogue system. In: proceedings of the 21st international joint conference on artificial intelligence, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA IJCAI’09, pp 1818–1823, URL http://dl.acm.org/citation.cfm?id=1661445.1661737
Galindo C, Saffiotti A, Coradeschi S, Buschka P, Fernández-Madrigal J, González J (2005) Multi-hierarchical semantic maps for mobile robotics. In: proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Edmonton, CA, pp 3492–3497. online at http://www.aass.oru.se/~asaffio/
Gallo L, De Pietro G, Marra I (2008) 3D interaction with volumetric medical data: experiencing the Wiimote. In: proceedings of the 1st international conference on Ambient media and systems, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), pp 1–6
Ghidary SS, Nakata Y, Saito H, Hattori M, Takamori T (2002) Multi-modal interaction of human and home robot in the context of room map generation. Auton Robot 13(2):169–184
Hasanuzzaman M, Zhang T, Ampornaramveth V, Gotoda H, Shirai Y, Ueno H (2007) Adaptive visual gesture recognition for human–robot interaction using a knowledge-based software platform. Robot Auton Syst 55(8):643–657
Hertzberg J, Saffiotti A (2008) Using semantic knowledge in robotics. Robotics and autonomous systems 56(11):875–877. doi:10.1016/j.robot.2008.08.002, URL http://www.sciencedirect.com/science/article/B6V16-4T72WWW-1/2/9edd0eb7357cb93ab0a7f0285979c469, semantic knowledge in robotics
Ishii H, Ullmer B (1997) Tangible bits: towards seamless interfaces between people, bits and atoms. In: proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 234–241
Jayasekara A, Watanabe K, Kiguchi K, Izumi K (2009) Adaptation of robot behaviors toward user perception on fuzzy linguistic information by fuzzy voice feedback. In: robot and human interactive communication, 2009. RO-MAN 2009. The 18th IEEE International Symposium on, IEEE, pp 395–400. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5326305
Johnson D, Agah A (2009) Human–robot interaction through semantic integration of multiple modalities, dialog management, and contexts. Int J Soc Robot 1:283–305
Kemp C, Anderson C, Nguyen H, Trevor A, Xu Z (2008) A point-and-click interface for the real world: laser designation of objects for mobile manipulation. In: proceedings of the 3rd ACM/IEEE international conference on Human–robot interaction, ACM, pp 241–248
Kruijff G, Zender H, Jensfelt P, Christensen H (2006) Clarification dialogues in human–augmented mapping. In: proceedings of the 1st annual conference on human–robot interaction (HRI’06), Salt Lake City, UT
Kruijff GJM, Zender H, Jensfelt P, Christensen HI (2007) Situated dialogue and spatial organization: What, where..and why? Int J Adv Robot Syst Spec Issue human–robot Int 4(1):125–138
Li S, Haasch A, Wrede B, Fritsch J, Sagerer G (2005) Human-style interaction with a robot for cooperative learning of scene objects. In: proceedings of the 7th international conference on multimodal interfaces, ACM, New York, NY, USA, ICMI ’05, pp 151–158
Marshall P (2007) Do tangible interfaces enhance learning? In: proceedings of the 1st international conference on Tangible and embedded interaction, ACM, New York, NY, USA, TEI ’07, pp 163–170. doi:10.1145/1226969.1227004, URL http://doi.acm.org/10.1145/1226969.1227004
Martínez Mozos O, Burgard W (2006) Supervised learning of topological maps using semantic information extracted from range data. In: proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Beijing, China, pp 2772–2777
Messing L, Campbell R (1999) Gesture, speech, and sign. Oxford University Press, Oxford
Nickel K, Stiefelhagen R (2007) Visual recognition of pointing gestures for human–robot interaction. Image and vision computing 25(12):1875–1884. doi:10.1016/j.imavis.2005.12.020, URL http://www.sciencedirect.com/science/article/pii/S0262885606002897
Nishimori M, Saitoh T, Konishi R (2007) Voice controlled intelligent wheelchair. In: SICE, 2007 annual conference, IEEE, pp 336–340. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4421003
Nüchter A, Surmann H, Lingemann K, Hertzberg J (2003) Semantic scene analysis of scanned 3d indoor environments. In: proceedings of the Eighth international fall workshop on vision, modeling, and visualization (VMV-03)
Oviatt S (2003) Multimodal interfaces. The human-computer interaction handbook: fundamentals, evolving technologies and emerging applications pp 286–304
Oviatt S, Cohen P (2000) Perceptual user interfaces: multimodal interfaces that process what comes naturally. Commun ACM 43(3):45–53. doi:10.1145/330534.330538, URL http://doi.acm.org/10.1145/330534.330538
Perzanowski D, Schultz A, Adams W, Marsh E, Bugajska M (2001) Building a multimodal human–robot interface. Intell Syst IEEE 16(1):16–21
Pronobis A (2011) Semantic mapping with mobile robots. PhD thesis, KTH royal institute of technology, Stockholm, Sweden. URL http://www.pronobis.pro/phd
Pronobis A, Martinez Mozos M (2010) Approaching the symbol grounding problem with probabilistic graphical models. Int J Robot Res 29(2–3):298–320
Randelli G, Venanzi M, Nardi D (2011) Evaluating tangible paradigms for ground robot teleoperation. In: proceedings of the 20th IEEE international symposium on robot and human interactive communication (ROMAN), pp 389–394. doi:10.1109/ROMAN.2011.6005240
Ros R, Lemaignan S, Sisbot EA, Alami R, Steinwender J, Hamann K, Warneken F (2010) Which one? grounding the referent based on efficient human–robot interaction. In: IEEE international symposium in robot and human interactive communication (RO-MAN), pp 570–575
Scheutz M, Cantrell R, Schemerhorn P (2011) Toward humanlike task-based dialogue processing for human–robot interaction. AI Mag 34(4):64–76
Sko T, Gardner H (2009) The Wiimote with multiple sensor bars: creating an affordable, virtual reality controller. In: CHINZ ’09: proceedings of the 10th international conference NZ chapter of the ACM’s special interest group on human-computer interaction, ACM, New York, NY, USA, pp 41–44. http://doi.acm.org/10.1145/1577782.1577790
Stiefelhagen R, Ekenel H, Fugen C, Gieselmann P, Holzapfel H, Kraft F, Nickel K, Voit M, Waibel A (2007) Enabling multimodal human–robot interaction for the karlsruhe humanoid robot. Robot IEEE Trans 23(5):840–851
Tellex S, Kollar T, Dickerson S, Walter MR, Banerjee AG, Teller S, Roy N (2011) Approaching the symbol grounding problem with probabilistic graphical models. AI Mag 34(4):64–76
Tenorth M (2011) Knowledge processing for autonomous robots. PhD thesis, Technische Universität München, Germany. URL http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:91-diss-20111125-1079930-1-7
Theobalt C, Bos J, Chapman T, Espinosa-Romero A, Fraser M, Hayes G, Klein E, Oka T, Reeve R (2002) Talking to Godot: dialogue with a mobile robot. In: proceedings of IEEE/RSJ international conference on intelligent robots and systems (IROS 2002), pp 1338–1343
Zender H, Óscar Martínez Mozos, Jensfelt P, Kruijff GJM, Burgard W (2008) Conceptual spatial representations for indoor mobile robots. Robot Auton Syst 56(6):493–502. doi:10.1016/j.robot.2008.03.007
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Randelli, G., Bonanni, T.M., Iocchi, L. et al. Knowledge acquisition through human–robot multimodal interaction. Intel Serv Robotics 6, 19–31 (2013). https://doi.org/10.1007/s11370-012-0123-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11370-012-0123-1