Skip to main content
Log in

Knowledge acquisition through human–robot multimodal interaction

  • Special Issue
  • Published:
Intelligent Service Robotics Aims and scope Submit manuscript

Abstract

The limited understanding of the surrounding environment still restricts the capabilities of robotic systems in real world applications. Specifically, the acquisition of knowledge about the environment typically relies only on perception, which requires intensive ad hoc training and is not sufficiently reliable in a general setting. In this paper, we aim at integrating new acquisition devices, such as tangible user interfaces, speech technologies and vision-based systems, with established AI methodologies, to present a novel and effective knowledge acquisition approach. A natural interaction paradigm is presented, where humans move within the environment with the robot and easily acquire information by selecting relevant spots, objects, or other relevant landmarks. The synergy between novel interaction technologies and semantic knowledge leverages humans’ cognitive skills to support robots in acquiring and grounding knowledge about the environment; such richer representation can be exploited in the realization of robot autonomous skills for task accomplishment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://wiibrew.org/wiki/Wiimote.

  2. http://www.xbox.com/en-US/KINECT.

  3. http://www.ros.org/wiki/gmapping.

  4. Material about this experiment is provided at http://www.dis.uniroma1.it/~randelli/index.php?id=10.

  5. http://opencv.org/.

  6. http://openni.org/.

  7. http://www.primesense.com/technology/nite3.

  8. http://www.mediavoice.it/.

  9. Videos showing examples of semantic mapping and knowledge acquisition from these environments are available from https://sites.google.com/site/isrsubmission/.

References

  1. Arsenio A, Fitzpatrick P, Kemp CC, Metta G (2003) The whole world in your hand: active and interactive segmentation. URL http://cogprints.org/3329/1/Arsenio.pdf

  2. Bohus D, Horvitz E, Kanda T, Mutlu B, Raux A, editors (2011) Special issue on dialog with robots. Association for the advancement of artificial intelligence , vol 32:4.

  3. Bolt R (1980) Put-that-there: Voice and gesture at the graphics interface. ACM Siggraph Comput Gr 14(3):262–270

    Article  MathSciNet  Google Scholar 

  4. Brick T, Schermerhorn P, Scheutz M (2007) Speech and action: integration of action and language for mobile robots. In: intelligent robots and systems, 2007. IROS 2007. IEEE/RSJ international conference on, IEEE, pp 1423–1428. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4399576

  5. Burges C (1998) A tutorial on support vector machines for pattern recognition. Data min knowl discov 2(2):121–167

    Article  Google Scholar 

  6. Buschka P, Saffiotti A (2002) A virtual sensor for room detection. In: proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 637–642

  7. Chen X, Jiang J, Ji J, Jin G, Wang F (2009) Integrating NLP with reasoning about actions for autonomous agents communicating with humans. 2009 IEEE/WIC/ACM international joint conference on web intelligence and intelligent agent technology pp 137–140. doi:10.1109/WI-IAT.2009.142, URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5284848

  8. Cohen P (2000) Speech cant do everything: a case for multimodal systems. Speech technology magazine 5(4)

  9. DeVault D, Stone M (2009) Learning to interpret utterances using dialogue history. In: proceedings of the 12th conference of the European chapter of the association for computational linguistics, association for computational linguistics, pp 184–192., URL http://dl.acm.org/citation.cfm?id=1609087

  10. Diosi A, Taylor G, Kleeman L (2005) Interactive slam using laser and advanced sonar. Proceedings of the IEEE International Conference on Robotics and Automation, Barcelona, Spain, In, pp 1103–1108

  11. Dourish P (2004) Where the action is: the foundations of embodied interaction. The MIT Press, USA

    Google Scholar 

  12. Foster ME, Giuliani M, Isard A, Matheson C, Oberlander J, Knoll A (2009) Evaluating description and reference strategies in a cooperative human–robot dialogue system. In: proceedings of the 21st international joint conference on artificial intelligence, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA IJCAI’09, pp 1818–1823, URL http://dl.acm.org/citation.cfm?id=1661445.1661737

  13. Galindo C, Saffiotti A, Coradeschi S, Buschka P, Fernández-Madrigal J, González J (2005) Multi-hierarchical semantic maps for mobile robotics. In: proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Edmonton, CA, pp 3492–3497. online at http://www.aass.oru.se/~asaffio/

  14. Gallo L, De Pietro G, Marra I (2008) 3D interaction with volumetric medical data: experiencing the Wiimote. In: proceedings of the 1st international conference on Ambient media and systems, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), pp 1–6

  15. Ghidary SS, Nakata Y, Saito H, Hattori M, Takamori T (2002) Multi-modal interaction of human and home robot in the context of room map generation. Auton Robot 13(2):169–184

    Article  MATH  Google Scholar 

  16. Hasanuzzaman M, Zhang T, Ampornaramveth V, Gotoda H, Shirai Y, Ueno H (2007) Adaptive visual gesture recognition for human–robot interaction using a knowledge-based software platform. Robot Auton Syst 55(8):643–657

    Google Scholar 

  17. Hertzberg J, Saffiotti A (2008) Using semantic knowledge in robotics. Robotics and autonomous systems 56(11):875–877. doi:10.1016/j.robot.2008.08.002, URL http://www.sciencedirect.com/science/article/B6V16-4T72WWW-1/2/9edd0eb7357cb93ab0a7f0285979c469, semantic knowledge in robotics

  18. Ishii H, Ullmer B (1997) Tangible bits: towards seamless interfaces between people, bits and atoms. In: proceedings of the SIGCHI conference on human factors in computing systems, ACM, pp 234–241

  19. Jayasekara A, Watanabe K, Kiguchi K, Izumi K (2009) Adaptation of robot behaviors toward user perception on fuzzy linguistic information by fuzzy voice feedback. In: robot and human interactive communication, 2009. RO-MAN 2009. The 18th IEEE International Symposium on, IEEE, pp 395–400. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5326305

  20. Johnson D, Agah A (2009) Human–robot interaction through semantic integration of multiple modalities, dialog management, and contexts. Int J Soc Robot 1:283–305

    Article  Google Scholar 

  21. Kemp C, Anderson C, Nguyen H, Trevor A, Xu Z (2008) A point-and-click interface for the real world: laser designation of objects for mobile manipulation. In: proceedings of the 3rd ACM/IEEE international conference on Human–robot interaction, ACM, pp 241–248

  22. Kruijff G, Zender H, Jensfelt P, Christensen H (2006) Clarification dialogues in human–augmented mapping. In: proceedings of the 1st annual conference on human–robot interaction (HRI’06), Salt Lake City, UT

  23. Kruijff GJM, Zender H, Jensfelt P, Christensen HI (2007) Situated dialogue and spatial organization: What, where..and why? Int J Adv Robot Syst Spec Issue human–robot Int 4(1):125–138

    Google Scholar 

  24. Li S, Haasch A, Wrede B, Fritsch J, Sagerer G (2005) Human-style interaction with a robot for cooperative learning of scene objects. In: proceedings of the 7th international conference on multimodal interfaces, ACM, New York, NY, USA, ICMI ’05, pp 151–158

  25. Marshall P (2007) Do tangible interfaces enhance learning? In: proceedings of the 1st international conference on Tangible and embedded interaction, ACM, New York, NY, USA, TEI ’07, pp 163–170. doi:10.1145/1226969.1227004, URL http://doi.acm.org/10.1145/1226969.1227004

  26. Martínez Mozos O, Burgard W (2006) Supervised learning of topological maps using semantic information extracted from range data. In: proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Beijing, China, pp 2772–2777

  27. Messing L, Campbell R (1999) Gesture, speech, and sign. Oxford University Press, Oxford

    Book  Google Scholar 

  28. Nickel K, Stiefelhagen R (2007) Visual recognition of pointing gestures for human–robot interaction. Image and vision computing 25(12):1875–1884. doi:10.1016/j.imavis.2005.12.020, URL http://www.sciencedirect.com/science/article/pii/S0262885606002897

    Google Scholar 

  29. Nishimori M, Saitoh T, Konishi R (2007) Voice controlled intelligent wheelchair. In: SICE, 2007 annual conference, IEEE, pp 336–340. URL http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4421003

  30. Nüchter A, Surmann H, Lingemann K, Hertzberg J (2003) Semantic scene analysis of scanned 3d indoor environments. In: proceedings of the Eighth international fall workshop on vision, modeling, and visualization (VMV-03)

  31. Oviatt S (2003) Multimodal interfaces. The human-computer interaction handbook: fundamentals, evolving technologies and emerging applications pp 286–304

  32. Oviatt S, Cohen P (2000) Perceptual user interfaces: multimodal interfaces that process what comes naturally. Commun ACM 43(3):45–53. doi:10.1145/330534.330538, URL http://doi.acm.org/10.1145/330534.330538

  33. Perzanowski D, Schultz A, Adams W, Marsh E, Bugajska M (2001) Building a multimodal human–robot interface. Intell Syst IEEE 16(1):16–21

    Article  Google Scholar 

  34. Pronobis A (2011) Semantic mapping with mobile robots. PhD thesis, KTH royal institute of technology, Stockholm, Sweden. URL http://www.pronobis.pro/phd

  35. Pronobis A, Martinez Mozos M (2010) Approaching the symbol grounding problem with probabilistic graphical models. Int J Robot Res 29(2–3):298–320

    Article  Google Scholar 

  36. Randelli G, Venanzi M, Nardi D (2011) Evaluating tangible paradigms for ground robot teleoperation. In: proceedings of the 20th IEEE international symposium on robot and human interactive communication (ROMAN), pp 389–394. doi:10.1109/ROMAN.2011.6005240

  37. Ros R, Lemaignan S, Sisbot EA, Alami R, Steinwender J, Hamann K, Warneken F (2010) Which one? grounding the referent based on efficient human–robot interaction. In: IEEE international symposium in robot and human interactive communication (RO-MAN), pp 570–575

  38. Scheutz M, Cantrell R, Schemerhorn P (2011) Toward humanlike task-based dialogue processing for human–robot interaction. AI Mag 34(4):64–76

    Google Scholar 

  39. Sko T, Gardner H (2009) The Wiimote with multiple sensor bars: creating an affordable, virtual reality controller. In: CHINZ ’09: proceedings of the 10th international conference NZ chapter of the ACM’s special interest group on human-computer interaction, ACM, New York, NY, USA, pp 41–44. http://doi.acm.org/10.1145/1577782.1577790

  40. Stiefelhagen R, Ekenel H, Fugen C, Gieselmann P, Holzapfel H, Kraft F, Nickel K, Voit M, Waibel A (2007) Enabling multimodal human–robot interaction for the karlsruhe humanoid robot. Robot IEEE Trans 23(5):840–851

    Article  Google Scholar 

  41. Tellex S, Kollar T, Dickerson S, Walter MR, Banerjee AG, Teller S, Roy N (2011) Approaching the symbol grounding problem with probabilistic graphical models. AI Mag 34(4):64–76

    Google Scholar 

  42. Tenorth M (2011) Knowledge processing for autonomous robots. PhD thesis, Technische Universität München, Germany. URL http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:91-diss-20111125-1079930-1-7

  43. Theobalt C, Bos J, Chapman T, Espinosa-Romero A, Fraser M, Hayes G, Klein E, Oka T, Reeve R (2002) Talking to Godot: dialogue with a mobile robot. In: proceedings of IEEE/RSJ international conference on intelligent robots and systems (IROS 2002), pp 1338–1343

  44. Zender H, Óscar Martínez Mozos, Jensfelt P, Kruijff GJM, Burgard W (2008) Conceptual spatial representations for indoor mobile robots. Robot Auton Syst 56(6):493–502. doi:10.1016/j.robot.2008.03.007

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriele Randelli.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Randelli, G., Bonanni, T.M., Iocchi, L. et al. Knowledge acquisition through human–robot multimodal interaction. Intel Serv Robotics 6, 19–31 (2013). https://doi.org/10.1007/s11370-012-0123-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11370-012-0123-1

Keywords

Navigation