Abstract
The currently available speech technologies on mobile devices achieve effective performance in terms of both reliability and the language they are able to capture. The availability of performant speech recognition engines may also support the deployment of vocal interfaces in consumer robots. However, the design and implementation of such interfaces still requires significant work. The language processing chain and the domain knowledge must be built for the specific features of the robotic platform, the deployment environment and the tasks to be performed. Hence, such interfaces are currently built in a completely ad hoc way. In this paper, we present a design methodology together with a support tool aiming to streamline and improve the implementation of dedicated vocal interfaces for robots. This work was developed within an experimental project called Speaky for Robots. We extend the existing vocal interface development framework to target robotic applications. The proposed solution is built using a bottom-up approach by refining the language processing chain through the development of vocal interfaces for different robotic platforms and domains. The proposed approach is validated both in experiments involving several research prototypes and in tests involving end-users.
Similar content being viewed by others
Notes
Available at http://sag.art.uniroma2.it/huric
References
Asoh H, Vlassis NA, Motomura Y, Asano F, Hara I, Hayamizu S, Itou K, Kurita T, Matsui T, Bunschoten R, Kröse BJA (2001) Jijo-2: An office robot that communicates and learns. IEEE Intell Syst 16(5):46–55
Baker CF, Fillmore CJ, Lowe JB (1998) The Berkeley framenet project. In: Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics. ACL, pp 86–90
Bannat A, Blume J, Geiger JT, Rehrl T, Wallhoff F, Mayer C, Radig B, Sosnowski S, Kühnlenz K (2010) A multimodal human-robot-dialog applying emotional feedbacks. In: Proceedings of international conference of social robotics, pp 1–10
Bastianelli E, Bloisi D, Capobianco R, Cossu F, Gemignani G, Iocchi L, Nardi D (2013) On-line semantic mapping. In: Proceeding of international conference on advanced robotics. IEEE, pp 1–6
Bastianelli E, Bloisi D, Capobianco R, Gemignani G, Iocchi L, Nardi D (2013) Knowledge representation for robots through human-robot interaction. CoRR http://arxiv.org/abs/1307.7351
Bastianelli E, Castellucci G, Croce D, Basili R, Nardi D (2014) Effective and robust natural language understanding for human-robot interaction. In: Proceedings of 21st European conference on artificial intelligence. IOS Press, pp 57–62
Bastianelli E, Castellucci G, Croce D, Basili R, Nardi D Natural language technologies for adaptive spoken human-robot interaction (2014). In preparation
Bastianelli E, Castellucci G, Croce D, Iocchi L, Basili R, Nardi D (2014) Huric: a human robot interaction corpus. In: Chair NCC, Choukri K, Declerck T, Loftsson H, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S (eds) Proceedings of the 9th international conference on language resources and evaluation (LREC’14), European Language Resources Association (ELRA), Reykjavik
Bos J (2002) Compilation of unification grammars with compositional semantics to speech recognition packages. In: Proceedings of the 19th international conference on computational linguistics. ACL, pp 1–7
Bos J, Oka T (2007) A spoken language interface with a mobile robot. Artif Life Robot 11(1):42–47
Cocorobo: Sharp. http://www.sharp.co.jp/cocorobo/
Connell JH (2014) Extensible grounding of speech for robot instruction. In: Markowitz J (ed) Robots that talk and listen: technology and social impact. Walter De Gruyter Inc
Coradeschi S, Saffiotti A (2003) An introduction to the anchoring problem. Robot Auton Syst 43(2–3):85–96
Fasola J, Mataric M (2013) Using semantic fields to model dynamic spatial relations in a robot architecture for natural language instruction of service robots. In: Proceedings of international conference on intelligent robots and systems, pp 143–150
Fillmore CJ (1985) Frames and the semantics of understanding. Quaderni di Semantica 6(2):222–254
Foster ME, Giuliani M, Isard A, Matheson C, Oberlander J, Knoll A (2009) Evaluating description and reference strategies in a cooperative human-robot dialogue system. In: Proceedings of 21st international jont conference on artifical intelligence. Morgan Kaufmann Publishers Inc, pp 1818–1823
Harnad S (1990) The symbol grounding problem. Physica D: Nonlinear Phenomena 42(1-3):335–346
Kamp H (1981) A theory of truth and semantic representation. In: Groenendijk JAG, Janssen TMV, Stokhof MBJ (eds) Formal methods in the study of language, vol 1. Mathematisch Centrum, pp 277–322
Kollar T, Tellex S, Roy D, Roy N (2010) Toward understanding natural language directions. In: Proceedings of the 5th international conference on human-robot interaction. ACM/IEEE, IEEE Press, pp 259–266
Kollar T, Tellex S, Roy N (2010) A discriminative model for understanding natural language route directions. In: Proceedings of association for the advancement of artificial intelligence fall symposium: dialog with robots’10
Kruijff G, Zender H, Jensfelt P, Christensen H (2007) Situated dialogue and spatial organization: What, where... and why, vol 4, pp 125–138. Special issue on human and robot interactive communication
Kruijff GJM (2001) A categorial-modal logical architecture of informativity: dependency grammar logic & information structure. Ph.D. thesis, Faculty of Mathematics and Physics. Charles University, Czech Republic
de Mori R (2007). In: Furui S, Kawahara T (eds) Spoken language understanding: a survey. IEEE, pp 365–376
Nao: Aldebaran. http://www.aldebaran-robotics.com/
Nardi D, Lima P (2012) RoboCup: the robot soccer world cup. In: Lima P, Cortesao R (eds) Proceedings of the international conference on intelligent robots and systems. Workshop on robot competitions: benchmarking, technology transfer and education. IEEE/RSJ, IEEE
Nishimori M, Saitoh T, Konishi R (2007) Voice controlled intelligent wheelchair. In: Proceedings of society of instrument and control engineers annual conference. IEEE, pp 336–340
Nisimura R, Uchida T, Lee A, Saruwatari H, Shikano K, Matsumoto Y (2002) ASKA: Receptionist robot with speech dialogue system. IEEE/RSJ, pp 1314–1319
Nüchter A, Hertzberg J (2008) Towards semantic maps for mobile robots. Robot Auton Syst 56(11):915–926
Palmer M, Gildea D, Xue N (2010) Semantic role labeling. Synthesis lectures on human language technologies. Morgan & Claypool Publishers
Popović M, Ney H (2007) Word error rates: decomposition over pos classes and applications for error analysis. In: Proceedings of the 2nd workshop on statistical machine translation. ACL, pp 48–55
Q.bo: The corpora robot company. http://thecorpora.com
Rybski P, Yoon K, Stolarz J, Veloso M (2007) Interactive robot task training through dialog and demonstration. In: Proceedings of international conference on human-robot interaction. ACM/IEEE, ACM, pp 49–56
Sallé D, Traonmilin M, Canou J, Dupourqué V (2007) Using microsoft robotics studio for the design of generic robotics controllers: the robubox software. In: Proceedings of international conference on robotics and automation. Workshop software development and integration in robotics. IEEE
Stiefelhagen R, Ekenel H, Fugen C, Gieselmann P, Holzapfel H, Kraft F, Nickel K, Voit M, Waibel A (2007) Enabling multimodal human–robot interaction for the Karlsruhe humanoid robot. IEEE Trans Robot 23(5):840–851
Tellex S, Kollar T, Dickerson S, Walter MR, Banerjee AG, Teller S, Roy N (2011) Approaching the symbol grounding problem with probabilistic graphical models. AI Mag 32(4):64–76
Tellex S, Kollar T, Dickerson S, Walter MR, Banerjee AG, Teller SJ, Roy N (2011) Understanding natural language commands for robotic navigation and mobile manipulation. In: Proceedings of association for the advancement of artificial intelligence
Theobalt C, Bos J, Chapman T, Espinosa-Romero A, Fraser M, Hayes G, Klein E, Oka T, Reeve R (2002) Talking to godot: dialogue with a mobile robot. In: Proceedings of international conference on intelligent robots and systems. IEEE/RSJ
Thomas BJ, Jenkins OC (2012) Roboframenet: verb-centric semantics for actions in robot middleware. In: Proceedings of international conference on robotics and automation, pp 4750–4755
Thrun S, Beetz M, Bennewitz M, Burgard W, Cremers A, Dellaert F, Fox D, Haehnel D, Rosenberg C, Roy N, Schulte J, Schulz D (2000) Probabilistic algorithms and the interactive museum tour-guide robot Minerva. J Robot Res 19(11)
Topp EA (2008) Human-robot interaction and mapping with a service robot: human augmented mapping. Ph.D. thesis, Royal Institute of Technology, School of Computer Science and Communication
Warwick K, Shah H (2013) Good machine performance in turing’s imitation game. IEEE Trans Comput Intell AI Games 6(3):289–299
Zuo X, Iwahashi N, Taguchi R, Funakoshi K, Nakano M, Matsuda S, Sugiura K, Oka N (2010) Detecting robot-directed speech by situated understanding in object manipulation tasks. In: Avizzano CA, Ruffaldi E (eds) Proceedings of the international symposium of robots and human interactive communication. IEEE, pp 608–613
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bastianelli, E., Nardi, D., Aiello, L.C. et al. Speaky for robots: the development of vocal interfaces for robotic applications. Appl Intell 44, 43–66 (2016). https://doi.org/10.1007/s10489-015-0695-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-015-0695-5