Autonomous Robots

, Volume 42, Issue 5, pp 1067–1085 | Cite as

Learning proactive behavior for interactive social robots

  • Phoebe Liu
  • Dylan F. Glas
  • Takayuki Kanda
  • Hiroshi Ishiguro
Part of the following topical collections:
  1. Special Issue: Learning for Human-Robot Collaboration


Learning human–robot interaction logic from example interaction data has the potential to leverage “big data” to reduce the effort and time spent on designing interaction logic or crafting interaction content. Previous work has demonstrated techniques by which a robot can learn motion and speech behaviors from non-annotated human–human interaction data, but these techniques only enable a robot to respond to human-initiated inputs, and do not enable the robot to proactively initiate interaction. In this work, we propose a method for learning both human-initiated and robot-initiated behavior for a social robot from human–human example interactions, which we demonstrate for a shopkeeper interacting with a customer in a camera shop scenario. This was achieved by extending an existing technique by (1) introducing a concept of a customer yield action, (2) incorporating interaction history, represented by sequences of discretized actions, as inputs for training and generating robot behavior, and (3) using an “attention mechanism” in our learning system for training robot behaviors, that learns which parts of the interaction history are more important for generating robot behaviors. The proposed method trains a robot to generate multimodal actions, consisting of speech and locomotion behaviors. We compared this study with the previous technique in two ways. Cross-validation on the training data showed higher social appropriateness of predicted behaviors using the proposed technique, and a user study of live interaction with a robot showed that participants perceived the proposed technique to produce behaviors that were more proactive, socially-appropriate, and better in overall quality.


Human–robot interaction Data-driven learning Learning by imitation Social robotics Service robots Proactive behaviors 



This work was supported in part by the JST ERATO Ishiguro Symbiotic Human-Robot Interaction Project, Grant Number JPMJER1401, and in part by JSPS KAKENHI Grant Number 25240042.

Compliance with ethical standards

Ethical approval

This research was conducted in compliance with the standards and regulations of our company’s ethical review board, which requires each experiment to be subject to a review and approval procedure according to strict ethical guidelines.

Supplementary material

10514_2017_9671_MOESM1_ESM.mp4 (57.8 mb)
Supplementary material 1 (mp4 59198 KB)


  1. Admoni, H., & Scassellati, B. (2014). Data-driven model of nonverbal behavior for socially assistive human–robot interactions. In Proceedings of the 16th international conference on multimodal interaction (pp. 196–199), ACM.Google Scholar
  2. Awais, M., & Henrich, D. (2012). Proactive premature intention estimation for intuitive human–robot collaboration. In 2012 IEEE/RSJ international conference on intelligent robots and systems (pp. 4098–4103), IEEE.Google Scholar
  3. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
  4. Bauer, A., Klasing, K., Lidoris, G., Mühlbauer, Q., Rohrmüller, F., Sosnowski, S., et al. (2009). The autonomous city explorer: Towards natural human–robot interaction in urban environments. International Journal of Social Robotics, 1(2), 127–140.CrossRefGoogle Scholar
  5. Breazeal, C., DePalma, N., Orkin, J., Chernova, S., & Jung, M. (2013). Crowdsourcing human–robot interaction: new methods and system evaluation in a public environment. Journal of Human–Robot Interaction, 2(1), 82–111.Google Scholar
  6. Brscic, D., Kanda, T., Ikeda, T., & Miyashita, T. (2013). Person tracking in large public spaces using 3-D range sensors. IEEE Transactions on Human–Machine Systems, 43(6), 522–534. Scholar
  7. Chao, C., & Thomaz, A. L. (2011). Timing in multimodal turn-taking interactions: Control and analysis using timed petri nets. Journal of Human–Robot Interaction, 1(1), 1–16.Google Scholar
  8. Cheng, J., Dong, L., & Lapata, M. (2016). Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733.
  9. Chernova, S., DePalma, N., Morant, E., & Breazeal, C. (2011). Crowdsourcing human–robot interaction: Application from virtual to physical worlds. In RO-MAN, 2011 IEEE, July 31 2011–Aug. 3 2011 (pp. 21–26).
  10. Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.CrossRefzbMATHGoogle Scholar
  11. Duncan, S. (1972). Some signals and rules for taking speaking turns in conversations. Journal of Personality and Social Psychology, 23(2), 283.MathSciNetCrossRefGoogle Scholar
  12. Duncan, S. (1974). On the structure of speaker–auditor interaction during speaking turns. Language in Society, 3(02), 161–180.CrossRefGoogle Scholar
  13. Fox, D., Burgard, W., & Thrun, S. (1997). The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine, 4(1), 23–33.CrossRefGoogle Scholar
  14. Glas, D. F., Brščič, D., Miyashita, T., & Hagita, N. (2015). SNAPCAT-3D: Calibrating networks of 3D range sensors for pedestrian tracking. In 2015 IEEE international conference on robotics and automation (ICRA) (pp. 712–719), IEEE.Google Scholar
  15. Gu, E., & Badler, N. I. (2006). Visual attention and eye gaze during multiparty conversations with distractions. In International workshop on intelligent virtual agents (pp. 193–204), Springer.Google Scholar
  16. Guéguen, L. (2001). Segmentation by maximal predictive partitioning according to composition biases. In O. Gascuel, & M.-F. Sagot (Eds.), Computational biology. Lecture Notes in Computer Science (Vol. 2066, pp. 32–44). Berlin: Springer.Google Scholar
  17. Hall, E. T. (1966). The hidden dimension. London: The Bodley Head Ltd.Google Scholar
  18. Hayashi, K., Sakamoto, D., Kanda, T., Shiomi, M., Koizumi, S., Ishiguro, H., et al. (2007). Humanoid robots as a passive-social medium—A field experiment at a train station. In 2007 2nd ACM/IEEE international conference on human–robot interaction (HRI), 9–11 March 2007 (pp. 137–144).Google Scholar
  19. Hermann, K. M., Kocisky, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., et al. (2015). Teaching machines to read and comprehend. In Advances in neural information processing systems (pp. 1693–1701).Google Scholar
  20. Huang, C.-M., Cakmak, M., & Mutlu, B. (2015). Adaptive coordination strategies for human–robot handovers. In Proceedings of robotics: Science and systems.Google Scholar
  21. Hulme, C., Maughan, S., & Brown, G. D. (1991). Memory for familiar and unfamiliar words: Evidence for a long-term memory contribution to short-term memory span. Journal of Memory and Language, 30(6), 685–701.CrossRefGoogle Scholar
  22. Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
  23. Jayawardena, C., Kuo, I.-H., Broadbent, E., & MacDonald, B. A. (2016). Socially assistive robot healthbot: Design, implementation, and field trials. IEEE Systems Journal, 10(3), 1056–1067.CrossRefGoogle Scholar
  24. Kawai, H., Toda, T., Ni, J., Tsuzaki, M., & Tokuda, K. (2004). XIMERA: A new TTS from ATR based on corpus-based technologies. In Fifth ISCA workshop on speech synthesis.Google Scholar
  25. Keizer, S., Foster, M. E., Wang, Z., & Lemon, O. (2014). Machine learning for social multiparty human–robot interaction. ACM Transactions on Intelligent Systems and Technology, 4(3), 1–32. Scholar
  26. Kitade, T., Satake, S., Kanda, T., & Imai, M. (2013). Understanding suitable locations for waiting. In Proceedings of the 8th ACM/IEEE international conference on Human–robot interaction (pp. 57–64), IEEE Press.Google Scholar
  27. Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes, 25(2–3), 259–284.CrossRefGoogle Scholar
  28. Liu, P., Glas, D. F., Kanda, T., & Ishiguro, H. (2016). Data-driven HRI: Learning social behaviors by example from human–human interaction. IEEE Transactions on Robotics, 32(4), 988–1008. Scholar
  29. Michalowski, M. P., Sabanovic, S., & Simmons, R. (2006). A spatial model of engagement for a social robot. In 9th IEEE international workshop on advanced motion control, 2006 (pp. 762–767). michalowski06: IEEE.Google Scholar
  30. Michaud, F., & Matarić, M. J. (1998). Learning from history for behavior-based mobile robots in non-stationary conditions. Machine Learning, 31(1–3), 141–167.CrossRefzbMATHGoogle Scholar
  31. Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., & Khudanpur, S. (2010). Recurrent neural network based language model. In Interspeech (Vol. 2, p. 3)Google Scholar
  32. Mohammad, Y., & Nishdia, T. (2012). Self-initiated imitation learning. Discovering what to imitate. In 2012 12th International conference on control, automation and systems (ICCAS), 2012 (pp. 726–732), IEEE.Google Scholar
  33. Mutlu, B., Shiwa, T., Kanda, T., Ishiguro, H., & Hagita, N. (2009). Footing in human–robot conversations: How robots might shape participant roles using gaze cues. Paper presented at the Proceedings of the 4th ACM/IEEE international conference on Human robot interaction, La Jolla, California, USA.Google Scholar
  34. Nickel, K., & Stiefelhagen, R. (2007). Visual recognition of pointing gestures for human–robot interaction. Image and Vision Computing, 25(12), 1875–1884.CrossRefGoogle Scholar
  35. Orkin, J., & Roy, D. (2007). The restaurant game: Learning social behavior and language from thousands of players online. Journal of Game Development, 3(1), 39–60.Google Scholar
  36. Orkin, J., & Roy, D. (2009). Automatic learning and generation of social behavior from collective human gameplay. In Proceedings of the 8th international conference on autonomous agents and multiagent systems-volume 1 (pp. 385–392). International Foundation for Autonomous Agents and Multiagent SystemsGoogle Scholar
  37. Pandey, A. K., Ali, M., & Alami, R. (2013). Towards a task-aware proactive sociable robot based on multi-state perspective-taking. International Journal of Social Robotics, 5(2), 215–236.CrossRefGoogle Scholar
  38. Raffel, C., & Ellis, D. P. (2015). Feed-forward networks with attention can solve some long-term memory problems. arXiv preprint arXiv:1512.08756.
  39. Raux, A., & Eskenazi, M. (2008). Optimizing endpointing thresholds using dialogue features in a spoken dialogue system. In Proceedings of the 9th SIGdial workshop on discourse and dialogue (pp. 1–10). Association for Computational LinguisticsGoogle Scholar
  40. Rich, C., Ponsler, B., Holroyd, A., & Sidner, C. L. (2010). Recognizing engagement in human–robot interaction. In 2010 5th ACM/IEEE international conference on human–robot interaction (HRI) (pp. 375–382), IEEEGoogle Scholar
  41. Robins, B., Dautenhahn, K., & Dickerson, P. (2009). From isolation to communication: a case study evaluation of robot assisted play for children with autism with a minimally expressive humanoid robot. In Second international conferences on advances in computer–human interactions, 2009. ACHI’09 (pp. 205–211), IEEE.Google Scholar
  42. Rozo, L., Silvério, J., Calinon, S., & Caldwell, D. G. (2016). Learning controllers for reactive and proactive behaviors in human–robot collaboration. Frontiers in Robotics and AI, 3, 30.CrossRefGoogle Scholar
  43. Satake, S., Hayashi, K., Nakatani, K., & Kanda, T. (2015). Field trial of an information-providing robot in a shopping mall. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS) (pp. 1832–1839), IEEE.Google Scholar
  44. Satake, S., Kanda, T., Glas, D. F., Imai, M., Ishiguro, H., & Hagita, N. (2009). How to approach humans? Strategies for social robots to initiate interaction. In Proceedings of the 4th ACM/IEEE international conference on human robot interaction, La Jolla, California, USA (pp. 109–116), ACM.
  45. Schmid, A. J., Weede, O., & Worn, H. (2007). Proactive robot task selection given a human intention estimate. In RO-MAN 2007—The 16th IEEE international symposium on robot and human interactive communication, 26–29 Aug. 2007 (pp. 726–731).
  46. Schrempf, O. C., Hanebeck, U. D., Schmid, A. J., & Worn, H. (2005). A novel approach to proactive human–robot cooperation. In ROMAN 2005. IEEE international workshop on robot and human interactive communication, 2005. (pp. 555–560), IEEEGoogle Scholar
  47. Shi, C., Kanda, T., Shimada, M., Yamaoka, F., Ishiguro, H., & Hagita, N. (2010). Easy development of communicative behaviors in social robots. In 2010 IEEE/RSJ international conference on intelligent robots and systems (IROS), 18–22 Oct. 2010 (pp. 5302–5309).
  48. Shi, C., Shimada, M., Kanda, T., Ishiguro, H., & Hagita, N. (2011). Spatial formation model for initiating conversation. In Proceedings of robotics: Science and systems VII.Google Scholar
  49. Shiomi, M., Kanda, T., Glas, D. F., Satake, S., Ishiguro, H., & Hagita, N. (2009). Field trial of networked social robots in a shopping mall. In IEEE/RSJ international conference on intelligent robots and systems, 2009. IROS 2009. St. Louis, MO, USA, 10–15 Oct. 2009 (pp. 2846–2853). shiomi09: IEEE Press.
  50. Sugiyama, O., Kanda, T., Imai, M., Ishiguro, H., & Hagita, N. (2007). Natural deictic communication with humanoid robots. In 2007 IEEE/RSJ international conference on intelligent robots and systems (pp. 1441–1448), IEEE.Google Scholar
  51. Sukhbaatar, S., Weston, J., & Fergus, R. (2015). End-to-end memory networks. In Advances in neural information processing systems (pp. 2440–2448).Google Scholar
  52. Thomaz, A. L., & Chao, C. (2011). Turn-taking based on information flow for fluent human–robot interaction. AI Magazine, 32(4), 53–63.CrossRefGoogle Scholar
  53. Toris, R., Kent, D., & Chernova, S. (2014). The robot management system: A framework for conducting human–robot interaction studies through crowdsourcing. Journal of Human–Robot Interaction, 3(2), 25–49.Google Scholar
  54. Triebel, R., Arras, K., Alami, R., Beyer, L., Breuers, S., Chatila, R., et al. (2016). Spencer: A socially aware service robot for passenger guidance and help in busy airports. In Field and service robotics (pp. 607–622), Springer.Google Scholar
  55. Viejo, G., Khamassi, M., Brovelli, A., & Girard, B. (2015). Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning. Frontiers in Behavioral Neuroscience, 9, 225.CrossRefGoogle Scholar
  56. Yamaoka, F., Kanda, T., Ishiguro, H., & Hagita, N. (2008). How close? A model of proximity control for information-presenting robots. In Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction, Amsterdam, The Netherlands (pp. 137–144), ACM.
  57. Young, J. E., Igarashi, T., Sharlin, E., Sakamoto, D., & Allen, J. (2014). Design and evaluation techniques for authoring interactive and stylistic behaviors. ACM Transactions on Interactive Intelligent Systems (TiiS), 3(4), 23.Google Scholar
  58. Young, J. E., Sharlin, E., & Igarashi, T. (2013). Teaching robots style: Designing and evaluating style-by-demonstration for interactive robotic locomotion. Human–Computer Interaction, 28(5), 379–416.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.ERATO Ishiguro Symbiotic Human-Robot Interaction Project ATR-HILKeihanna Science CityJapan
  2. 2.ATR-IRCKeihanna Science CityJapan
  3. 3.ERATO Ishiguro Symbiotic Human-Robot Interaction Project IRLOsaka UniversityToyonakaJapan

Personalised recommendations