Autonomous Robots

, Volume 39, Issue 3, pp 347–362 | Cite as

Recovering from failure by asking for help

  • Ross A. KnepperEmail author
  • Stefanie Tellex
  • Adrian Li
  • Nicholas Roy
  • Daniela Rus


Robots inevitably fail, often without the ability to recover autonomously. We demonstrate an approach for enabling a robot to recover from failures by communicating its need for specific help to a human partner using natural language. Our approach automatically detects failures, then generates targeted spoken-language requests for help such as “Please give me the white table leg that is on the black table.” Once the human partner has repaired the failure condition, the system resumes full autonomy. We present a novel inverse semantics algorithm for generating effective help requests. In contrast to forward semantic models that interpret natural language in terms of robot actions and perception, our inverse semantics algorithm generates requests by emulating the human’s ability to interpret a request using the Generalized Grounding Graph (\(\hbox {G}^{3}\)) framework. To assess the effectiveness of our approach, we present a corpus-based online evaluation, as well as an end-to-end user study, demonstrating that our approach increases the effectiveness of human interventions compared to static requests for help.


Natural language generation Failure detection Failure handling Assembly Human–robot interaction 

List of Symbols

\(\lambda \in \varLambda \)

Set of language variables (words or short phrases)

\(\gamma \in \varGamma \)

Set of grounding variables (concepts in the real world)

\(\phi \in \varPhi \)

Set of correspondence variables


Environmental context model


Target symbolic action

\(\gamma _{a}^{*}\)

Target action grounding variable



This research was done at CSAIL-MIT. This work was supported in part by the Boeing Company, and in part by the U.S Army Research Laboratory under the Robotics Collaborative Technology Alliance. The authors thank Dishaan Ahuja and Andrew Spielberg for their assistance in conducting the experiments.


  1. Akmajian, A. (2010). Linguistics an introduction to language and communication. Cambridge: MIT Press. ISBN 000-0262513706.Google Scholar
  2. Bollini, M., Tellex, S., Thompson, T., Roy, N., & Rus, D. (2012). Interpreting and executing recipes with a cooking robot. In 13th international symposium on experimental robotics.Google Scholar
  3. Chen, D. L., & Mooney, R. J. (2011). Learning to interpret natural language navigation instructions from observations. In Proceedings of AAAI.Google Scholar
  4. de Marneffe, M, MacCartney, B., & Manning, C. (2006). Generating typed dependency parses from phrase structure parses. In Proceedings of international conference on language resources and evaluation (LREC) (pp. 449–454). Genoa.Google Scholar
  5. Dorais, G., Banasso, R., Kortenkamp, D., Pell, P., & Schreckenghost, D. (1998). Adjustable autonomy for human-centered autonomous systems on mars. Presented at the Mars Society Conference.Google Scholar
  6. Dragan, A., & Srinivasa, S. (2013). Generating legible motion. In Robotics: Science and Systems.Google Scholar
  7. Dzifcak, J., Scheutz, M., Baral, C., & Schermerhorn, P. (2009). What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution. In Proceedings of IEEE international conference on robotics and automation (pp. 4163–4168).Google Scholar
  8. Fasola, J., & Mataric, M.J. (2013). Using semantic fields to model dynamic spatial relations in a robot architecture for natural language instruction of service robots. In 2013 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE (pp. 143–150).Google Scholar
  9. Fong, T., Thorpe, C., & Baur, C. (2003). Robot, asker of questions. Journal of Robotics and Autonomous Systems, 42, 235–243.zbMATHCrossRefGoogle Scholar
  10. Garoufi, K., & Koller, A. (2011). Combining symbolic and corpus-based approaches for the generation of successful referring expressions. In Proceedings of the 13th European workshop on natural language generation. Association for Computational Linguistics (pp. 121–131).Google Scholar
  11. Goeddel, R., & Olson, E. (2012). Dart: A particle-based method for generating easy-to-follow directions. In 2012 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE (pp. 1213–1219).Google Scholar
  12. Golland, D., Liang, P., & Klein, D. (2010). A game-theoretic approach to generating spatial descriptions. In Proceedings of the 2010 conference on empirical methods in natural language processing. Association for Computational Linguistics (pp. 410–419).Google Scholar
  13. Goodman, N. D., & Stuhlmüller, A. (2013). Knowledge and implicature: Modeling language understanding as social cognition. Topics in Cognitive Science, 5(1), 173–184.CrossRefGoogle Scholar
  14. Heim, I., & Kratzer, A. (1998). Semantics in generative grammar. Oxford: Blackwell. ISBN 978-0631197133.Google Scholar
  15. Hertle, A. (2011). Design and implementation of an object-oriented planning language. Master’s thesis, Albert-Ludwigs-Universität Freiburg.Google Scholar
  16. Jackendoff, R. S. (1983). Semantics and cognition (pp. 161–187). Cambridge: MIT Press.Google Scholar
  17. Jurafsky, D., & Martin, J.H. (2008). Speech and language processing (2 ed.). Pearson Prentice Hall. ISBN 0131873210.Google Scholar
  18. Knepper, R.A., Layton, T., Romanishin, J., & Rus, D. (May 2013). IkeaBot: An autonomous multi-robot coordinated furniture assembly system. In Proceedings of IEEE international conference on robotics and automation. Karlsruhe.Google Scholar
  19. Kollar, T., Tellex, S., Roy, D., & Roy, N. (2010). Toward understanding natural language directions. In Proceedings of ACM/IEEE international conference on human-robot interaction (pp. 259–266).Google Scholar
  20. Krahmer, E., & Van Deemter, K. (2012). Computational generation of referring expressions: A survey. Computational Linguistics, 38(1), 173–218.CrossRefGoogle Scholar
  21. MacMahon, M., Stankiewicz, B., & Kuipers, B. (2006). Walk the talk: Connecting language, knowledge, and action in route instructions. In Proceedings of national conference on artificial intelligence (AAAI) (pp. 1475–1482).Google Scholar
  22. Maitin-Shepard, J., Lei, J., Cusumano-Towner, M., & Abbeel, P. (2010). Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding. In Proceedings of IEEE international conference on robotics and automation. Anchorage, AK.Google Scholar
  23. Matuszek, C., FitzGerald, N., Zettlemoyer, L., Bo, L., & Fox, D. (2012). A joint model of language and perception for grounded attribute learning. Arxiv preprint arXiv:1206.6423.
  24. McDermott, D., Ghallab, M., Howe, A., Knoblock, C., Ram, A., Veloso, M., Weld, D., & Wilkins, D. (1998). PDDL—the planning domain definition language. Technical Report CVC TR98003/DCS TR1165, Yale Center for Computational Vision and Control. New Haven.Google Scholar
  25. Reiter, E., Dale, R. (2000). Building natural language generation systems. Cambridge University Press. ISBN 9780521620369.Google Scholar
  26. Rosenthal, S., Veloso, M., & Dey, A. K. (2011). Learning accuracy and availability of humans who help mobile robots. In Proceedings of AAAI.Google Scholar
  27. Roy, D. (2002). A trainable visually-grounded spoken language generation system. In Proceedings of the international conference of spoken language processing.Google Scholar
  28. Simmons, R., Singh, S., Heger, F., Hiatt, L.M., Koterba, S.C., Melchior, N., & Sellner, B.P. (2007). Human-robot teams for large-scale assembly. In Proceedings of the NASA science technology conference.Google Scholar
  29. Striegnitz, K., Denis, A., Gargett, A., Garoufi, K., Koller, A., & Theune M. (2011). Report on the second second challenge on generating instructions in virtual environments (give-2.5). In Proceedings of the 13th European workshop on natural language generation. Association for Computational Linguistics (pp.270–279).Google Scholar
  30. Sutton, C. A., & McCallum, A. (2012). An introduction to conditional random fields. Foundations and Trends in Machine Learning, 4(4), 267–373.CrossRefGoogle Scholar
  31. Tellex, S., Kollar, T., Dickerson, S., Walter, M., Banerjee, A., & Teller, S., et al. (2011). Understanding natural language commands for robotic navigation and mobile manipulation. In Proceedings of AAAI.Google Scholar
  32. Tellex, S., Knepper, R., Li, A., Rus, D., & Roy, N. (2014). Asking for help using inverse semantics. In Robotics: Science and systems, (Best Paper.).Google Scholar
  33. Vogel, A., Bodoia, M., Potts, C., & Jurafsky, D. (2013a). Emergence of gricean maxims from multi-agent decision theory. In Proceedings of NAACL.Google Scholar
  34. Vogel, A., Potts, C., & Jurafsky, D. (2013b). Implicatures and nested beliefs in approximate Decentralized-POMDPs. In Proceedings of the 51st annual meeting of the association for computational linguistics. Association for Computational Linguistics. Sofia.Google Scholar
  35. Wilson, R. (1995). Minimizing user queries in interactive assembly planning. IEEE transactions on robotics and automation, 11(2).Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceCornell UniversityIthacaUSA
  2. 2.Computer Science DepartmentBrown UniversityProvidenceUSA
  3. 3.Department of EngineeringUniversity of CambridgeCambridgeUK
  4. 4.Computer Science and Artificial Intelligence LaboratoryMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations