An agent for learning new natural language commands

Azaria, Amos; Srivastava, Shashank; Krishnamurthy, Jayant; Labutov, Igor; Mitchell, Tom M.

doi:10.1007/s10458-019-09425-x

An agent for learning new natural language commands

Published: 04 December 2019

Volume 34, article number 6, (2020)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Amos Azaria ORCID: orcid.org/0000-0002-5057-1309^1,2,
Shashank Srivastava³,
Jayant Krishnamurthy⁴,
Igor Labutov⁵ &
…
Tom M. Mitchell⁶

637 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Teaching via natural language is an intuitive way for end users to add functionality to a virtual assistant, enabling them to personalize their assistant with new commands without requiring the intervention of the system developer, who cannot possibly anticipate all of an end user’s needs. In this paper we introduce our Learning by Instruction Agent (LIA), the first virtual assistant, for an email domain, that is capable of learning how to perform new commands taught by end users in natural language. LIA grounds the semantics of each command in terms of primitive executable procedures. When a user provides LIA with a command that it does not understand, it prompts the user to explain the command through a sequence of natural language steps. From this input, LIA learns the meaning of the new command and how to generalize the command to novel situations. For example, having been taught how to “forward an email to Alice”, it can correctly understand “forward this email to Bob”. We show that users that were assigned to interact with LIA completed the task quicker than users assigned to interact with a non-learning agent. These results demonstrate the potential of natural language teaching to improve the capabilities of intelligent personal assistants. We annotated 4759 natural language statements with their associated computer readable execution commands (logical forms) to form a dataset (which we publicize in this paper). We present the performance of several different parser methods on this dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring

Article 30 September 2014

Intelligent CALL: Individualizing Learning Using Natural Language Generation

Evaluating the Effectiveness of Integrating Natural Language Tutoring into an Existing Adaptive Learning System

References

Abbeel, P., & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on Machine learning (p. 1). ACM.
Allen, J., Chambers, N., Ferguson, G., Galescu, L., Jung, H., Swift, M., & Taysom, W. (2007). Plow: A collaborative task learning agent. In Proceedings of the national conference on artificial intelligence (Vol. 22, p. 1514). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999.
Amir, O., & Gal, Y. K. (2013). Plan recognition and visualization in exploratory learning environments. ACM Transactions on Interactive Intelligent Systems (TiiS), 3(3), 16.
Google Scholar
Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 469–483.
Google Scholar
Artzi, Y., Lee, K., & Zettlemoyer, L. (2015). Broad-coverage ccg semantic parsing with AMR. In EMNLP (pp. 1699–1710).
Azaria, A., Gal, Y., Kraus, S., & Goldman, C. V. (2016). Strategic advice provision in repeated human-agent interactions. Autonomous Agents and Multi-agent Systems, 30(1), 4–29.
Google Scholar
Azaria, A., & Hong, J. (2016). Recommender system with personality. In RecSys (pp. 207–210).
Azaria, A., Krishnamurthy, J., & Mitchell, T. M. (2016). Instructable intelligent personal agent. In Thirtieth AAAI conference on artificial intelligence.
Azaria, A., Richardson, A., & Kraus, S. (2015). An agent for deception detection in discussion based environments. In Proceedings of the 18th ACM conference on computer supported cooperative work and social computing (pp. 218–227). ACM.
Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Proceedings of the international conference on learning representations (ICLR).
Berant, J., Chou, A., Frostig, R., & Liang, P. (2013). Semantic parsing on freebase from question-answer pairs. In EMNLP (Vol. 2, p. 6).
Biermann, A. W. (1983). Natural language programming. Berlin: Springer.
Google Scholar
Billard, A., Calinon, S., Dillmann, R., & Schaal, S. (2008). Robot programming by demonstration. In B. Siciliano & O. Khatib (Eds.), Springer handbook of robotics (pp. 1371–1394). Berlin: Springer.
Branavan, S. R., Chen, H., Zettlemoyer, L. S., & Barzilay, R. (2009). Reinforcement learning for mapping instructions to actions. In Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: Volume 1-volume 1 (pp. 82–90). Association for Computational Linguistics.
Brewer, R., Morris, M. R., & Piper, A. M. (2016). Why would anybody do this?: Understanding older adults’ motivations and challenges in crowd work. In Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 2246–2257). ACM.
Buntain, C., Azaria, A., & Kraus, S. (2014). Leveraging fee-based, imperfect advisors in human-agent games of trust. In Twenty-eighth AAAI conference on artificial intelligence.
Calinon, S., Guenter, F., & Billard, A. (2007). On learning, representing, and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 37(2), 286–298.
Google Scholar
Cantrell, R., Benton, J., Talamadupula, K., Kambhampati, S., Schermerhorn, P., & Scheutz, M. (2012). Tell me when and why to do it! run-time planner model updates via natural language instruction. In 2012 7th ACM/IEEE international conference on human–robot interaction (HRI) (pp. 471–478). IEEE.
Chalamish, M., Sarne, D., & Kraus, S. (2007). Mass programmed agents for simulating human strategies in large scale systems. In Proceedings of the 6th international joint conference on autonomous agents and multiagent systems (p. 135). ACM.
Chen, D. L., & Mooney, R. J. (2008). Learning to sportscast: A test of grounded language acquisition. In Proceedings of the 25th international conference on machine learning (pp. 128–135). ACM.
Chen, D. L., & Mooney, R. J. (2011). Learning to interpret natural language navigation instructions from observations (pp. 859–865). San Francisco, CA.
Chkroun, M., & Azaria, A. (2019). Lia: A virtual assistant that can be taught new commands by speech. International Journal of Human-Computer Interaction, 35(17), 1596–1607.
Google Scholar
Clancey, W. J. (1983). The epistemology of a rule-based expert system-a framework for explanation. Artificial intelligence, 20(3), 215–251.
Google Scholar
Clarke, J., Goldwasser, D., Chang, M.-W., & Roth, D. (2010). Driving semantic parsing from the world’s response. In CoNLL (pp. 18–27). Association for Computational Linguistics.
Druck, G., Mann, G., & McCallum, A. (2008). Learning from labeled features using generalized expectation criteria. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (pp. 595–602). ACM.
Haas, N., & Hendrix, G. G. (1980). An approach to acquiring and applying knowledge. In AAAI (pp. 235–239).
Hara, K., Adams, A., Milland, K., Savage, S., Callison-Burch, C., & Bigham, J. P. (2018). A data-driven analysis of workers’ earnings on amazon mechanical turk. In Proceedings of the 2018 CHI conference on human factors in computing systems (p. 449). ACM.
Harnad, S. (1990). The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1), 335–346.
Google Scholar
Hinkin, T. R. (1998). A brief tutorial on the development of measures for use in survey questionnaires. Organizational Research Methods, 1(1), 104–121.
Google Scholar
Knox, W. B., & Stone, P. (2009). Interactively shaping agents via human reinforcement: The tamer framework. In Proceedings of the fifth international conference on Knowledge capture (p. 9–16). ACM.
Koedinger, K. R., Aleven, V., Heffernan, N., McLaren, B., & Hockenberry, M. (2004). Opening the door to non-programmers: Authoring intelligent tutor behavior by demonstration. In J. C. Lester, R. M. Vicari, & F. Paraguaçu (Eds.), Intelligent tutoring systems (pp. 162–174). Berlin: Springer.
Google Scholar
Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. Cambridge: MIT press.
MATH Google Scholar
Krishnamurthy, J., & Kollar, T. (2013). Jointly learning to parse and perceive: Connecting natural language to the physical world. Transactions of Association for Computational Linguistics, 1, 193–206.
Google Scholar
Krishnamurthy, J., & Mitchell, T. M. (2012). Weakly supervised training of semantic parsers. In EMNLP-CoNLL (pp. 754–765).
Kuklinski, K., Fischer, K., Marhenke, I., Kirstein, F., Solvason, D., Kruger, N., Savarimuthu, T. R., et al. (2014). Teleoperation for learning by demonstration: Data glove versus object manipulation for intuitive robot control. In 2014 6th international congress on Ultra modern telecommunications and control systems and workshops (ICUMT) (pp. 346–351). IEEE.
Kwiatkowski, T., Choi, E., Artzi, Y., & Zettlemoyer, L. S. (2013). Scaling semantic parsers with on-the-fly ontology matching. In ACL.
Kwiatkowski, T., Zettlemoyer, L., Goldwater, S., & Steedman, M. (2010). Inducing probabilistic ccg grammars from logical form with higher-order unification. In EMNLP (pp. 1223–1233).
Le, V., Gulwani, S., & Su, Z. (2013). Smartsynth: Synthesizing smartphone automation scripts from natural language. In Proceeding of the 11th annual international conference on Mobile systems, applications, and services (pp. 193–206). ACM.
Li, T. J.-J., Azaria, A., & Myers, B. A. (2017). Sugilite: Creating multimodal smartphone automation by demonstration. In Proceedings of the 2017 CHI conference on human factors in computing systems (pp. 6038–6049). ACM.
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., & McClosky, D. (2014). The stanford corenlp natural language processing toolkit. In Proceedings of 52nd annual meeting of the association for computational linguistics: System demonstrations (pp. 55–60).
Matuszek, C., FitzGerald, N., Zettlemoyer, L., Bo, L., & Fox, D. (2012). A joint model of language and perception for grounded attribute learning. arXiv preprint arXiv:1206.6423.
Myers, B. A., Cypher, A., Maulsby, D., Smith, D. C., & Shneiderman, B. (1991). Demonstrational interfaces: Coming soon? In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 393–396). ACM.
Nakaoka, S., Nakazawa, A., Kanehiro, F., Kaneko, K., Morisawa, M., Hirukawa, H., et al. (2007). Learning from observation paradigm: Leg task models for enabling a biped humanoid robot to imitate human dances. The International Journal of Robotics Research, 26(8), 829–844.
Google Scholar
Pasupat, P., & Liang, P. (2015). Compositional semantic parsing on semi-structured tables. arXiv preprint arXiv:1508.00305.
Price, B., & Boutilier, C. (2003). Accelerating reinforcement learning through implicit imitation. Journal of Artificial Intelligence Research, 19, 569–629.
MATH Google Scholar
Quirk, C., Mooney, R., & Galley, M. (2015). Language to code: Learning semantic parsers for if-this-then-that recipes. In Proceedings of the 53rd annual meeting of the association for computational linguistics (ACL-15) (pp. 878–888). Beijing, China.
Quirk, C., Mooney, R. J., & Galley, M. (2015). Language to code: Learning semantic parsers for if-this-then-that recipes. ACL, 1, 878–888.
Google Scholar
Reed, A. (2010). Creating interactive fiction with inform 7. Boston: Cengage Learning.
Google Scholar
Rosenfeld, A., Cohen, M., Taylor, M. E., & Kraus, S. (2018). Leveraging human knowledge in tabular reinforcement learning: A study of human subjects. The Knowledge Engineering Review, 33, e14.
Google Scholar
Rosenfeld, A., & Kraus, S. (2018). Predicting human decision-making: From prediction to action. Synthesis Lectures on Artificial Intelligence and Machine Learning, 12(1), 1–150.
Google Scholar
Rosenfeld, A., Zuckerman, I., Segal-Halevi, E., Drein, O., & Kraus, S. (2016). Negochat-a: A chat-based negotiation agent with bounded rationality. Autonomous Agents and Multi-agent Systems, 30(1), 60–81.
Google Scholar
Sarafian, E., Tamar, A., & Kraus, S. (2019). Safe policy learning from observations. In The multi-disciplinary conference on reinforcement learning and decision making.
Segal, A., David, Y. B., Williams, J. J., Gal, K., & Shalom, Y. (2018). Combining difficulty ranking with multi-armed bandits to sequence educational content. In International conference on artificial intelligence in education (pp. 317–321). Springer.
She, L., Yang, S., Cheng, Y., Jia, Y., Chai, J., & Xi, N. (2014). Back to the blocks world: Learning new actions through situated human-robot dialogue. In Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL) (pp. 89–97).
Spielberg, Y., & Azaria, A. (2019). The concept of criticality in reinforcement learning. In International conference on tools with artificial intelligence.
Srivastava, S., Azaria, A., & Mitchell, T. (2017). Parsing natural language conversations using contextual cues. In IJCAI 2017.
Steedman, M., & Baldridge, J. (2011). Combinatory categorial grammar. In R. D. Borsley & K. Börjars (Eds.), Non-transformational syntax: Formal and explicit models of grammar (pp. 181–224). Wiley Online Library.
Thomason, J., Zhang, S., Mooney, R., & Stone, P. (2015). Learning to interpret natural language commands through human-robot dialog. In Proceedings of the twenty-fourth international joint conference on artificial intelligence (IJCAI).
Wang, Y., Berant, J., & Liang, P. (2015). Building a semantic parser overnight. In ACL.
Winograd, T. (1973). A procedural model of language understanding. In R. Schank & K. Colby (Eds.), Computer models of thought and language (pp. 152–186). New York: WH Freeman.
Google Scholar
Wong, Y. W., & Mooney, R. J. (2007). Learning synchronous grammars for semantic parsing with lambda calculus. In ACL (Vol. 45, p. 960).
Zettlemoyer, L. S., & Collins, M. (2005). Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In UAI ’05, proceedings of the 21st conference in uncertainty in artificial intelligence.
Zettlemoyer, L. S., & Collins, M. (2007). Online learning of relaxed ccg grammars for parsing to logical form. In EMNLP-CoNLL (pp. 678–687).

Download references

Acknowledgements

This work was supported in part by Samsung GRO, Verizon (Yahoo!) through CMU’s InMind project [7] and, the Ministry of Science Technology & Space, Israel, and DARPA under Contract No. FA8750-13-2-0005.

Author information

Authors and Affiliations

Department of Computer Science, Ariel University, Ariel, Israel
Amos Azaria
Data Science Center, Ariel University, Ariel, Israel
Amos Azaria
Microsoft Research, Redmond, WA, USA
Shashank Srivastava
Semantic Machines, Berkeley, CA, USA
Jayant Krishnamurthy
LAER AI, New York, NY, USA
Igor Labutov
Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA, USA
Tom M. Mitchell

Authors

Amos Azaria
View author publications
You can also search for this author in PubMed Google Scholar
Shashank Srivastava
View author publications
You can also search for this author in PubMed Google Scholar
Jayant Krishnamurthy
View author publications
You can also search for this author in PubMed Google Scholar
Igor Labutov
View author publications
You can also search for this author in PubMed Google Scholar
Tom M. Mitchell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amos Azaria.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

The following is a sample of a real user interacting with LIA:

User: go

Agent: Sorry, but I don’t understand. Would you like to teach me what to do in this case (either say yes or simply ignore this question)?

User: yes

Agent: Great! When you say, for example: “go”, what shall I do first?

User: move to next email and read it

Agent: Set to next incoming email successfully. sender: caseyousoon8@myworkplace.com subject: Another email body: I felt like sending you another email. Please ignore this email too and read the next one. What shall I do next (when executing: “go”)?

User: end

Agent: I’m currently learning the new command (“go”). I’m trying to generalize to other similar commands, this takes several seconds...

Agent: I now know what to do when you say (for example): “go”!

...

User: go

Agent: Set to next incoming email successfully. recipient list: you@myworkplace.com sender: caseyousoon8@myworkplace.com subject: Charlie body: I asked Charlie to do what you said, but I see that it must come from you. Please forward this email to Charlie.

User: create contact Charlie

Agent: Instance “charlie” (of concept “contact”) was created. “contact” fields are: email.

User: set Charlie’s email to charlieisasleep4@myworkplace.com

Agent: The “email” field in “charlie” was set to: “charlieisasleep4@myworkplace.com”.

...

User:go

Agent: Set to next incoming email successfully. sender: alextimetowork@myworkplace.com subject: Work before parting body: We will all have to work very hard next Monday, Tuesday and Wednesday. Please forward this email to Charlie.

User: forward charlie

Agent: Sorry, but I don’t understand. Would you like to teach me what to do in this case?

User: yes

Agent: Great! When you say, for example: “forward charlie”, what shall I do first?

User: set recipient list to charlie’s email, set subject to current email’s subject, set body to current email’s

Agent: I see that there is no email being composed. Do you want to compose a new email?

User: create new email, set recipient list to charlie’s email, set subject to current email’s subject, set body to current email’s body, send email

Agent: Composing new email. The “recipient list” field in “outgoing email” was set to: “charlieisasleep4@myworkplace.com”. The “subject” field in “outgoing email” was set to: “Work before parting”. The “body” field in “outgoing email” was set to: “We will all have to work very hard next Monday, Tuesday and Wednesday. Please forward this email to Charlie.”. Email sent successfully. What shall I do next (when executing: “forward charlie”)?

User: end

Agent: I now know what to do when you say (for example): “forward charlie”!

Rights and permissions

Reprints and permissions

About this article

Cite this article

Azaria, A., Srivastava, S., Krishnamurthy, J. et al. An agent for learning new natural language commands. Auton Agent Multi-Agent Syst 34, 6 (2020). https://doi.org/10.1007/s10458-019-09425-x

Download citation

Published: 04 December 2019
DOI: https://doi.org/10.1007/s10458-019-09425-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An agent for learning new natural language commands

Abstract

Access this article

Similar content being viewed by others

AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring

Intelligent CALL: Individualizing Learning Using Natural Language Generation

Evaluating the Effectiveness of Integrating Natural Language Tutoring into an Existing Adaptive Learning System

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An agent for learning new natural language commands

Abstract

Access this article

Similar content being viewed by others

AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring

Intelligent CALL: Individualizing Learning Using Natural Language Generation

Evaluating the Effectiveness of Integrating Natural Language Tutoring into an Existing Adaptive Learning System

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation