Skip to main content
Log in

Machine learning from casual conversation

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

Human social learning is an effective process that has inspired many existing machine learning approaches, such as learning from observation and learning by demonstration. In this paper, we introduce another form of social learning, learning from a casual conversation or LCC a machine learning approach in which an artificially intelligent agent learns new information through an extended natural language dialog with a human. Our system enables the agent to add or change information in its knowledge base as a result of the human’s conversational text inputs. LCC seeks to close an important gap in the state of the art that has focused on teaching computer agents how to perform specific tasks. Furthermore, LCC could also provide an efficient way to enhance the knowledge base of certain types of systems without requiring the involvement of a programmer. LCC does not require the user to enter specific information; instead, the user can converse naturally with the agent. As part of its learning process, LCC identifies the text inputs from the conversing human that contain information worth learning, and then determines whether the inputs are heretofore unknown and learns it; in agreement with what it already “knows” and ignores it; or in conflict with what it “knows” and it must resolve the conflict. LCC’s architecture consists of multiple sub-systems combined to perform the above tasks. Its learning component can add new information to the knowledge base, confirm existing information, and/or update existing information found to be related to the user input. The LCC system functionality was rigorously assessed with test statements comprising various difficulty levels. Furthermore, its acceptance by human users was evaluated by two separate groups of human test subjects—one group who interacted with the system, and a second group that evaluated the logs of the interactions of the first group. The collected results were all found to be acceptable and within the range of our expectations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Availability of data and materials

All used data can be found in the appendices of the first author full dissertation document that can be accessed from https://stars.library.ucf.edu

Code availability

Not applicable at the moment but the authors are planning to publish the code soon

Notes

  1. Behavioral cloning is a type of imitation learning where the agent receives the states and actions of an expert demonstrator as training data then the learning agent uses a supervised machine learning approach such as a classifier to replicate the demonstrator policy (Torabi et al., 2018) It is very similar in nature to LfO and LfD.

  2. TF-IDF stands for Term Frequency-Inverse Document Frequency, a numerical measure that reflects the importance of a word in a document or corpus.

  3. SemEval is an ongoing series of evaluations of computational semantic analysis systems. The evaluation involves exploring the natural meaning of the language. This task is not intuitive to machines as it is for humans (International workshop on semantic evaluation, 2015).

  4. http://paraphrase.org

References

  • Anderson, J. R. (1996). ACT: A simple theory of complex cognition. American Psychologist, 51, 355.

    Article  Google Scholar 

  • Apté, C., Damerau, F., & Weiss, S. M. (1994). Automated learning of decision rules for text categorization. Assoc Comput Mach (ACM) Trans Inf Syst TOIS, 12, 233–251.

    Google Scholar 

  • Chacón, A., Marco-Sola, S., Espinosa, A., Ribeca, P., & Moure, J. C. (2014). Thread-cooperative, bit-parallel computation of levenshtein distance on GPU. In Proceedings of the 28th of international conference on supercomputing (pp. 103–112)

  • Chang, Y.-W., Hsieh, C.-J., Chang, K.-W., Ringgaard, M., & Lin, C.-J. (2010). Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research, 11, 1471.

    MathSciNet  MATH  Google Scholar 

  • ChatterBot-machine learning, conversational dialog engine. Retrieved from, https://chatterbot.readthedocs.io/en/stable/. (2019).

  • Chieu, H. L., & Ng, H. T. (2002). A maximum entropy approach to information extraction from semi-structured and free text. In Proceedings of the association for the advancement of artificial intelligence (AAAI), (vol. 2002, pp. 786–791).

  • Clark, H., & Schaefer, E. (1989). Contributing to discourse’cognitive. Science, 13(13), 259–294.

    Google Scholar 

  • Cox, G. (2017). chatterbot.corpus.english.greetings. Retrieved from, https://github.com/gunthercox/chatterbot-corpus/blob/master/chatterbot_corpus/data/english/greetings.yml.

  • Cox, G. (2019). chatterbot.corpus.english.conversations. Retrieved from, https://github.com/gunthercox/chatterbot-corpus/blob/master/chatterbot_corpus/data/english/greetings.yml.

  • Dai, W., Xue, G.-R., Yang, Q., & Yu, Y. (2007). Transferring Naïve Bayes classifiers for text classification. In Proceedings of the association for the advancement of artificial intelligence (AAAI) (pp. 540–545).

  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805

  • Dietterich, T. G. (2002). Ensemble learning. The handbook of brain theory and neural networks (vol. 2, pp. 110–125).

  • Dunford, R., Su, Q., & Tamang, E. (2014). The Pareto principle. The Plymouth Student Scientist, 7, 140–148.

    Google Scholar 

  • Eggins, S., & Slade, D. (2004). Analysing casual conversation. Equinox Publishing Ltd. Cassell.

    Google Scholar 

  • Feldman, A. (1959). Mannerisms of speech and gestures in everyday life. New York, NY: International Universities Press.

  • Finkel, J. R., Grenager, T., & Manning, C. (2005). Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd annual meeting on association for computational linguistics (pp. 363–370).

  • fuzzywuzzy. Retrieved from, https://github.com/seatgeek/fuzzywuzzy

  • Ganesan, K., Zhai, C., & Han, J. (2010). Opinosis: A graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd international conference on computational linguistics (pp. 340–348).

  • Garfinkel, H. (1967). Studies in ethnomethodology. Prentice Hall.

    Google Scholar 

  • Gilmartin, E., Saam, C., Vogel, C., Campbell, N., & Wade, V. (2018). Just talking-modelling casual conversation. In Proceedings of the 19th annual SIGdial meeting on discourse and dialogue (pp. 51–59).

  • Goldberg, Y., & Levy, O. (2014). word2vec Explained: Deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv:1402.3722

  • Goldwasser, D., & Roth, D. (2011). Learning from natural instructions. In Proceedings of international joint conference on artificial intelligence (IJCAI).

  • International workshop on semantic evaluation (SemEval-2015). http://alt.qcri.org/semeval2015

  • Jaccard similarity measure. Retrieved from, https://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_score.html

  • Joseph, V. R., & Vakayil, A. (2021). SPlit: An optimal method for data splitting. Technometrics, 64, 166.

    Article  MathSciNet  Google Scholar 

  • Kuhlmann, G., Stone, P., Mooney, R., & Shavlik, J. (2004). Guiding a reinforcement learner with natural language advice: Initial results in RoboCup soccer. In Proceedings of the association for the advancement of artificial intelligence (AAAI) workshop on supervisory control of learning and adaptive systems.

  • Li, J., Miller, A. H., Chopra, S., Ranzato, M., & Weston, J. (2016). Learning through dialogue interactions. arXiv:1612.04936

  • Liu, B., & Mazumder, S. (2021) Lifelong and continual learning dialogue systems: Learning during conversation. In Proceedings of AAAI.

  • Luong, M.-T., Pham, H., & Manning, C. D. (2015) Effective approaches to attention-based neural machine translation. arXiv:1508.04025

  • Mihalcea, R., Corley, C., & Strapparava, C. (2006) Corpus-based and knowledge-based measures of text semantic similarity. In Proceedings of the association for the advancement of artificial intelligence (AAAI) (pp. 775–780).

  • Mohammed, A. A. (2019). Machine learning from casual conversation. Doctoral Dissertation, Department of Computer Science, University of Central Florida Electronic Theses and Dissertations. 6297. Retrieved from, https://stars.library.ucf.edu/etd/6297

  • Naïve Bayes text classification. Retrieved from, https://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html

  • Nigam, K., Lafferty, J., & McCallum, A. (1999). Using maximum entropy for text classification. In Proceedings of international joint conference on artificial intelligence IJCAI-99 workshop on machine learning for information filtering (pp. 61–67).

  • Provost, F. J., & Fawcett, T. (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. Knowledge Discovery and Data Mining (KDD) (pp. 43–48).

  • Random facts. Retrieved from, https://www.factslides.com

  • Rybski, P. E., Yoon, K., Stolarz, J., & Veloso, M. M. (2007). Interactive robot task training through dialog and demonstration. In Proceedings of the ACM/IEEE international conference on Human-robot interaction (pp. 49–56).

  • Sacks, H., Schegloff, E. A., & Jefferson, G. (1978). Studies in the organization of conversational interaction (pp. 696–735). Elsevier.

    Google Scholar 

  • Sultan, M. A., Bethard, S., & Sumner, T. (2015). DLS \(@\) CU: Sentence similarity from word alignment and semantic vector composition. In Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015) (pp. 148–153).

  • Sultan, M. A., Bethard, S., & Sumner, T. (2014). Back to basics for monolingual alignment: Exploiting word similarity and contextual evidence. Transactions of the Association for Computational Linguistics, 2, 219–230.

    Article  Google Scholar 

  • Torabi, F., Warnell, G., & Stone, P. (2018). Behavioral cloning from observation. arXiv:1805.01954

  • Torrey, L., Walker, T., Shavlik, J., & Maclin, R. (2005). Using advice to transfer knowledge acquired in one reinforcement learning task to another. In The European conference on machine learning and principles and practice of knowledge discovery in databases (ECML-PKDD) (pp. 412–424).

  • Traum, D. R., & Hinkelman, E. A. (1992). Conversation acts in task-oriented spoken dialogue. Computational Intelligence, 8, 575–599.

    Article  Google Scholar 

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998–6008).

  • Ventola, E. (1979). The structure of casual conversation in English. Journal of Pragmatics, 3, 267–298.

    Article  Google Scholar 

  • Weston, J. E. (2016). Dialog-based language learning. In Advances in neural information processing systems (pp. 829–837).

  • WordNet. Retrieved from, https://wordnet.princeton.edu/

  • Yujian, L., & Bo, L. (2007). A normalized Levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1091–1095.

    Article  Google Scholar 

  • Zhang, H., Yu, H., & Xu, W. (2017). Listen, interact and talk: Learning to speak via interaction. In NIPS workshop on visually-grounded interaction and language.

Download references

Funding

Work was supported indirectly through a teaching assistantship from the University of Central Florida for the first author.

Author information

Authors and Affiliations

Authors

Contributions

AMA: Conceptualization, methodology; investigation; software development: data acquisition and curation; writing and editing. AJG: conceptualization; project administration; supervision: manuscript editing

Corresponding author

Correspondence to Awrad E. Mohammed Ali.

Ethics declarations

Conflict of interest

There are no conflicts of interest for any of the authors

Ethical approval

The use of human test subjects and surveys were approved by the Institutional Review Board at the University of Central Florida, SBE-18-1418 dated: 8/14/2018. The approval letter can be found in APPENDIX H from the frist author dissertation document that can be accessed from https://stars.library.ucf.edu/etd/6297/. The authors consent that the submitted work is original and have not have been published or submitted elsewhere.

Consent to participate

Not applicable as the authors did not use any identification data related to the participants in this research.

Consent for publication

Not applicable.

Additional information

Editor: Derek Greene.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohammed Ali, A.E., Gonzalez, A.J. Machine learning from casual conversation. Mach Learn 112, 4789–4836 (2023). https://doi.org/10.1007/s10994-023-06383-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-023-06383-0

Keywords

Navigation