Advertisement

Effective Crowdsourced Generation of Training Data for Chatbots Natural Language Understanding

  • Rucha Bapat
  • Pavel Kucherbaev
  • Alessandro Bozzon
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10845)

Abstract

Chatbots are text-based conversational agents. Natural Language Understanding (NLU) models are used to extract meaning and intention from user messages sent to chatbots. The user experience of chatbots largely depends on the performance of the NLU model, which itself largely depends on the initial dataset the model is trained with. The training data should cover the diversity of real user requests the chatbot will receive. Obtaining such data is a challenging task even for big corporations. We introduce a generic approach to generate training data with the help of crowd workers, we discuss the approach workflow and the design of crowdsourcing tasks assuring high quality. We evaluate the approach by running an experiment collecting data for 9 different intents. We use the collected training data to train a natural language understanding model. We analyse the performance of the model under different training set sizes for each intent. We provide recommendations on selecting an optimal confidence threshold for predicting intents, based on the cost model of incorrect and unknown predictions.

Keywords

Conversational agents Natural Language Understanding Crowdsourcing 

Notes

Acknowledgments

This research has been supported in part by the Amsterdam Institute for Advanced Metropolitan Solutions with the AMS Social Bot grant, and by the Dutch national e-infrastructure with the support of SURF Cooperative (grant e-infra170237).

References

  1. 1.
    Bernstein, M.S., Little, G., Miller, R.C., Hartmann, B., Ackerman, M.S., Karger, D.R., Crowell, D., Panovich, K.: Soylent: A word processor with a crowd inside. In: Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology, UIST 2010, pp. 313–322, New York, NY, USA. ACM (2010)Google Scholar
  2. 2.
    Bozzon, A., Brambilla, M., Ceri, S., Mauri, A., Volonterio, R.: Pattern-based specification of crowdsourcing applications. In: Casteleyn, S., Rossi, G., Winckler, M. (eds.) Web Engineering. LNCS, pp. 218–235. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-08245-5_13CrossRefGoogle Scholar
  3. 3.
    Bozzon, A., Catallo, I., Ciceri, E., Fraternali, P., Martinenghi, D., Tagliasacchi, M.: A framework for crowdsourced multimedia processing and querying. In: Proceedings of the First International Workshop on Crowdsourcing Web Search, Lyon, France, 17 April 2012, pp. 42–47 (2012)Google Scholar
  4. 4.
    Bozzon, A., Galli, L.: An introduction to human computation and games with a purpose. In: Daniel, F., Dolog, P., Li, Q. (eds.) ICWE 2013. LNCS, vol. 7977, pp. 514–517. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-39200-9_48CrossRefGoogle Scholar
  5. 5.
    Callison-Burch, C.: Fast, cheap, and creative: evaluating translation quality using Amazon’s mechanical turk. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1, EMNLP 2009, pp. 286–295, Stroudsburg, PA, USA. Association for Computational Linguistics (2009)Google Scholar
  6. 6.
    Cranshaw, J., Elwany, E., Newman, T., Kocielnik, R., Yu, B., Soni, S., Teevan, J., Monroy-Hernández, A.: Calendar.help: designing a workflow-based scheduling agent with humans in the loop, January 2017Google Scholar
  7. 7.
    Dias, J.: Transcribing and annotating speech corpora for speech recognition: a three-step crowdsourcing approach with quality control. In: 1st AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2013, vol. WS-13-18, pp. 30–31, Palm Springs. Association for the Advancement of Artificial Intelligence, AI Access Foundation (2013)Google Scholar
  8. 8.
    Huang, T.H., Lasecki, W.S., Bigham, J.: Guardian: a crowd-powered spoken dialog system for web APIs. In: Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2015. AAAI (2015)Google Scholar
  9. 9.
    Huang, T.-H.K., Azaria, A., Bigham, J.P.: InstructableCrowd: creating if-then rules via conversations with the crowd. In: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, CHI EA 2016, pp. 1555–1562, New York, NY, USA. ACM (2016)Google Scholar
  10. 10.
    Huang, T.K., Chen, Y., Bigham, J.P.: Real-time on-demand crowd-powered entity extraction. Collective Ingelligence (2017)Google Scholar
  11. 11.
    Huang, T.K., Lasecki, W.S., Azaria, A., Bigham, J.P.: “Is there anything else I can help you with?”: challenges in deploying an on-demand crowd-powered conversational agent. In: Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2016 (2016)Google Scholar
  12. 12.
    Business Insider: Messaging apps are now bigger than social networks, September 2015. http://www.businessinsider.com/the-messaging-app-report-2015-11
  13. 13.
    Jha, M., Andreas, J., Thadani, K., Rosenthal, S., McKeown, K.: Corpus creation for new genres: a crowdsourced approach to PP attachment. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT 2010, pp. 13–20, Stroudsburg, PA, USA. Association for Computational Linguistics (2010)Google Scholar
  14. 14.
    Jiang, Y., Kummerfeld, J.K., Lasecki, W.S.: Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection. ArXiv e-prints, April 2017Google Scholar
  15. 15.
    Lane, I., Waibel, A., Eck, M., Rottmann, K.: Tools for collecting speech corpora via mechanical-turk. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, CSLDAMT 2010, pp. 184–187, Stroudsburg, PA, USA. Association for Computational Linguistics (2010)Google Scholar
  16. 16.
    Lasecki, W., Kamar, E., Bohus, D.: Conversations in the crowd: collecting data for task-oriented dialog learning. In: Scaling Speech, Language Understanding and Dialogue Through Crowdsourcing Workshop. AAAI (2013)Google Scholar
  17. 17.
    Lasecki, W.S., Rello, L., Bigham, J.P.: Measuring text simplification with the crowd. In: Proceedings of the 12th Web for All Conference, W4A 2015, pp. 4:1–4:9, New York, NY, USA. ACM (2015)Google Scholar
  18. 18.
    Lasecki, W.S., Wesley, R., Nichols, J., Kulkarni, A., Allen, J.F., Bigham, J.P.: Chorus: a crowd-powered conversational assistant. In: Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology, UIST 2013, pp. 151–162, New York, NY, USA. ACM (2013)Google Scholar
  19. 19.
    McTear, M., Callejas, Z., Griol, D.: The Conversational Interface: Talking to Smart Devices. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-32967-3CrossRefGoogle Scholar
  20. 20.
    Negri, M., Bentivogli, L., Mehdad, Y., Giampiccolo, D., Marchetti, A.: Divide and conquer: crowdsourcing the creation of cross-lingual textual entailment corpora. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, pp. 670–679, Stroudsburg, PA, USA. Association for Computational Linguistics (2011)Google Scholar
  21. 21.
    Negri, M., Mehdad, Y., Marchetti, A., Giampiccolo, D., Bentivogli, L.: Chinese whispers: cooperative paraphrase acquisition. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, Istanbul, Turkey, 23–25 May 2012, pp. 2659–2665 (2012)Google Scholar
  22. 22.
    Rothwell, S., Carter, S., Elshenawy, A., Dovgalecs, V., Saleem, S., Braga, D., Kennewick, B.: Data collection and annotation for state-of-the-art NER using unmanaged crowds. In: INTERSPEECH 2015, pp. 2789–2793 (2015)Google Scholar
  23. 23.
    Savenkov, D., Agichtein, E.: CRQA: crowd-powered real-time automatic question answering system. In: Fourth AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2016 (2016)Google Scholar
  24. 24.
    Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good? Evaluating non-expert annotations for natural language tasks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, pp. 254–263, Stroudsburg, PA, USA. Association for Computational Linguistics (2008)Google Scholar
  25. 25.
    Tür, G., Hakkani-Tur, D., Heck, L.P.: What is left to be understood in ATIS? In: Hakkani-Tür, D., Ostendorf, M. (eds.) SLT, pp. 19–24. IEEE (2010)Google Scholar
  26. 26.
    von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2004, pp. 319–326, New York, NY, USA. ACM (2004)Google Scholar
  27. 27.
    Vtyurina, A., Savenkov, D., Agichtein, E., Clarke, C.L.A.: Exploring conversational search with humans, assistants, and wizards. In: Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems, CHI EA 2017, pp. 2187–2193, New York, NY, USA. ACM (2017)Google Scholar
  28. 28.
    Wang, W.Y., Bohus, D., Kamar, E., Horvitz, E.: Crowdsourcing the acquisition of natural language corpora: methods and observations. In: 2012 IEEE Spoken Language Technology Workshop (SLT), Miami, FL, USA, 2–5 December 2012, pp. 73–78 (2012)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Rucha Bapat
    • 1
  • Pavel Kucherbaev
    • 1
  • Alessandro Bozzon
    • 1
  1. 1.Delft University of TechnologyDelftNetherlands

Personalised recommendations