Advertisement

The Second Conversational Intelligence Challenge (ConvAI2)

  • Emily DinanEmail author
  • Varvara Logacheva
  • Valentin Malykh
  • Alexander Miller
  • Kurt Shuster
  • Jack Urbanek
  • Douwe Kiela
  • Arthur Szlam
  • Iulian Serban
  • Ryan Lowe
  • Shrimai Prabhumoye
  • Alan W. Black
  • Alexander Rudnicky
  • Jason Williams
  • Joelle Pineau
  • Mikhail Burtsev
  • Jason Weston
Conference paper
Part of the The Springer Series on Challenges in Machine Learning book series (SSCML)

Abstract

We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots. Some key takeaways from the competition are: (1) pretrained Transformer variants are currently the best performing models on this task, (2) but to improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations)—in terms of repetition, consistency and balance of dialogue acts (e.g. how many questions asked vs. answered).

Notes

Acknowledgements

We thank all the competitors for taking part and making this a successful competition. We especially thank the competition’s sponsors, Facebook Academics and Amazon Web Services. Participation of Mikhail Burtsev, Varvara Logacheva, and Valentin Malykh was supported by National Technology Initiative and PAO Sberbank project ID 0000000007417F630002.

References

  1. 1.
    Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston. Personalizing dialogue agents: I have a dog, do you have pets too? arXiv preprint arXiv:1801.07243, 2018.Google Scholar
  2. 2.
    Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, and Joelle Pineau. Generative deep neural networks for dialogue: A short review. arXiv preprint arXiv:1611.06216, 2016.Google Scholar
  3. 3.
    Oriol Vinyals and Quoc Le. A neural conversational model. arXiv preprint arXiv:1506.05869, 2015.Google Scholar
  4. 4.
    Jiwei Li, Michel Galley, Chris Brockett, Georgios P Spithourakis, Jianfeng Gao, and Bill Dolan. A persona-based neural conversation model. arXiv preprint arXiv:1603.06155, 2016.Google Scholar
  5. 5.
    Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055, 2015.Google Scholar
  6. 6.
    Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In Proceedings of the SIGDIAL 2015 Conference, The 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2–4 September 2015, Prague, Czech Republic, pages 285–294. The Association for Computer Linguistics, 2015.Google Scholar
  7. 7.
    Wenchao Du and Alan W. Black. Data augmentation for neural online chats response selection. In Aleksandr Chuklin, Jeff Dalton, Julia Kiseleva, Alexey Borisov, and Mikhail Burtsev, editors, Proceedings of the 2nd International Workshop on Search-Oriented Conversational AI, SCAI@EMNLP 2018, Brussels, Belgium, October 31, 2018, pages 52–58. Association for Computational Linguistics, 2018.Google Scholar
  8. 8.
    Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, and Hua Wu. Multi-turn response selection for chatbots with deep attention matching network. In Iryna Gurevych and Yusuke Miyao, editors, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15–20, 2018, Volume 1: Long Papers, pages 1118–1127. Association for Computational Linguistics, 2018.Google Scholar
  9. 9.
    Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. Dailydialog: A manually labelled multi-turn dialogue dataset. In Greg Kondrak and Taro Watanabe, editors, Proceedings of the Eighth International Joint Conference on Natural Language Processing, IJCNLP 2017, Taipei, Taiwan, November 27 - December 1, 2017 - Volume 1: Long Papers, pages 986–995. Asian Federation of Natural Language Processing, 2017.Google Scholar
  10. 10.
    Iulian Vlad Serban, Alessandro Sordoni, Yoshua Bengio, Aaron C. Courville, and Joelle Pineau. Building end-to-end dialogue systems using generative hierarchical neural network models. In Dale Schuurmans and Michael P. Wellman, editors, Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12–17, 2016, Phoenix, Arizona, USA., pages 3776–3784. AAAI Press, 2016.Google Scholar
  11. 11.
    Alexander H Miller, Will Feng, Adam Fisch, Jiasen Lu, Dhruv Batra, Antoine Bordes, Devi Parikh, and Jason Weston. Parlai: A dialog research software platform. arXiv preprint arXiv:1705.06476, 2017.Google Scholar
  12. 12.
    Chia-Wei Liu, Ryan Lowe, Iulian Vlad Serban, Michael Noseworthy, Laurent Charlin, and Joelle Pineau. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. CoRR, abs/1603.08023, 2016.Google Scholar
  13. 13.
    Oriol Vinyals and Quoc V. Le. A neural conversational model. CoRR, abs/1506.05869, 2015.Google Scholar
  14. 14.
    Jiwei Li, Will Monroe, Alan Ritter, Dan Jurafsky, Michel Galley, and Jianfeng Gao. Deep reinforcement learning for dialogue generation. In Su et al. [26], pages 1192–1202.Google Scholar
  15. 15.
    Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. A diversity-promoting objective function for neural conversation models. In Kevin Knight, Ani Nenkova, and Owen Rambow, editors, NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12–17, 2016, pages 110–119. The Association for Computational Linguistics, 2016.Google Scholar
  16. 16.
    Chia-Wei Liu, Ryan Lowe, Iulian Serban, Michael Noseworthy, Laurent Charlin, and Joelle Pineau. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In Su et al. [26], pages 2122–2132.Google Scholar
  17. 17.
    Ilya Kulikov, Alexander H. Miller, Kyunghyun Cho, and Jason Weston. Importance of a search strategy in neural dialogue modelling. CoRR, abs/1811.00907, 2018.Google Scholar
  18. 18.
    Thomas Wolf, Victor Sanh, Julien Chaumond, and Clement Delangue. Transfertransfo: A transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149, 2019.Google Scholar
  19. 19.
    Yu Wu, Wei Wu, Chen Xing, Ming Zhou, and Zhoujun Li. Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. arXiv preprint arXiv:1612.01627, 2016.Google Scholar
  20. 20.
    Jason Weston, Emily Dinan, and Alexander H Miller. Retrieve and refine: Improved sequence generation models for dialogue. arXiv preprint arXiv:1808.04776, 2018.Google Scholar
  21. 21.
    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.Google Scholar
  22. 22.
    Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, and Jason Weston. Real-time inference in multi-sentence tasks with deep pretrained transformers. arXiv preprint arXiv:1905.01969, 2019.Google Scholar
  23. 23.
    Sean Welleck, Jason Weston, Arthur Szlam, and Kyunghyun Cho. Dialogue natural language inference. arXiv preprint arXiv:1811.00671, 2018.Google Scholar
  24. 24.
    Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. What makes a good conversation? how controllable attributes affect human judgments. arXiv preprint arXiv:1902.08654, 2019.Google Scholar
  25. 25.
    Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, and Jason Weston. Wizard of wikipedia: Knowledge-powered conversational agents. arXiv preprint arXiv:1811.01241, 2018.Google Scholar
  26. 26.
    Jian Su, Xavier Carreras, and Kevin Duh, editors. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1–4, 2016. The Association for Computational Linguistics, 2016.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Emily Dinan
    • 1
    Email author
  • Varvara Logacheva
    • 2
  • Valentin Malykh
    • 2
  • Alexander Miller
    • 1
  • Kurt Shuster
    • 1
  • Jack Urbanek
    • 1
  • Douwe Kiela
    • 1
  • Arthur Szlam
    • 1
  • Iulian Serban
    • 3
  • Ryan Lowe
    • 4
    • 1
  • Shrimai Prabhumoye
    • 5
  • Alan W. Black
    • 5
  • Alexander Rudnicky
    • 5
  • Jason Williams
    • 6
  • Joelle Pineau
    • 1
    • 4
  • Mikhail Burtsev
    • 2
  • Jason Weston
    • 1
  1. 1.Facebook AI ResearchNew YorkUSA
  2. 2.Moscow Institute of Physics and TechnologyMoscowRussia
  3. 3.University of MontrealMontrealCanada
  4. 4.McGill UniversityMontrealCanada
  5. 5.Carnegie Mellon UniversityPittsburghUSA
  6. 6.Microsoft ResearchRedmondUSA

Personalised recommendations