Abstract
Collecting a large amount of real human-computer interaction data in various domains is a cornerstone in the development of better data-driven spoken dialog systems. The DialPort project is creating a portal to collect a constant stream of real user conversational data on a variety of topics. In order to keep real users attracted to DialPort, it is crucial to develop a robust evaluation framework to monitor and maintain high performance. Different from earlier spoken dialog systems, DialPort has a heterogeneous set of spoken dialog systems gathered under one outward-looking agent. In order to access this new structure, we have identified some unique challenges that DialPort will encounter so that it can appeal to real users and have created a novel evaluation scheme that quantitatively assesses their performance in these situations. We look at assessment from the point of view of the system developer as well as that of the end user.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Lee K, Zhao T, Du Y, Cai E, Lu A, Pincus E, Traum D, Ultes S, Barahona LMR, Gasic M et al (2017) Dialport, gone live: an update after a year of development. In: Proceedings of the 18th annual SIGdial meeting on discourse and dialogue, pp 170–173
Liu CW, Lowe R, Serban IV, Noseworthy M, Charlin L, Pineau J (2016) How not to evaluate your dialogue system: an empirical study of unsupervised evaluation metrics for dialogue response generation. arXiv preprint arXiv:1603.08023
Mrkšić N, Séaghdha DO, Thomson B, Gašić M, Su PH, Vandyke D, Wen TH, Young S (2015) Multi-domain dialog state tracking using recurrent neural networks. arXiv preprint arXiv:1506.07190
Pincus E, Traum D (2016) Towards automatic identification of effective clues for team word-guessing games. In: Proceedings of the language resources and evaluation conference (LREC). European Language Resources Association, Portoro, Slovenia, pp 2741–2747
Raux A, Eskenazi M (2009) A finite-state turn-taking model for spoken dialog systems. In: Proceedings of human language technologies: the 2009 annual conference of the north american chapter of the association for computational linguistics. Association for Computational Linguistics, pp 629–637
Vinyals O, Le Q (2015) A neural conversational model. arXiv preprint arXiv:1506.05869
Walker MA, Litman DJ, Kamm CA, Abella A (1997) Paradise: a framework for evaluating spoken dialogue agents. In: Proceedings of the eighth conference on European chapter of the association for computational linguistics. Association for Computational Linguistics, pp 271–280
Williams JD, Young S (2007) Partially observable markov decision processes for spoken dialog systems. Comput Speech Lang 21(2):393–422
Zhao T, Eskenazi M, Lee K (2016) Dialport: a general framework for aggregating dialog systems. EMNLP 2016:32
Zhao T, Lee K, Eskenazi M (2016) Dialport: connecting the spoken dialog research community to real user data. In: 2016 IEEE workshop on spoken language technology
Zhao T, Lee K, Eskenazi M (2016) The dialport portal: grouping diverse types of spoken dialog systems. In: Workshop on Chatbots and conversational agents
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Lee, K. et al. (2019). An Assessment Framework for DialPort. In: Eskenazi, M., Devillers, L., Mariani, J. (eds) Advanced Social Interaction with Agents . Lecture Notes in Electrical Engineering, vol 510. Springer, Cham. https://doi.org/10.1007/978-3-319-92108-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-92108-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92107-5
Online ISBN: 978-3-319-92108-2
eBook Packages: EngineeringEngineering (R0)