A framework for improving error detection and correction in spoken dialog systems

Griol, David; Molina, José Manuel

doi:10.1007/s00500-016-2290-z

A framework for improving error detection and correction in spoken dialog systems

Focus
Published: 29 July 2016

Volume 20, pages 4229–4241, (2016)
Cite this article

Soft Computing Aims and scope Submit manuscript

David Griol¹ &
José Manuel Molina¹

540 Accesses
7 Citations
3 Altmetric
Explore all metrics

Abstract

Despite the recent improvements in performance and reliably of the different components of dialog systems, it is still crucial to devise strategies to avoid error propagation from one another. In this paper, we contribute a framework for improved error detection and correction in spoken conversational interfaces. The framework combines user behavior and error modeling to estimate the probability of the presence of errors in the user utterance. This estimation is forwarded to the dialog manager and used to compute whether it is necessary to correct possible errors. We have designed an strategy differentiating between the main misunderstanding and non-understanding scenarios, so that the dialog manager can provide an acceptable tailored response when entering the error correction state. As a proof of concept, we have applied our proposal to a customer support dialog system. Our results show the appropriateness of our technique to correctly detect and react to errors, enhancing the system performance and user satisfaction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural Language Processing

Prompt Engineering in Large Language Models

Soft Systems Methodology

Notes

http://www.colips.org/workshop/dstc4/.
https://framenet.icsi.berkeley.edu/fndrupal/.
The degrees of freedom that SPSS employs for t tests are \(N-1\) in case the compared groups have the same number of samples (N), and \(N1+N2-1\) when they differ in the number of samples (N1 and N2).

References

Aberdeen J, Ferro L (2003) Dialogue patterns and misunderstandings. In: Proceedings of ISCA Workshop on error handling in SDSs, pp 17–23
Ai H, Litman D, Forbes-Riley K, Rotaru M, Tetreault J, Purandare A (2006) Using systems and user performance features to improve emotion detection in spoken tutoring dialogs. In: Proceedings of Interspeech’06-ICSLP, pp 797–800
Ai H, Raux A, Bohus D, Eskenazi M, Litman D (2007) Comparing spoken dialog corpora collected with recruited subjects versus real users. In: Proceedings of SIGdial, pp 124–131
Batliner A, Burkhardt F, van Ballegooy M, Noth E (2006) IA taxonomy of applications that utilize emotional awareness. In: Proceedings of IS-LTC’06, pp 246–250
Bickmore T, Giorgino T (2004) Some novel aspects of health communication from a dialogue systems perspective. In: Proceedings of AAAI Fall symposium on dialogue systems for health communication, pp 275–291
Black A, Burger S, Langner B, Parent G, Eskenazi M (2010) Spoken dialog challenge 2010. In: Proceedings of SLT’10, pp 448–453
Bulyko I, Kirchhoff K, Ostendorf M, Goldberg J (2005) Error-correction detection and response generation in a spoken dialogue system. Speech Commun 45(3):271–288
Article Google Scholar
Calvo R, D’Mello S, Gratch J, Kappas A (2014) The Oxford handbook of affective computing. Oxford University Press, Oxford
Google Scholar
Dethlefs N, Cuayáhuitl H (2015) Hierarchical reinforcement learning for situated natural language generation. Nat Lang Eng 21(3):391–435
Article Google Scholar
Eckert W, Levin E, Pieraccini R (1997) User modeling for spoken dialogue system evaluation. In: Proceedings of ASRU, pp 80–87
Engelbrecht KP, Moller S (2010) Sequential classifiers for the prediction of user judgments about spoken dialog systems. Speech Commun 52(10):816–833
Article Google Scholar
Erdogan H, Sarikaya R, Chen S, Gao Y, Picheny M (2005) Using semantic analysis to improve speech recognition performance. Comput Speech Lang 19:321–343
Article Google Scholar
Fonfara J, Hellbacha S, Bohme H (2014) Imitating dialog strategies under uncertainty. In: Proceedings of IHCI, pp 131–138
Fukubayashi Y, Komatani K, Ogata T, Okuno H (2006) Dynamic help generation by estimating user’s mental model in spoken dialogue systems. In: Proceedings of ICSLP, pp 1946–1949
Gemello R, Mana F, Albesano D, Mori RD (2006) Multiple resolution analysis for robust automatic speech recognition. Comput Speech Lang 20:2–21
Article Google Scholar
Griol D, Carbo J, Molina JM (2013a) An automatic dialog simulation technique to develop and evaluate interactive conversational agents. Appl Artif Intell 27(9):759–780
Article Google Scholar
Griol D, Carbo J, Molina J (2013b) A statistical simulation technique to develop and evaluate conversational agents. AI Commun 26(4):355–371
MathSciNet Google Scholar
Griol D, Callejas Z, López-Cózar R, Riccardi G (2014) A domain-independent statistical methodology for dialog management in spoken dialog systems. Comput Speech Lang 28(3):743–768
Article Google Scholar
Hakkani-Tur D, Bechet F, Riccardi G, Tur G (2006) Beyond ASR 1-best: using word confusion networks in spoken language understanding. Comput Speech Lang 20(4):495–514
Article Google Scholar
Hirst G, McRoy S, Heeman P, Edmonds P, Horton D (1993) Repairing conversational misunderstandings and non-understandings. Speech Commun 15(3–4):825–840
Google Scholar
Karsenty L, Botherel V (2005) Transparency strategies to help users handle system errors. Speech Commun 45:305–324
Article Google Scholar
Kitaoka N, Kakutani N, Nakagawa S (2003) Detection and recognition of correction utterance in spontaneously spoken dialog. In: Proceedings of Eurospeech, pp 625–628
Lee C, Jung S, Kim K, Lee GG (2010) Hybrid approach to robust dialog management using agenda and dialog examples. Comput Speech Lang 24(4):609–631
Levin E, Pieraccini R, Eckert W (2000) A stochastic model of human–machine interaction for learning dialog strategies. IEEE Trans Speech Audio Process 8(1):11–23
Article Google Scholar
López-Cózar R, Callejas Z, Griol D (2010) Using knowledge of misunderstandings to increase the robustness of spoken dialogue systems. Knowl Based Syst 23:471–485
Article Google Scholar
Lutfi SL, Fernández-Martínez F, Lucas-Cuesta JM, López-Lebón L, Montero JM (2013) A satisfaction-based model for affect recognition from conversational features in spoken dialog systems. Speech Commun 55(7–8):825–840
Article Google Scholar
Martinovsky B, Traum D (2003) The error is the clue: breakdown in human–machine interaction. In: Proceedings of ISCA Workshop on error handling in SDSs, 99 11–17
McCrae R, John O (1992) An introduction to the five-factor model and its applications. J Pers 60(2):175–215
Article Google Scholar
McTear MF, Callejas Z, Griol D (2016) The conversational interface. Springer, Berlin
Book Google Scholar
Paek T, Pieraccini R (2008) Automating spoken dialogue management design using machine learning: an industry perspective. Speech Commun 50(8–9):716–729
Article Google Scholar
Schatzmann J, Georgila K, Young S (2005) Quantitative evaluation of user simulation techniques for spoken dialogue systems. In: Proceedings of SIGdial, pp 45–54
Schatzmann J, Weilhammer K, Stuttle M, Young S (2006) A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. Knowl Eng Rev 21(2):97–126
Article Google Scholar
Schatzmann J, Thomson B, Weilhammer K, Ye H, Young S (2007a) Agenda-based user simulation for bootstrapping a POMDP dialogue system. In: Proceedings of HLT/NAACL, pp 149–152
Schatzmann J, Thomson B, Young S (2007b) Statistical user simulation with a hidden agenda. In: Proceedings of SIGdial, pp 273–282
Schmitt A, Ultes S (2015) Interaction quality: assessing the quality of ongoing spoken dialog interaction by experts and how it relates to user satisfaction. Speech Commun 74:12–36
Article Google Scholar
Schuller B, Batliner A (2013) Computational paralinguistics: emotion, affect and personality in speech and language processing. Wiley, New York
Book Google Scholar
Shin J, Narayanan S, Gerber L, Kazemzadeh A, Byrd D (2002) Analysis of user behavior under error conditions in spoken dialogs. In: Proceedings of ICSLP, pp 2069–2072
Skantze G (2009) Exploring human error recovery strategies: implications for spoken dialogue systems. Speech Commun 45(3):325–341
Article Google Scholar
Stepanov E, Riccardi G, Bayer A (2014) The development of the multilingual LUNA corpus for spoken language system porting. In: Proceedings of LREC, pp 2675–2678
Wang F, Swegles K (2013) Modeling user behavior online for disambiguating user input in a spoken dialogue system. Speech Commun 55:84–98
Article Google Scholar
Wang Y, Acero A, Chelba C (2003) Is word error rate a good indicator for spoken language understanding accuracy? In: Proceedings of ASRU, pp 577–582
Williams J (2009) The best of both worlds: Unifying conventional dialog systems and pomdps. In: Proceedings of Interspeech, pp 1173–1176
Young SJ, Gasic M, Thomson B, Williams JD (2013) Pomdp-based statistical spoken dialog systems: a review. Proc IEEE 101(5):1160–1179
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by Projects MINECO TEC2012-37832-C02-01, CICYT TEC2011-28626-C02-02, CAM CONTEXTS (S2009/TIC-1485).

Author information

Authors and Affiliations

Group of Applied Artificial Intelligence (GIAA), Computer Science Department, Carlos III University of Madrid, Avda. de la Universidad, 30, 28911, Leganés, Spain
David Griol & José Manuel Molina

Authors

David Griol
View author publications
You can also search for this author in PubMed Google Scholar
José Manuel Molina
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Griol.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Additional information

Communicated by A. Herrero.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Griol, D., Molina, J.M. A framework for improving error detection and correction in spoken dialog systems. Soft Comput 20, 4229–4241 (2016). https://doi.org/10.1007/s00500-016-2290-z

Download citation

Published: 29 July 2016
Issue Date: November 2016
DOI: https://doi.org/10.1007/s00500-016-2290-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A framework for improving error detection and correction in spoken dialog systems

Abstract

Access this article

Similar content being viewed by others

Natural Language Processing

Prompt Engineering in Large Language Models

Soft Systems Methodology

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A framework for improving error detection and correction in spoken dialog systems

Abstract

Access this article

Similar content being viewed by others

Natural Language Processing

Prompt Engineering in Large Language Models

Soft Systems Methodology

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation