Skip to main content
Log in

A framework for improving error detection and correction in spoken dialog systems

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Despite the recent improvements in performance and reliably of the different components of dialog systems, it is still crucial to devise strategies to avoid error propagation from one another. In this paper, we contribute a framework for improved error detection and correction in spoken conversational interfaces. The framework combines user behavior and error modeling to estimate the probability of the presence of errors in the user utterance. This estimation is forwarded to the dialog manager and used to compute whether it is necessary to correct possible errors. We have designed an strategy differentiating between the main misunderstanding and non-understanding scenarios, so that the dialog manager can provide an acceptable tailored response when entering the error correction state. As a proof of concept, we have applied our proposal to a customer support dialog system. Our results show the appropriateness of our technique to correctly detect and react to errors, enhancing the system performance and user satisfaction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://www.colips.org/workshop/dstc4/.

  2. https://framenet.icsi.berkeley.edu/fndrupal/.

  3. The degrees of freedom that SPSS employs for t tests are \(N-1\) in case the compared groups have the same number of samples (N), and \(N1+N2-1\) when they differ in the number of samples (N1 and N2).

References

  • Aberdeen J, Ferro L (2003) Dialogue patterns and misunderstandings. In: Proceedings of ISCA Workshop on error handling in SDSs, pp 17–23

  • Ai H, Litman D, Forbes-Riley K, Rotaru M, Tetreault J, Purandare A (2006) Using systems and user performance features to improve emotion detection in spoken tutoring dialogs. In: Proceedings of Interspeech’06-ICSLP, pp 797–800

  • Ai H, Raux A, Bohus D, Eskenazi M, Litman D (2007) Comparing spoken dialog corpora collected with recruited subjects versus real users. In: Proceedings of SIGdial, pp 124–131

  • Batliner A, Burkhardt F, van Ballegooy M, Noth E (2006) IA taxonomy of applications that utilize emotional awareness. In: Proceedings of IS-LTC’06, pp 246–250

  • Bickmore T, Giorgino T (2004) Some novel aspects of health communication from a dialogue systems perspective. In: Proceedings of AAAI Fall symposium on dialogue systems for health communication, pp 275–291

  • Black A, Burger S, Langner B, Parent G, Eskenazi M (2010) Spoken dialog challenge 2010. In: Proceedings of SLT’10, pp 448–453

  • Bulyko I, Kirchhoff K, Ostendorf M, Goldberg J (2005) Error-correction detection and response generation in a spoken dialogue system. Speech Commun 45(3):271–288

    Article  Google Scholar 

  • Calvo R, D’Mello S, Gratch J, Kappas A (2014) The Oxford handbook of affective computing. Oxford University Press, Oxford

    Google Scholar 

  • Dethlefs N, Cuayáhuitl H (2015) Hierarchical reinforcement learning for situated natural language generation. Nat Lang Eng 21(3):391–435

    Article  Google Scholar 

  • Eckert W, Levin E, Pieraccini R (1997) User modeling for spoken dialogue system evaluation. In: Proceedings of ASRU, pp 80–87

  • Engelbrecht KP, Moller S (2010) Sequential classifiers for the prediction of user judgments about spoken dialog systems. Speech Commun 52(10):816–833

    Article  Google Scholar 

  • Erdogan H, Sarikaya R, Chen S, Gao Y, Picheny M (2005) Using semantic analysis to improve speech recognition performance. Comput Speech Lang 19:321–343

    Article  Google Scholar 

  • Fonfara J, Hellbacha S, Bohme H (2014) Imitating dialog strategies under uncertainty. In: Proceedings of IHCI, pp 131–138

  • Fukubayashi Y, Komatani K, Ogata T, Okuno H (2006) Dynamic help generation by estimating user’s mental model in spoken dialogue systems. In: Proceedings of ICSLP, pp 1946–1949

  • Gemello R, Mana F, Albesano D, Mori RD (2006) Multiple resolution analysis for robust automatic speech recognition. Comput Speech Lang 20:2–21

    Article  Google Scholar 

  • Griol D, Carbo J, Molina JM (2013a) An automatic dialog simulation technique to develop and evaluate interactive conversational agents. Appl Artif Intell 27(9):759–780

    Article  Google Scholar 

  • Griol D, Carbo J, Molina J (2013b) A statistical simulation technique to develop and evaluate conversational agents. AI Commun 26(4):355–371

    MathSciNet  Google Scholar 

  • Griol D, Callejas Z, López-Cózar R, Riccardi G (2014) A domain-independent statistical methodology for dialog management in spoken dialog systems. Comput Speech Lang 28(3):743–768

    Article  Google Scholar 

  • Hakkani-Tur D, Bechet F, Riccardi G, Tur G (2006) Beyond ASR 1-best: using word confusion networks in spoken language understanding. Comput Speech Lang 20(4):495–514

    Article  Google Scholar 

  • Hirst G, McRoy S, Heeman P, Edmonds P, Horton D (1993) Repairing conversational misunderstandings and non-understandings. Speech Commun 15(3–4):825–840

    Google Scholar 

  • Karsenty L, Botherel V (2005) Transparency strategies to help users handle system errors. Speech Commun 45:305–324

    Article  Google Scholar 

  • Kitaoka N, Kakutani N, Nakagawa S (2003) Detection and recognition of correction utterance in spontaneously spoken dialog. In: Proceedings of Eurospeech, pp 625–628

  • Lee C, Jung S, Kim K, Lee GG (2010) Hybrid approach to robust dialog management using agenda and dialog examples. Comput Speech Lang 24(4):609–631

  • Levin E, Pieraccini R, Eckert W (2000) A stochastic model of human–machine interaction for learning dialog strategies. IEEE Trans Speech Audio Process 8(1):11–23

    Article  Google Scholar 

  • López-Cózar R, Callejas Z, Griol D (2010) Using knowledge of misunderstandings to increase the robustness of spoken dialogue systems. Knowl Based Syst 23:471–485

    Article  Google Scholar 

  • Lutfi SL, Fernández-Martínez F, Lucas-Cuesta JM, López-Lebón L, Montero JM (2013) A satisfaction-based model for affect recognition from conversational features in spoken dialog systems. Speech Commun 55(7–8):825–840

    Article  Google Scholar 

  • Martinovsky B, Traum D (2003) The error is the clue: breakdown in human–machine interaction. In: Proceedings of ISCA Workshop on error handling in SDSs, 99 11–17

  • McCrae R, John O (1992) An introduction to the five-factor model and its applications. J Pers 60(2):175–215

    Article  Google Scholar 

  • McTear MF, Callejas Z, Griol D (2016) The conversational interface. Springer, Berlin

    Book  Google Scholar 

  • Paek T, Pieraccini R (2008) Automating spoken dialogue management design using machine learning: an industry perspective. Speech Commun 50(8–9):716–729

    Article  Google Scholar 

  • Schatzmann J, Georgila K, Young S (2005) Quantitative evaluation of user simulation techniques for spoken dialogue systems. In: Proceedings of SIGdial, pp 45–54

  • Schatzmann J, Weilhammer K, Stuttle M, Young S (2006) A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. Knowl Eng Rev 21(2):97–126

    Article  Google Scholar 

  • Schatzmann J, Thomson B, Weilhammer K, Ye H, Young S (2007a) Agenda-based user simulation for bootstrapping a POMDP dialogue system. In: Proceedings of HLT/NAACL, pp 149–152

  • Schatzmann J, Thomson B, Young S (2007b) Statistical user simulation with a hidden agenda. In: Proceedings of SIGdial, pp 273–282

  • Schmitt A, Ultes S (2015) Interaction quality: assessing the quality of ongoing spoken dialog interaction by experts and how it relates to user satisfaction. Speech Commun 74:12–36

    Article  Google Scholar 

  • Schuller B, Batliner A (2013) Computational paralinguistics: emotion, affect and personality in speech and language processing. Wiley, New York

    Book  Google Scholar 

  • Shin J, Narayanan S, Gerber L, Kazemzadeh A, Byrd D (2002) Analysis of user behavior under error conditions in spoken dialogs. In: Proceedings of ICSLP, pp 2069–2072

  • Skantze G (2009) Exploring human error recovery strategies: implications for spoken dialogue systems. Speech Commun 45(3):325–341

    Article  Google Scholar 

  • Stepanov E, Riccardi G, Bayer A (2014) The development of the multilingual LUNA corpus for spoken language system porting. In: Proceedings of LREC, pp 2675–2678

  • Wang F, Swegles K (2013) Modeling user behavior online for disambiguating user input in a spoken dialogue system. Speech Commun 55:84–98

    Article  Google Scholar 

  • Wang Y, Acero A, Chelba C (2003) Is word error rate a good indicator for spoken language understanding accuracy? In: Proceedings of ASRU, pp 577–582

  • Williams J (2009) The best of both worlds: Unifying conventional dialog systems and pomdps. In: Proceedings of Interspeech, pp 1173–1176

  • Young SJ, Gasic M, Thomson B, Williams JD (2013) Pomdp-based statistical spoken dialog systems: a review. Proc IEEE 101(5):1160–1179

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by Projects MINECO TEC2012-37832-C02-01, CICYT TEC2011-28626-C02-02, CAM CONTEXTS (S2009/TIC-1485).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Griol.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Additional information

Communicated by A. Herrero.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Griol, D., Molina, J.M. A framework for improving error detection and correction in spoken dialog systems. Soft Comput 20, 4229–4241 (2016). https://doi.org/10.1007/s00500-016-2290-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-016-2290-z

Keywords

Navigation