Skip to main content

Evaluation of Statistical POMDP-Based Dialogue Systems in Noisy Environments

  • Chapter
  • First Online:
Situated Dialog in Speech-Based Human-Computer Interaction

Abstract

Compared to conventional hand-crafted rule-based dialogue management systems, statistical POMDP-based dialogue managers offer the promise of increased robustness, reduced development and maintenance costs, and scaleability to large open-domains. As a consequence, there has been considerable research activity in approaches to statistical spoken dialogue systems over recent years. However, building and deploying a real-time spoken dialogue system is expensive, and even when operational, it is hard to recruit sufficient users to get statistically significant results. Instead, researchers have tended to evaluate using user simulators or by reprocessing existing corpora, both of which are unconvincing predictors of actual real world performance. This paper describes the deployment of a real-world restaurant information system and its evaluation in a motor car using subjects recruited locally and by remote users recruited using Amazon Mechanical Turk. The paper explores three key questions: are statistical dialogue systems more robust than conventional hand-crafted systems; how does the performance of a system evaluated on a user simulator compare to performance with real users; and can performance of a system tested over the telephone network be used to predict performance in more hostile environments such as a motor car? The results show that the statistical approach is indeed more robust, but results from a simulator significantly over-estimate performance both absolute and relative. Finally, by matching WER rates, performance results obtained over the telephone can provide useful predictors of performance in noisier environments such as the motor car, but again they tend to over-estimate performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.gumtree.com.

  2. 2.

    As well as being used to train the POMDP-based system, the user simulator was used to tune the rules in the conventional hand-crafted system.

References

  1. Roy N, Pineau J, Thrun S (2000) Spoken dialogue management using probabilistic reasoning. In: Proceedings of ACL

    Google Scholar 

  2. Young S (2002) Talking to machines (statistically speaking). In: Proceedings of ICSLP

    Google Scholar 

  3. Williams J, Young S (2007) Partially observable markov decision processes for spoken dialog systems. Comput Speech Lang 21(2):393–422

    Article  Google Scholar 

  4. Young S, Gasic M, Thomson B, Williams J (2013) POMDP-based statistical spoken dialogue systems: a review. Proc IEEE 101(5):1160–1179

    Article  Google Scholar 

  5. Scheffler K, Young S (2000) Probabilistic simulation of human-machine dialogues. In: ICASSP

    Google Scholar 

  6. Pietquin O, Dutoit T (2006) A probabilistic framework for dialog simulation and optimal strategy learning. IEEE Trans Speech Audio Process, Spec Issue Data Min Speech, Audio Dialog 14(2):589–599

    Google Scholar 

  7. Schatzmann J, Weilhammer K, Stuttle M, Young S (2006) A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. KER 21(2):97–126

    Google Scholar 

  8. Pietquin O, Renals S (2002) ASR system modelling for automatic evaluation and optimisation of dialogue systems. In: International Conference on Acoustics Speech and Signal Processing. Florida

    Google Scholar 

  9. Thomson B, Henderson M, Gasic M, Tsiakoulis P, Young S (2012) N-Best error simulation for training spoken dialogue systems. In: IEEE SLT 2012. Miami

    Google Scholar 

  10. Tsiakoulis P, Gašić M, Henderson M, Planells-Lerma J, Prombonas J, Thomson B, Yu K, Young S, Tzirkel E (2012) Statistical methods for building robust spoken dialogue systems in an automobile. In: Proceedings of the 4th applied human factors and ergonomics

    Google Scholar 

  11. Jurčíček F, Keizer S, Gašić M, Mairesse F, Thomson B, Yu K, Young S (2011) Real user evaluation of spoken dialogue systems using amazon mechanical Turk. In: Proceedings of interspeech

    Google Scholar 

  12. Young S, Evermann G, Gales M, Hain T, Kershaw D, Liu X, Moore G, Odell J, Ollason D, Povey D, Valtchev V, Woodland P (2006) The HTK book version 3.4. Cambridge University, Cambridge

    Google Scholar 

  13. Mairesse F, Gašić M, Jurčíček F, Keizer S, Thomson B, Yu K, Young S (2009) Spoken language understanding from unaligned data using discriminative classification models. In: Proceedings of ICASSP

    Google Scholar 

  14. Henderson M, Gasic M, Thomson B, Tsiakoulis P, Yu K, Young S (2012) Discriminative spoken language understanding using word confusion networks. In: IEEE SLT 2012. Miami

    Google Scholar 

  15. Young S (2007) CUED standard dialogue acts. Cambridge University Engineering Department (14 October 2007)

    Google Scholar 

  16. Thomson B, Young S (2010) Bayesian update of dialogue state: a POMDP framework for spoken dialogue systems. Comput Speech Lang 24(4):562–588

    Article  Google Scholar 

  17. Minka T (2001) Expectation propagation for approximate bayesian inference. In: Proceedings of the 17th conference in uncertainty in artificial intelligence (Seattle). Morgan-Kaufmann, pp 362–369

    Google Scholar 

  18. Thomson B, Jurcicek F, Gasic M, Keizer S, Mairesse F, Yu K, Young S (2010) Parameter learning for POMDP spoken dialogue models. In: IEEE workshop on spoken language technology (SLT 2010). Berkeley

    Google Scholar 

  19. Jurcicek F, Thomson B, Young S (2011) Natural actor and belief critic: reinforcement algorithm for learning parameters of dialogue systems modelled as POMDPs. ACM Trans Speech Lang Process 7(3)

    Google Scholar 

  20. Schatzmann J, Thomson B, Weilhammer K, Ye H, Young S (2007) Agenda-Based user simulation for bootstrapping a POMDP dialogue system. In: Proceedings of HLT

    Google Scholar 

  21. Yu K, Young S (2011) Continuous F0 modelling for HMM based statistical parametric speech synthesis. IEEE Audio, Speech Lang Process 19(5):1071–1079

    Article  Google Scholar 

  22. Mairesse F, Gašić M, Jurčíček F, Keizer S, Thomson B, Yu K, Young S (2010) Phrase-based statistical language generation using graphical models and active learning. In: Proceedings of ACL

    Google Scholar 

  23. OnStar (2013) OnStar FMV mirror. http://www.onstarconnections.com/

  24. Williams J (2012) A critical analysis of two statistical spoken dialog systems in public use. In: Spoken language technology workshop (SLT). Miami

    Google Scholar 

  25. Gasic M, Breslin C, Henderson M, Kim D, Szummer M, Thomson B, Tsiakoulis P, Young S (2013) POMDP-based dialogue manager adaptation to extended domains. In: SigDial 13. Metz

    Google Scholar 

  26. Gasic M, Breslin C, Henderson M, Kim D, Szummer M, Thomson B, Tsiakoulis P, Young S (2013) On-line policy optimisation of bayesian spoken dialogue systems via human interaction. In: ICASSP 2013. Vancouver

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steve Young .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Young, S. et al. (2016). Evaluation of Statistical POMDP-Based Dialogue Systems in Noisy Environments. In: Rudnicky, A., Raux, A., Lane, I., Misu, T. (eds) Situated Dialog in Speech-Based Human-Computer Interaction. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-21834-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-21834-2_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-21833-5

  • Online ISBN: 978-3-319-21834-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics