A Theoretical Framework for a User-Centered Spoken Dialog Manager
Dialog strategies have long since been handcrafted by dialog experts. Only within the last decade, research has moved to data-driven methods leading to statistical models. But still, most dialog systems make use solely of the spoken words and their semantics, although speech signals reveal much more about the speaker, e.g. its age, gender, emotional state, etc. Using this speaker state information - along with the semantics - can be a promising way of moving dialog systems towards better performance whilst making them more natural at the same time. Partially Observable Markov Decision Processes (POMDPs), a state-of-the-art statistical modeling method, offer an easy and unified way of integrating speaker state information into dialog systems. In this contribution we present our ongoing research on combining a POMDP-based dialog manager with speaker state information.
Unable to display preview. Download preview PDF.
- 1.Abdulla, W.H., Kasabov, N.K.: Improving speech recognition performance through gender separation. In: Proc. of ANNES, pp. 218–222 (2001)Google Scholar
- 2.Bohus, D., Raux, A., Harris, T.K., Eskenazi, M., Rudnicky, A.I.: Olympus: an open-source framework for conversational spoken language interface research. In: Proc. of, NAACL-HLT-Dialog ’07, pp. 32–39 (2007)Google Scholar
- 3.Heinroth, T., Denich, D.: Spoken interaction within the computed world: Evaluation of a multitasking adaptive spoken dialogue system (2011). CompsacGoogle Scholar
- 7.Oshry, M., Auburn, R., Baggia, P., Bodell, M., Burke, D., Burnett, D., Candell, E., Carter, J., Mcglashan, S., Lee, A., Porter, B., Rehor, K.: Voice extensible markup language (voicexml) version 2.1 (2007)Google Scholar
- 8.Polzehl, T., Schmitt, A., Metze, F.: Salient features for anger recognition in german and english ivr portals. In: Spoken Dialogue Systems Technology and Design, pp. 83–105. Springer New York (2011)Google Scholar
- 9.Schmitt, A., Heinroth, T., Bertrand, G.: Towards emotion, age- and genderaware voicexml applications. In: Proc. of IE’09 (2009)Google Scholar
- 10.Schmitt, A., Polzehl, T., Liscombe, J.: The influence of the utterance length on the recognition of aged voices. In: Proc. of LREC. Valetta, Malta (2010)Google Scholar
- 14.Young, S., Williams, J., Schatzmann, J., Stuttle, M., Weilhammer, K.: D4.3: Bayes net prototype - the hidden information state dialogue manager. Tech. rep., TALK project, IST-507802, 6th FP (2006)Google Scholar