Evaluating Interactions with Spoken Dialogue Telephone Services
The quality experienced during the interaction with telephone-based spoken dialogue services results from a perception and judgement process. As a consequence, quality has to be measured in a subjective way, with the help of human test persons. To complement subjective quality judgements, parameters can be logged which quantify the flow of the interaction, the behaviour of the user and the system, and the performance of individual system modules during the interaction. Although such parameters are not directly linked to the quality perceived by the user, they provide useful information for system development, optimisation, and maintenance. This chapter presents standardised methods for both measurement approaches. Firstly, a brief overview of subjective evaluation experiments is provided, following Recommendation P.851 issued by the International Telecommunication Union. Secondly, a collection of parameters is presented which has proven to be useful for system design. An initial evaluation study in is described which shows that the parameters correlate only weakly with subjective judgements; thus, both types of evaluation provide complementary types of information. Linear regression models may be used to predict subjective judgements from interaction parameters, but their prediction accuracy is still limited.
KeywordsSpoken dialogue system subjective evaluation interaction parameters quality prediction
Unable to display preview. Download preview PDF.