Evaluating the Performance of a Speech Recognition Based System
Speech based solutions have taken center stage with growth in the services industry where there is a need to cater to a very large number of people from all strata of the society. While natural language speech interfaces are the talk in the research community, yet in practice, menu based speech solutions thrive. Typically in a menu based speech solution the user is required to respond by speaking from a closed set of words when prompted by the system. A sequence of human speech response to the IVR prompts results in the completion of a transaction. A transaction is deemed successful if the speech solution can correctly recognize all the spoken utterances of the user whenever prompted by the system. The usual mechanism to evaluate the performance of a speech solution is to do an extensive test of the system by putting it to actual people use and then evaluating the performance by analyzing the logs for successful transactions. This kind of evaluation could lead to dissatisfied test users especially if the performance of the system were to result in a poor transaction completion rate. To negate this the Wizard of Oz approach is adopted during evaluation of a speech system. Overall this kind of evaluations is an expensive proposition both in terms of time and cost. In this paper, we propose a method to evaluate the performance of a speech solution without actually putting it to people use. We first describe the methodology and then show experimentally that this can be used to identify the performance bottlenecks of the speech solution even before the system is actually used thus saving evaluation time and expenses.
KeywordsSpeech solution evaluation Speech recognition Pre-launch recognition performance measure
Unable to display preview. Download preview PDF.
- 2.Kim, C., Stern, R.: Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring. In: IEEE International Conference on Acoustics Speech and Signal Processing, pp. 4574–4577 (2010)Google Scholar
- 5.Sun, Y., Gemmeke, J., Cranen, B., Bosch, L., Boves, L.: Using a DBN to integrate sparse classification and GMM-based ASR. In: Proceedings of Interspeech 2010 (2010)Google Scholar
- 6.Zhao, Y., Juang, B.: A comparative study of noise estimation algorithms for VTS-based robust speech recognition. In: Proceedings of Interspeech 2010 (2010)Google Scholar