Advertisement

Ivr Usability Engineering Using Guidelines And Analyses Of End-to-End Calls

  • Bernhard Suhm
Part of the Signals and Communication Technology book series (SCT)

Abstract

While speech offers unique advantages and opportunities as an interface modality, the known limitations of speech recognition technology and cognitive limitations of spoken interaction amplify the importance of usability in the development of speech applications. The competitive business environment, on the other hand, requires sound business justification for any investment in speech technology and proof of its usability and effectiveness. This chapter presents design principles and usability engineering methods that empower practitioners to optimize both usability and ROI of telephone speech applications, frequently also referred to as telephone Voice User Interface (VUI) or Interactive Voice Response (IVR) systems. The first section discusses limitations of speech user interfaces and their repercussions on design. From a survey of research and industry know-how a short list of guidelines for IVR design is derived. Examples illustrate how to apply these guidelines during the design phase of a telephone speech application. The second section presents a data-driven methodology for optimizing usability and effectiveness of IVRs. The methodology is grounded in the analysis of live, end-to-end calls - the ultimate field data for telephone speech applications. We will describe how to capture end-to-end call data from deployed systems and how to mine this data to measure usability and identify problems. Leveraging end-to-end call data empowers practitioners to build solid business cases, optimize ROI, and justify the cost of IVR usability engineering. Case studies from the consulting practice at BBN Technologies illustrate how these methods were applied in some of the largest US deployments of automated telephone applications.

Keywords

telephone speech application speech user interface design principles best practices usability engineering end-to-end call ROI 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balentine, B., & Morgan, D. P. (1999). How to build a speech recognition application. San Ramon, CA: Enterprise Integration Group.Google Scholar
  2. Balentine, B. (2006). It’s better to be a good machine. San Ramon, CA: Enterprise Integration Group.Google Scholar
  3. Bennacef, S., Devillers, L., Rosset, S., & Lamel, L. (1996). Dialog in the RAILTEL telephone-based system. In International Conference on Spoken Language Systems (ICSLP)(pp. 550-553). Philadelphia, PA: IEEE.Google Scholar
  4. Cohen, M. H., Giangola, J. P., and Balogh, J. (2004). Voice user interface design. Reading, MA: Addison-Wesley.Google Scholar
  5. Delogu, C., Di Carlo, A., Rotundi, P., & Satori, D. (1998). A comparison between DTMF and ASR IVR services through objective and subjective evaluation. In Interactive Voice Technology for Telecommunications Applications (IVTTA)(pp.145-150). Italy: IEEE.Google Scholar
  6. Edwards, K., Quinn, K., Dalziel, P. B., & Jack, M. A. (1997). Evaluating commercial speech recognition and DTMF technology for automated telephone banking services. In IEEE Colloquium on Advances in Interactive Voice Technologies for Telecommunication Services(pp. 1-6).Google Scholar
  7. Edwards, K., Quinn, K., et al. (1997). Evaluating commercial speech recognition and DTMF technology for automated telephone banking services. IEEE Colloquium on Advances in Interactive Voice Technologies for Telecommunication Services.Google Scholar
  8. Gorin, A., Parker, B., Sachs, R., & Wilpon, J. (1996). How may I help you? In Interactive Voice Technology for Telecommunications Applications (IVTTA)(pp. 57-60). IEEE.Google Scholar
  9. Halstead-Nussloch, R. (1989). The design of phone-based interfaces for consumers. In International Conference for Human Factors in Computing Systems (CHI) (pp. 347-352). New York: ACM Press.Google Scholar
  10. Holtzblatt, K., & Beyer, H. (1998). Contextual design. Morgan Kaufmann.Google Scholar
  11. Karat, C.-M., Halverson, C., Horn, D., & Karat, John. (1999). Patterns of entry and correction in large vocabulary continuous speech recognition systems. In International Conference for Computer-Human Interaction (CHI)(pp. 568-576). New York: ACM Press.Google Scholar
  12. Karat, J., D. Horn, D., Halverson, C., & Karat, C.-M. (2000). Overcoming unusability: Developing efficient strategies in speech recognition systems. In International Conference for Human Factors in Computing Systems (CHI) (Vol. 2). New York: ACM Press.Google Scholar
  13. Newman, D. (2000). Talk to your computer: Speech recognition made easy. Berkeley, CA: Waveside Publishing.Google Scholar
  14. Nielsen, J. (1993). Usability engineering. Morristown , NJ: AP Professional.zbMATHGoogle Scholar
  15. Novick, D. G., Hansen, B., Sutton, S., & Marshall, C.R. (1999). Limiting factors of automated telephone dialogues. In D. Gardner-Bonneau (Ed.), Human factors and voice interactive systems (pp. 163-186). Boston/Dordrecht/London: Kluwer Academic Publishers.Google Scholar
  16. Oviatt, S., DeAngeli, A., & Kuhn, K. (1997). Integration and synchronization of input modes during multimodal human-computer interaction. International Conference on Human Factors in Computing Systems (CHI) (pp. 415-422). New York: ACM Press.Google Scholar
  17. Parnas, D. L. (1969). On the use of transition diagrams in the design of a user interface of interactive computer systems. In Proceedings of ACM Conference (pp. 379-385).Google Scholar
  18. Reeves, B., & Nass, C. (1996). The media equation. Cambridge (UK): Cambridge University Press.Google Scholar
  19. Resnick, P., & Virzi, R. A. (1995). Relief from the audio interface blues: Expanding the spectrum of menu, list, and form styles. Transactions on Computer-Human Interaction (TOCHI), 2(2), 145-176.CrossRefGoogle Scholar
  20. Roberts, T. L., & Engelbeck, G. (1989). The effects of device technology on the usability of advanced telephone functions. In International Conference on Human Factors in Computing Systems (CHI) (pp. 331-338). New York: ACM Press.Google Scholar
  21. Sacks, H., & Schegloff, E. A. (1974). A simplest systematics for the organization of turn-taking in conversation. Language, 50, 698-735.CrossRefGoogle Scholar
  22. Shneiderman, B. (2000). The limits of speech recognition. Communications of the ACM, 43(9).Google Scholar
  23. Soltau, H., & Waibel, A. (2000). Acoustic models for hyperarticulated speech. Paper presented at the International Conference on Speech and Language Processing (ICASSP), Beijing, China.Google Scholar
  24. Suhm, B. (2003). Towards best practices for speech user interface design. In European Conference on Speech Communication and Technology (Eurospeech) (pp. 2217-2220).Google Scholar
  25. Suhm, B., Meyers, B., & Waibel, A. (1999). Empirical and model-based evaluation of multimodal error correction. In International Conference on Computer-Human Interaction (CHI). New York: ACM Press.Google Scholar
  26. Suhm, B., & Peterson, P. (2001). Evaluating commercial touch-tone and speech-enabled telephone voice user interfaces using a single measure. In International Conference on Human Factors in Computing Systems (CHI)(pp. 2.129-2.130). New York: ACM Press.Google Scholar

Copyright information

© Springer Science + Business Media, LLC 2008

Authors and Affiliations

  • Bernhard Suhm
    • 1
  1. 1.BBN TechnologiesCambridgeUSA

Personalised recommendations