Turn-taking as a design principle for barge-in in Spoken Language Systems
- 96 Downloads
It is widely acknowledged that users of Spoken Language Systems (SLS) want the ability to truncate system prompts by using a barge-in capability (e.g., Basson et al., 1995; Yankelovich et al., 1995). However, little has been published on how barge-in is used or if it adversely affects Automatic Speech Recognition (ASR) and the interface usability. Typically, user requests for barge-in are assumed to be based on the desire to make system interactions faster and therefore more similar to interactions with touch-tone systems. We believe that requests for a barge-in capability are rooted in the notion of discourse as a turn-taking event. Viewed in this way, we believe SLS can be enhanced to develop speech interfaces that are deemed more natural by users, as well as to increase system performance. This study addressed several issues. We found that users new to the system did not need to be informed about the barge-in capability before they attempted barge-in, that they used barge-in during almost half of their interactions with the system, and that they had identifiable patterns of barge-in use consistent with the turn-taking model. Results are presented and consequences for speech interface design as well as algorithm enhancement are discussed.
Keywordsspoken language systems automatic speech recognition barge-in telephone interface user interface design
Unable to display preview. Download preview PDF.
- Aust, H., Oerder, M., Seide, F., and Steinbiss, V. (1994). Experience with the Phillips automatic train timetable information system.Proc. IEEE Workshop on Interactive Voice Technology for Telecommunications Applications. New York: IEEE Press, pp. 67–72.Google Scholar
- Basson, S., Kalyanswamy, A., Man, E., Springer, S., and Yashchin, D. (1995). Establishing speech technology requirements: Themoney talks field trial.Proc. Annual International Voice Technologies Applications. San Jose, California: American Voice Input/Output Society, pp. 131–136.Google Scholar
- Franzke, M., Marx, A.N., Roberts, T.L., and Engelbeck, G.E. (1993). Is Speech Recognition Usable? An exploration of the usability of a speech-based voice mail interface.SIGCHI Bulletin, 25:49–51. New York: Association for Computing Machinery Inc.Google Scholar
- Marx, M. and Phillips, M. (1995). Against “Shoehorning:” Rethinking IVR architectures for speech recognition.Proc. Annual International Voice Technologies Applications. San Jose, California: American Voice Input/Output Society, pp. 187–195.Google Scholar
- Rudnicky, A.I. and Hauptmann, A.G. (1988). Talking to computers: An empirical investigation.International Journal of Man-Machine Studies, 28:583–604.Google Scholar
- Sacks, H., Schegloff, E., and Jefferson, G. (1975). A simplest systematics for the organization of turn-taking for conversation.Language, 50:696–735. Washington, D.C.: Linguistic Society of America.Google Scholar
- Stuart, R., Desurvire, H., and Dews, S. (1991). The truncation of prompts in phone based interfaces: Using TOTT in evaluations.Proc. of the Human Factors Society 35th Annual Meeting. Santa Monica, CA: Human Factors Society, pp. 230–234.Google Scholar
- Yankelovich, N., Levow, G., and Marx, M. (1995). Designing speech acts: Issues in speech user interfaces.SIGCHI, Human Factors in Computing System Proc., Annual Conference Series. New York: Association for Computing Machinery Inc., pp. 369–376.Google Scholar