VoiceMarks: restructuring hierarchical voice menus for improving navigation
Interactive Voice Response (IVR) systems, or touch-tone telephony interfaces, are nowadays a common medium of interaction between organizations or companies and their customers, allowing users to access or enter specific company-based information. These telephony interfaces typically involve the use of hierarchically structured voice menus, through which a user has to navigate in order to locate a specific desired menu item. This navigation process is often inefficient and time-consuming, leaving users at times frustrated and annoyed. In this paper, we describe the foundation of VoiceMarks, a system designed to improve the ease and efficiency of navigation in menu-based voice interfaces. The system features personalized menus through the use of voicemarks, in a process similar to bookmarking, but adapted to voice interfaces. VoiceMarks are essentially bookmarked nodes in the voice menu hierarchy, which are stored for the respective user in a directly accessible, personal menu. We developed and tested VoiceMarks interfaces for two applications: a bus schedule information system and a cinema ticket purchase system. A comparative study of VoiceMarks and traditional interfaces of these applications showed that VoiceMarks can significantly improve the interaction between users and systems, in terms of time and number of keystrokes needed to locate a menu item, as well as regarding user satisfaction. In general, users responded very positively to the VoiceMarks interface. In addition, the study pointed to some useful modifications of VoiceMarks, which should be considered before employing the system in a commercial setting.
KeywordsVoice user interfaces Personalized touch-tone menus Telephony bookmarks Touch-tone interface navigation
Unable to display preview. Download preview PDF.
- Balentine, B. (1999). Re-Engineering the speech menu. In D. Gardner-Bonneau (Ed.), Human factors and voice interactive systems (pp. 205–235). Norwell: Kluwer Academic. Google Scholar
- Balentine, B., & Morgan, D. P. (1999). How to build a speech recognition application. San Ramon: Enterprise Integration Group, Inc. Google Scholar
- Gardner-Bonneau, D. (Ed.) (1999). Human factors and voice interactive systems. Norwell: Kluwer Academic. Google Scholar
- Carroll, J., & Carrithers, C. (1984). Blocking learner error states in a training-wheels system. Human Factors, 4(26), 377–389. Google Scholar
- Karat, C. M., Halverson, M., Horn, D., & Karat, J. (1999). Patterns of entry and correction in large vocabulary continuous speech recognition systems. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 568–575). New York: ACM Press. Google Scholar
- Leplatre, G., & Brewster, S. A. (2000). Designing non-speech sounds to support navigation in mobile phone menus. In International conference on auditory display, 2000. Google Scholar
- Linton, F., Joy, D., Schaefer, P., & Charron, A. (2000). Owl: A recommender system for organization-wide learning. Educational Technology & Society, 1(3), 62–76. Google Scholar
- McGrenere, J., & Moore, G. (2000). Are we all in the same “bloat”? In Graphics interface, pp. 187–186, 2000. Google Scholar
- McGrenere, J., Baecker, R. M., & Booth, K. S. (2002). An evaluation of a multiple interface design solution for bloated software. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 164–170). New York: ACM Press. Google Scholar
- McInnes, F. R., Nairn, I. A., Attwater, D. J., Edgington, M. D., & Jack, M. A. (1999). A comparison of confirmation strategies for fluent telephone dialogues. In Human factors in telecommunication, 1999. Google Scholar
- Resnick, P., & Virzi, R. A. (1992). Skip and scan: Cleaning up telephone interfaces. In Proceedings of ACM CHI’92 (pp. 419–426). New York: ACM Press. Google Scholar
- Shajahan, P., & Irani, P. (2004). Representing hierarchies using multiple synthetic voices. In 8th international conference on information visualization (pp. 885–891). Los Alamitos: IEEE Computer Society. Google Scholar
- Sumikawa, D. A. (1985). Guidelines for the integration of audio cues into computer user interfaces. Livermore: Lawrence Livermore National Laboratory. Google Scholar
- Voicegenie. www.voicegenie.com (2008).
- Yankelovich, N., Levow, G., & Marx, M. (1995). Designing speechacts: issues in speech user interfaces. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 369–376). New York: ACM Press/Addison-Wesley. Google Scholar
- Yin, M., & Zhai, S. (2006). The benefits of augmenting telephone voice menu navigation with visual browsing and search. In Proceedings of ACM CHI’06, pp. 319–328, 2006. Google Scholar