Voice Messaging User Interface

  • Harry E. Blanchard
  • Steven H. Lewis
Part of the Signals and Communication Technology book series (SCT)


Voice mail is certainly one of the most common interactive voice response (IVR) applications in telephony. When we talk of designing these systems to be easy to use, there is much overlap with general IVR system design, but there are also significant special characteristics to voice mail systems. This chapter discusses designing voice mail systems for usability with particular emphasis on the special characteristics, demands, conventions and standards associated with voice mail. Touch-tone interfaces are still the norm in messaging, but this chapter also surveys continually emerging innovations: novel prompt and menu structures, automatic speech recognition, unified messaging, and multimedia mail.


voice mail voice messaging unified messaging universal mailbox 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. American National Standards Institute (ANSI). (1996, May). ANSI/ISO 13714: User interface to telephone-based services: Voice messaging applications. New York: Author.Google Scholar
  2. Balentine, B., & Morgan, D. P. (2001). How to build a speech recognition application ( 2nd ed.). San Ramon, CA: EIG Press.Google Scholar
  3. Barra, L. (2006). Minutes of the INCITS Standards Development Board meeting number 14. Scholar
  4. Blanchard, H. E., & Angiolillo, J. S. (1994). Visual displays in communications: A review of effects on human performance and preference. SID International Symposium Digest of Technical Papers, 25, 375-378.Google Scholar
  5. Beach, T. (1999). The pitfalls of UM. Wireless Review, 16(21), 30. Scholar
  6. Brems, D. J., Rabin, M. D., & Waggett, J. L. (1995). Using natural language conventions in the user interface design of automatic speech recognition systems. Human Factors, 37(2), 265-282.CrossRefGoogle Scholar
  7. Business Journal of Phoenix, The. (2006), Traditional voice mail to fade away. Scholar
  8. Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-computer interaction. Hillsdale, NJ: Erlbaum.Google Scholar
  9. Card, S. K., Moran, T. P, & Newell, A. (1980). The keystroke-level model for user performance with interactive systems. Communications of the ACM, 23(7), 396-410.CrossRefGoogle Scholar
  10. Chin, J. P. (1996). Personality trait attributions to voice mail user interfaces. Conference Companion CHI-96: Human Factors in Computing Systems (pp. 248-249). New York: ACM.CrossRefGoogle Scholar
  11. Clark, H. H., & Haviland, S. E. (1977). Comprehension and the given-new contract. In R. O. Freedle (Ed.), Discourse production and comprehension(pp. 1-40). Norwood, N.J: Ablex.Google Scholar
  12. Cohen, M. H., Giangola, J. P., & Balogh, J. (2004). Voice user interface design. Boston: Addison-Wesley Professional.Google Scholar
  13. Cowley, C. K., & Jones, D. M. (1992). Synthesized or digitized? A guide to the use of computer speech. Applied Ergonomics, 23(3), 172-176.CrossRefGoogle Scholar
  14. Doherty, S. (2005a). Unified messaging systems: A la carte messaging. Network Computing, 16(20), 41-53. Scholar
  15. Doherty, S. (2005b). Get the message out. Information Week, 1061, 64-70. Scholar
  16. Englebeck, G., & Roberts, T. (1990). The effects of several voice-menu characteristics on menu-selection performance, In Proceedings of the Bellcore Symposium on User Centered Design(Bellcore Special Report SR-STS-001658) (pp. 50-63). Red Bank, NJ: Bell Communications Research.Google Scholar
  17. Fay, D. (1993). Interfaces to automated telephone services: Do users prefer touch-tone or automatic speech recognition? Proceedings of the 14 thInternational Symposium on Human Factors in Telecommunications(pp. 339-349). Darmstadt, Germany: R. v. Decker’s Verlag.Google Scholar
  18. Franzke, M., Marx, A. N., Roberts, R. L, & Engelbeck, G. E. (1993). Is speech recognition usable? An exploration of the usability of a speech-based voice mail interface. SIGCHI Bulletin, 25(3), 49-51.CrossRefGoogle Scholar
  19. Fusco, M., & Gattuso, N. (1991, September). An assessment of a visual voice mail interface. Poster presented at the 35thAnnual Meeting of the Human Factors Society, San Francisco, CA.Google Scholar
  20. Gardner-Bonneau, D. (1999). Guidelines for speech-enabled IVR application design. In D. Gardner-Bonneau (Ed.), Human factors and voice interactive systems (pp. 147-162). Boston: Kluwer.Google Scholar
  21. Goldstein, M., Bretan, I., Sallnäs, E.-L., & Björk, H. (1999). Navigational abilities in audial voice-controlled dialogue structures, Behaviour & Information Technology, 18(2), 83-95.CrossRefGoogle Scholar
  22. Gould, J. D, & Boies, S. J. (1983). Human factors challenges in creating a principal support office system—The speech filing system approach. ACM Transactions on Office Information Systems, 1(4), 273-298.CrossRefGoogle Scholar
  23. Graham, J. (2003, November 18). Camera phones rival DVD players as fastest growing, USA Today. Scholar
  24. Green, A. (2002, November 7). Voicemail replayed, fastforwarded. Communications Convergence Magazine. Scholar
  25. Grudin, J. (1989). The case against user interface consistency. Communications of the ACM. 32(10), 1164-1173.CrossRefGoogle Scholar
  26. Halstead-Nussloch, R. (1989). The design of phone-based interfaces for consumers. Proceedings of CHI-89: Human Factors in Computing Systems(pp. 347-352). New York: ACM.Google Scholar
  27. Harris, R. A. (2005). Voice interaction design: Crafting the new conversational speech systems. Amsterdam: Morgan-Kaufman.Google Scholar
  28. Hirschberg, J., Bacchiani, M., Hindle, D., Isenhour, P., Rosenberg, A., Stark, L., Zamchick, G., & Whittaker, S. (2001). SCANMail: Browsing and searching speech data by content, Proceedings of Eurospeech 2001(pp. 1299-1302). Aalborg, Denmark.Google Scholar
  29. Hobson, J. (2005, July 27). The voice-mail woman wastes our time. Commentary on Day to Day, National Public Radio, Washington, DC. Scholar
  30. John, B. E., & Kieras, D. E. (1994). The GOMS family of analysis techniques: Tools for design and evaluation (Carnegie Mellon University Human Computer Interaction Institute Technical Report 94-106). Pittsburgh, PA: Carnegie Mellon University.Google Scholar
  31. ISO/IEC JTC1. (1995). ISO/IEC 13714: User interface to telephone-based services: Voice messaging applications. Geneva: ISO.Google Scholar
  32. Kamm, C. (1994). User interfaces for voice applications. In D. B. Roe & J. G. Wilpon (Eds.), Voice communication between humans and machines (pp.422-442). Washington, DC: National Academy Press.Google Scholar
  33. Kamm, C., & Helander, M. (1997). Design issues for interfaces using voice input. In M. Helander, T. K. Landauer, & P. V. Prabhu (Eds.), Handbook of human-computer interaction (2nded., pp. 1043-1060). Amsterdam: North-Holland.Google Scholar
  34. Karis, D., (1997). Speech recognition systems: Performance, preference, and design. Proceedings of the 16 th International Symposium on Human Factors in Telecommunications. Oslo, Norway.Google Scholar
  35. Karis, D., & Dobroth, K. M. (1991). Automating services with speech recognition over the public switched telephone network: Human factors considerations. IEEE Journal on Selected Areas in Communications, 9(4), 574-585.CrossRefGoogle Scholar
  36. Kopf, D. (2000, November 1). Unified what?! Business Communications Review.!_20001101812.htmGoogle Scholar
  37. Kotelly, B. (2003). The art and business of speech recognition: Creating the noble voice. Boston: Addison-Wesley.Google Scholar
  38. Lai, J., Mitchell, S., Viveros, M., Wood, D., & Lee, K. M. (2002). Ubiquitous access to unified messaging: A study of usability and the limits of pervasive computing. International Journal of Human Computer Interaction, 14(3), 385-404.CrossRefGoogle Scholar
  39. Lee, E., & MacGregor, J. (1985). Minimizing users’ search time in menu-retrieval systems. Human Factors, 27(2), 157-162.Google Scholar
  40. Lee, K. M., & Lai, J. (2005). Speech vs. touch: A comparative study of the use of speech and DTMF keypad for navigation. International Journal of Human Computer Interaction, 19(3), 343-360.CrossRefGoogle Scholar
  41. Leopold, F. F., & van Nes, F. L. (1984). Control of voice mail systems by voice commands. IPO Annual Progress Report, 19, 118-122.Google Scholar
  42. Lie, H. W., Dybvik, P. E., & Rygh, J. (1994). SCREAM: Screen-based navigation in voice messages. Conference Companion CHI-94: Human Factors in Computing Systems(pp. 443-444). New York: ACM.CrossRefGoogle Scholar
  43. Lisbona, D. (1999, May/June). Unified messaging: Turning promise into profitability, Messaging Magazine. Scholar
  44. Marics, M. A., & Engelbeck, G. (1998). Designing voice menu applications for telephones. In M.G. Helander, T.K. Landauer, & P. Prabhu (Eds.), Handbook of human-computer interaction ( 2nded.). Elsevier: New York.Google Scholar
  45. Marx, M., & Schmandt, C. (1996). MailCall: Message presentation and navigation in a non-visual environment. Proceedings of CHI-96: Human Factors in Computing Systems(pp. 165-172). New York: ACM.Google Scholar
  46. McCauley, M. E. (1994). Human factors in voice technology. In F. A. Muckler (Ed.), Human factors review: 1984(pp. 131-166), Santa Monica, CA: Human Factors Society.Google Scholar
  47. Noll, A. M. (1992). Anatomy of a failure: Picturephone revisited. Telecommunications Policy, 18, 307-316.CrossRefGoogle Scholar
  48. Norman, K. L. (1991). The psychology of menu selection: Designing cognitive control at the human/computer interface. Norwood, NJ: Ablex. Scholar
  49. Paris, C. R., Gilson, R. D., Thomas, M. H., & Silver, N. C. (1995). Effect of synthetic voice intelligibility on speech comprehension. Human Factors, 37(2), 335-340.CrossRefGoogle Scholar
  50. Pisoni, D. B. (1997). Perception of synthetic speech. In J. van Santen, R. W. Sproat, J. P. Olive, & J. Hirschberg (Eds.), Progress in speech synthesis(pp. 541-560). New York: Springer.Google Scholar
  51. Pisoni, D. B., Manous, L. M., & Dedina, J. J. (1987). Comprehension of natural and synthetic speech: Effects of predictability on the verification of sentences controlled for intelligibility. Computer Speech and Language, 2(3-4), 303-320.CrossRefGoogle Scholar
  52. Resnick, P., & Virzi, R. A. (1992). Skip and scan: Cleaning up telephone interfaces. Proceedings of CHI-92: Human Factors in Computing Systems(pp. 419-426). New York: ACM.Google Scholar
  53. Resnick, P., & Virzi, R. A. (1995). Relief from the audio interface blues: Expanding the spectrum of menu, list, and from styles. ACM Transactions on Computer-Human Interaction, 2(2), 145-176.CrossRefGoogle Scholar
  54. Richardson, T. (2005, May 19). Orange postpones Wildfire closure. The Register. Scholar
  55. Rice, R. E., & Tyler, J. (1995). Individual and organizational influences on voice mail use and evaluation. Behaviour & Information Technology, 14(6), 329-141.CrossRefGoogle Scholar
  56. Ringel, M., & Hirschberg, J. (2002). Automated message prioritization: Making voice mail retrieval more efficient. Proceedings of CHI 2002: Human Factors in Computing Systems(pp. 592-593). New York: ACM.Google Scholar
  57. Roy, D. K., & Schmandt, C. (1996). NewsComm: A hand-held interface for interactive access to structured audio. Proceedings of CHI-96: Human Factors in Computing Systems(pp. 173-180). New York: ACM.Google Scholar
  58. Schumacher, R. M. Jr., Hardzinski, M. L., & Schwartz, A. L. (1995). Increasing the usability of interactive voice response systems: Research and guidelines for phone-based interfaces. Human Factors, 37(2), 251-264.CrossRefGoogle Scholar
  59. Smith, B. (2005). Unified messaging reaches new dimension. Wireless Week, 11(20), 42.Google Scholar
  60. Stuart, R., Desurvire, H., & Dews, S. (1991). The truncation of prompts in phone based interfaces: Using TOTT in evaluations. Proceedings of the Human Factors Society 25 thAnnual Meeting(pp. 230-234). Santa Monica, CA: Human Factors Society.Google Scholar
  61. Syrdal, A. K. (1994). Development of a female voice for a concatenative text-to-speech synthesis system. Current Topics in Acoustic Research, 1, 169-181.Google Scholar
  62. Virzi, R. A., & Huitema, J. S. (1997). Telephone-based menus: Evidence that broader is better than deeper. Proceedings of the Human Factors and Ergonomics Society 41 stAnnual Meeting(pp. 315-319). Santa Monica, CA: Human Factors and Ergonomics Society.Google Scholar
  63. Walker, M. A., Fromer, J., Di Fabbrizio, G., Mestel, C., & Hindle, D. (1998). What can I say?: Evaluating a spoken language interface to email. Proceedings of CHI-98: Human Factors in Computing Systems(pp. 582-589). New York: ACM.Google Scholar
  64. Whittaker, S., Hirschberg, J., Amento, B., Stark, L., Bacchiani, M., Isenhour, P., Stead, L., Zamchick, G., & Rosenberg, A. (2002). SCANMail: A voicemail interface that makes speech browsable, readable and searchable. Proceedings of CHI 2002: Human Factors in Computing Systems(pp. 275-282). New York: ACM.Google Scholar
  65. Yuschik, M. (2002). Usability testing of voice controlled messaging. International Journal of Speech Technology, 5(4), 331-341.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science + Business Media, LLC 2008

Authors and Affiliations

  • Harry E. Blanchard
    • 1
  • Steven H. Lewis
    • 1
  1. 1.AT&T LabsMiddletownUSA

Personalised recommendations