Skip to main content

Gaze and Speech: Pointing Device and Text Entry Modality

  • Chapter
  • First Online:
Current Trends in Eye Tracking Research

Abstract

The performance of eye gaze and speech when used as a pointing device was tested using the International Organization for Standardization (ISO) multidirectional tapping task. Eye gaze and speech were used for target selection as is, as well as with the use of a gravitational well and in conjunction with magnification. This selection method was then compared to the mouse. The mouse was far superior in terms of performance when selecting targets, although the use of a gravitational well did increase the performance of eye gaze and speech. Unfortunately, owing to the nature of eye gaze and speech as well as the ease with which target selection is achieved with a gravitational well, there was an increase in the number of incorrect clicks with this interaction technique. Magnification did not improve the use of gaze and speech as a pointing device and also resulted in more target re-entries. Furthermore, eye gaze and speech were used as a modality for text entry in a popular word processor application. Users were required to focus on an on-screen keyboard and then utter a verbal command in order to type a letter in the document. Speed and accuracy measurements were captured for the input of randomly selected phrases. Results showed that the keyboard was both faster and more accurate than using a combination of eye gaze and speech. Furthermore, neither the size of the on-screen buttons nor the spacing between the buttons affected the speed or accuracy of text entry.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Anderson, T. (2009). Pro office 2007 development with VSTO. United States of America: APress

    Google Scholar 

  • Beelders, T. R., & Blignaut, P. J. (2012). Using eye gaze and speech to simulate a pointing device. In Proceedings of the symposium on eye-tracking research and application (ETRA), Santa Barbara, California

    Google Scholar 

  • Bernhaupt, R., Palanque, P., Winkler, M., & Navarre, D. (2007). Usability study of multi-modal interfaces using eye-tracking. In Proceedings of INTERACT 2007, 412–424

    Google Scholar 

  • Bolt, R. (1981). Gaze-orchestrated dynamic windows. Computer Graphics, 15(3), 109–119.

    Article  Google Scholar 

  • Carroll, J. M. (2003). HCI models, theories, and frameworks: Towards a multidisciplinary science. San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Coutaz, J., & Caelen, J. (1991). A taxonomy for multimedia and multimodal user interfaces. In Proceedings of the Second East-West HCI conference, St Petersburg, Russia, 229–240

    Google Scholar 

  • Dragon Naturally Speaking. (nd). History of speech and voice recognition and transcription software. Retrieved 13 Feb 2009 from http://www.nuance.com

  • Drewes, H., & Schmidt, A. (2009). The MAGIC touch: Combining MAGIC-pointing with a touch-sensitive mouse. In Human-Computer Interaction—INTERACT 2009. 12th IFIP TC 13 International Conference, Part II, Uppsala, Sweden, 415–428

    Google Scholar 

  • Edwards, A. L. (1951). Balanced Latin-square designs in psychological research. The American Journal of Psychology, 64(4), 598–603.

    Article  Google Scholar 

  • Foley, J. D., Van Dam, A., Feiner, S. K., & Hughes, J. F. (1990). Computer graphics: Principles and practice. Reading, Massachusetts: Addison-Wesley.

    Google Scholar 

  • Griffin, Z. M., & Bock, K. (2000). What the eyes say about speaking. Psychological Science, 11, 274–279.

    Article  Google Scholar 

  • Hansen, J. P., Hansen, D. W., & Johansen, A. S. (2001). Bringing gaze-based interaction back to basics. In C. Stephanidis (Ed.), Universal Access in HCI (UAHCI): Towards an Information Society for All—Proceedings of the 9th International Conference on Human-Computer Interaction (HCII‘01), 325–328. Mahwah: Lawrence Erlbaum Associates

    Google Scholar 

  • Haro, A., Essa, I., & Flickner, M. (2000). A non-invasive computer vision system for reliable eye tracking. In Proceedings of CHI ’00, The Hague, Netherlands, 167–168

    Google Scholar 

  • Hatfield, F., & Jenkins, E. A. (1997). An interface integrating eye gaze and voice recognition for hands-free computer access. In Proceedings of the CSUN 1997 Conference, 1–7

    Google Scholar 

  • Hwang, F., Keates, S., Langdon, P., & Clarkson, J. (2004). Mouse movements of motion-impaired users: A submovement analysis. In Proceedings of ASSETS ’04, Atlanta, Georgia, United States of America, 102–109

    Google Scholar 

  • Istance, H. O., Spinner, C., & Howarth, P. A. (1996). Providing motor impaired users with access to standard Graphical User Interface (GUI) software via eye-based interaction. In Proceedings of 1st European Conference on Disability, Virtual Reality and Associated Technology, Maidenhead, United Kingdom, 109–116

    Google Scholar 

  • ISO. (2000). ISO 9241–9: Ergonomic requirements for office work with visual display terminals (VDTs)—Part 9: Requirements for non-keyboard input devices. International Organization for Standardization

    Google Scholar 

  • Jacob, R. J. K. (1991). The use of eye movements in human-computer interaction techniques: What you look at is what you get. ACM Transactions on Information Systems, 9(2), 152–169.

    Google Scholar 

  • Jacob, R. J. K. (1993). Eye movement-based human-computer interaction techniques: Toward non-command interfaces. In H. R. Hartson & D. Hix (Eds), Advances in human-computer interaction, 4, 151–190. Norwood. New Jersey: Ablex Publishing.

    Google Scholar 

  • Jacob, R. J. K. (1995). Eye tracking in advanced interface design. In W. Barfield & T. A. Furness (Eds.), Virtual environments and advanced interface design (pp. 258–288). New York: Oxford University Press.

    Google Scholar 

  • Jacob, R. J. K., & Karn, K. S. (2003). Eye tracking in human-computer interaction and usability research: Ready to deliver the promises (Section Commentary). In J. Hyona, R. Radach & H. Deubel (Eds), The mind’s eye: Cognitive and applied aspects of eye movement research (pp. 573–605). Amsterdam: Elsevier Science.

    Google Scholar 

  • Jaimes, A., & Sebe, N. (2005). Multimodal human computer interaction: A survey. IEEE workshop on human computer interaction, Las Vegas, Nevada, United States of America, 15–21

    Google Scholar 

  • Just, M. A., & Carpenter, P. A. (1976). Eye fixations and cognitive processes. Cognitive Psychology, 8, 441–480.

    Article  Google Scholar 

  • Kammerer, Y., Scheiter, K., & Beinhauer, W. (2008). Looking my way through the menu: The impact of menu design and multimodal input on gaze-based menu selection. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA), Savannah, Georgia, United States of America, 213–220

    Google Scholar 

  • Kaur, M., Tremaine, M., Huang, N., Wilder, J., Gacovski, Z., Flippo, F., & Mantravadi, S. (2003). Where is “it”? Event synchronization in gaze-speech input systems. In Proceedings of ICIM ’03, Vancouver, Canada, 151–158

    Google Scholar 

  • Keates, S., Hwang, F., Langdon, P., Clarkson, P. J., & Robinson, P. (2002). Cursor movements for motion-impaired computer users. In Proceedings of ASSETS ’02, Edinburgh, Scotland, 135–142

    Google Scholar 

  • Keates, S., & Trewin, S. (2005). Effect of age and Parkinson’s Disease on cursor positioning using a mouse. In Proceedings of ASSETS ’05, Baltimore, Maryland, United States of America, 68–75

    Google Scholar 

  • Klarlund, N. (2003). Editing by voice and the role of sequential symbol systems for improved human-to-computer information rates. In Proceedings of ICASSP, Hong Kong, 553–556

    Google Scholar 

  • Land, M. F., & Tatler, B. W. (2009). Looking and acting: Vision and eye movements in natural behaviour. United States of America: Oxford University Press.

    Book  Google Scholar 

  • Levenshtein, V. I. (1965). Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk, 163, 845–848.

    Google Scholar 

  • Liu, Y., Chai, J, Y., & Jin, R. (2007). Automated vocabulary acquisition and interpretation in multimodal conversational systems. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics

    Google Scholar 

  • MacKenzie, I. S., Kauppinen, T., & Silfverberg, M. (2001). Accuracy measures for evaluating computer pointing devices. In Proceedings of SIGCHI ’01, Seattle, Washington, United States of America, 9–16

    Google Scholar 

  • MacKenzie, I. S., & Soukoreff, R. W. (2002). A character-level error analysis technique for evaluating text entry methods. In Proceedings of NordiCHI 2002, Aarhus, Denmark, 243–246

    Google Scholar 

  • MacKenzie, I. S., & Soukoreff, R. W. (2003). Phrase sets for evaluating text entry techniques. In Extended Abstracts of the ACM Conference on Human Factors in Computing Systems—CHI 2003, Fort Lauderdale, Florida, United States of America, 754–755

    Google Scholar 

  • Maglio, P. P., Matlock, T., Campbell, C. S., Zhai, S., & Smith, B. A. (2000). Gaze and speech in attentive user interfaces. In Proceedings of the Third International Conference on Advances in Multimodal Interfaces, Vancouver, Canada, 1–7

    Google Scholar 

  • Majaranta, P. (2009). Text entry by eye gaze. Dissertations in Interactive Technology, number 11, University of Tampere

    Google Scholar 

  • Man, D. W. K., & Wong, M.-S., L (2007). Evaluation of computer-access solutions for students with quadriplegic athetoid cerebral palsy. American Journal of Occupational Therapy, 61, 355–364.

    Article  Google Scholar 

  • Miniotas, D., Špakov, O., & Evreinov, G. (2003). Symbol Creator: An alternative eye-based text entry technique with low demand for screen space. In Proceedings of Human Computer Interaction—INTERACT ’03, Zurich, Switzerland, 137–143

    Google Scholar 

  • Miniotas, D., Špakov, O., Tugoy, I., & MacKenzie, I. S. (2006). Speech-augmented eye gaze interaction with small closely spaced targets. In Proceedings of the 2006 Symposium on Eye Tracking Research and Applications (ETRA), 67–72

    Google Scholar 

  • Morimoto, C. H., & Amir, A. (2010). Context switching for fast key selection in text entry applications. In Proceedings of the 2010 Symposium on Eye Tracking Research and Applications (ETRA), 271–274

    Google Scholar 

  • Oviatt, S. (1999). Mutual disambiguation of recognition errors in a multimodal architecture. In Proceedings of the ACM SIGCHI 99, Pittsburgh, Pennsylvania, United States of America, 576–583

    Google Scholar 

  • Oviatt, S., & Cohen, P. (2000). Multimodal interfaces that process what comes naturally. Communications of the ACM, 43(2), 45–53.

    Google Scholar 

  • Pireddu, A. (2007). Multimodal Interaction: An integrated speech and gaze approach. Thesis, Politecnico di Torino

    Google Scholar 

  • Prasov, Z., Chai, J. Y., & Jeong, H. (2007). Eye gaze for attention prediction in multimodal human-machine conversation. In Proceedings of AAAI Spring Symposium on Interaction Challenges for Intelligent Assistants

    Google Scholar 

  • Read, J. (2005). On the application of text input metrics to handwritten text input. Text Input Workshop, Dagstuhl., Germany

    Google Scholar 

  • Read, J., MacFarlane, S., & Casey, C. (2001). Measuring the usability of text input methods for children. In Proceedings of Human-Computer Interaction (HCI) 2001, New Orleans, United States of America, 559–572

    Google Scholar 

  • Tan, Y. K., Sherkat, N., & Allen, T. (2003a). Eye gaze and speech for data entry: A comparison of different data entry methods. In Proceedings of the International Conference on Multimedia and Expo, Baltimore, Maryland, United States of America, 41–44

    Google Scholar 

  • Tan, Y. K., Sherkat, N., & Allen, T. (2003b). Error recovery in a blended style eye gaze and speech interface. In Proceedings of ICMI ’03, Vancouver, Canada, 196–202

    Google Scholar 

  • Tanaka, K. (1999). A robust selection system using realtime multi-modal user-agent interactions. In Proceedings of IUI’99, 105–108

    Google Scholar 

  • Tanenhaus, M. K., Spivey-Knowlton, M., Eberhard, K., & Sedivy, J. (1995). Integration of visual and linguistic information during spoken language comprehension. Science, 268, 1632–1634.

    Article  Google Scholar 

  • Tobii. (2011). Tobii unveils the world’s first eye-controlled laptop. Retrieved 14 March 2011 from http://www.tobii.com/en/eye-tracking-integration/global/news-and-events/press-releases/tobii-unveils-the-worlds-first-eye-controlled-laptop/

  • Van Dam, A. (2001). Post-Wimp user interfaces: The human connection. In R. Earnshaw, R. Guedj, A. van Dam & J. Vince (Eds), Frontiers of human-centred computing, online communities and virtual environments (pp. 163–178). London: Springer-Verlag.

    Google Scholar 

  • Velichkovsky, B. M., Sprenger, A., & Pomplun, M. (1997). Auf dem Weg zur Blickmaus: Die Beeinflussung der Fixationsdauer durch kognitive und kommunikative Aufgaben. In R. Liskowsky, B. M. Velichkowsky & W. Wünschmann (Eds), Software-Ergonomie (pp. 317–327)

    Google Scholar 

  • Vertanen, K., & MacKay, D. J. C. (2010). Speech Dasher: Fast writing using speech and gaze. In Proceedings of CHI 2010, Atlanta, Georgia, United States of America, 595–598

    Google Scholar 

  • Ward, D. J., Blackwell, A. F., & MacKay, D. J. C. (2000). Dasher—a data entry interface using continuous gestures and language models. In Proceedings of UIST 2000: The 13th Annual ACM Symposium on User Interface Software and Technology, San Diego, California, United States of America, 129–137

    Google Scholar 

  • Wobbrock, J. O., Rubinstein, J., Sawyer, M. W., & Duchowski, A. T. (2008). Longitudinal evaluation of discrete consecutive gaze gestures for text entry. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA), Savannah, Georgia, United States of America, 11–18

    Google Scholar 

  • Zhai, S., Morimoto, C., & Ihde, S. (1999). Manual And gaze input cascaded (MAGIC) pointing. In Proceedings of CHI ’99: ACM Conference on Human Factors in Computing Systems, Pittsburgh, Pennsylvania, United States of America, 246–253

    Google Scholar 

  • Zhang, X., & MacKenzie, I. S. (2007). Evaluating eye tracking with ISO 9241– Part 9. In J. Jacko (Ed.), Human Computer Interaction, 779–788

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. R. Beelders .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Beelders, T., Blignaut, P. (2014). Gaze and Speech: Pointing Device and Text Entry Modality. In: Horsley, M., Eliot, M., Knight, B., Reilly, R. (eds) Current Trends in Eye Tracking Research. Springer, Cham. https://doi.org/10.1007/978-3-319-02868-2_4

Download citation

Publish with us

Policies and ethics