Abstract
The performance of eye gaze and speech when used as a pointing device was tested using the International Organization for Standardization (ISO) multidirectional tapping task. Eye gaze and speech were used for target selection as is, as well as with the use of a gravitational well and in conjunction with magnification. This selection method was then compared to the mouse. The mouse was far superior in terms of performance when selecting targets, although the use of a gravitational well did increase the performance of eye gaze and speech. Unfortunately, owing to the nature of eye gaze and speech as well as the ease with which target selection is achieved with a gravitational well, there was an increase in the number of incorrect clicks with this interaction technique. Magnification did not improve the use of gaze and speech as a pointing device and also resulted in more target re-entries. Furthermore, eye gaze and speech were used as a modality for text entry in a popular word processor application. Users were required to focus on an on-screen keyboard and then utter a verbal command in order to type a letter in the document. Speed and accuracy measurements were captured for the input of randomly selected phrases. Results showed that the keyboard was both faster and more accurate than using a combination of eye gaze and speech. Furthermore, neither the size of the on-screen buttons nor the spacing between the buttons affected the speed or accuracy of text entry.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anderson, T. (2009). Pro office 2007 development with VSTO. United States of America: APress
Beelders, T. R., & Blignaut, P. J. (2012). Using eye gaze and speech to simulate a pointing device. In Proceedings of the symposium on eye-tracking research and application (ETRA), Santa Barbara, California
Bernhaupt, R., Palanque, P., Winkler, M., & Navarre, D. (2007). Usability study of multi-modal interfaces using eye-tracking. In Proceedings of INTERACT 2007, 412–424
Bolt, R. (1981). Gaze-orchestrated dynamic windows. Computer Graphics, 15(3), 109–119.
Carroll, J. M. (2003). HCI models, theories, and frameworks: Towards a multidisciplinary science. San Francisco: Morgan Kaufmann.
Coutaz, J., & Caelen, J. (1991). A taxonomy for multimedia and multimodal user interfaces. In Proceedings of the Second East-West HCI conference, St Petersburg, Russia, 229–240
Dragon Naturally Speaking. (nd). History of speech and voice recognition and transcription software. Retrieved 13 Feb 2009 from http://www.nuance.com
Drewes, H., & Schmidt, A. (2009). The MAGIC touch: Combining MAGIC-pointing with a touch-sensitive mouse. In Human-Computer Interaction—INTERACT 2009. 12th IFIP TC 13 International Conference, Part II, Uppsala, Sweden, 415–428
Edwards, A. L. (1951). Balanced Latin-square designs in psychological research. The American Journal of Psychology, 64(4), 598–603.
Foley, J. D., Van Dam, A., Feiner, S. K., & Hughes, J. F. (1990). Computer graphics: Principles and practice. Reading, Massachusetts: Addison-Wesley.
Griffin, Z. M., & Bock, K. (2000). What the eyes say about speaking. Psychological Science, 11, 274–279.
Hansen, J. P., Hansen, D. W., & Johansen, A. S. (2001). Bringing gaze-based interaction back to basics. In C. Stephanidis (Ed.), Universal Access in HCI (UAHCI): Towards an Information Society for All—Proceedings of the 9th International Conference on Human-Computer Interaction (HCII‘01), 325–328. Mahwah: Lawrence Erlbaum Associates
Haro, A., Essa, I., & Flickner, M. (2000). A non-invasive computer vision system for reliable eye tracking. In Proceedings of CHI ’00, The Hague, Netherlands, 167–168
Hatfield, F., & Jenkins, E. A. (1997). An interface integrating eye gaze and voice recognition for hands-free computer access. In Proceedings of the CSUN 1997 Conference, 1–7
Hwang, F., Keates, S., Langdon, P., & Clarkson, J. (2004). Mouse movements of motion-impaired users: A submovement analysis. In Proceedings of ASSETS ’04, Atlanta, Georgia, United States of America, 102–109
Istance, H. O., Spinner, C., & Howarth, P. A. (1996). Providing motor impaired users with access to standard Graphical User Interface (GUI) software via eye-based interaction. In Proceedings of 1st European Conference on Disability, Virtual Reality and Associated Technology, Maidenhead, United Kingdom, 109–116
ISO. (2000). ISO 9241–9: Ergonomic requirements for office work with visual display terminals (VDTs)—Part 9: Requirements for non-keyboard input devices. International Organization for Standardization
Jacob, R. J. K. (1991). The use of eye movements in human-computer interaction techniques: What you look at is what you get. ACM Transactions on Information Systems, 9(2), 152–169.
Jacob, R. J. K. (1993). Eye movement-based human-computer interaction techniques: Toward non-command interfaces. In H. R. Hartson & D. Hix (Eds), Advances in human-computer interaction, 4, 151–190. Norwood. New Jersey: Ablex Publishing.
Jacob, R. J. K. (1995). Eye tracking in advanced interface design. In W. Barfield & T. A. Furness (Eds.), Virtual environments and advanced interface design (pp. 258–288). New York: Oxford University Press.
Jacob, R. J. K., & Karn, K. S. (2003). Eye tracking in human-computer interaction and usability research: Ready to deliver the promises (Section Commentary). In J. Hyona, R. Radach & H. Deubel (Eds), The mind’s eye: Cognitive and applied aspects of eye movement research (pp. 573–605). Amsterdam: Elsevier Science.
Jaimes, A., & Sebe, N. (2005). Multimodal human computer interaction: A survey. IEEE workshop on human computer interaction, Las Vegas, Nevada, United States of America, 15–21
Just, M. A., & Carpenter, P. A. (1976). Eye fixations and cognitive processes. Cognitive Psychology, 8, 441–480.
Kammerer, Y., Scheiter, K., & Beinhauer, W. (2008). Looking my way through the menu: The impact of menu design and multimodal input on gaze-based menu selection. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA), Savannah, Georgia, United States of America, 213–220
Kaur, M., Tremaine, M., Huang, N., Wilder, J., Gacovski, Z., Flippo, F., & Mantravadi, S. (2003). Where is “it”? Event synchronization in gaze-speech input systems. In Proceedings of ICIM ’03, Vancouver, Canada, 151–158
Keates, S., Hwang, F., Langdon, P., Clarkson, P. J., & Robinson, P. (2002). Cursor movements for motion-impaired computer users. In Proceedings of ASSETS ’02, Edinburgh, Scotland, 135–142
Keates, S., & Trewin, S. (2005). Effect of age and Parkinson’s Disease on cursor positioning using a mouse. In Proceedings of ASSETS ’05, Baltimore, Maryland, United States of America, 68–75
Klarlund, N. (2003). Editing by voice and the role of sequential symbol systems for improved human-to-computer information rates. In Proceedings of ICASSP, Hong Kong, 553–556
Land, M. F., & Tatler, B. W. (2009). Looking and acting: Vision and eye movements in natural behaviour. United States of America: Oxford University Press.
Levenshtein, V. I. (1965). Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk, 163, 845–848.
Liu, Y., Chai, J, Y., & Jin, R. (2007). Automated vocabulary acquisition and interpretation in multimodal conversational systems. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics
MacKenzie, I. S., Kauppinen, T., & Silfverberg, M. (2001). Accuracy measures for evaluating computer pointing devices. In Proceedings of SIGCHI ’01, Seattle, Washington, United States of America, 9–16
MacKenzie, I. S., & Soukoreff, R. W. (2002). A character-level error analysis technique for evaluating text entry methods. In Proceedings of NordiCHI 2002, Aarhus, Denmark, 243–246
MacKenzie, I. S., & Soukoreff, R. W. (2003). Phrase sets for evaluating text entry techniques. In Extended Abstracts of the ACM Conference on Human Factors in Computing Systems—CHI 2003, Fort Lauderdale, Florida, United States of America, 754–755
Maglio, P. P., Matlock, T., Campbell, C. S., Zhai, S., & Smith, B. A. (2000). Gaze and speech in attentive user interfaces. In Proceedings of the Third International Conference on Advances in Multimodal Interfaces, Vancouver, Canada, 1–7
Majaranta, P. (2009). Text entry by eye gaze. Dissertations in Interactive Technology, number 11, University of Tampere
Man, D. W. K., & Wong, M.-S., L (2007). Evaluation of computer-access solutions for students with quadriplegic athetoid cerebral palsy. American Journal of Occupational Therapy, 61, 355–364.
Miniotas, D., Špakov, O., & Evreinov, G. (2003). Symbol Creator: An alternative eye-based text entry technique with low demand for screen space. In Proceedings of Human Computer Interaction—INTERACT ’03, Zurich, Switzerland, 137–143
Miniotas, D., Špakov, O., Tugoy, I., & MacKenzie, I. S. (2006). Speech-augmented eye gaze interaction with small closely spaced targets. In Proceedings of the 2006 Symposium on Eye Tracking Research and Applications (ETRA), 67–72
Morimoto, C. H., & Amir, A. (2010). Context switching for fast key selection in text entry applications. In Proceedings of the 2010 Symposium on Eye Tracking Research and Applications (ETRA), 271–274
Oviatt, S. (1999). Mutual disambiguation of recognition errors in a multimodal architecture. In Proceedings of the ACM SIGCHI 99, Pittsburgh, Pennsylvania, United States of America, 576–583
Oviatt, S., & Cohen, P. (2000). Multimodal interfaces that process what comes naturally. Communications of the ACM, 43(2), 45–53.
Pireddu, A. (2007). Multimodal Interaction: An integrated speech and gaze approach. Thesis, Politecnico di Torino
Prasov, Z., Chai, J. Y., & Jeong, H. (2007). Eye gaze for attention prediction in multimodal human-machine conversation. In Proceedings of AAAI Spring Symposium on Interaction Challenges for Intelligent Assistants
Read, J. (2005). On the application of text input metrics to handwritten text input. Text Input Workshop, Dagstuhl., Germany
Read, J., MacFarlane, S., & Casey, C. (2001). Measuring the usability of text input methods for children. In Proceedings of Human-Computer Interaction (HCI) 2001, New Orleans, United States of America, 559–572
Tan, Y. K., Sherkat, N., & Allen, T. (2003a). Eye gaze and speech for data entry: A comparison of different data entry methods. In Proceedings of the International Conference on Multimedia and Expo, Baltimore, Maryland, United States of America, 41–44
Tan, Y. K., Sherkat, N., & Allen, T. (2003b). Error recovery in a blended style eye gaze and speech interface. In Proceedings of ICMI ’03, Vancouver, Canada, 196–202
Tanaka, K. (1999). A robust selection system using realtime multi-modal user-agent interactions. In Proceedings of IUI’99, 105–108
Tanenhaus, M. K., Spivey-Knowlton, M., Eberhard, K., & Sedivy, J. (1995). Integration of visual and linguistic information during spoken language comprehension. Science, 268, 1632–1634.
Tobii. (2011). Tobii unveils the world’s first eye-controlled laptop. Retrieved 14 March 2011 from http://www.tobii.com/en/eye-tracking-integration/global/news-and-events/press-releases/tobii-unveils-the-worlds-first-eye-controlled-laptop/
Van Dam, A. (2001). Post-Wimp user interfaces: The human connection. In R. Earnshaw, R. Guedj, A. van Dam & J. Vince (Eds), Frontiers of human-centred computing, online communities and virtual environments (pp. 163–178). London: Springer-Verlag.
Velichkovsky, B. M., Sprenger, A., & Pomplun, M. (1997). Auf dem Weg zur Blickmaus: Die Beeinflussung der Fixationsdauer durch kognitive und kommunikative Aufgaben. In R. Liskowsky, B. M. Velichkowsky & W. Wünschmann (Eds), Software-Ergonomie (pp. 317–327)
Vertanen, K., & MacKay, D. J. C. (2010). Speech Dasher: Fast writing using speech and gaze. In Proceedings of CHI 2010, Atlanta, Georgia, United States of America, 595–598
Ward, D. J., Blackwell, A. F., & MacKay, D. J. C. (2000). Dasher—a data entry interface using continuous gestures and language models. In Proceedings of UIST 2000: The 13th Annual ACM Symposium on User Interface Software and Technology, San Diego, California, United States of America, 129–137
Wobbrock, J. O., Rubinstein, J., Sawyer, M. W., & Duchowski, A. T. (2008). Longitudinal evaluation of discrete consecutive gaze gestures for text entry. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA), Savannah, Georgia, United States of America, 11–18
Zhai, S., Morimoto, C., & Ihde, S. (1999). Manual And gaze input cascaded (MAGIC) pointing. In Proceedings of CHI ’99: ACM Conference on Human Factors in Computing Systems, Pittsburgh, Pennsylvania, United States of America, 246–253
Zhang, X., & MacKenzie, I. S. (2007). Evaluating eye tracking with ISO 9241– Part 9. In J. Jacko (Ed.), Human Computer Interaction, 779–788
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Beelders, T., Blignaut, P. (2014). Gaze and Speech: Pointing Device and Text Entry Modality. In: Horsley, M., Eliot, M., Knight, B., Reilly, R. (eds) Current Trends in Eye Tracking Research. Springer, Cham. https://doi.org/10.1007/978-3-319-02868-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-02868-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02867-5
Online ISBN: 978-3-319-02868-2
eBook Packages: Humanities, Social Sciences and LawEducation (R0)