Abstract
Vision based text entry systems aim to help disabled people achieve text communication using eye movement. Most previous methods have employed an existing eye tracker to predict gaze direction and designed an input method based upon that. However, these methods can result in eye tracking quality becoming easily affected by various factors and lengthy amounts of time for calibration. Our paper presents a novel efficient gaze based text input method, which has the advantage of low cost and robustness. Users can type in words by looking at an on-screen keyboard and blinking. Rather than estimate gaze angles directly to track eyes, we introduce a method that divides the human gaze into nine directions. This method can effectively improve the accuracy of making a selection by gaze and blinks. We built a Convolutional Neural Network (CNN) model for 9-direction gaze estimation. On the basis of the 9-direction gaze, we used a nine-key T9 input method which is widely used in candy bar phones. Bar phones were very popular in the world decades ago and have cultivated strong user habits and language models. To train a robust gaze estimator, we created a large-scale dataset with images of eyes sourced from 25 people. According to the results from our experiments, our CNN model is able to accurately estimate different people’s gaze under various lighting conditions. In considering disable people’s needs, we removed the complex calibration process. The input methods can run in screen mode and portable off-screen mode. Moreover, The datasets used in our experiments are made available to the community to allow further research.
Similar content being viewed by others
References
Chen J, Ji Q (2008) 3D gaze estimation with a single camera without ir illumination. In: 19th international conference on pattern recognition, 2008. ICPR 2008. IEEE, pp 1–4
Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking generic human motion via fusion of low-and high-dimensional approaches. IEEE Trans Syst Man Cybern Syst 43(4):996–1002
Dagnelie G (2011) Visual prosthetics: physiology, bioengineering. Rehabilitation
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005, vol 1. IEEE, pp 886–893
Dohse KKC (2007) Effects of field of view and stereo graphics on memory in immersive command and control. Iowa State University
Fejtova M, Fejt J, Lhotska L (2004) Controlling a PC by eye movements: the MEMREC project. In: Proceedings of the 9th international conference on computers helping people with special needs, ICCHP 2004, Paris, France. Springer Berlin, Heidelberg, pp 770–773
Gips J, Olivieri P (1996) Eagleeyes: an eye control system for persons with disabilities. In: The eleventh international conference on technology and persons with disabilities, pp 1–15
Grauman K, Betke M, Lombardi J, Gips J, Bradski GR (2003) Communication via eye blinks and eyebrow raises: Video-based human-computer interfaces. Univ Access Inf Soc 2(4):359–373
Hansen DW, Hansen JP, Nielsen M, Johansen AS, Stegmann MB (2002) Eye typing using markov and active appearance models. In: Proceedings of the sixth IEEE workshop on applications of computer vision, 2002 (WACV 2002). IEEE, pp 132–136
Hansen DW, Ji Q (2010) In the eye of the beholder: a survey of models for eyes and gaze. IEEE Trans Pattern Anal Mach Intell 32(3):478–500
Hutchinson TE, White KP, Martin WN, Reichert KC, Frey LA (1989) Human-computer interaction using eye-gaze input. IEEE Trans Syst Man Cybern 19 (6):1527–1534
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167
Istance HO, Spinner C, Howarth PA (1996) Providing motor impaired users with access to standard graphical user interface (gui) software via eye-based interaction. In: Proceedings of the 1st european conference on disability, virtual reality and associated technologies (ECDVRAT’96)
Krafka K, Khosla A, Kellnhofer P, Kannan H, Bhandarkar S, Matusik W, Torralba A (2016) Eye tracking for everyone. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2176–2184
Królak A, Strumiłło P (2012) Eye-blink detection system for human–computer interaction. Univ Access Inf Soc 11(4):409–419
Liu L, Cheng L, Liu Y, Jia Y, Rosenblum DS (2016) Recognizing complex activities by a probabilistic interval-based model. In: AAAI, vol 30, pp 1266–1272
Liu Y, Cui J, Zhao H, Zha H (2012) Fusion of low-and high-dimensional approaches by trackers sampling for generic human motion tracking. In: 21st international conference on pattern recognition (ICPR), 2012. IEEE, pp 898–901
Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2activity: recognizing complex activities from sensor data. In: IJCAI, pp 1617–1623
Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115
Liu Y, Zhang X, Cui J, Wu C, Aghajan H, Zha H (2010) Visual analysis of child-adult interactive behaviors in video sequences. In: 16th international conference on virtual systems and multimedia (VSMM), 2010. IEEE, pp 26–33
Lu Y, Wei Y, Liu L, Zhong J, Sun L, Liu Y (2017) Towards unsupervised physical activity recognition using smartphone accelerometers. Multimedia Tools and Applications 76(8):10,701–10,719
Majaranta P, Ahola UK, Špakov O (2009) Fast gaze typing with an adjustable dwell time. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 357–360
Majaranta P, Bulling A (2014) Eye tracking and eye-based human–computer interaction. In: Advances in physiological computing. Springer, pp 39–65
Majaranta P, Räihä KJ (2002) Twenty years of eye typing: systems and design issues. In: Proceedings of the 2002 symposium on eye tracking research & applications. ACM, pp 15–22
Majaranta P, Räihä KJ (2007) Text entry by gaze: utilizing eye-tracking. Text entry systems: mobility, accessibility, universality, pp 175–187
Morimoto CH, Amir A, Flickner M (2002) Detecting eye position and gaze from a single camera and 2 light sources. In: Proceedings of the 16th international conference on pattern recognition, 2002, vol 4. IEEE, pp 314–317
Preoţiuc-Pietro D, Liu Y, Hopkins D, Ungar L (2017) Beyond binary labels: political ideology prediction of twitter users. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: Long papers), vol 1, pp 729–740
Silfverberg M, MacKenzie IS, Korhonen P (2000) Predicting text entry speed on mobile phones. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 9–16
Tuisku O, Majaranta P, Isokoski P, Räihä KJ (2008) Now dasher! dash away!: longitudinal study of fast text entry by eye gaze. In: Proceedings of the 2008 symposium on eye tracking research & applications. ACM, pp 19–26
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition, 2001. CVPR 2001, vol 1. IEEE, pp I–I
Wobbrock JO, Rubinstein J, Sawyer M, Duchowski AT (2007) Not typing but writing: eye-based text entry using letter-like gestures. In: Proceedings of the conference on communications by gaze interaction (COGAIN), pp 61–64
Zhang X, Sugano Y, Fritz M, Bulling A (2015) Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4511–4520
Acknowledgments
This work is supported by the Fundamental Research Funds for the Central Universities (No. 2017XKQY075).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, C., Yao, R. & Cai, J. Efficient eye typing with 9-direction gaze estimation. Multimed Tools Appl 77, 19679–19696 (2018). https://doi.org/10.1007/s11042-017-5426-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5426-y