Abstract
In this paper, we consider the problem of comprehension difficulties caused by encountering new words in reading. Most of the existing ways to determine whether a user needs to translate a word are based on user-initiated clicks, but this is too cumbersome and has the possibility of mistakes. Therefore, this paper proposes a prediction method of word confusion based on eye tracking. This method uses the local perception of fixation method to find target words that the user may be confused about, then combines dynamic user eye-movement behavior data with static text information to build a Hidden Markov Model to predict whether the user is confused about these words, and finally selects a word with the highest confusion value and displays the words meaning in the annotation area. Experimental results show that the method achieves 87.4% precision in word confusion prediction, and the system's automatic annotation function improves users' reading efficiency and reading experience.
Similar content being viewed by others
Abbreviations
- x :
-
The x-coordinate of the sampling point
- y :
-
The y-coordinate of the sampling point
- d :
-
Pupil diameter
- m :
-
Eye movement patterns: fixation, regression, saccade, unclassified
- df :
-
Distance from the last fixation point
- tf :
-
Time with the last fixation point
- pf :
-
Continuous fixation time
- ds :
-
Distance of the last saccade
- ts :
-
Duration of the last saccade
- ps :
-
Continuous saccade time
- ti :
-
Continuous regression time
- wx :
-
The x-coordinate of the center of the nearest text box to the current sample point
- wy :
-
The y-coordinate of the center of the nearest text box to the current sample point
- tw :
-
Duration for which the sample point remains in the current text box
- wt :
-
Text content in the text box closest to the current sampling point
- wl :
-
Text length of the text content in the text box closest to the current sampling point
- ww :
-
TF-IDF weight of the text content in the text box closest to the current sampling point
- ax :
-
Average x-coordinate
- ay :
-
Average y-coordinate
- ad :
-
Average pupil diameter
- vx :
-
Variance of x-coordinate
- vy :
-
Variance of y-coordinate
- vd :
-
Variance of pupil diameter
- gx :
-
Gradient in the x-direction within the window
- gy :
-
Gradient in the y-direction within the window
- gd :
-
Gradient of pupil diameter
- ex :
-
Extreme difference of x-coordinate
- ey :
-
Extreme difference of y-coordinate
- ed :
-
Extreme difference in pupil diameter
- d a :
-
Average candidate sensing distance
- d e :
-
Extreme difference in candidate sensing distance
References
Biedert, R., Buscher, G., Schwarz, S., Möller, M., Dengel, A., Lottermann, T.: The text 2.0 framework: writing web-based gaze-controlled realtime applications quickly and easily. In: Proceedings of the 2010 Workshop on Eye Gaze in Intelligent Human Machine Interaction, pp. 114–117 (2010)
Birkerts, S.: The Gutenberg Elegies: The Fate of Reading in an Electronic Age. Farrar, Straus and Giroux (2006)
Bixler, R., D’Mello, S.: Automatic gaze-based detection of mind wandering with metacognitive awareness. In: International Conference on User Modeling, Adaptation, and Personalization, pp 31–43. Springer, Cham (2015)
Brooke, J.: Sus-a quick and dirty usability scale. In: Usability Evaluation in Industry, vol. 189. Taylor & Francis, London, p. 194 (1996)
Chen, Z., Shi, B.E.: Using variable dwell time to accelerate gazebased web browsing with two-step selection. Int. J. Hum. Comput. Interact. 35(3), 240–255 (2019)
Chen, Y., Zhang, J.: Deep choices in the era of shallow reading. J. Acad. Libr. Inf. Sci. 2011, 1 (2011)
Cheng, S., Sun, Z.: An approach to eye tracking for mobile device based interaction. J. Comput. Aided Des. Comput. Graph. 26(8), 8 (2014)
Cheng, S., Sun, Y.: Eye movement data visualization based annotation for reading teaching. J. Zhejiang Univ. Technol. 45(6), 5 (2017)
Clifton, C., Ferreira, F., Henderson, J.M., et al.: Eye movements in reading and information processing: Keith Rayner’s 40year legacy. J. Mem. Lang. 2016, 1–19 (2016). https://doi.org/10.1016/j.jml.2015.07.004
Cohen, I., Huang, Y., Chen, J., Benesty, J., Benesty, J., Chen, J., & Cohen, I.: Pearson correlation coefficient. In: Noise Reduction in Speech Processing, pp. 1–4 (2009)
Dussias, P.E.: Uses of eye-tracking data in second language sentence processing research. Annu. Rev. Appl. Linguist. 30, 149–166 (2010)
Eddy, S.R.: What is a hidden Markov model? Nat. Biotechnol. 22(10), 1315–1316 (2004)
Fischler, M.A., Bolles, R.C.: random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography—ScienceDirect. In: Readings in Computer Vision, pp. 726–740 (1987)
Forney, G.D.: The Viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973)
Gabriel, K.R., Neumann, J.: A Markov chain model for daily rainfall occurrence at Tel Aviv. Q. J. r. Meteorol. Soc. 88(375), 90–95 (1962)
Guestrin, E.D., Eizenman, M.: General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE Trans. Biomed. Eng. 53(6), 1124–1133 (2006)
Guo, W., Cheng, S.: An approach to reading assistance with eye tracking data and text features. In: Adjunct of the 2019 International Conference on Multimodal Interaction, pp. 1–7 (2019)
Hansen, D.W., Ji, Q.: In the eye of the beholder: a survey of models for eyes and gaze. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 478–500 (2009)
Hauke, J., Kossowski, T.: Comparison of values of Pearson’s and Spearman’s correlation coefficients on the same sets of data. Quaest. Geogr. 30(2), 87–93 (2011)
Istance, H., Bates, R., Hyrskykari, A., et al.: Snap clutch, a moded approach to solving the Midas touch problem. In: Proceedings of the Eye Tracking Research & Application Symposium, pp. 221–228 (2008)
Jakob, R.J.K.: Eye movement based human computer interaction techniques; toward non-command interfaces. In: Hartson, H.R., Hix, D. (eds.) Advances in Human–Computer Interaction, vol. 4, pp. 151–190. Ablex Publishing Co., Norwood (1993)
Jaques, N., Conati, C., Harley, J.M., Azevedo, R.: Predicting affect from gaze data during interaction with an intelligent tutoring system. In: Intelligent Tutoring Systems: 12th International Conference, ITS 2014, Honolulu, HI, USA, June 5–9, 2014. Proceedings, vol. 12, pp. 29–38. Springer (2014)
Jie, Z., Ye, Y., Cheng, S.: A model to personalized text summarization generation based on eye tracking. J. Comput. Aided Des. Comput. Graph. https://link.cnki.net/urlid/11.2925.tp.20230815.1623.022 (2022)
Kang, L., Li, Y., Doermann, D.: Orientation robust text line detection in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4034–4041 (2014)
Kim, K.C., Byun, H.R., Song, Y.J., Choi, Y.W., Chi, S.Y., Kim, K.K., Chung, Y.. Scene text extraction in natural scene images using hierarchical feature combining and verification. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 2, pp. 679–682. IEEE (2004)
Lallé, S., Conati, C., Carenini, G.: Predicting confusion in information visualization from eye tracking and interaction data. In: IJCAI, pp. 2529–2535 (2016a)
Lallé, S., Conati, C., Carenini, G.: Prediction of individual learning curves across information visualizations. User Model. User Adapt. Interact. 26, 307–345 (2016b)
Liao, M., Shi, B., Bai, X., Wang, X., & Liu, W.: TextBoxes: a fast text detector with a single deep neural network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1 (2017)
Liu, Y., Jin, L.: Deep matching prior network: toward tighter multi-oriented text detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1962–1969 (2017)
Lu X, Li M, Lu S, et al.: The situation model in text reading: evidence from an eye movement study. In: International Conference on Brain Informatics, pp. 104–113. Springer, Cham (2016)
McCrudden, M.T., Schraw, G., Hartley, K.: The effect of general relevance instructions on shallow and deeper learning and reading time. J. Exp. Educ. 74(4), 291–310 (2006)
Moon, T.K.: The expectation–maximization algorithm. IEEE Signal Process. Mag. 13(6), 47–60 (1996)
Palmero, C., Selva, J., Bagheri, M.A., et al.: Recurrent CNN for 3D gaze estimation using appearance and shape cues. Philos. Trans. r. Soc. Lond. 114, 247–256 (1824)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Rayner, K.: Eye movements in reading and information processing: 20 years of research. Psychol. Bull. 124(3), 372–422 (1998)
Rayner, K.: Eye movements and attention in reading, scene perception, and visual search. Q. J. Exp. Psychol. 62(8), 1457–1506 (2009). https://doi.org/10.1080/17470210902816461
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)
Shi, B., Bai, X., Belongie, S.: Detecting oriented text in natural images by linking segments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2550–2558 (2017)
Sun, F., Morita, M., Stark, L.W.: Comparative patterns of reading eye movement in Chinese and English. Percept. Psychophys. 37, 502–506 (1985)
Tian, S,, Pei, W.Y., Zuo, Z.Y., et al.: Scene text detection in video by learning locally and globally. In: International Joint Conference on Artificial Intelligence, pp. 2647–2653 (2016)
Wang, J., Yuan, X., Liu, Z.: An extraction method of pupil and corneal reflection centers based on image processing technology. CAAI Trans. Intell. Syst. 7(5), 423–428 (2013)
Yang, C., Yin, X.C., Pei, W.Y., et al.: Tracking based multi-orientation scene text detection: a unified framework with dynamic programming. IEEE Trans. Image Process. 26, 3235–3248 (2017)
Yin, X.C., Pei, W.Y., Zhang, J., et al.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. 2015(9), 1930–1937 (2015)
Zhang, Y., Zhao, R., Zhao X., Zhou Y., Zheng, J.: A review of 2D and 3D eye gaze tracking techniques. In: Proceedings of the 13th National Conference on Signal Processing (CCSP-2007) (2007)
Zhang, X., Ren, X., Zha, H.: Improving eye cursor's stability for eye pointing tasks. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 525–534 (2008)
Zhang, C., Chi, J.N., Zhang, Z.H., Wang, Z.L.: A novel eye gaze tracking technique based on pupil center cornea reflection technique. Chin. J. Comput. 33(7), 1272–1285 (2010)
Zhang, X., Sugano, Y., Fritz, M., et al.: Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4511–4520 (2015)
Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4159–4167 (2016)
Zhang, X., Sugano, Y., Fritz, M., et al.: Mpiigaze: real-world dataset and deep appearance-based gaze estimation. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 162–175 (2017)
Zhou, X., Tang, F., Guan, Q., Hua, M.: A survey of 3D eye model based gaze tracking. J. Comput. Aided Des. Comput. Graph. 29(9), 1579–1589 (2017a)
Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., Liang, J.: East: an efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5551–5560 (2017b)
Zhu, Z., Ji, Q.: Eye and gaze tracking for interactive graphic display. Mach. vis. Appl. 15(3), 139–148 (2004)
Acknowledgements
The authors would like to thank the volunteers who participated in this study. This work was supported in part by the National Natural Science Foundation of China under Grants 62172368 and 61772468, the Natural Science Foundation of Zhejiang Province under Grant LR22F020003, and the Zhejiang Provincial Key Research & Development Program No.2023C01045.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Du, P., Guo, W. & Cheng, S. Using eye-tracking for real-time translation: a new approach to improving reading experience. CCF Trans. Pervasive Comp. Interact. (2024). https://doi.org/10.1007/s42486-024-00150-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42486-024-00150-3