Skip to main content
Log in

Using eye-tracking for real-time translation: a new approach to improving reading experience

  • Regular Paper
  • Published:
CCF Transactions on Pervasive Computing and Interaction Aims and scope Submit manuscript

Abstract

In this paper, we consider the problem of comprehension difficulties caused by encountering new words in reading. Most of the existing ways to determine whether a user needs to translate a word are based on user-initiated clicks, but this is too cumbersome and has the possibility of mistakes. Therefore, this paper proposes a prediction method of word confusion based on eye tracking. This method uses the local perception of fixation method to find target words that the user may be confused about, then combines dynamic user eye-movement behavior data with static text information to build a Hidden Markov Model to predict whether the user is confused about these words, and finally selects a word with the highest confusion value and displays the words meaning in the annotation area. Experimental results show that the method achieves 87.4% precision in word confusion prediction, and the system's automatic annotation function improves users' reading efficiency and reading experience.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig.7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Abbreviations

x :

The x-coordinate of the sampling point

y :

The y-coordinate of the sampling point

d :

Pupil diameter

m :

Eye movement patterns: fixation, regression, saccade, unclassified

df :

Distance from the last fixation point

tf :

Time with the last fixation point

pf :

Continuous fixation time

ds :

Distance of the last saccade

ts :

Duration of the last saccade

ps :

Continuous saccade time

ti :

Continuous regression time

wx :

The x-coordinate of the center of the nearest text box to the current sample point

wy :

The y-coordinate of the center of the nearest text box to the current sample point

tw :

Duration for which the sample point remains in the current text box

wt :

Text content in the text box closest to the current sampling point

wl :

Text length of the text content in the text box closest to the current sampling point

ww :

TF-IDF weight of the text content in the text box closest to the current sampling point

ax :

Average x-coordinate

ay :

Average y-coordinate

ad :

Average pupil diameter

vx :

Variance of x-coordinate

vy :

Variance of y-coordinate

vd :

Variance of pupil diameter

gx :

Gradient in the x-direction within the window

gy :

Gradient in the y-direction within the window

gd :

Gradient of pupil diameter

ex :

Extreme difference of x-coordinate

ey :

Extreme difference of y-coordinate

ed :

Extreme difference in pupil diameter

d a :

Average candidate sensing distance

d e :

Extreme difference in candidate sensing distance

References

  • Biedert, R., Buscher, G., Schwarz, S., Möller, M., Dengel, A., Lottermann, T.: The text 2.0 framework: writing web-based gaze-controlled realtime applications quickly and easily. In: Proceedings of the 2010 Workshop on Eye Gaze in Intelligent Human Machine Interaction, pp. 114–117 (2010)

  • Birkerts, S.: The Gutenberg Elegies: The Fate of Reading in an Electronic Age. Farrar, Straus and Giroux (2006)

    Google Scholar 

  • Bixler, R., D’Mello, S.: Automatic gaze-based detection of mind wandering with metacognitive awareness. In: International Conference on User Modeling, Adaptation, and Personalization, pp 31–43. Springer, Cham (2015)

  • Brooke, J.: Sus-a quick and dirty usability scale. In: Usability Evaluation in Industry, vol. 189. Taylor & Francis, London, p. 194 (1996)

  • Chen, Z., Shi, B.E.: Using variable dwell time to accelerate gazebased web browsing with two-step selection. Int. J. Hum. Comput. Interact. 35(3), 240–255 (2019)

    Article  Google Scholar 

  • Chen, Y., Zhang, J.: Deep choices in the era of shallow reading. J. Acad. Libr. Inf. Sci. 2011, 1 (2011)

    Google Scholar 

  • Cheng, S., Sun, Z.: An approach to eye tracking for mobile device based interaction. J. Comput. Aided Des. Comput. Graph. 26(8), 8 (2014)

    Google Scholar 

  • Cheng, S., Sun, Y.: Eye movement data visualization based annotation for reading teaching. J. Zhejiang Univ. Technol. 45(6), 5 (2017)

    Google Scholar 

  • Clifton, C., Ferreira, F., Henderson, J.M., et al.: Eye movements in reading and information processing: Keith Rayner’s 40year legacy. J. Mem. Lang. 2016, 1–19 (2016). https://doi.org/10.1016/j.jml.2015.07.004

    Article  Google Scholar 

  • Cohen, I., Huang, Y., Chen, J., Benesty, J., Benesty, J., Chen, J., & Cohen, I.: Pearson correlation coefficient. In: Noise Reduction in Speech Processing, pp. 1–4 (2009)

  • Dussias, P.E.: Uses of eye-tracking data in second language sentence processing research. Annu. Rev. Appl. Linguist. 30, 149–166 (2010)

    Article  Google Scholar 

  • Eddy, S.R.: What is a hidden Markov model? Nat. Biotechnol. 22(10), 1315–1316 (2004)

    Article  CAS  PubMed  Google Scholar 

  • Fischler, M.A., Bolles, R.C.: random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography—ScienceDirect. In: Readings in Computer Vision, pp. 726–740 (1987)

  • Forney, G.D.: The Viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973)

    Article  MathSciNet  Google Scholar 

  • Gabriel, K.R., Neumann, J.: A Markov chain model for daily rainfall occurrence at Tel Aviv. Q. J. r. Meteorol. Soc. 88(375), 90–95 (1962)

    Article  ADS  Google Scholar 

  • Guestrin, E.D., Eizenman, M.: General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE Trans. Biomed. Eng. 53(6), 1124–1133 (2006)

    Article  PubMed  Google Scholar 

  • Guo, W., Cheng, S.: An approach to reading assistance with eye tracking data and text features. In: Adjunct of the 2019 International Conference on Multimodal Interaction, pp. 1–7 (2019)

  • Hansen, D.W., Ji, Q.: In the eye of the beholder: a survey of models for eyes and gaze. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 478–500 (2009)

    Article  Google Scholar 

  • Hauke, J., Kossowski, T.: Comparison of values of Pearson’s and Spearman’s correlation coefficients on the same sets of data. Quaest. Geogr. 30(2), 87–93 (2011)

    Google Scholar 

  • Istance, H., Bates, R., Hyrskykari, A., et al.: Snap clutch, a moded approach to solving the Midas touch problem. In: Proceedings of the Eye Tracking Research & Application Symposium, pp. 221–228 (2008)

  • Jakob, R.J.K.: Eye movement based human computer interaction techniques; toward non-command interfaces. In: Hartson, H.R., Hix, D. (eds.) Advances in Human–Computer Interaction, vol. 4, pp. 151–190. Ablex Publishing Co., Norwood (1993)

    Google Scholar 

  • Jaques, N., Conati, C., Harley, J.M., Azevedo, R.: Predicting affect from gaze data during interaction with an intelligent tutoring system. In: Intelligent Tutoring Systems: 12th International Conference, ITS 2014, Honolulu, HI, USA, June 5–9, 2014. Proceedings, vol. 12, pp. 29–38. Springer (2014)

  • Jie, Z., Ye, Y., Cheng, S.: A model to personalized text summarization generation based on eye tracking. J. Comput. Aided Des. Comput. Graph. https://link.cnki.net/urlid/11.2925.tp.20230815.1623.022 (2022)

  • Kang, L., Li, Y., Doermann, D.: Orientation robust text line detection in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4034–4041 (2014)

  • Kim, K.C., Byun, H.R., Song, Y.J., Choi, Y.W., Chi, S.Y., Kim, K.K., Chung, Y.. Scene text extraction in natural scene images using hierarchical feature combining and verification. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 2, pp. 679–682. IEEE (2004)

  • Lallé, S., Conati, C., Carenini, G.: Predicting confusion in information visualization from eye tracking and interaction data. In: IJCAI, pp. 2529–2535 (2016a)

  • Lallé, S., Conati, C., Carenini, G.: Prediction of individual learning curves across information visualizations. User Model. User Adapt. Interact. 26, 307–345 (2016b)

    Article  Google Scholar 

  • Liao, M., Shi, B., Bai, X., Wang, X., & Liu, W.: TextBoxes: a fast text detector with a single deep neural network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1 (2017)

  • Liu, Y., Jin, L.: Deep matching prior network: toward tighter multi-oriented text detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1962–1969 (2017)

  • Lu X, Li M, Lu S, et al.: The situation model in text reading: evidence from an eye movement study. In: International Conference on Brain Informatics, pp. 104–113. Springer, Cham (2016)

  • McCrudden, M.T., Schraw, G., Hartley, K.: The effect of general relevance instructions on shallow and deeper learning and reading time. J. Exp. Educ. 74(4), 291–310 (2006)

    Article  Google Scholar 

  • Moon, T.K.: The expectation–maximization algorithm. IEEE Signal Process. Mag. 13(6), 47–60 (1996)

    Article  ADS  Google Scholar 

  • Palmero, C., Selva, J., Bagheri, M.A., et al.: Recurrent CNN for 3D gaze estimation using appearance and shape cues. Philos. Trans. r. Soc. Lond. 114, 247–256 (1824)

    Google Scholar 

  • Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  • Rayner, K.: Eye movements in reading and information processing: 20 years of research. Psychol. Bull. 124(3), 372–422 (1998)

    Article  CAS  PubMed  Google Scholar 

  • Rayner, K.: Eye movements and attention in reading, scene perception, and visual search. Q. J. Exp. Psychol. 62(8), 1457–1506 (2009). https://doi.org/10.1080/17470210902816461

    Article  Google Scholar 

  • Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)

    Article  PubMed  Google Scholar 

  • Shi, B., Bai, X., Belongie, S.: Detecting oriented text in natural images by linking segments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2550–2558 (2017)

  • Sun, F., Morita, M., Stark, L.W.: Comparative patterns of reading eye movement in Chinese and English. Percept. Psychophys. 37, 502–506 (1985)

    Article  CAS  PubMed  Google Scholar 

  • Tian, S,, Pei, W.Y., Zuo, Z.Y., et al.: Scene text detection in video by learning locally and globally. In: International Joint Conference on Artificial Intelligence, pp. 2647–2653 (2016)

  • Wang, J., Yuan, X., Liu, Z.: An extraction method of pupil and corneal reflection centers based on image processing technology. CAAI Trans. Intell. Syst. 7(5), 423–428 (2013)

    Google Scholar 

  • Yang, C., Yin, X.C., Pei, W.Y., et al.: Tracking based multi-orientation scene text detection: a unified framework with dynamic programming. IEEE Trans. Image Process. 26, 3235–3248 (2017)

    Article  ADS  MathSciNet  PubMed  Google Scholar 

  • Yin, X.C., Pei, W.Y., Zhang, J., et al.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. 2015(9), 1930–1937 (2015)

    Article  Google Scholar 

  • Zhang, Y., Zhao, R., Zhao X., Zhou Y., Zheng, J.: A review of 2D and 3D eye gaze tracking techniques. In: Proceedings of the 13th National Conference on Signal Processing (CCSP-2007) (2007)

  • Zhang, X., Ren, X., Zha, H.: Improving eye cursor's stability for eye pointing tasks. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 525–534 (2008)

  • Zhang, C., Chi, J.N., Zhang, Z.H., Wang, Z.L.: A novel eye gaze tracking technique based on pupil center cornea reflection technique. Chin. J. Comput. 33(7), 1272–1285 (2010)

    Article  Google Scholar 

  • Zhang, X., Sugano, Y., Fritz, M., et al.: Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4511–4520 (2015)

  • Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4159–4167 (2016)

  • Zhang, X., Sugano, Y., Fritz, M., et al.: Mpiigaze: real-world dataset and deep appearance-based gaze estimation. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 162–175 (2017)

    Article  PubMed  Google Scholar 

  • Zhou, X., Tang, F., Guan, Q., Hua, M.: A survey of 3D eye model based gaze tracking. J. Comput. Aided Des. Comput. Graph. 29(9), 1579–1589 (2017a)

    Google Scholar 

  • Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., Liang, J.: East: an efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5551–5560 (2017b)

  • Zhu, Z., Ji, Q.: Eye and gaze tracking for interactive graphic display. Mach. vis. Appl. 15(3), 139–148 (2004)

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the volunteers who participated in this study. This work was supported in part by the National Natural Science Foundation of China under Grants 62172368 and 61772468, the Natural Science Foundation of Zhejiang Province under Grant LR22F020003, and the Zhejiang Provincial Key Research & Development Program No.2023C01045.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shiwei Cheng.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Du, P., Guo, W. & Cheng, S. Using eye-tracking for real-time translation: a new approach to improving reading experience. CCF Trans. Pervasive Comp. Interact. (2024). https://doi.org/10.1007/s42486-024-00150-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42486-024-00150-3

Keywords

Navigation