Abstract
The early eye tracking studies of Yarbus provided descriptive evidence that an observer’s task influences patterns of eye movements, leading to the tantalizing prospect that an observer’s intentions could be inferred from their saccade behavior. We investigate the predictive value of task and eye movement properties by creating a computational cognitive model of saccade selection based on instructed task and internal cognitive state using a Dynamic Bayesian Network (DBN). Understanding how humans generate saccades under different conditions and cognitive sets links recent work on salience models of low-level vision with higher level cognitive goals. This model provides a Bayesian, cognitive approach to top-down transitions in attentional set in pre-frontal areas along with vector-based saccade generation from the superior colliculus. Our approach is to begin with eye movement data that has previously been shown to differ across task. We first present an analysis of the extent to which individual saccadic features are diagnostic of an observer’s task. Second, we use those features to infer an underlying cognitive state that potentially differs from the instructed task. Finally, we demonstrate how changes of cognitive state over time can be incorporated into a generative model of eye movement vectors without resorting to an external decision homunculus. Internal cognitive state frees the model from the assumption that instructed task is the only factor influencing observers’ saccadic behavior. While the inclusion of hidden temporal state does not improve the classification accuracy of the model, it does allow accurate prediction of saccadic sequence results observed in search paradigms. Given the generative nature of this model, it is capable of saccadic simulation in real time. We demonstrated that the properties from its generated saccadic vectors closely match those of human observers given a particular task and cognitive state. Many current models of vision focus entirely on bottom-up salience to produce estimates of spatial “areas of interest” within a visual scene. While a few recent models do add top-down knowledge and task information, we believe our contribution is important in three key ways. First, we incorporate task as learned attentional sets that are capable of self-transition given only information available to the visual system. This matches influential theories of bias signals by (Miller and Cohen Annu Rev Neurosci 24:167–202, 2001) and implements selection of state without simply shifting the decision to an external homunculus. Second, our model is generative and capable of predicting sequence artifacts in saccade generation like those found in visual search. Third, our model generates relative saccadic vector information as opposed to absolute spatial coordinates. This matches more closely the internal saccadic representations as they are generated in the superior colliculus.
Similar content being viewed by others
References
İşcan Z, Özkaya Ö, & Dokur Z. Classification of EEG in a steady state visual evoked potential based brain computer interface experiment. In Proceedings of the 10th international conference on Adaptive and natural computing algorithms-Volume Part II (pp. 81–88). Springer-Verlag; 2011.
Carlson TA, Schrater P, He S. Patterns of activity in the categorical representations of objects. J Cogn Neurosci. 2003;15(5):704–17.
Borji A, Itti L. Defending Yarbus: Eye movements reveal observers’ task. J Vis. 2014;14(3)
Cutsuridis V, Taylor JG. A cognitive control architecture for the perception–action cycle in robots and agents. Cogn Comput. 2013;5(3):383–95.
Schiller PH. The neural control of visually guided eye movements. In: Cognitive neuroscience of attention, ed Richards JE (Erlbaum, Mahway, NJ); 1998. p. 3–50.
Itti L, Koch C. Computational modelling of visual attention. Nat Rev Neurosci. 2001;2:194–203. https://doi.org/10.1038/35058500.
Wolfe JM, Horowitz TS. What attributes guide the deployment of visual attention and how do they do it? Nat Rev Neurosci. 2004;5:1–7.
Tatler BW, Hayhoe MM, Land MF, Ballard DH. Eye guidance in natural vision: reinterpreting salience. J Vis. 2011;11(5):5. https://doi.org/10.1167/11.5.5.
Mital PK, Smith TJ, Hill RL, Henderson JM. Clustering of gaze during dynamic scene viewing is predicted by motion. Cogn Comput. 2011;3(1):5–24.
Siebold A, van Zoest W, Donk M. Oculomotor evidence for top-down control following the initial saccade. PLoS One. 2011;6(9):e23552.
Tatler BW, Vincent BT. Systematic tendencies in scene viewing. J Eye Mov Res. 2008;2(2)
Clarke A, Tatler B. Deriving an appropriate baseline for describing fixation behavior. Vis Res. 2014;102:41–51, ISSN 0042-6989. https://doi.org/10.1016/j.visres.2014.06.016.
MacInnes W, Hunt A, Hilchey M, Klein R. Driving forces in free visual search: an ethology. Attention, Perception and Psychophysics. 2014;76(2):280–95.
Smith TJ, Henderson JM. Does oculomotor inhibition of return influence fixation probability during scene search? Attention, Perception, & Psychophysics. 2011;73(8):2384–98.
Treisman AM, Gelade G. A feature-integration theory of attention. Cogn Psychol. 1980;12(1):97–136.
Hinton GE. Learning multiple layers of representation. Trends Cogn Sci. 2007;11(10):428–34.
Krizhevsky A, Sutskever I, & Hinton GE. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105); 2012.
Serre T, Oliva A, Poggio T. A feedforward architecture accounts for rapid categorization. Proc Natl Acad Sci. 2007;104(15):6424–9.
Tu Z, Abel A, Zhang L, Luo B, Hussain A. A new spatio-temporal saliency-based video object segmentation. Cogn Comput. 2016;8(4):629–47.
Pang Y, Ye L, Li X, and Pan J, Moving object detection in video using saliency map and subspace learning, IEEE Transactions on Circuits Systems for Video Technology, https://doi.org/10.1109/TCSVT20162630731, 2016 (also arXiv:1509.09089).
Wischnewski M, Belardinelli A, Schneider WX, Steil JJ. Where to look next? Combining static and dynamic proto-objects in a TVA-based model of visual attention. Cognitive Computation. 2010;2(4):326–34323.
Liu H, Yu Y, Sun F, Gu J. Visual–tactile fusion for object recognition. IEEE Trans Autom Sci Eng. 2017;14(2):996–1008.
Poria S, Cambria E, Howard N, Huang GB, Hussain A. Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing. 2016;174:50–9.
Golomb JD, Chun MM, Mazer JA. The native coordinate system of spatial attention is retinotopic. J Neurosci. 2008;28(42):10654–62.
Dorris MC, Pare M, Munoz DP. Neuronal activity in monkey superior colliculus related to the initiation of saccadic eye movements. J Neurosci. 1997;17(21):8566–79.
Aboudib A, Gripon V, Coppin G. A biologically inspired framework for visual information processing and an application on modeling bottom-up visual attention. Cogn Comput. 2016;8(6):1007–26.
Klein RM. Inhibition of return. Trends Cogn Sci. 2000;4(4):138–47.
Findlay JM, Brown V, Gilchrist ID. Saccade target selection in visual search: the effect of information from the previous fixation. Vis Res. 2001;41(1):87–95.
McPeek RM, Skavenski AA, Nakayama K. Concurrent processing of saccades in visual search. Vis Res. 2000;40(18):2499–516.
Klein RM, MacInnes WJ. Inhibition of return is a foraging facilitator in visual search. Psychol Sci. 1999;10(4):346–52.
Smith TJ, Henderson JM. Looking back at Waldo: oculomotor inhibition of return does not prevent return fixations. J Vis. 2011;11(1):3–3.
Fecteau JH, Bell AH, Munoz DP. Neural correlates of the automatic and goal-driven biases in orienting spatial attention. J Neurophysiol. 2004;92(3):1728–37.
Fecteau JH, Munoz DP. Salience, relevance, and spiking neurons: a priority map governs target selection. Trends Cogn Sci. 2006;10:382–90.
Folk CL, Remington RW, Johnston JC. Involuntary covert orienting is contingent on attentional control settings. J Exp Psychol Hum Percept Perform. 1992;18(4):1030–44.
Yarbus AL. Eye Movements and Vision, New York: Plenum. (Originally published in Russian 1965); 1967
DeAngelus M, Pelz J. Top-down control of eye movements: Yarbus revisited. Vis Cogn. 2009;17(6–7):790–811.
Ballard D, Hayhoe M, Pelz J. Memory representations in natural tasks. J Cogn Neurosci. 1995;7(1):66–80. https://doi.org/10.1162/jocn.1995.7.1.66.
Land M, Hayhoe M. In what ways do eye movements contribute to everyday activities? Vis Res. 2001;41(25–26):3559–65. https://doi.org/10.1016/S0042-6989(01)00102-X.
Castelhano MS, Mack ML, Henderson JM. Viewing task influences eye movement control during active scene perception. J Vis. 2009;9(3):6–6.
Mills M, Hollingworth A, Van der Stigchel S, Hoffman L, Dodd MD. Examining the influence of task set on eye movements and fixations. J Vis. 2011;11(8):17–17.
Kardan O, Henderson JM, Yourganov G, Berman MG. Observers’ cognitive states modulate how visual inputs relate to gaze control. J Exp Psychol Hum Percept Perform. 2016;42(9):1429–42.
Macdonald JSP, Mathan S, Yeung N. Trial-by-trial variations in subjective attentional state are reflected in ongoing prestimulus EEG alpha oscillations. Front Psychol. 2011;2:82.
Aston-Jones G, Cohen J. An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance. Annu Rev Neurosci. 2005;28:403–50.
Kotsiantis S, Zaharakis ID, Pintelas PE. Supervised machine learning: a review of classification techniques. Artif Intell Rev. 2007;26(3):159–90.
Greene M, Liu T, Wolfe J. Reconsidering Yarbus: a failure to predict observers' task from eye movement patterns. Vision research, 62, 1–8.Henderson J., Shinkareva, S., Wang, J., Luke, S. & Olejarczyk, J. PLOS One: Predicting Cognitive State from Eye Movements; 2012. p. 2013. https://doi.org/10.1371/journal.pone.0064937.
Henderson JM, Shinkareva SV, Wang J, Luke SG, Olejarczyk J. Predicting cognitive state from eye movements. PLoS One. 2013;8(5):e64937.
Marat S, Rahman A, Pellerin D, Guyader N, Houzet D. Improving visual saliency by adding ‘face feature map’and ‘center bias’. Cogn Comput. 2013;5(1):63–75.
Kootstra G, de Boer B, Schomaker LR. Predicting eye fixations on complex visual stimuli using local symmetry. Cogn Comput. 2011;3(1):223–40.
Dodd MD, Van der Stigchel S, Hollingworth A. Novelty is not always the best policy: inhibition of return and facilitation of return as a function of visual task. Psychol Sci. 2009;20:333–9.
Bahle B, Mills M, & Dodd MD. Human classifier: Observers can deduce task solely from eye movements. Attention, Perception, & Psychophysics. 2017; 1–11.
Borji A, Lennartz A, Pomplun M. What do eyes reveal about the mind?: algorithmic inference of search targets from fixations. Neurocomputing. 2015;149:788–99.
Hess EH, Polt JM. Pupil size as related to interest value of visual stimuli. Science. 1960;132:349–50.
Beatty J, Kahneman D. Pupillary changes in two memory tasks. Psychon Sci. 1966;5:371–2.
Kahneman D. Attention and effort. Engelwood Cliffs, NJ: Prentice Hall; 1973.
Laeng B, Ørbo M, Holmlund T, Miozzo M. Pupillary stroop effects. Cogn Process. 2011;12:13–21.
Gabay S, Pertzov Y, Henik A. Orienting of attention, pupil size, and the norepinephrine system. Atten Percept Psychophys. 2011;73(1):123–9. https://doi.org/10.3758/s13414-010-0015-4.
Posner MI, Fan J. Attention as an organ system. In: Pomerantz JR, editor. Topics in integrative neuroscience: from cells to cognition. 1st ed. Cambridge: Cambridge University Press; 2008. p. 31–61.
Rajkowski J, Kubiak P, Aston-Jones G. Correlations between locus coeruleus (LC) neural activity, pupil diameter and behavior in monkey support a role of LC in attention. Soc Neurosci Abstr. 1993;19:974.
Rajkowski J, Majczynski H, Clayton E, Aston-Jones G. Activation of monkey locus coeruleus neurons varies with difficulty and performance in a target detection task. J Neurophysiol. 2004;92:361–71.
Dunn JC. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern. 1973;3(3):32–57. https://doi.org/10.1080/01969727308546046.
Jain AK. Data clustering: 50 years beyond K-means. Pattern Recogn Lett. 2010;31(8):651–66.
Vincent BT. Bayesian accounts of covert selective attention: a tutorial review. Atten Percept Psychophys. 2015;77(4):1013–32.
Druzdzel MJ. SMILE: structural modeling, inference, and learning engine and GeNIe: a development environment for graphical decision-theoretic models. In: Aaai/Iaai; 1999, July. p. 902–3.
Moon TK. The expectation-maximization algorithm. IEEE Signal Process Mag. 1996;13(6):47–60.
Kardan O, Berman MG, Yourganov G, Schmidt J, Henderson JM. Classifying mental states from eye movements during scene viewing. J Exp Psychol Hum Percept Perform. 2015;41(6):1502–14.
Fishel J, Loeb G. Bayesian exploration for intelligent identification of textures. Front Neurorobotics. 2012;18 https://doi.org/10.3389/fnbot.2012.00004.
Murphy KP. Dynamic bayesian networks: representation, inference and learning. University of California, Berkeley: Doctoral dissertation; 2002.
Bylinskii, Z., Judd, T., Borji, A., Itti, L., Durand, F., Oliva, A., & Torralba, A. (2015). Mit saliency benchmark.
Miller EK, Cohen JD. An integrative theory of prefrontal cortex function. Annu Rev Neurosci. 2001;24:167–202.
Akaike H. A new look at the statistical model identification. Automatic Control, IEEE Transactions on. 1974;19(6):716–23.
Posner M, Dehaene S. Attentional networks. Trends Neurosci. 1994;17:75–9.
Banich M, Milham M, Atchley R, Cohen N, Webb A, Wszalek T, et al. Prefrontal regions play a predominant role in imposing an attentional ‘set’: evidence from fMRI. Cogn Brain Res. 2000;10(1–2):1–9, ISSN 0926-6410. https://doi.org/10.1016/S0926-6410(00)00015-X.
Tanner J, Itti L. Goal relevance as a quantitative model of human task relevance. Psychol Rev. 2017;124(2):168–78.
Hanes DP, Wurtz RH. Interaction of the frontal eye field and superior colliculus for saccade generation. J Neurophysiol. 2001;85(2):804–15.
Bruce CJ, Goldberg ME. Primate frontal eye fields. I. Single neurons discharging before saccades. J Neurophysiol. 1985;53(3):603–35.
Trappenberg T, Dorris M, Munoz D, Klein R. A model of saccade initiation based on the competitive integration of exogenous and endogenous signals in the superior colliculus. J Cogn Neurosci. 2001;13(2):256–71.
Corbetta M, Shulman GL. Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci. 2002;3:201–15.
Gilzenrat MS, Nieuwenhuis S, Jepma M, Cohen JD. Pupil diameter tracks changes in control state predicted by the adaptive gain theory of locus coeruleus function. Cogn Affect Behav Neurosci. 2010;10(2):252–69.
Joshi S, Li Y, Kalwani RM, Gold JI. Relationships between pupil diameter and neuronal activity in the locus coeruleus, colliculi, and cingulate cortex. Neuron. 2016;89(1):221–34.
Barack DL, & Platt ML. Engaging and Exploring: Cortical Circuits for Adaptive Foraging Decisions. In Impulsivity (pp. 163–199). Springer International Publishing. 2017
Funding
Amelia Hunt has received research grants from the James S. MacDonnell foundation. Michael Dodd has received funding by NIH/NEI Grant 1R01EY022974.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical Approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed Consent
Informed consent was obtained from all individual participants included in the study.
Conflict of Interest
All other authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Joseph MacInnes, W., Hunt, A.R., Clarke, A.D.F. et al. A Generative Model of Cognitive State from Task and Eye Movements. Cogn Comput 10, 703–717 (2018). https://doi.org/10.1007/s12559-018-9558-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-018-9558-9