The only true voyage of discovery, (… would be) to possess other eyes, to behold the universe through the eyes of another, of a hundred others, to behold the hundred universes that each of them beholds, that each of them is. Marcel Proust, Remembrance of Things Past (or In Search of Lost Time)


The eyes are windows to the soul. This phrase present in the common sense popularly expresses that it is possible to deeply understand people’s minds just by how their eyes behave. This assumption is not that far from reality. Analyzing the eyes of subjects, researchers have answered questions of how people think, remember, pay attention, recognize each other, and many other theoretical and empirical ones. Recently, with the advancement of research in social and affective neuroscience, researchers are starting to look at human interactions and how the individuals’ eyes can relate to their behaviors and cognitive functions in social contexts. To measure individuals’ gaze, a machinery specialized in recording eye movements and pupillary diameter changes is used: a device known as an eye tracker.

Eye tracking as a research tool is more accessible than ever, and since it allows different inferences about mental functioning at a less expense of researcher grants, its popularity has grown exponentially in psychology and cognitive neuroscience laboratories. The eye-tracking device is a nonintrusive machine that normally emits infrared/near-infrared light to create a reflection in the cornea of the subject. This cornea reflection corresponds to the first Purkinje image (P1) obtained from the reflection of eye structures and is commonly known as a “glint.” This reflection and the center of the pupil are used to track eye movements. The corneal reflection is captured by a camera in the eye-tracking device, and it is possible to calculate a vector formed by the angle between the corneal reflection and the pupil. Those features enable the software to calculate the gaze direction. For the software to be fully capable of capturing eye movements, a calibration procedure is required, consisting of a presentation of dots on the screen which the subject should normally follow while the device registers the position of the eye (with the reflection) in order calculate references of where the person is looking (Hansen & Ji, 2010).

The eye-tracking equipment is able to record some helpful measures. Eye-tracking measurements can be divided into four large groups, being (1) movement measures (how the eyes move through space and the properties of these movements), (2) position measures (dealing with where a participant has or has not been looking and its properties), (3) numerosity measures (proportion or rate of any countable eye movement event), and (4) latency measures (how long these events take to start and finish). Thus, depending on the question asked by the researcher, it is possible to answer with several possible measures, for example, if the study shows different emotional faces to the participant, it is possible to verify how much and in which places the participant’s eyes move around to process those faces, the time it takes to do these scanpaths, the parts of the face that the participant looks at, the number of times he checks essential points of the faces (i.e., mouth, eyes), and even the time he keeps processing any of these points. Each of these indices will be able to answer different questions and may be also integrated so that we can make inferences about underlying cognitive processes.

In relation to movements, eye-tracking devices can record saccades and fixations. Saccades are the movements that the eyes do when they are searching for stimuli in the environment, while the fixations are brief moments when the eyes stop to look at something more carefully. During fixations, the visual resolution is optimal, and the visual system receives information about retinal input that is the moment when we process information and plan the next saccade to the objects of interest. In other words, the eyes are always in movement; even when fixation takes place, the eyes perform very small jitter, but for classification purposes, the eye movements are divided into those two categories (Liversedge et al., 2011). The measure recorded by eye-tracking devices is the pupil dilation, which is calculated from pupil diameter changes during the task execution (Sirois & Brisson, 2014).

Fixations are a great way to study emotion recognition based on facial expressions. For example, when a person is visually scanning a human face in order to recognize an emotion, they fixate approximately 88% of the time on facial regions including the eyes, nasion, nose, or upper lip (see Fig. 16.1). Emotions such as fear, anger, sadness, and shame have fixations predominantly on the region of eyes, while other emotions, such as joy and disgust, draw more attention toward the upper lip. This fixation pattern is related to optimizing the visual search for cues that are important for emotion identification. For example, the deformation of the upper lip characteristic during a smile is an important feature of joy. Moreover, the lower part of the nose seems to be a key region to differentiate between emotional faces. Those results suggest that there are certain diagnostic regions in the face for emotion processing (Schurgin et al., 2014).

Fig. 16.1
In an image of a man that has applied a visual scan where trying to recognize an emotion, the focus is on various areas of the face around 88 percent of the time. These areas include the eyes, nasion, nose, and upper lip. It illustrates the emotional recognition of the face with various regions of interest.

Regions of interest in the face for emotion recognition. (This image and the regions of interest were based on the manuscript of Schurgin et al. (2014))

Besides looking at the emotion expressed by the face, another important aspect is face recognition per se. One important finding has shown that people are generally better at face recognition of their own race, and this process is called own-race bias (ORB). Then trying to recognize faces of own race than faces of another race, participants had a shorter response time. Studies with eye tracking helped to understand this phenomenon. In a study with Caucasians trying to remember if they saw other Caucasian faces or Asiatic faces, it was possible to understand that a more complex scanning happened when Caucasians looked in their own-race faces. In own-race faces, they performed more saccades and more fixations. Additionally, these fixations were shorter than in other-race faces. The distance of saccades was not different when trying to recognize faces of own race or other races (Wu et al., 2012).

When trying to recognize a face, a person looks more than 70% of the time on the eyes, nose, and mouth. The participants spent more time looking in the region of eyes and forehead, while less time is spent looking at the nose when the face is an own-race face. This gaze pattern points to a different strategy of visual processing when trying to recognize faces of own race compared to other races. The visual scanning of own-race faces is done in a more automatic, quick, and effortless process than in other faces (Alves & Bueno, 2017; Wu et al., 2012).

Since the effective evaluation of the facial expression and a correct inference of the affective states are important for people’s social interaction, studies looked at the strategies used for face processing. They showed that when people are looking at static faces, they tend to direct their gaze to the right side of the face, the so-called left (hemispace) gaze bias, and this preferential looking is already present in children (Gilbert & Bakan, 1973; Sackeim et al., 1978; Heller & Levy, 1981; Hisao & Cottrel, 2008; Chiang et al., 2000; Taylor et al., 2012). Balas and Moulson (2011) registered eye gaze of children 5–10 years old while looking and judging face similarity of proof and a target face. They confirmed left-side bias in children 5 years old and showed an increase for left-side preference with age, however only when looking at human faces. No effect was found when children were looking and judging monkey faces. Together with other studies, the findings indicate that over the developmental trajectory, people improve their looking strategies together with acquiring expertise in human face judgment. Indeed, looking to the left hemiface may be more informative. Several studies examined composite photographs of human and chimera faces and asked whether the left-left composites were more informative than right-right hemiface composites. In most cases, the left-left photographs were judged as more emotionally expressive (Moreno et al., 1990) and more trustworthy (Okubo et al., 2013) and had more muscle movements (Dimberg & Peterson, 2000). Nicholls and colleagues (2002) found the left-gaze bias also in faces turned slightly to the side 15o, and it raised a question whether the same eye movement strategies are to be found in faces viewed from different angles and in natural dynamic setup.


Pupillometry is a measure of pupil diameter variance (i.e., pupil dilation) in the course of time. In one of the earliest studies with pupil diameters, scientists took pictures of the participants when performing tasks and then compared them with a baseline, that is, during a period when no task was done (Hess & Polt, 1960, 1964; Kahneman & Beatty, 1966). Since then, with the development of video-based eye trackers, the scientific interest in pupillometry has been growing.

Pupils’ diameter changes in order to allow more light to enter the eye and reach the retina, increasing our vision in dim light conditions. However, the pupil diameter also increases in response to cognitive processing, such as performing a test or contemplating a photograph with strong emotional content. Numerous studies that used pupillometry as a complementary measure in the execution of cognitive tasks demonstrated that the magnitude of change is directly related to the tasks’ cognitive demands. The change in pupil dilation related to the use of cognitive resources is minimal if compared to the change due to the change in luminosity, and while the former tends to vary by less than 1 millimeter, the latter may imply changes of up to 8 millimeters. This small, but conspicuous, difference is used to infer the way participants are allocating mental resources to perform demanding tasks. It is well known that change in pupil diameter is an effective indicator of a person’s mental activity (Hess & Polt, 1964; Kahneman & Peavler, 1969). Pupillometric studies provide evidence that pupil dilation is related not only to processing emotional states but also to increasing mental effort that is undertaken on a task (Eckstein et al., 2016; Hess & Polt, 1964; Kahneman & Peavler, 1969; Mathôt, 2018; Wierda et al., 2012).

Another two indices useful to the mental effort-related hypothesis are the pupil dilation peak and the eye blink rate. The peak of dilation – arguably as reliable as pupil dilation – can be related with the peak of effort during a task, since stabilization of the dilation can happen after the beginning of tasks (Beatty & Lucero-Wagoner, 2000; Hershaw & Ettenhofer, 2018), while eye blink rate is a complementary measure that can reflect cognitive engagement, usually, before a high-demanding task begins (Siegle et al., 2008; Van Bochove et al., 2013). In view of that, more blinks represent more preparation for doing a hard task and, with pupil dilation, can be used to indicate an effortful task (Fukuda et al., 2005; Ichikawa & Ohira, 2004).

Cognitive Ethology: From the Real World to the Lab and from the Lab to Virtual Reality

A prominent research approach to eye tracking is called cognitive ethology, mostly studying everyday attention and social interactions (Kingstone, 2009; Smilek et al., 2006). The goal is to first begin one’s research approach at the level of natural performance before moving it into the lab where it can be recreated, controlled, and manipulated. Cognitive ethology ends up being an alternative way of studying attentional processes when related to social interactions. By starting at the real-world level, the main focus is on what people really do in real life, and hence, one can determine what behaviors are, and are not, specific to the laboratory environment (Kingstone, 2009).

People have strong tendencies to follow gaze cues. With the help of an eye-tracking device, MacDonald and Tatler (2013) investigated whether social perceptions of a collaborator affects how people look at them and follow their gaze. Namely, they aimed to understand how social context can affect our gaze behavior during social interaction. With an experiment in which two participants worked together to perform a task (in their case, cooking), they found results showing that social context can affect gaze behavior, that is, the social context influenced the way the participants interacted with their eyes, focusing their attention depending on the action of the other. This result points out the use of eye tracking in social research and attempts to carry out an experiment in naturalistic environments to show how social attention works in natural social contexts.

Besides eye tracking, another technology that may help investigate social neuroscience is virtual reality (VR). VR is interesting to social neuroscience because it allows the creation of ecologically valid experiments that can be fully interactive and three-dimensional (Parsons et al., 2017). One study proposed to create the trolley dilemma in VR. This is a well-known series of experiments on moral decision-making on whether to sacrifice one person to save a larger number of people by making a certain action, such as diverting the incoming trolley on a sidetrack. In the VR version of the task, participants had to choose killing either ten victims or one victim. They created three conditions for the experiment environment, the first one with randomized women and men as possible victims, the second one with possible victims of different ethnicity, and the third one with a possible victim facing toward them and a possible victim facing away from them. Results from eye tracking pointed that the participant spent more gazing time on the chosen victim, which was an unexpected result since it was expected that they would avoid looking at the victim (Skulmowski et al., 2014).

In this experiment, the variation of the pupillary diameter was also verified. In relation to pupil variation, in all conditions, the pupil presented an increased diameter after the moment of decision, indicating that the participants had an increased cognitive load in the moment of decision. In the different ethnicity condition, participants presented a higher pupil dilatation, suggesting that the participants had a higher cognitive load in an extreme social decision situation due to the controversial topic (Skulmowski et al., 2014). In the previous study on the faces of different or equal races, the pupillary diameter was also recorded, indicating that a person will have a bigger pupil variation when trying to recognize other race faces. This is consistent with the gaze pattern that was already described above, indicating that a person will have more cognitive effort to recognize the face of another race, while own-race faces will be more automatic.

Eye Tracking in Clinical Populations

Another interesting way to use eye tracking is with studies in clinical populations with impaired social interaction, such as study with individuals diagnosed with schizophrenia, autism, or social anxiety disorder. Since eye tracking can demonstrate underlying cognitive patterns of a person, it can be a good tool to understand how different clinical populations understand and process different stimuli. For example, eye tracking can help us understand which part of a stimulus (e.g., faces) a person more fixates on, indicating where is the part of the face that a person applies most attention to. Thus, it is possible to infer different cognitive processes of a clinical population when comparing their eye gaze with a typical population.

In relation to schizophrenia, there have been a large number of studies with eye tracking that goes beyond the scope of this work. The findings point out to different aspects of eye movement impairments in persons with schizophrenia, one of them being an impaired smooth pursuit. Smooth pursuit happens when eyes follow a moving object. In persons with schizophrenia, the smooth pursuit lags behind the moving object, and thus a series of saccades is made to catch up the target (O’Driscoll & Callahan, 2008). It has long been known that this population presents a worse performance in anti-saccade tasks (Fukushima et al., 1988) during which the participant must avoid looking at a suddenly appearing target and is supposed to look in the opposite direction. Recently, new experiments revealed a worse performance in the fixation task (Benson et al., 2012) assessed by a study that asked the participants to visually fixate on the point ignoring a cue appearing in the peripheral area. Furthermore, on free-viewing tasks, participants with schizophrenia tend to focus their gaze on a smaller area, if compared with typical participants (Sprenger et al., 2013).

Since people with schizophrenia present a different eye movement pattern, compared with typical persons, there are some discussions regarding the use of the eye gaze as a biomarker for schizophrenia. This is possible because eye movements are underlaid by different neurological mechanisms that can be altered in persons with schizophrenia. In this regard, Morita et al. (2020) made a review describing the findings in this area. The results suggest that eye movements can be used to discriminate between persons with schizophrenia and typical subjects at a rate of ~75–90% (Morita et al., 2020).

The eye movements of persons with autism spectrum disorder (ASD) also seem to be different from typical persons. These regions may not be identified by persons with some developmental type of disorders. One meta-analysis reviewed studies on face processing and showed that children with ASD have significantly reduced the number of fixations in the region of the eyes. Furthermore, diminished attention on eyes negatively impacts social interaction because not looking at social cues may lead to worse interaction and emotion recognition (Papagiannopoulou et al., 2014). The same meta-analysis demonstrated that there were no significant differences in mouth region fixations for children with or without ASD. Another meta-analysis of 38 studies revealed that individuals with ASD present reduced social attention if compared with typical individuals and that the social attention in persons with ASD is influenced by social contents (Chita-Tegmark, 2016). However, a comparison of the gaze pattern in different regions of interest, this time in a meta-analysis involving 122 studies, found differences of small and medium magnitudes (Frazier et al., 2017). In special, participants with ASD presented a higher difficulty in selecting socially relevant or nonrelevant stimuli. The biggest difference was again found in the eyes and whole face regions of interest (Frazier et al., 2017).

Very promising results come from studies on social attention in toddlers (18–35 months old) with and without ASD. Specific signs of ASD may be indicated by subtle variation in the way the child follows another person’s look to the target of interest, an ability called joint attention. There are two principal kinds of joint attention: the response joint attention that requires to spot the change of the other person’s look and follow it to the new destiny and the initiation joint attention that requires the child to look at a moving object and by her/his own gaze indicate this fact to another person. While at 24 months of age the eye-tracking pattern, especially in initiating joining attention, was different in ASD toddlers compared to typically developing children, by 6 months later, this difference disappeared. Due to the natural maturation, the ASD improved their ability to disengage from the face stimuli and explore the global aspect of the scene approaching their eye moving pattern to the performance of typically developing children (Muratori et al., 2019).

Lastly, persons with social anxiety disorder (SAD) also present peculiarities in their eye gaze. A meta-analysis containing 13 studies demonstrated that participants with SAD presented a hypervigilance-avoidance effect in their eye gaze when looking into faces, compared to typical participants (Claudino et al., 2019). This eye gaze effect can be understood by a big number of fixations in the face at the first moment and then less fixation in the stimulus at a second moment. Claudino et al. (2019) also found that this effect was more prominent in faces presenting negative emotions, such as anger.


The measurement of eye movement and pupil dilation is a valid undertaking for studies in cognitive, social, and affective neurosciences. With this technique, it is possible to carry out an ecological evaluation, which is cheaper and answers several important experimental questions. Using typically developing or clinical populations of different age groups and even allowing constant social interactions during the experiment, the device allows a series of inferences on cognitive processing based on objective, simple, and noninvasive physiological measures. The use of eye tracking by different behavioral disciplines depends more on the limit of what the researcher is willing to investigate than on the technique per se. To sum up, the possibilities of research questions that can be answered by participants’ eyes go much further than expected or, rather, go beyond what the eyes can see.