Eye tracking under dichoptic viewing conditions: a practical solution
- 1k Downloads
In several research contexts it is important to obtain eye-tracking measures while presenting visual stimuli independently to each of the two eyes (dichoptic stimulation). However, the hardware that allows dichoptic viewing, such as mirrors, often interferes with high-quality eye tracking, especially when using a video-based eye tracker. Here we detail an approach to combining mirror-based dichoptic stimulation with video-based eye tracking, centered on the fact that some mirrors, although they reflect visible light, are selectively transparent to the infrared wavelength range in which eye trackers record their signal. Although the method we propose is straightforward, affordable (on the order of US$1,000) and easy to implement, for many purposes it makes for an improvement over existing methods, which tend to require specialized equipment and often compromise on the quality of the visual stimulus and/or the eye tracking signal. The proposed method is compatible with standard display screens and eye trackers, and poses no additional limitations on the quality or nature of the stimulus presented or the data obtained. We include an evaluation of the quality of eye tracking data obtained using our method, and a practical guide to building a specific version of the setup used in our laboratories.
KeywordsEye tracking Pupil Gaze Dichoptic viewing Stereoscope Mirror Infrared
Researchers of human behavior and cognition record participants’ eye dynamics for many purposes. Gaze direction and pupil size reveal much about a person’s cognitive states and hidden intentions (Naber et al., 2013). Gaze direction, for instance, can inform about factors such as attention allocation (Deubel & Schneider, 1996; Pastukhov & Braun, 2010) and decision making (Reddi & Carpenter, 2000), while pupil size reveals aspects of visual processing (Barbur, 2003; Naber & Nakayama, 2013) and task engagement (Gilzenrat et al., 2010), among other things. Dichoptic stimulus presentation, the practice of showing two distinct images to a participant’s two eyes, also has a range of purposes, for instance in 3D entertainment and research on 3D vision (Barendregt et al. 2015; Julesz, 1971; Held et al., 2012), as well as for experimental paradigms that center on interocular conflict, such as binocular rivalry and continuous flash suppression (Blake & Logothetis, 2002; Tsuchiya & Koch, 2005). It can be important to combine dichoptic presentation and eye-tracking within a single paradigm, for instance to evaluate eye movements in 3D scenes (Erkelens & Regan, 1986; Wismeijer et al, 2010) or ocular responses to visual input that remains unseen due to interocular conflict (Rothkirch et al., 2012; Spering et al., 2011). Nevertheless, combining the two techniques remains technically challenging and fraught with limitations, due to the simple fact that the components used to achieve dichoptic stimulation, such as mirrors or prisms, often interfere with the recording of the eyes. This is specifically true for approaches that use a camera for recording the eyes (i.e., video-based eye tracking), as these rely on an unobstructed line of sight to the eyes. In this paper we present a method we recently developed, that offers a convenient and simple solution to eye recording and dichoptic stimulus presentation in parallel.
Existing work that combines dichoptic stimulation (see Law et al., 2013 for an overview of methods) with eye tracking tends to fall into one of two categories. Work in the first category uses eye-tracking techniques that do not require a clear line of sight to the eyes. For instance, the scleral coil technique (Erkelens & Regan, 1986; Kalisvaart & Goossens; 2013; Robinson, 1963) relies on a type of contact lens fitted with a metal coil or coils and inserted into the eye of a participant seated inside a changing magnetic field. The position of the eye is then inferred from the voltage induced in the coil(s). Electro-oculography (Fox et al., 1975; Leopold et al., 1995; Zaretskaya et al., 2010), in turn, relies on a difference in electical potential that exists between the front and the back of the eye. When placing a pair of skin electrodes in close proximity to the eye, this difference allows information about eye position to be inferred from the potential difference between the electrodes. The second type of study that manages to combine dichoptic stimulation with eye tracking, even video-based eye tracking, does so by separating the two eyes’ inputs without obstructing the camera’s view. This can be achieved, for instance, with commercially available goggle systems that have cameras integrated in the eye pieces (Frässle et al., 2014). Alternatively, the two eyes’ views can be separated spectrally by using anaglyph glasses (Hayashi & Tanifuji, 2012; Van Dam & Van Ee, 2006; Wismeijer & Erkelens, 2009), or temporally by combining polarized glasses with a dynamic polarization screen that transmits light of a different polarization angle on alternate monitor frames (Wismeijer et al., 2010). Each of these methods has its benefits but also its limitations. For instance, the scleral coil technique and electro-oculography are moderately invasive and do not allow pupillometry, approaches using anaglyph glasses pose limitations on the colors used in the visual display, techniques that rely on light polarization do not work with certain (flatscreen) monitors and projectors (which emit light at restricted polarization angles), and currently available goggles have low spatial and temporal eye-tracking resolution.
In this paper we describe a technique for combining video-based eye tracking, an accurate approach to eye tracking and arguably the most popular today (Mele & Federici, 2012), with dichoptic presentation by means of a mirror stereoscope. This is an arrangement of mirrors that diverts the two eyes’ lines of sight to two different visual displays, thus posing no limitations on the nature of the display or the monitor type. Although a small number of papers in the literature report positioning a video-based eye tracker such that it has a view of the eye through the space that remains on the sides of the mirrors (e.g., Rothkirch et al.; 2012; Spering et al. 2011), this is generally difficult because the mirrors leave little room for a clear line of sight. Indeed, the difficulty of getting a clear view of both eyes in this situation may be the reason why such work recorded samples from only one eye (Spering et al. 2011). To make full use of the potential of video-based eye trackers in a dichoptic stimulation setting, we recently pioneered an approach centered on the facts that video-based eye trackers generally operate in the infrared part of the light spectrum, and that some suppliers of optical equipment carry infrared-transparent mirrors. Thus, this approach replaces the mirrors in a basic stereoscope with ones that transmit infrared light, offering the camera an unobstructed view through the mirrors (Naber et al., 2011; see Carle et al., 2011 for a different implementation of this idea). Besides drawing attention to this general approach, in the present paper we will describe a version of this setup that we developed more recently with an emphasis on easy construction, and we will present an evaluation of the quality of eye data collected using this setup in combination with infrared eye trackers of two different makes.
To establish the utility of this setup for experiment purposes we assessed the quality of the eye data collected with infrared eye trackers through the mirrors. Because different types of eye trackers have different characteristics (e.g., their sensors might not be equally responsive or they may use different software algorithms to identify the pupil in a camera frame), we tested our setup with two different kinds of infrared eye trackers: a desktop-mounted EyeLink 1000 recording binocularly at 1,000 Hz (SR Research Ltd., Mississauga, Ontario, Canada; used in combination with the first set of mirrors mentioned above) and an Eye Tribe Tracker recording at 30 Hz (The Eye Tribe Aps, Copenhagen, Denmark; used with the second set of mirrors mentioned above). The product specifications suggests that both trackers should work well in this setup, as they both operate at wavelengths transmitted by the mirrors (Eyelink: 890–940 nm; Eye Tribe: around 850 nm). Each of the eye trackers is part of a different version of the setup, one at Utrecht University and one at Michigan State University. Three naïve participants were tested in each of the setups.
We established data quality in two different ways. First, with the mirrors in place each participant performed a calibration-validation procedure in which he fixated on-screen dots (0.1 dva) that appeared at different screen locations, one after the other (calibration), and then performed this same procedure a second time (validation). Our measure of data quality was the correspondence in inferred gaze angle between both repetitions. In the case of the Eyelink tracker this procedure was, in fact, the 13-point calibration-validation sequence that is part of the Eyelink software. Here the outer eight calibration points trace the edges of a large rectangle centered on the middle of the screen (in our case 16.5 dva wide by 9.3 dva high); four additional points are positioned at the corners of a rectangle half that size, and the remaining point (presented first in the sequence) is at the screen center. Inferred gaze angle for each point was compared between calibration and validation by the Eyelink software per eye. The average of both eyes is reported here. In the case of the Eye Tribe tracker the participant first performed the Eye Tribe’s built-in calibration routine, after which he followed with his eyes an onscreen dot that visited the same 13 points as in the Eyelink calibration procedure. In this case inferred gaze angle was taken to be the median angle recorded for each calibration point, averaged across eyes, and the comparison was between this inferred angle and the physical angle of each onscreen point.
As a second indicator of data quality, we removed one of the two mirrors, thus providing the camera with a direct view of one eye and a view through the infrared transparent mirror of the other eye (counterbalancing across setups which mirror was removed). The participant then completed a set of three tasks involving basic ocular events, and we used the resulting data to compare the signal between the eyes. Specifically, three participants fixated a point while the screen’s brightness stepped repeatedly between 0 and 100 cd/m2; they made saccades to follow a dot that jumped between corners of an imaginary square, each corner located at 5° from fixation; and they used pursuit eye movements to follow a dot that slowly moved between these same four points at a speed of 3°/s. Without any data preprocessing, we then analyzed the number of lost samples per eye, the correlation in estimated pupil size between eyes for the first task, and the correlation in estimated gaze angle between eyes for the remaining tasks. Given that eye position and pupil size are typically yoked between the two eyes outside blink periods, any systematic deviations are indicative of a systematic influence of the mirror on the recorded signal.
The calibration-validation procedure was performed without problems with the mirrors in place. We performed three repetitions for each of the eye trackers, obtaining a valid recording of gaze angle for each of the 13 calibration points in all cases. For the three participants in the setup using an Eyelink tracker the average deviations in estimated gaze angle between calibration and validation were 0.52, 0.35, and 0.66° of visual angle (dva); for the Eye Tribe tracker the deviations were 2.15, 1.76, and 4.14 dva. We will comment on the larger deviation for the latter tracker in the Discussion section.
During the short experiments with only one mirror in place the trackers generally kept good track of both pupils, regardless of whether the line of sight passed through a mirror. The Eyelink missed no samples for either eye, and for the three participants the Eye Tribe missed 17.6, 7.1, and 12 % of the samples for the eye with mirror, versus 12.5, 10.3, and 8 % for the eye without mirror (not statistically different; paired t-test across three participants, t(2) = 0.76; p = 0.52). Figure 2 shows data for one representative participant in the Eyelink setup (left figure in each pair) and for one in the Eye Tribe setup (right figure in each pair). The correspondence between eyes was overall very good for both trackers, as quantified by between-eye Pearson correlations for each of the three experiments. Specifically, for the experiment with changing background luminance (Fig. 2a), the average correlation in pupil size was 0.99 for both trackers. For the saccade experiment (Fig. 2b) the average correlations in horizontal gaze angle as well as vertical gaze angle were and 0.99 and 0.97 for the Eyelink and Eye Tribe, respectively. Finally, for the smooth pursuit experiment (Fig. 2c) the average correlations in horizontal gaze angle as well as vertical gaze angle were and 0.99 and 0.96 for the Eyelink and Eye Tribe, respectively.
We present a straightforward method that allows experimenters to simultaneously track the eyes binocularly and present visual stimuli dichoptically. The approach, which centers on the combination of an infrared eye tracker and infrared-transparent mirrors, can be of value for studies that focus on 3D vision or interocular suppression. The proposed eye-tracking method has several benefits over previous methods that involve dichoptic stimulation. It allows a full, unblocked view of both eyes and use of large, high quality visual stimuli. Furthermore, eye-tracking data quality is limited only by the tracker used (not by the dichoptic presentation system), and existing eye-tracker hardware needs no modifications for use in this setup.
Our approach does have limitations. One limitation is that the eye tracker’s infrared illuminator, whose range of transmitted wavelengths extends into the visible range, can in some cases be seen through the mirrors, potentially contaminating the visual display. The severity of this concern depends on the particular experiment. While the illuminators are faintly visible when the screen is black, they cannot be seen when background luminance is higher. Moreover, in cases where the stimulus of interest covers a relatively small part of the screen, the illuminators can be moved not to overlap with this part. A second limitation is the low-tech method by which alignment between screens is achieved. This method might not be sufficient, for instance, for experiments that require exactly controlled binocular disparity across large regions of the screen. However, the method has proven adequate in our experiments, which tend to rely on smaller stimuli near the screen center.
We note that in our calibration-validation procedure the deviation between the two was considerably larger for the data from the Eye Tribe than for those from the Eyelink. One potential reason for this is that the Eye Tribe signal seems to contain more noise in general, as is also apparent in Fig. 2. A second potential reason is that, whereas the calibration and validation were performed in immediate succession for the Eyelink, there was a larger time gap for the Eye Tribe (see Methods), potentially allowing more participant movement. Either way, the second experiment, with only one mirror in place, showed high correlations between the two eyes’ signals for both trackers, arguing against the idea that any calibration-validation difference was to a large degree due to the presence of the mirrors.
In closing, it is worth mentioning a recent movement toward the use of pupillary and oculomotor responses instead of subjective button press reports in certain behavioral tasks that involve dichoptic stimulation (Frässle et al., 2014; Naber et al., 2011; Tsuchiya et al., 2015). Specifically, to prevent potential confounds associated with manual responses in experiments that manipulate consciousness through interocular suppression, such “no report” designs rely on eye recordings to infer perception. The method presented here provides an ideal solution for those wishing to implement such designs.
The authors thank Pieter Schiphorst for his role in designing the setup and for providing the graphics of Fig. 1, and Steffen Klingenhoefer for his fruitful suggestions.
- Barbur, J. L. (2003). Learning from the pupil: Studies of basic mechanisms and clinical applications. The visual neurosciences. Cambridge: MIT Press.Google Scholar
- Beach, G., Cohen, C. J., Braun, J., & Moody, G. (1998). Eye tracker system for use with head mounted displays. IEEE International Conference on Systems, Man, and Cybernetics, 5, 4348–4352.Google Scholar
- Julesz, B. (1971). Foundations of cyclopean perception. Chicago: The University of Chicago Press.Google Scholar
- Leopold, D. A., Fitzgibbonz, Logothetis, N. K., & Logothetis, N. (1995). The role of attention in binocular rivalry as revealed through optokinetic nystagmus. Artificial Intelligence Lab Publications. Available at: http://hdl.handle.net/1721.1/6649