Gaze3DFix: Detecting 3D fixations with an ellipsoidal bounding volume
Nowadays, the use of eyetracking to determine 2-D gaze positions is common practice, and several approaches to the detection of 2-D fixations exist, but ready-to-use algorithms to determine eye movements in three dimensions are still missing. Here we present a dispersion-based algorithm with an ellipsoidal bounding volume that estimates 3D fixations. Therefore, 3D gaze points are obtained using a vector-based approach and are further processed with our algorithm. To evaluate the accuracy of our method, we performed experimental studies with real and virtual stimuli. We obtained good congruence between stimulus position and both the 3D gaze points and the 3D fixation locations within the tested range of 200–600 mm. The mean deviation of the 3D fixations from the stimulus positions was 17 mm for the real as well as for the virtual stimuli, with larger variances at increasing stimulus distances. The described algorithms are implemented in two dynamic linked libraries (Gaze3D.dll and Fixation3D.dll), and we provide a graphical user interface (Gaze3DFixGUI.exe) that is designed for importing 2-D binocular eyetracking data and calculating both 3D gaze points and 3D fixations using the libraries. The Gaze3DFix toolkit, including both libraries and the graphical user interface, is available as open-source software at https://github.com/applied-cognition-research/Gaze3DFix.
KeywordsBinocular 3D eye tracking 3D gaze points 3D fixations Eye movement analysis Methodology Open-source software
Nowadays, eyetracking is widely used in basic and applied research on visual perception and cognition. However, the vast majority of eyetracking research has been done with two-dimensional stimuli. This might restrict the validity of some findings as the visual system evolved in a 3D environment, which makes experiments employing 3D scenes inevitable (Collewijn, Steinman, Erkelens, Pizlo, & van der Steen, 1992; Essig, Pomplun, & Ritter, 2006; Land, 2006; Lappi, 2016). Furthermore, applications of virtual reality containing stereoscopic presentations increase, thereby offering great possibilities for research, for example when investigating action and perception (Durgin & Li, 2010). In the field of eye tracking research it is common practice to determine gaze direction and/or the 2-D gaze position on a reference plane (mostly a monitor) (Duchowski, 2007; Hammoud, 2008; Hansen & Ji, 2010). Although several methods for the computation of gaze points in 3D have already been reported (see below), no implementation is immediately available.
The analysis of eye tracking measurements usually requires detecting fixations and saccades from the recorded sample data. Meanwhile, there are various approaches to estimating 2-D fixations and saccades from sample data. Several software packages and tool boxes are available (see, e.g., Komogortsev, Gobert, Jayarathna, Koh, & Gowda, 2010). Furthermore, there have been several suggestions for detecting fixations in 3D scenes using monocular eyetrackers (Diaz, Cooper, Kit, & Hayhoe, 2013; Munn, Stefano, & Pelz, 2008; Pelz & Canosa, 2001; Reimer & Sodhi, 2006). However, those methods do not provide a position of the 3D fixation but only a start and end time. For the detection of 3D fixations and saccades only few approaches have been suggested (Duchowski et al., 2002; Duchowski, Medlin, Gramopadhye, Melloy, & Nair, 2001) and none of them can easily be used by researchers. The aim of the present article is to close this gap by introducing an approach for the detection of 3D fixations. The proposed algorithm was successfully tested, implemented, and is available open source, ready-to-use for everyone interested in 3D eye movement research.
We start with an overview of existing approaches for the computation of 3D gaze points and the detection of 2-D and 3D fixations, followed by the section in which our algorithms are introduced. In the third section, two empirical studies into the accuracy of our method are reported. We conclude the article by discussing the advantages and requirements of our method as well as its applicability.
First, we summarize various attempts that have been made so far to compute 3D gaze points and describe current approaches for the detection of 2-D and 3D fixations.
Computation of 3D gaze points
Our literature review revealed four main approaches to compute gaze depth and 3D gaze points: The extended gaze disparity approach, a neural network approach, the ray-based approach, and the vector-based approach.
The gaze disparity approach represents the simplest way to extract the gaze depth from binocular eye tracking data. Therefore, the difference of the horizontal pupil positions of both eyes needs to be calculated. This has often been done in perceptual research (e.g., Blythe, Holliman, Jainta, Tbaily, & Liversedge, 2012; Grosjean, Rinkenauer, & Jainta, 2012; Pobuda & Erkelens, 1993; Rambold, Neumann, Sander, & Helmchen, 2006; Semmlow, Hung, & Ciuffreda, 1986; Wismeijer, van Ee, & Erkelens, 2008). This measure is suitable for many research questions but does not provide a 3D gaze point.
With the eye locations known, the x- and y-coordinates of the 3D gaze point can be computed. To increase accuracy, online and offline filtering procedures as well as a 3D calibration routine were implemented (Duchowski et al., 2011; Wang et al., 2012, 2014).
Essig et al. (2006) employed an artificial neural network, specifically a parametrized self-organizing map (PSOM) for the calculation of 3D gaze points. Therefore, following a 2-D calibration of the eye tracker, a 3D calibration is done using calibration points in a 3 × 3 × 3 calibration cube. The 3D coordinates of the calibration points (x, y, z) and the 2-D gaze positions measured by the eye tracker are used as input-output pairs to train the neural network. After the 3D calibration, the PSOM provides estimated 3D gaze points as a function of the 2-D binocular input. Empirical evaluation revealed estimates of higher accuracy for the PSOM method than for 3D gaze points derived from the vector-based approach with a fixed interpupillary distance (Essig et al., 2006; Pfeiffer, Latoschik, & Wachsmuth, 2009). However, as compared to the extended gaze disparity approach, PSOM is less accurate, is more difficult to implement, and requires a longer 3D calibration (Wang et al., 2014).
If the geometry of the presented stimuli is known, 3D gaze points can be inferred by allocating gaze points to stimulus positions as a ray-based approach (Duchowski et al., 2002; Duchowski et al., 2001; Mansouryar, Steil, Sugano, & Bulling, 2016). Therefore, a gaze vector needs to be estimated and the point of intersection with the reconstructed 3D scene represents the 3D gaze point. However, a gaze point can only be unambiguously determined if the gaze vector intersects exactly one point in the 3D scene. Otherwise, if none or several points in the 3D scene are hit, the selection is under- or overspecified, a problem common for all ray-based approaches (Pfeiffer et al., 2009). Furthermore, this approach requires a virtual reconstructing of the scene when real stimuli are presented.
We decided to use the vector-based approach, in which the gaze vectors for both eyes are calculated using their locations in space and the gaze positions on the measuring plane. Theoretically, the 3D gaze point would be the point at which both vectors intersect. Since this is seldom the case, Collewijn, Erkelens, and Steinman (1997) restricted all of their stimuli and the eye movements to a horizontal plane of regard, simplifying the computation to a two-dimensional triangle. To determine a real 3D gaze point, the shortest straight line between the two gaze vectors is computed, and the 3D gaze point is defined as the midpoint of this line (Epelboim et al., 1995; Hennessey & Lawrence, 2008; Pfeiffer et al., 2009; Wibirama & Hamamoto, 2012, 2014); see below for detailed formulas. In a study comparing the accuracy of 3D gaze points from vector intersection to 3D gaze points from gaze disparity, Duchowski et al. (2014) found that the depth estimates of both methods were highly correlated. However, the correlation decreased with increasing distance between the stimulus and observer. The authors concluded that vector-based gaze depths are more accurate, because depth estimates based on gaze disparity appeared to inflate the error magnitude. Furthermore, in contrast to the ray-based approach, a vector-based approach requires no knowledge of the geometry of the scene. In addition, no depth calibration is needed as in the extended gaze disparity approach, the PSOM, or other machine-learning approaches (although an additional depth calibration has been proposed by Wibirama & Hamamoto, 2014). This is beneficial because it allows for free head movements as long as the positions of the eyes are known, and it enables research into study convergence errors (when users converge slightly in front or behind the fixated object). Another advantage of machine-learning approaches is the possibility to adapt to the individual viewing characteristics of participants. Due to the above-mentioned advantages, we think that a vector-based approach is the most appropriate method for the computation of 3D gaze points.
Methods for 2-D fixation detection
Several methods to detect fixations and saccades on a two-dimensional reference plane have been described. Three approaches can be distinguished: Dispersion-based algorithms, velocity-based algorithms, and area-based algorithms (Salvucci & Goldberg, 2000; for a recent review, see Lappi, 2015).
Dispersion-based algorithms are built on the assumption that the distances between single gaze points are smaller in a fixation than in a saccade. A fixation is detected, when the spatial dispersion threshold and the temporal threshold is met. Measures for the spatial dispersion threshold can either be pairwise distance—which is computationally costly—maximum centroid distance, or bounding box size (Salvucci & Goldberg, 2000). Salvucci and Goldberg suggested a dispersion threshold between 0.5° and 1° of visual angle. The temporal threshold is defined by the minimum fixation duration. Since fixations are usually longer than 100 ms, most dispersion-based algorithms use a minimum duration of 100–200 ms (Salvucci & Goldberg, 2000).
Velocity-based algorithms use a threshold in the velocity. In the simplest form the point-to-point velocity is computed and the gaze point is considered as belonging to a fixation if the velocity is below the threshold and a defined minimum fixation duration is reached. More sophisticated velocity-based algorithms examine velocity peaks for the presence of a saccade and subsequently determine the beginning and ending of the saccade on the basis of the velocity profile (Lappi, 2015). Instead of using a fix velocity threshold, Nyström and Holmqvist (2010) suggest to calculate the threshold algorithmically on the basis of the data. Since small saccades can be quite slow, velocity ranges of saccades can overlap with other types of eye movements such as smooth pursuit movements and the vestibulo-ocular reflex (Lappi, 2015). To circumvent this overlap, the use of acceleration as the criterion instead of velocity has been suggested (e.g., Behrens, MacKeben, & Schröder-Preikschat, 2010) and a combination of velocity and acceleration has been applied (e.g., SR Research, 2009).
Area-based algorithms require predefined regions in the visual field called areas of interest (AOIs). Gaze points inside an AOI are defined as belonging to a fixation once a minimal fixation duration is reached. Points outside the AOIs are discarded as well as points laying within an AOI but not meeting the temporal threshold. Since area-based algorithms do not allow identifying eye movements within an AOI they are no fixation detection algorithms per se; they rather permit to determine fixation groups of higher order (also called glances) belonging to visual targets (Salvucci & Goldberg, 2000).
Few attempts have been made to compare different fixation detection algorithms. For instance, Salvucci and Goldberg (2000) evaluated five exemplary algorithms. They concluded that incorporating a minimum fixation duration is advisable, velocity-based and dispersion-based algorithms achieve a similar performance, accuracy, and robustness of the fixation identification. Area-based algorithms were found to be less preferable in general, but suitable for certain types of aggregation. Komogortsev et al. (2010) compared three velocity-based with two dispersion-based algorithms. Although rather simple stimulus material was used, they reported pronounced differences between the algorithms. In a similar vein, Shic, Scassellati, and Chawarska (2008) showed that different algorithms result in variations with regard to the number and duration of fixations. Accordingly, this can also lead to different interpretations of the same data.
Approaches from 2-D to 3D fixation detection
When moving from 2-D to 3D stimuli, other approaches are necessary. Diaz et al. (2013) suggest using a velocity-based algorithm to identify fixations in virtual environments presented in a head-mounted display. Further approaches for the identification of fixations in 3D scenes have been described by Munn et al. (2008), Pelz and Canosa (2001), or Reimer and Sodhi (2006). However, these approaches are intended for monocular eye trackers, and although they are suitable to identify fixations in 3D environments, they provide only a fixation direction, not a 3D fixation position.
Theoretically, the known 2-D detection algorithms could be used to first identify the start and end times of fixations on the basis of the binocular data. Afterward the fixation centers could be computed using the common 3D gaze point algorithms. However, using traditional algorithms for 2-D fixations on 3D eyetracking data leads to a component-wise analysis of movements of the left and the right eye, thereby effectively ignoring depth (Duchowski et al., 2001). As a result, different fixation durations and variable start and end times of a fixation could be obtained for the right and the left eye. A similar approach is to use algorithms for the detection of 2-D fixations on some kind of a composite position of the eyes as, for example, on the mean gaze angle of left and right eye (Wismeijer et al., 2008) or on the conjugate eye movement signal ([left eye + right eye]/2; Grosjean et al., 2012). Along with Duchowski and colleagues (Duchowski et al., 2002; Duchowski et al., 2001), calculating gaze points in three-dimensional space before the identification of 3D fixations seems more appropriate to us because the 3D gaze point provides a composite representation of the user’s gaze. We elaborate on this issue further in the supplement and provide a comparison between the two approaches: Identifying 2-D fixations using 2-D samples, before calculating the 3D fixation position versus calculating 3D gaze positions before identifying 3D fixations using 3D gaze points.
Duchowski et al. (2001) describe a simple velocity-based algorithm to detect 3D fixations in which the velocity is calculated using two successive three-dimensional gaze points. Their evaluation revealed that the algorithm underestimated fixation duration and number of fixations as compared to the theoretically expected values. After they improved their algorithm (Duchowski et al., 2002), the velocity calculation was based on five successive gaze points, and a velocity threshold of 130°/s was suggested. They also implemented an acceleration-based algorithm with an adaptive thresholding technique, which dynamically sets the threshold depending on the noise level. They furthermore added a minimum fixation duration, which is by default 150 ms. Their evaluation revealed that velocity-based algorithms are easier to implement, require less parameters but are sensitive to noise. Further, they inferred that acceleration-based algorithms can be more robust, but determining adequate parameters is difficult.
We decided to extend the dispersion-based approach to detect 3D fixations from 3D gaze points since it has been shown to be accurate and robust, requires few parameters and can also be used for eyetrackers with a low sampling rate.
Proposed algorithms to compute 3D gaze points and 3D fixations
In the following, we describe the calculation steps for the computation of 3D gaze points, the essential procedure for the use of a dispersion-based algorithm for the detection of 3D fixations as well as the implementation of those algorithms in the toolkit Gaze3DFix.
Computation of 3D gaze points
Using the vector-based approach, 3D gaze points are defined as the midpoint of the shortest straight line between the two gaze vectors (see also Epelboim et al., 1995; Pfeiffer et al., 2009; Wibirama & Hamamoto, 2012, 2014).
Next, the point with the smallest distance to the two gaze vectors must be found. For this, the common perpendicular of both vectors is calculated.
Detection of 3D fixations
The midpoint between the two eyes, called cyclopedian eye position, is determined:
The distance between the cyclopedian eye and the current fixation center is calculated:
The parameters of the ellipsoid are defined depending on the distance to the current fixation center and the threshold of acceptable gaze point deviation δ. The spatial extensions are calculated as follows:
Since the ellipsoid is not radially symmetrical, it has to be oriented to the cyclopedian eye. The yaw angle φ (rotation around the y-axis) and the pitch angle θ (rotation around the x-axis) between the cyclopedian eye position and the current fixation center result from
The gaze point deviations are rotated according to the ellipsoidal bounding volume of the current fixation:
The algorithm tests whether the new 3D gaze point lies within or outside of the ellipsoidal bounding volume of the current fixation:
the new 3D gaze point lies within the ellipsoid bounding volume of the current fixation,
If the new 3D gaze point lies within the ellipsoidal bounding volume, it is added to the current fixation, and the new fixation parameters (3D position of the center, fixation duration) are calculated.
With the next sample, the verification of the fixation hypothesis starts again with Step 1.
The toolkit Gaze3DFix
We have implemented the described algorithms for the computation of 3D gaze points in the dynamic linked library Gaze3D.dll. The described calculation steps for the detection of 3D fixations by an ellipsoidal bounding volume could also be used in other dispersion-based fixation detection algorithms. Here, we are exemplarily building up on the algorithm of LC Technologies, Inc. (2014) in Fixations3D.dll. In both libraries the mathematical calculations are realized as functions that can be called from different programming environments and are usable online during the recording of eye movements as well as offline after the recording process. To facilitate the application, we provide Gaze3DFixGUI.exe, a simple graphical user interface (GUI) for importing binocular eye tracking data and calculating the related 3D gaze points and 3D fixations. A more in depth description of the use of the libraries and Gaze3DFixGUI.exe can be found in the supplement of this article. The toolkit Gaze3DFix including the two dynamic linked libraries (Gaze3D.dll and Fixations3D.dll), and the GUI (Gaze3DFixGUI.exe) is available as open-source software. Thus, researchers can use and modify the libraries or the user interface for their work. They are available along with a manual at github under https://github.com/applied-cognition-research/Gaze3DFix.
Evaluation of the approach
Two experiments were conducted to test the accuracy of the suggested method for the computation of 3D gaze points and the detection of 3D fixations. In both experiments participants had to fixate predefined stimuli while their eye movements were tracked. In Experiment 1, real stimuli were shown at different locations, in Experiment 2, the same setup was replicated with virtual stimuli. Both experiments will be reported together due to their similarity.
In Experiment 1, 12 participants (seven men, five women; 26 to 47 years of age, M = 33) took part, and in Experiment 2, 24 participants (12 men, 12 women; 25 to 49 years of age, M = 32) took part. All participants had normal or corrected-to-normal visual acuity. In Experiment 2, the stereopsis was also determined using the Titmus Fly Test. One participant failed the stereopsis test and was therefore excluded.
Apparatus and material
For Experiment 2, the standard 2-D eyetracker calibration (monoscopic) and stereoscopic stimulus presentation were done on a 3D monitor (ASUS VG236, 23-in., 1.920 × 1.080 pixels, 120 Hz). The monitor was synchronized with shutter glasses (NVIDIA® 3DVISION™) and positioned 600 mm in front of the participant. The shorter distance to the monitor and the eyetracker than in Experiment 1 was required in order to achieve sufficient intensity of infrared light as some amount of light was blocked by the shutter glasses. Replicating the distances of the real stimuli, spheres were presented in a distance of 200, 300, 400, and 500 mm in front of the participants in three columns and three rows (see Fig. 5c). Since replicating the exact positions of Experiment 1 led to stereoscopic images that did not fit on the monitor, we moved the stimuli closer together on the x- and y-axes, leading to a field of view of 24.60° horizontal and 13.79° vertical. The virtual spheres had a diameter of 7 mm, were marked with a fixation cross, and were rendered either dark in front of a light background or light in front of a dark background (see Fig. 5d). The stereoscopic stimuli were rendered with the framework Bildsprache Live Lab (BiLL; Wojdziak, Kammer, Franke, & Groh, 2011) individually for each participant’s interpupillary distance.
After a demographic questionnaire and the stereopsis screening test in Experiment 2, participants were positioned and received the instruction. Before the experiment, a 9-point calibration was performed. The standard 2-D calibration targets formed a field of view of 37.80° horizontal and 21.83° vertical in Experiment 1 and a field of view of 43.60° horizontal and 25.36° vertical in Experiment 2. If the average of the distances between target position and estimated gaze position was above 0.3°, calibration procedure was repeated. After a test trial and a short break to clarify possible questions the calibration procedure was repeated before the experimental session started with either 45 (Exp. 1) or 72 (Exp. 2) trials. In Experiment 2, calibration was done on a dark background, followed by a block of 36 trials with light target spheres on a dark background, and then calibration was done on a light background followed by a block of 36 trials with dark target spheres in front of a light background; the order of the blocks was counterbalanced across participants. This additional independent variable in Experiment 2 was implemented because of results from research on binocular coordination in reading (Huckauf, Watrin, Yuras, & Koepsel, 2013; Liversedge, Rayner, White, Findlay, & McSorley, 2006; Nuthmann & Kliegl, 2009). Those results suggested that whether white-on-black or black-on-white stimuli are used might affect the convergence distance. A trial started either with the experimenter lowering one target sphere at the prescribed position (Exp. 1) or with one virtual target sphere moving from the monitor plane to the prescribed position (Exp. 2). The participant had to fixate the target stimulus in order to fuse the stereoscopic image. Participants pressed a button to terminate the trial in cases in which the stereoscopic image could not be fused. If participants were able to fixate the target stimulus appropriately, this was confirmed by pressing another button. After the button press, the participants had to maintain the fixation for a duration of 1,000 ms. A subsequent acoustic signal indicated that the end of the required time interval was reached—that is, the end of the trial. Between trials participants could look around freely and rest their eyes while the sphere was either removed by the experimenter (Exp. 1) or blanked out (Exp. 2).
From the eyetracking data, only samples between the confirmatory button press of the participant and the acoustic signal were processed further. Samples with invalid values due to a blink or gaze locations outside the measuring range (frustum between the participants’ eyes and eye tracker measuring plane) were filtered, as well as trials in which the participants were not able to fuse the stereoscopic images. In Experiment 1 we excluded 6.15%, and in Experiment 2 8.75%, of the data. For the remaining binocular samples, the 3D gaze point was computed using the library described above. From the 3D gaze points, 3D fixations were detected and their positions computed using the algorithm described above. We set the minimum duration to 100 ms and manipulated the dispersion threshold as an independent variable with the levels 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, and 2.0 deg of visual angle.
Medians were calculated for each participant and each condition. We used medians instead of means as the distributions for the 3D deviations and fixation durations were expected to be skewed to the right (see, e.g., Helo, Pannasch, Sirri, & Rama, 2014). For a right-skewed distribution the median represents a more reliable measure of central tendency than the mean. Repeated measures analyses of variance (ANOVAs) were applied to analyze deviations for the different stimulus distances and the different dispersion thresholds, respectively. Greenhouse–Geisser correction (ε) was applied when the assumption of sphericity was violated. As an indication of effect size, eta-squared (η 2) is reported as recommended by Levine and Hullett (2002).
First, the accuracy of the 3D gaze points depending on the stimulus distance was considered for both the real and the virtual stimuli. Second, the 3D fixation positions depending on the dispersion threshold used and the stimulus distance were compared.
3D gaze points
To test the accuracy of our proposed method, real and virtual stimuli were presented, and the 3D gaze points and 3D fixations were calculated. For the tested distances of 200 to 600 mm, we found good congruence between the stimulus position and the 3D gaze points as well as the 3D fixation locations for both real and virtual stimuli. In fact, the results concerning real and virtual stimuli were highly similar. Looking at different stimulus distances, we found higher 3D gaze deviations (and thus higher 3D deviations of the 3D fixations) for increased fixation distances. However, the z deviations did not vary with distance. This means that there was no systematic distortion, like a constant under- or overestimation, but simply a higher variance. This was probably due to the smaller vergence angle for farther distances. The smaller the vergence angle, the more the computed 3D gaze point is affected by measurement errors of the eyetracking system. The spatial threshold setting for the detection of fixations did influence the number of detected fixations, and consequently the fixation durations. The accuracy of the 3D fixation location, however, was not affected for real stimuli, and was only slightly affected for virtual stimuli. One reason for the obtained high accuracy might be that we used stimuli that were small and easy to fixate, leading to exact fixations by the participants. However, here we focused on accuracy; therefore, artificial fixations were generated by means of the instructions and the presentation of isolated stimuli. Hence, the present findings do not allow us to draw conclusions about the quality of the algorithms’ output for the detection of 3D fixations in a free-viewing paradigm with natural sequences of fixations and saccades.
Since the measurement error increases with fixation distance, our method is suitable for stimuli within reaching distance, which is the case for all approaches that build on convergent eye movements. A further requirement is that the users’ eye positions must be known, which means that the eye positions need to be measured and either kept stable or tracked continuously. This would, for example, be the case with the eyetracker we used in the present evaluation, since the EyeFollower™ provides the 3D positions of the eyes. However, in our experimental evaluation, head movements were restricted by using a chinrest, to avoid influences of head movements on the results, and we used a simple fixation task with stationary targets. Thus, the present results are limited, and we do not know how the method would cope with situations in which the targets or the observer moved, and thus in which slow eye movements, such as smooth pursuit movements or vestibulo-ocular reflex, were present. Further experiments will be needed to clarify in which situations our approach to detecting 3D fixations with an ellipsoidal bounding volume provides valid results, and in which it might not be sufficient.
In this article we have presented a new approach for the detection of 3D fixations using an ellipsoidal bounding volume. The proposed algorithm, as well as an algorithm to calculate the necessary 3D gaze points using a vector-based approach, are implemented as dynamic linked libraries and can be assessed through GUI-based software.
The proposed approach has the advantage of not being hardware- or application-specific, which allows for high flexibility in its use. Furthermore, no 3D calibration is required, since the normal 2-D calibration of the eyetracker is sufficient. With the use of the dynamic linked libraries, the algorithms are applicable online during the recording of eye movements. Offline analyses of binocular eye movements can be done using Gaze3DFixGUI.exe, which has a GUI, thus making our algorithms available for researchers not well versed with programming.
A requirement of our method is that for both the computation of 3D gaze points and the detection of 3D fixations, the users’ eye positions must be known. This means that the eye positions need to be measured and either kept stable or tracked continuously. The present results are limited to a simple and artificial environment; future testing will need to examine the quality of the detection of 3D fixations in free-viewing paradigms.
The Gaze3DFix toolkit is available open-source at github, https://github.com/applied-cognition-research/Gaze3DFix. We hope that the availability of ready-to-use algorithms for analyses of 3D eye movements will boost research in this field. Furthermore, using the same algorithms would allow for better comparisons between the results of future experiments.
S.W. was co-financed by the European Social Fund and the Free State of Saxony, Germany (SAB 100080340). R.S.S. was supported by a grant from the German Research Council (Deutsche Forschungsgemeinschaft, DFG; VE 192/17-1). The authors thank Dixon Cleveland (LC Technologies, Inc.) for technical support, and Dietrich Kammer (Chair of Media Design, TU-Dresden) for programming the stereoscopic visualization.
The mathematical relation is described in greater detail in the supplement.
- Blythe, H. I., Holliman, N. S., Jainta, S., Tbaily, L. W., & Liversedge, S. P. (2012). Binocular coordination in response to two-dimensional, three-dimensional and stereoscopic visual stimuli. Ophthalmic and Physiological Optics, 32, 397–411. https://doi.org/10.1111/j.1475-1313.2012.00926.x PubMedCrossRefGoogle Scholar
- Collewijn, H., Steinman, R. M., Erkelens, C. J., Pizlo, Z., & van der Steen, J. (1992). Effect of freeing the head on eye movement characteristics during three-dimensional shifts of gaze and tracking. In A. Berthoz, W. Graf, & P. P. Vidal (Eds.), The head–neck sensory motor system (pp. 412–418). Oxford, UK: Oxford University Press.CrossRefGoogle Scholar
- Duchowski, A. T. (2007). Eye tracking methodology: Theory and practice (Vol. 373). New York, NY: Springer Science & Business Media.Google Scholar
- Duchowski, A. T., House, D. H., Gestring, J., Congdon, R., Świrski, L., Dodgson, N. A., … Krejtz, I. (2014). Comparing estimated gaze depth in virtual and physical environments. In P. Qvarfordt & D. Witzner Hansen (Eds.), Proceedings of the Symposium on Eye Tracking Research and Applications (pp. 103–110). New York, NY, USA: ACM Press.CrossRefGoogle Scholar
- Duchowski, A. T., Medlin, E., Gramopadhye, A., Melloy, B., & Nair, S. (2001). Binocular eye tracking in VR for visual inspection training. In C. Shaw, W. Wang, & M. Green (Eds.), Proceedings of the ACM Symposium on Virtual Reality Software and Technology (pp. 1–8). New York, NY, USA: ACM Press.Google Scholar
- Duchowski, A. T., Pelfrey, B., House, D. H., & Wang, R. I. (2011). Measuring gaze depth with an eye tracker during stereoscopic display. In S. N. Spencer (Ed.), Proceedings of the Symposium on Applied Perception in Graphics and Visualization (pp. 15–22). New York, NY, USA: ACM Press.Google Scholar
- Huckauf, A., Watrin, L., Yuras, G., & Koepsel, A. (2013). Brightness and contrast effects on binocular coordination. Paper presented at the 55th Tagung experimentell arbeitender Psychologen, Vienna, Austria.Google Scholar
- LC Technologies. (2014). Eyegaze edge analysis system: Programmer’s manual. Fairfax, VA, USA: LC Technologies, Inc.Google Scholar
- Levine, T. R., & Hullett, C. R. (2002). Eta squared, partial eta squared, and misreporting of effect size in communication research. Human Communication Research, 28, 612–625. https://doi.org/10.1111/j.1468- 2958.2002.tb00828.x CrossRefGoogle Scholar
- Mansouryar, M., Steil, J., Sugano, Y., & Bulling, A. (2016). 3D gaze estimation from 2D pupil positions on monocular head-mounted eye trackers. Procceedings of the International Symposium on Eye Tracking Research and Applications (ETRA). 197–200Google Scholar
- Munn, S. M., Stefano, L., & Pelz, J. B. (2008). Fixation-identification in dynamic scenes: Comparing an automated algorithm to manual coding. In S. Creem-Regehr & K. Myszkowski (Eds.), Proceedings of the 5th Symposium on Applied Perception in Graphics and Visualization (pp. 33–42). New York, NY: ACM Press.CrossRefGoogle Scholar
- Pfeiffer, T., Latoschik, M. E., & Wachsmuth, I. (2009). Evaluation of binocular eye trackers and algorithms for 3D gaze interaction in virtual reality environments. Journal of Virtual Reality and Broadcasting, 5, 1660.Google Scholar
- Rambold, H., Neumann, G., Sander, T., & Helmchen, C. (2006). Age-related changes of vergence under natural viewing conditions. Neurobiology of Aging, 27, 163–172. https://doi.org/10.1016/j.neurobiolaging.2005.01.002 PubMedCrossRefGoogle Scholar
- Salvucci, D. D., & Goldberg, J. H. (2000). Identifying fixations and saccades in eye-tracking protocols. Proceedings of the 2000 Symposium on Eye Tracking Research and Applications (pp. 71–78). New York, NY, USA: ACM Press.Google Scholar
- Shic, F., Scassellati, B., & Chawarska, K. (2008). The incomplete fixation measure. Proceedings of the 2008 Symposium on Eye Tracking Research and Applications (pp. 111–114). New York, NY, USA: ACM.Google Scholar
- SR Research LTD. (2009). EyeLink® 1000 user manual—Version 1.5.0. Mississauga, Ontario, Canada.Google Scholar
- Wang, R. I., Pelfrey, B., Duchowski, A. T., & House, D. H. (2012). Online gaze disparity via bioncular eye tracking on stereoscopic displays. In Proceedings of the 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission (pp. 184–191). Washington, DC, USA: IEEE Computer Society.CrossRefGoogle Scholar
- Wibirama, S., & Hamamoto, K. (2012). A geometric model for measuring depth perception in immersive virtual environment. In Proceedings of the 10th Asia Pacific Conference on Computer Human Interaction (pp. 325–330). New York, NY: ACM Press.Google Scholar
- Wojdziak, J., Kammer, D., Franke, I. S., & Groh, R. (2011). BiLL: An interactive computer system for visual analytics. In Proceedings of the 3rd ACM SIGCHI Symposium on Engineering Interactive Computing Systems (p. 264). New York, NY: ACM Press.Google Scholar