We have described a method for obtaining measurements of end-to-end system latency in gaze-contingent systems. We showed that large discrepancies exist between the latencies contributed by different displays. The choice of monitor could mean the difference between an average latency at the center of the screen of 16 ms (the CRT) or 35 ms (the iMac).
A limitation of system latency measurements obtained in this way is that they only partially account for pixel response time—that is, the time for the screen elements in an LCD monitor to reach the target brightness level after receiving the signal to change. The monitor onset times that are recorded are based on the first change to monitor brightness that can be unambiguously detected in the high-speed video camera footage, rather than one of the standard criteria used to assess pixel response time, such as the time to transition between 10 % and 90 % of maximum luminance. If the precise level of luminance is critical for a gaze-contingent application, our LED method would underestimate the system latency. However, since the pixel response times of modern displays are usually below 10 ms (Elze & Tanner, 2012), if we assume that the refreshed pixels respond strongly enough to be detected in the recording at half brightness, the latency would be underestimated by no more than 5 ms.
This method requires more of a time investment to collect samples than do the methods using data acquisition devices in combination with photodiodes and artificial saccades (Bernard et al., 2007; Bockisch & Miller, 1999), since it is necessary to locate the frames in the video where the key events happen. Therefore, the photodiode-based methods are viable alternatives to measuring system latency for labs with access to the required technical expertise. However, the ability to synchronize high-speed footage of the stimuli with the eyetracking data is an advantage of the present method, facilitating a good understanding of the stimulus as presented to the eye, particularly as it is affected by the monitor properties (see the supplemental high-speed videos of monitor performance).
What is an acceptable level of system latency?
The maximum acceptable system latency depends on the properties of the visual system, but also on the application. For example, for a multiresolution display, in which areas outside of central vision are represented at lower resolution, latencies up to 60 ms are not typically detected (Loschky & Wolverton, 2007). Latencies in this type of application are less critical, because only the effects on conscious awareness are relevant, not any effects outside of conscious awareness. Another factor is whether central vision or the periphery is being masked. When the periphery is masked, the area that is hidden postsaccade will largely overlap with the area that was hidden presaccade, so longer latencies do not result in the viewer obtaining much extra information that could affect the results. However, the delay in unmasking the target location should be taken into account when computing reaction times. On the other hand, when central vision is masked, the postsaccade target will not already be masked (unless it is close to the original fixation location), so there is the danger of a central-vision glimpse of the target when latencies are long. Other experimental paradigms that depend on low system latencies include gaze-contingent investigations into perception during saccades. Most saccadic suppression experiments have used eyetracking to retroactively discard trials in which the stimulus was not presented at the correct time during the saccade (e.g., Watson & Krekelberg, 2011). However, when the system latency can be characterized well, more precise and efficient experiments are possible, with the presentation of the stimulus being contingent on the time course of the saccade.
Broadly speaking, if the goal is to mask central vision, then given the results on rapid perception of scene gist and the experiments suggesting that visual suppression lessens within 5–25 ms of the end of a saccade (McConkie & Loschky, 2002; Ross, Morrone, Goldberg, & Burr, 2001; Shioiri, 1993; Volkmann, Riggs, White, & Moore, 1978), experimenters should aim to achieve an average system latency at screen midpoint of less than 25 ms. Of the displays that we tested, the Samsung LCD, the CRT, and the DLP display offered this latency level in combination with the other hardware and software that we used.
Reducing system latency
The choice of monitor is critical for controlling system latency. The monitor should be capable of refreshing at least 120 times per second and should have a small or nonexistent input lag, as with the CRT and the Samsung 2233RZ, the latter of which has been shown to have other desirable temporal properties (Wang & Nikolic, 2011). The presence of a video converter can add to the latency, as we discovered from the extra 3 ms contributed by conversion from a Mini DisplayPort signal to a VGA signal for use with the CRT monitor. The smallest latency that we measured, a 16-ms average at the center of the monitor, was obtained with a CRT connected to a PC with a video card that had native VGA output. No currently manufactured Apple computer has native VGA output, and VGA is being phased out of the PCs shipped by major manufacturers (Shah, 2012), so just as CRTs are becoming more difficult to obtain, so are computers that can drive analog displays without the delay introduced by conversion from a digital video signal.
What other steps can be taken? Many displays have features that, when active, perform additional processing on the input, increasing the input lag, and therefore the system latency. For example, the Adaptive Contrast Management mode on the Acer GD235HZ increases the average system latency by an average of 3 ms, whereas the Dynamic Contrast mode on the Samsung 2233RZ increases the system latency by an average of 8 ms (both based on ten measurements). In addition to the added latency of these modes, inspection of the high-speed footage shows that they can cause unexpected time-varying changes to the contrast and brightness. Researchers should consult their display documentation to find the settings that correspond to the least amount of input processing possible, which may mean deactivating features with names like Adaptive Contrast Management or Dynamic Contrast, or activating features with names like Video Game Mode that are intended to minimize input lag.
The settings of the eyetracker also affect the system latency, although to a lesser degree than the display does. The EyeLink 1000 can sample at 250 Hz, in which case the sampling stage alone would add an average of 2 ms to the latency (4 ms worst case, 0 ms best case). However, beyond 1000 Hz the returns appear to diminish; the change from 1000 to 2000 Hz only reduces the mean latency by 0.4 ms (SR Research, 2013). At those frequencies, the processing of the eye images is likely the bottleneck at the eyetracker, not the sample rate. EyeLink eyetrackers provide two levels of heuristic filtering (Stampe, 1993), which reduce noise in the eye position data in real time. However, each level of filtering increases the mean latency by 1 ms (1000-Hz sampling rate, fixed head, SR Research, 2013). Since the measurements reported here used one level of filtering, they overestimate the latency by 1 ms relative to what would be possible with no filtering (Link/Analog Filter set to “OFF”). The “Remote Option,” which enables tracking of the participant’s head so that a head-fixing chinrest is not required, also contributes 1.2 ms to the latency when in use (EyeLink 1000 user manual, version 1.5.2, p. 9). Therefore, to achieve the minimum possible latency contributed by the EyeLink 1000, experimenters should use at least a 1000-Hz sampling rate, no heuristic filtering, and fixed-head mode (with the viewer using a brow-and-chinrest). With these settings, this model of eyetracker should be responsible for an average of 1.8 ms of the system latency (EyeLink 1000 user manual, version 1.5.2, p. 9). However, heuristic filtering may be desirable in some cases, since it can use previously buffered samples to achieve more accurate estimation of eye position, at the cost of only a few milliseconds of added latency. For other models of eyetracker, equivalent latency measurements and information about settings that will affect latency should be obtained from the vendor.
Another approach to reducing latency is to change the scheduling of the eyetracker sampling and subsequent rendering. The measurements reported in our results used an algorithm that samples the eye position as soon as possible after the previous refresh, and then immediately renders the new image updated by the gaze location. After that, program execution is blocked until the current screen refresh finishes and the next one is ready to begin. During this time, many fresher eye position samples may become available, but are ignored. An alternative to this sample-then-wait scheme is described in Aguilar and Castet (2011): After the monitor refresh is triggered, rather than sampling the eyetracker again immediately, the program waits to sample until there is just enough time left to render the updated image before the refresh. The advantage to a wait-and-then-sample scheme is that the next monitor refresh is based on a more recent sample of the eye position. How much reduction in latency can be achieved? For the pixels at the top of the monitor, if the sample-and-then-wait scheme is used, the delay for the sample to reach the screen will always be at least T
refr ms (in which T
refr is the time between screen refreshes), whereas if the wait-and-then-sample scheme is used, the delay for the sample to reach the screen can be as little as the rendering time, as in Fig. 3A). On an iMac computer, which has a T
refr = 16.7 ms, rendering an alpha-blended simulated scotoma at a particular location takes approximately 5 ms. Thus, the system latency would be reduced by over 10 ms. However, the potential benefit of waiting to sample later in the refresh cycle will be considerably less when the refresh rate is higher—for example, only around 2 ms on the Samsung monitor at 120 Hz. Also, the time to wait must be based on an estimate of the rendering time, which should be conservative, in order to avoid missing the monitor refresh deadline. If this occurred, it would add T
refr to the system latency for that frame.
Decreasing the rendering time—for example, by preloading frames of a movie into memory—can be a strategy to reduce latency (O’Sullivan, Dingliana, & Howlett, 2002). This applies particularly when the wait-and-then-sample algorithm is used, in which render time is often the limiting factor. In the sample-and-then-wait algorithm, on the other hand, the duration of rendering does not make a difference, provided that it stays within the available time, which is equal to T
refr minus the time to retrieve a sample from the eyetracker. Only if this time were regularly exceeded would the rendering time need to be optimized.
Latency can also be reduced slightly by forcing the monitor to begin using the newly generated image immediately, regardless of where it is in the refresh cycle (Loschky & McConkie, 2000; McConkie, Wolverton, & Zola, 1984). Items near the bottom of the screen would be updated more quickly on average using this option, with an average benefit of 0.5 * T
refr ms, but the drawback is that the new image could be applied partway through drawing an object of interest. If the object is moving it might appear distorted—for example, by a “shearing” effect. These types of artifacts may be acceptable, depending on the application.
Finally, in a simulated-scotoma paradigm, the effective average latency will decrease as the scotoma size increases, since for shorter saccades the next fixation point may already be covered by the simulated scotoma at its previous location. This applies when only the point of fixation is important to obscure, and a match between the center of the visual field modification and the point of fixation is not necessary. The expected reduction in latency depends on the distribution of saccade distances relative to the scotoma size.