Abstract
We have developed a new mono/binocular eye tracking system by using an IEEE1394b or USB-3.0 digital camera that provides high sensitivity, high resolution, high frame-rate and no rolling shutter distortion. Our goal is to provide a system that is friendly to researchers who conduct experiments. The system is non-invasive and inexpensive and can be used for mice, marmosets, monkeys, and humans. It has adopted infrared light to illuminate an eye (eyes). The reflected image of the infrared light on the cornea and the black image of the pupil are captured by the camera. The center of the pupil and the center of the corneal reflection are calculated and tracked over time. The movement of the head is compensated by the reflection. Since the high resolution camera has a 2048 horizontal pixels resolution, we can capture the images of both eyes simultaneously and calculate the parameters of the two eyes at each frame. The gaze position data can be read out on-line via computer network and/or DAC (digital analog converter). The adoption of the Windows 10 as the operation system makes this eye tracking system user-friendly. Because of the high frame-rate of the digital camera, the sampling rate of the system can be as high as 700 Hz and the latency less than 4 ms.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
The eye tracking system has been used for various purposes: in psychological studies, understanding visual information processing, analyzing consumers’ interests, etc [1, 2]. Recently, the eye movement information has also attracted a great deal of attention, because of its applicability in the Human-Computer-Interaction. However, as for the eye tracking system on the market, the hardware and software are incorporated as one manufactured product. As a result, the users (researchers) have to select or build up their experimental protocols to fit the eye tracking system that they own, which could limit their experiments. We report here a new user (experimenters) friendly, non-invasive and inexpensive eye tracking system, whose hardware can be selected to suit various experimental protocols. As shown in Fig. 1, there are two sets of software, iRecHS2 for monocular and iRecHS2b for binocular, both of which can be run on Windows 10. The basic requirement of the system’s hardware is a digital camera (FLIR Integrated Imaging Solutions, Inc.), which can receive infrared ray and has a global shutter. Other required hardware (infrared light-source, lens, PC, etc.) can be selected according to its application and the system can be used for mice, marmoset, monkeys, and humans, thus, expanding the applicability of the eye movement information over a variety of fields.
2 General Methods
Various types of eye tracking technologies are currently available [1, 2]. In the present study, we selected a digital video-based combined pupil and corneal reflection method by using a 3-D (3-Dimension) eyeball model, because it allows for passive calibration on untrained subjects (animals). By real-time image processing, the gaze angle, pupil size, blink (eye openness) and gaze-target position in 3-D space (binocular system) are estimated and the data can be read out on-line via TCP/IP and/or by a DAC (digital analog converter), and also be stored for offline analysis.
2.1 Apparatus
The system consists of the following devices. (1) infrared illumination, (2) visible light cut filter, (3) lens, (4) high frame-rate and global shutter digital camera, (5) Windows 10 personal computer, (6) DAC for output data and DIO (digital I/O) for synchronizing with other devices, (7) cables, camera support member, etc. The device configuration of (1)–(5) can be altered depending on the measurement subject and/or distance between the camera and subject. Details are summarized in Table 1.
2.2 3-D Eyeball Model for Gaze Detection
In order to obtain the gaze vector towards the camera, an eyeball model (Fig. 2) was adopted. In this model, the extension of the normal vector \((C_{\mathrm x}, C_{\mathrm y}, C_{\mathrm z})\) of the pupil plane, passing through the center of the pupil, passes through the corneal curvature center and the eyeball rotation center. The vector \((C_{\mathrm x}, C_{\mathrm y}, C_{\mathrm z})\) also represents the gaze vector in camera coordinates. We approximate the outline of the pupil in the captured image as an ellipse. We define the pupil center as \((X_{\mathrm p}, Y_{\mathrm p})\), the length of the minor axis as S, the major axis as L, and the slope of the minor axis as a. Since the cornea is also a part of a sphere, we refer to the center of the sphere as the cornea curvature center \((X_{\mathrm c}, Y_{\mathrm c})\). If the distance between the light/camera and the eyes is sufficiently longer than the head movement, the relationship between the center of the reflection and the cornea curvature center is constant. Thus, when the reflection point is \((X_{\mathrm r},Y_{\mathrm r})\), the cornea curvature center \((X_{\mathrm c}, Y_{\mathrm c})\) can be described as \((X_{\mathrm c}, Y_{\mathrm c})=(X_{\mathrm r}-Offset_{\mathrm x}, Y_{\mathrm r}-Offset_{\mathrm y})\). We defined the eyeball rotation center in a captured image as \((X_{\mathrm o}, X_{\mathrm o})\), length between the pupil center and eyeball rotation center in 3-D space as R and length between the pupil center and the corneal curvature center in 3-D space as \(R_{\mathrm {pc}}\). Using these parameters, we can express the gaze vector in two ways.
Head-fixed-method. Assess the gaze vector from the vector connecting the pupil center and eyeball rotation center (Eq. (1)). The gaze vector \((C_{\mathrm x},C_{\mathrm y},C_{\mathrm z})\) can be calculated by using Eq. (1) after measuring the pupil center \((X_{\mathrm p},Y_{\mathrm p})\) in the captured image of the eye. However, when the head moves, the eyeball rotation center \((X_{\mathrm o},X_{\mathrm o})\) has to be calculated again. Thus, this method is applicable for the animal whose head is firmly fixed.
Head-free-method. Assess the gaze vector from the vector connecting the pupil center and corneal curvature center (Eq. (2)). The gaze vector \((C_{\mathrm x},C_{\mathrm y},C_{\mathrm z})\) can be calculated by using Eq. (2) after measuring the pupil center \((X_{\mathrm p},Y_{\mathrm p})\), and the gravity-center of the reflection in the captured image of the eye \((X_{\mathrm r},Y_{\mathrm r})\). By this method, the gaze vector can be calculated when the head moves in the picture frame. However, when the distance between the eye and camera changes, the image of the eye would be out of focus. Thus, using a chin-rest to keep the distance between them is preferable. Since the measurement can be taken only when the reflection light is on the cornea, the range of the measurable eye position is narrower than the head-fixed-method. The noise of the measurement might increase because the images of the pupil and reflection are used.
2.3 Pupil Detecting Method
In this system, an infrared illuminates the eye of the subject. By using a high frame-rate digital camera, the image of the subject’s pupil is captured and approximated by an ellipse. To detect the pupil within limited time, we adopted the following procedure:
-
1.
Capture an image from camera.
-
2.
Create a reduced image of the captured image.
-
3.
Create a binary image of the reduced image.
-
4.
Label and measure the objects in the binary image. Detect the largest object as pupil.
-
5.
Approximate the reduced image of the pupil by an ellipse.
-
6.
Fit the ellipse on the original picture and detect the edge of the pupil.
-
7.
Remove reflection of irrelevant objects on the pupil edge (Fig. 3).
-
8.
Approximate the outline of the pupil to an ellipse by least square method.
-
9.
Compare between the vertical height of the ellipse and the distance between the top and bottom of the pupil image to detect eye openness.
-
10.
Create a binary image of the reduced image by using the threshold of the reflection.
-
11.
The reflected image of the light source nearest to the pupil center is used to calculate the gravity-center.
The process between (2)–(5) can be skipped because of the small shift of the pupil during the high frame-rate sampling.
2.4 Passive Calibration
By applying the value of the captured images of the freely moving eye, the pupil rotation center \((X_{\mathrm o},Y_{\mathrm o})\) and pupil rotation radius R are calculated using the head-fixed-method, and the offset between reflection and the corner curvature center (\(Offset_{\mathrm x}\), \(Offset_{\mathrm y}\)), and the length between the pupil center and corner curvature center \(R_{\mathrm {pc}}\) are calculated using the head-free-method. When the slope of minor axis of the ellipse is defined as a, \((X_{\mathrm o},Y_{\mathrm o})\) is expressed by Eq. (3). As shown in Fig. 4, the intersections of the minor axes of the pupil-ellipses in multiple picture-frames are located on \((X_{\mathrm o},Y_{\mathrm o})\). \(Offset_{\mathrm x}\) and \(Offset_{\mathrm y}\) are calculated by Eq. (4) using data taken from two or more picture-frames (least square method). R is calculated by Eq. (5) with \((X_{\mathrm o},Y_{\mathrm o})\) and \((X_{\mathrm {p}i},Y_{\mathrm {p}i})\). \({R_{\mathrm {pc}}}\) is calculated by Eq. (6) with \(Offset_{\mathrm x}\), \(Offset_{\mathrm y}\), \((X_{\mathrm {p}i},Y_{\mathrm {p}i})\), \((X_{\mathrm {r}i},Y_{\mathrm {r}i})\). The calculation-process for R, \(R_{\mathrm {pc}}\) from the major axis L and minor axis S of the pupil is shown in Fig. 5. By acquiring these parameters, the gaze vector directing to the camera can be calculated. This method is used for the subject who cannot perform the active calibration, for example mouse.
2.5 Active Calibration, Transformation into Object Coordinates
We calculate a gaze axis vector \((C_{\mathrm x}, C_{\mathrm y}, C_{\mathrm z})\) in camera coordinates from Eqs. (1), (2). and obtain a gaze axis vector \((T_{\mathrm x}, T_{\mathrm y}, T_{\mathrm z})\) in object coordinates from target positions. The subject is instructed to fixate small targets, located at 5 (more than 3) positions on the stimulus display to obtain \(Offset_{\mathrm x}, Offset_{\mathrm y}, R{pc}\), and a 3\(\,\times \,\)3 matrix M for coordinates transformation (Eq. (7)). Then we can calculate a gaze vector \((O_{\mathrm x},O_{\mathrm y},O_{\mathrm z})\) in object coordinates from Eq. (8).
2.6 Case of Binocular Vision
We calculate gaze-target positions (x, y, z) in 3-D space from the vectors in object coordinates of two eyes (left \((LO_{\mathrm x},LO_{\mathrm y},LO_{\mathrm z}) \), right \((RO_{\mathrm x},RO_{\mathrm y},RO_{\mathrm z}\))) and inter-pupillary distance (IPD) (Eq. (9)).
3 Evaluation of the System’s Accuracy
3.1 Evaluation of the System’s Accuracy of Gaze Angle
We evaluated the system’s accuracy by using an apparatus of synthetic eyes. (Fig. 6).
Method. We measured the accuracy of the system in representing monocular gaze angles. First, we calibrated the system while the synthetic eye gazing at 8 points angle of (horizontal, vertical) \(=\) \((-10, 10)\), (0, 10), (10, 10), \((-10, 0)\), (0, 0), (10, 0), \((-10, -10)\), \((10, -10)\) degrees. Then, under the head fix method, \((X_{\mathrm o},Y_{\mathrm o})\), R, and translation matrix \(M_{\mathrm {fixed}}\) were calculated, and under the head-free-method, \(Offset_{\mathrm x}\), \(Offset_{\mathrm y}\), \(R_{\mathrm {pc}}\), and translation matrix \(M_{\mathrm {free}}\) were calculated. We set the synthetic eye gazing at 25 points, horizontal angle \(-10 \text { to } +10^{\circ }\) and vertical angle \(-10 \text { to }10^{\circ }\) at intervals of \(5^{\circ }\), and measured the eye’s gaze angle for 1 s at each setting. The camera was GS3-U3-41C6NIR-C, and the captured image size was 320\(\,\times \,\)240 pixels.
Result. Figures 7, 8 show the distribution of the gaze angles in the head-fixed-method (Fig. 7) and head-free-method (Fig. 8). The gaze angles are plotted in camera coordinates (A) and in object coordinates (B), and superimposed in both coordinates (C). The distribution of the gaze positions in camera coordinates shrank a little from those in object coordinates (C). The means and standard deviations of the errors at 25 points (in object coordinates) are summarized in Table 2. Figure 9 shows the data at the center in a larger scale (A, B) and those after a median filter. The scatter of the errors by using the head-free-method was larger than that by the head-fixed-method.
Discussion. As shown in Figs. 7C and 8C, the distribution of the gaze positions in camera coordinates shrank a little from those in object coordinates. It is because of the shift of the pupil center due to the cornea-refraction. It is necessary to consider the shift when the subject (such as a mouse) cannot perform active calibration. On the other hand, there is no practical issue for the subject, who can perform the active calibration, because the transformation matrix created by the active calibration compensates the shift. The larger scattering of the errors in the head-free-method (Fig. 9A, B) is due to the measurement error of the gravity-center of the reflection in addition to the measurement error of the pupil center. When the head is fixed perfectly, the head-fixed-method provides higher-accuracy and wider-range measurement. Practically, the researchers can get stable measurements, by applying a median filter or moving average filter (Fig. 9C, D).
3.2 Evaluation of the System’s Accuracy of Gaze Position
Method. We estimated the accuracy of the system in representing binocular gaze angles by using a pair of synthetic eyes. First, we calibrated the system while the synthetic eyes gazing at 8 points horizontal angles of \((right eye, left eye)=(1, -1)\), \((1.5, -1.5)\), \((2.0, -2.0)\), \((2.5, -2.5)\), \((3.0, -3.0)\), \((3.5, -3.5)\), \((4.0, -4.0)\), \((5.0, -5.0)\), \((6.0, -6.0)\), \((8.0, -8.0)\), (10.0, 10.0) degrees, and vertical angle of \(0^{\circ }\). The horizontal position at \((1, -1)\) indicates that the gaze is on the center of the visual field at distance of 1719 mm, and that at \((10, -10)\) indicates at distance of 170 mm. We measured the left and right eyes’ gaze angles and calculated the 3-D position of the intersection for 1 s at each setting.
Result. Figure 10 shows the distribution of eye position errors (A, B) and 3-D intersections (C) measured by head-free-method, by using a 10-points moving average filter.
Discussion. As shown in Fig. 10, when the distance between the subject and the target is larger, the distribution of the measurement errors of the intersection scattered. It is important to notice this characteristic of the 3-D measurement.
4 Evaluation of the System’s Latency
We evaluated the system’s latency by using an apparatus of synthetic eye (Fig. 11).
4.1 Methods
By using the synthetic eye, the system was calibrated. The voltage output from the potentiometer and eye tracking system was sampled at 10 kHz. The synthetic eye was rotated by a hand and the output voltage were recorded and translated into rotation angle and horizontal eye movements (Fig. 12). As shown by the enlarged view in Fig. 12B, D, the time-delay of the system can be observed. The latency of the system is calculated by the following Eq. (10), by using the rotation angle output of potentiometer as p(t) and the output of the eye tracking system as h(t). The latency was obtained when the Error RMS (root mean square) was the minimum. We measured latencies with different sampling frequencies of the picture-frame.
4.2 Results
The data calculated by using Eq. (10) are shown in Fig. 13A at seven sampling frequencies of the camera-frame. Figure 13B shows the relationship between the minimum value and sampling frequency, suggesting that the system’s latency depended on the sampling frequency (= 1/period). The approximated line in Fig. 13B indicates that the latency is represented as the sum of the sampling duration and processing time. The latency was less than 4 ms at 500 Hz sampling rate.
4.3 Discussion
The reliable measurement of the system’s latency is one of the advantages of this eye tracking system. To get stable measurement, we have suggested application of a median filter or moving average filter. Since a 9-points median filter causes a 4.5-points time delay, the time-delay at the sampling frequency of 500 Hz is estimated as 13 ms. Thus, it is possible to calculate the gaze angle during the period of one frame and to display a new picture on the next-frame of a 60 Hz video display (real-time eye-position feedback experiment).
5 Measuring of Human Gaze Traces in 3-D Space
5.1 Methods
The subject was an adult male wearing glasses. The subject’s head was fixed by a chin rest. The system was calibrated by measuring eye positions during the subject fixed targets on the display. After the calibration, the subject transferred his gaze in 5 ways, between the center position (near) to one of the 5 positions (far) (Fig. 14). The sampling frequency was 500 Hz.
5.2 Results
Figure 15 shows the gaze transfer traces after applying a 10-points (20 ms) moving-average-filter. It clearly shows that the eye movement traces from near to far (divergence, red trace in Fig. 15) and from far to near (convergence, blue trace) are different in 3-D space. In both cases, the vergence eye movements (convergence and divergence) started earlier than saccade and terminated after the saccades.
5.3 Discussion
The gaze transfers between the near to far positions consisted of combination of saccades and vergence eye movements. The subject transferred his gaze from the near to far (red dots) point and from the far to near (blue dots) point in different ways. In both cases, the vergence eye movements started earlier than saccades. The results are consisted with classical reports by Yarbus (1957) [3] and Enright (1984) [4].
6 Conclusion and General Discussion
We developed a new widely applicable eye tracking system. By using synthetic eyes, we found that the system can measure eye movements with better than \(0.2^{\circ }\) accuracy and latency was less than 4 ms at 500 Hz. By using this system, eye movements can be measured even when the subject (such as a mouse) cannot perform the active calibration. Since this system takes the images of two eyes by one camera, movements of two eyes can be measured simultaneously. With this system, we succeeded in characterizing human vergence/saccade eye movements when ocular fixation shifted between two targets placed at different distances in 3-D space.
These programs can be access from https://staff.aist.go.jp/k.matsuda/iRecHS2/.
References
Duchowski, A.: Eye Tracking Methodology: Theory and Practice. Springer, London (2007)
Holmqvist, K., Nystrom, M., Andersson, R., Dewhurst, R., Jarodzka, H., Tracking, E.: A Comprehensive Guide to Methods and Measures. Oxford University Press, USA (2011)
Yarbus, A.L.: Eye movements during changes of the stationary points of fixation. Biophysics 2, 679–683 (1957)
Enright, J.T.: Changes in vergence mediated by saccades. J. Physiol. 350, 9–31 (1984)
Acknowledgement
This paper is based on results obtained from a project commissioned by KAKENHI (24650105, 16H03297) and the New Energy and Industrial Technology Development Organization (NEDO).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Matsuda, K., Nagami, T., Sugase, Y., Takemura, A., Kawano, K. (2017). A Widely Applicable Real-Time Mono/Binocular Eye Tracking System Using a High Frame-Rate Digital Camera. In: Kurosu, M. (eds) Human-Computer Interaction. User Interface Design, Development and Multimodality. HCI 2017. Lecture Notes in Computer Science(), vol 10271. Springer, Cham. https://doi.org/10.1007/978-3-319-58071-5_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-58071-5_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58070-8
Online ISBN: 978-3-319-58071-5
eBook Packages: Computer ScienceComputer Science (R0)