# Fast and robust ellipse detection algorithm for head-mounted eye tracking systems

- 224 Downloads

## Abstract

In head-mounted eye tracking systems, the correct detection of pupil position is a key factor in estimating gaze direction. However, this is a challenging issue when the videos are recorded in real-world conditions, due to the many sources of noise and artifacts that exist in these scenarios, such as rapid changes in illumination, reflections, occlusions and an elliptical appearance of the pupil. Thus, it is an indispensable prerequisite that a pupil detection algorithm is robust in these challenging conditions. In this work, we present one pupil center detection method based on searching the maximum contribution point to the radial symmetry of the image. Additionally, two different center refinement steps were incorporated with the aim of adapting the algorithm to images with highly elliptical pupil appearances. The performance of the proposed algorithm is evaluated using a dataset consisting of 225,569 head-mounted annotated eye images from publicly available sources. The results are compared with the better algorithm found in the bibliography, with our algorithm being shown as superior.

## Keywords

Eye tracking Head mounted Pupil detection## 1 Introduction

The first experiments using eye trackers began in early twentieth century [1]. At that time, gaining an understanding of eye movements was one of the main objectives of those evaluations [2]. Today, the technology has evolved, considerably widening the range of applications for which eye trackers can be employed. As the computational capacity of the existing equipment increases and as the price of the available technology decreases, more powerful and computationally expensive algorithms have been introduced for eye tracker devices. Thus, the range of applications using eye trackers has also become wider, including human–computer interaction and eye movement analysis.

Over the last few years, considerable efforts have been made to broaden the use of this technology to new application environments. Making this technology more robust and cheaper is key in order to apply this knowledge to conditions that are not completely controlled, i.e., outside the laboratory, such as in outdoor environments in which illumination cannot generally be controlled. Using eye trackers for driving experiments is one of the clearest examples, i.e., rapid light variations occur in an uncontrolled fashion, and most of the existing algorithms fail. Other cases are those carried out by users wearing head-mounted eye trackers in alternative environments such as shopping areas, and with individuals engaging in sports, work and other everyday activities. Moreover, the use of head-mounted devices also produces elliptical-shaped pupils with high eccentricity compared to those obtained when remote eye trackers are used. These “wilder” frameworks produce undesirable image artifacts, such as reflections, occlusions, blurring, and cases in which the pupil is cut by contact lenses or glasses or by problems caused by an eye mask.

As far as is known, gaze estimation methods use the center of the pupil to estimate the Point or Regard (PoR) or the Line of Sight (LoS), depending on the kind of experiment that is being carried out. Consequently, an accurate detection of the pupil center is key in obtaining a reliable measurement of gaze. Eye tracking is considered the algorithm that is employed to analyze the image captured by the camera, while gaze estimation is used to refer to the procedure that is responsible for estimating gaze using the results of the eye tracking stage [3]. The present proposal contributes to the area of eye tracking. More specifically, this paper presents a novel algorithm for detecting the pupil center in non-controlled environments in a more robust and accurate manner. The algorithm shows outstanding results compared to other methods that were previously published using state-of-the-art challenging eye tracking databases [4].

*k*-means algorithm. The detected region is considered the pupil, and an ellipse-fitting procedure is carried out using the edge points calculated by the Canny operator. The SET algorithm [8] uses a semiautomatic procedure. First, a threshold is manually selected to obtain a binary image in which the pupil is contained. For the blobs that are obtained, a signature value is calculated using the values of the

*x*and

*y*components of the contour points with respect to the center of the blob as a function of the angle. Both distributions are approximated by a sinusoidal function. The blob for which the aspect ratio between the sinusoidal functions is closer to one is selected as the pupil, i.e., the more circular shape. The PupilLabs algorithm was developed together with the open source code known as Pupil [9]. This algorithm presents high robustness in the presence of glints that overlap the pupil. As in the algorithm suggested by Świrski, Haar features are employed. Afterward, a Canny operator is used, and the edge points having darker gray values are selected. The resulting segments are analyzed using specific connectivity rules and curvature criteria. Ellipse-fitting techniques are also applied in order to select the best candidate. The ExCuSe method is one of the most recent algorithms [10], and it uses different approaches based on the presence of glints. On the one hand, in cases when a glint is detected, the edge points are calculated using Canny. A thinning procedure is then applied to the calculated edges, and specific ad hoc rules are applied in order to select the segments that are potential candidates for being part of the pupil contour. As in the rest of the algorithms, an ellipse is calculated, and its center is estimated. On the other hand, when no glints are detected, the pupil is segmented using an automatic threshold calculated from the image information. Subsequently, angular integral projection function (AIPF) is employed. This transform obtains the center of different projection angles using the binarized pupil information. The projections are weighted by using the gray level. The data that are calculated as a result of the projections are used to estimate an approximate pupil center. This point is employed to crop the image, and the aforementioned edge-processing procedure is applied in order to refine the pupil contour detection. The ray-tracing algorithm proposed by Starburst is also applied. Finally, the ELSe algorithm [11] proposes the use of an edge-processing algorithm similar to the one used by ExCuSe. After the edge selection stage, an ellipse is fitted for all the sets of points that are potential candidates to be pupil contour points. If the ellipses do not match a specific area, shape and gray level criteria are rejected. For the rest of the ellipses, a goodness parameter is calculated using the gray level and the shape information. The best of them is selected to be the pupil ellipse, assuming that a goodness threshold is exceeded. In cases when no ellipse is detected, a convolution is performed using circular masks to obtain a probability map that is further post-processed to approximate the pupil center. Using a completely different perspective, we found that some works employ deep learning, i.e., convolutional neural networks (CNN), to estimate pupil center. CNN have been demonstrated to be the best solution for many artificial vision problems. Valuable efforts have been made in eye tracking for low-resolution systems, i.e., for images captured with a webcam [12] for which the results are far from the ones obtained by high-resolution systems. Regarding the topic under study in this paper, we found the recent work in which a CNN-based method was applied to high-resolution images obtained in the “wild” [13].

This paper presents a novel algorithm, the fast robust ellipse detection algorithm (FREDA) algorithm, that beats the existing algorithms in terms of robustness and accuracy. The proposed method is based on the fast radial symmetry transform (FRST) [14] which is based on calculating the point presenting the highest radial symmetry in the image that is assumed to be the pupil center. This method was tested using the same framework that was used for the five state-of-the-art methods mentioned before and showed outstanding results.

In Sect. 2, the algorithm is described in detail, as well as the two center refinement stages. In addition, the set of images used to evaluate the algorithms is presented in this section. Section 3 shows the performance of the algorithms in the presented datasets, as well as a comparison to the ELSe algorithm. Finally, in Sect. 4 the conclusions of the present paper are explained.

## 2 Methods

The proposed approach uses the fast radial symmetry transform as the basis for detecting the pupil center. The FRST was also used in the algorithm presented by Skodras et al. [15] for remote eye tracking systems. Our contribution was directed at validating the use of the symmetry transform in head-mounted, gray-scale, high-resolution images. The algorithm bases the pupil center estimation on the detection of the highest radial symmetry point as the result of fast radial symmetry transform [14]. This transform detects circularly shaped zones in an image; thus, it is particularly appropriate for detecting the pupil center, assuming that the pupil’s appearance is typically circular. Nevertheless, in cases where the pupils appearance is more elliptical, this method tends to mark the center closer to the foci of the apparent ellipse. To avoid this problem, the FREDA I and FREDA II variations are presented, which incorporate an additional center refinement stage. The presented methods were developed using MATLAB.

The stages of the FREDA algorithm are summarized as follows (see Fig. 1): first, a preprocessing stage of the image is applied in order to adapt it to the subsequent processes. Then, the radial transform is computed both, on the negative of the preprocessed image, labeled \(\bar{I_{e}}\), and on the created pupil-enhanced image, labeled PupilMap. The two contributions are summed, and the center is defined by taking the coordinates of the maximum point of the resulting transformation, defined as \(S_\mathrm{TOT}\).

*c*, given by FREDA on the source image as a starting point (see Fig. 2). The center is chosen as the seed point for the successive region growths with which it is intended to fit the pupil. Thus, the corrected pupil center is considered the center of the best fitted ellipse to the region most closely matching the pupil shape. The difference between the two algorithms lies in the way in which the similarity between the pupil and the growing region is determined.

### 2.1 FREDA

#### 2.1.1 Image preprocessing

First, an image preprocessing stage is implemented in order to prepare it for the subsequent processes. This stage is compounded by two operations: a low-pass filter and an adaptive histogram equalization (see Fig. 1). Due to the calculation of the image gradient in the posterior radial symmetry transform, a low-pass filter is applied, resulting in the \(I_f\) image, in order to reduce the effect of noise on the border detection. A \(5\times 5\) Gaussian filter is used to implement the low-pass filter.

Adaptive image equalization is then performed, calculating the output image \(I_e\), to increase the contrast between the pupil and the background, thus obtaining more defined pupil edges. This procedure equalizes the histogram by small patches of the image rather than the entire image. Assuming that the pupil size is approximately a 10th of the image’s width, a subdivision of 10 columns and 10 rows is selected to which the equalization is applied. The output histogram of each region approaches a uniform distribution. To eliminate block effects between adjacent regions, they are combined using bilinear interpolation. To prevent noise from increasing in uniform areas of the image, the contrast is limited to a threshold that is chosen empirically, having a value of 0.01.

#### 2.1.2 Pupil enhancement: PupilMap

*B*1 and

*B*2 are flat, circular structuring elements whose radii are \(R_{B1}\) = Imagewidth/20 and \(R_{B2}\) = \(R_{B1}\)/2, respectively. The use of circular structuring elements emphasizes round patterns in the image, thus increasing the radial symmetry of the pupil zone. We applied a parabolic gray-scale transformation

*G*to the \(I_e\) input image (

*G*(\(I_e\)) = \(I'\)). The

*G*transformation brightens dark pixel areas that have gray levels below 0.2, approximating the negative transform, while the light parts, i.e., above 0.8, remain unchanged, approximating the identity transform. For normalized gray values between 0.2 and 0.8, the contrast is significantly reduced (see Fig. 4). Thus, when dividing the dilated transformed image of \(I'\) by the eroded input of \(I_e\), the brighter areas in \(I_e\) will tend to cancel out while the pupil zone will be enhanced. To avoid dividing by zero, the factor \(\varepsilon \) is summed. This factor is defined as:

#### 2.1.3 Fast radial symmetry transform: FRST

As described previously, in our approach, pupil center estimation is based on the detection of the point presenting the highest radial symmetry in the image. The implemented method is a modification of the transform proposed by Loy et al. [14]. This radial symmetry transform is a highly efficient computational approach.

The transform is calculated for a set of radii, \(n\in N\), where the values in *N* are selected empirically, considering the even numbers between 20 and 34 pixels for the application proposed. A discontinuous range of integers is selected in order to improve the computing speed. This reduction does not affect the accuracy of the center estimation.

First, the gradient of the image is calculated by a Sobel \(3\times 3\) operator. Only significant gradient values are considered. A threshold is empirically chosen as 5% of the maximum magnitude value of the gradient obtained in each image. Only gradient values greater than this threshold are considered, thereby reducing the number of pixels to be computed in the transform. Once the gradient values are calculated, the FRST is applied in order to detect the pupil center. Next, the FRST is summarized for clarity [14].

*p*, of the gradient, the affected pixel, \(p_\mathrm{af}\), is defined as the point located at a distance

*n*from

*p*and to which the gradient vector in

*p*points at, as follows:

*n*, an orientation projection \(O_n\) and a magnitude projection image \(M_n\) are created using the affected pixels \(p_\mathrm{af}\) in the following ways:

*n*. The contribution of the radial symmetry of the radius

*n*is obtained by combining both matrices and convolving them with a Gaussian smoothing mask, \(A_n\), with a mean, \(\mu \), equal to 2

*n*in the following way:

*n*a smoothed voting map, \(S_n\), is obtained, the values of which represent the contribution of each point to the local radial symmetry for a radius

*n*. The final map is calculated by averaging all the voting maps as:

*n*for which \(S_n\) is the one that has a higher peak value, making \(S_n\) the final transformation and

*c*the estimated pupil center.

### 2.2 FREDA I

As previously described, the FREDA I algorithm is an adaptation of the FREDA that aims to refine the estimated pupil center in pupil images with elliptical appearances. This refinement stage consists of various additional steps that are incorporated after the end of the FREDA. The flow diagram of this method is illustrated in Fig. 5.

*I*is cropped, resulting in the image \(I_c\), which focuses the region of interest on the center

*c*obtained by the FREDA algorithm. The size of the rectangular cutout is chosen adaptively according to the radius

*n*for which the maximum response is obtained in the FRST. Then, in order to cleanse the image of artifacts due to reflections or eyelashes, a morphological opening is applied (step 1.2). The structuring element used is a flat disk with a radius of

*n*/ 2. Subsequently, an iterative procedure is carried out in which the best candidate for the pupil center is searched.

For each iteration *i*, a region growing operation (step 1.3) is performed as follows: starting from the seed point *c*, a region *R* is generated by appending a new pixel each time, whose intensity value difference with the mean of *R* is the minimum from all 8-connected neighbors. The growth is stopped when this intensity difference exceeds a threshold \(T_i\). The initial value for \(T_i\) is 5 gray values, assuming 8-bit images and that in each iteration of the loop it is augmented by a factor *k* as \(T_{i+1}=T_i \cdot k\) where *k* is set as \(k=1,3\). This increasing factor causes *R* to be larger in each iteration, thus enabling the finding of the most accurate approximation of the pupil area. Taking greater values of *k* leads to more rapid growth but may cause inaccurate approximations. In contrast, lower values permit more accurate pupil fitting, but more iterations will be needed. The selected value is a balanced choice between precision and rapid rising.

*e*, is fitted using the obtained region contour points (step 1.4). The fitted ellipse has the same normalized second central moment as the region. When the ellipse is calculated, it is checked if it gets out of \(I_c\). In affirmative cases, the loop is interrupted assuming that

*R*has increased out of the pupil (step 1.7) and the last saved center is considered as the new pupil center. Otherwise, (step 1.5), a normalized difference area parameter \(\varDelta \) is defined as:

*e*is considered a better estimate for the pupil center, and its coordinates are saved (step 1.6). Finally, after a maximum of 10 iterations, the loop is finished, and the center that was saved last is considered the new pupil center. This stopping criterion avoids for realizing unnecessary iterations, assuming that with a threshold \(T_{10}=5\times 1.3 ^{10}\approx 70\) in the 10th repetition, the region

*R*would contain all pixels belonging to the pupil area. A stop criterion based on the convergence of \(\varDelta \) has been tested with no satisfactory results. As the growing of

*R*is not completely regular, the variation in \(\varDelta \) is not a monotonically decreasing function, thus preventing its use in estimating the stop condition. The described steps are graphically depicted in Fig. 6. Moreover, Fig. 7 shows the initially estimated center, as well as the one obtained after the refinement process.

### 2.3 FREDA II

In a similar manner to that of the FREDA I, the FREDA II algorithm is constructed by adding a center-refining stage to the FREDA as an additional alternative to improve the accuracy in the detection of the center in pupils with elliptical appearance. Figure 8 shows the flow diagram of the proposed method.

*C*, is used in a subsequent step to find the best center candidate. Once the opening is performed, an iterative procedure is carried out in which, after successive region growths, the best candidate for pupil center is determined. So, for each iteration, a region growing operation is executed using the same configuration parameters as in FREDA I. After the process of region growth ends, i.e.,

*R*, an ellipse

*e*is fitted to the region contour points, taking the one that has the second central moment of the region (step 2.5). It is verified that this ellipse does not get out from the image cutout,

*Ic*. If this happens, the loop is terminated (step 2.9), and the last saved center is given as result. Otherwise, a binary image,

*E*, is created applying a morphological dilation to the obtained ellipse with a squared \(3\times 3\) structuring element (step 2.6). Then, the number

*m*of pixels in the intersection between

*E*and the edge image,

*C*, is obtained as

*m*= \(\Vert E\cap C\Vert \) (step 2.7). The previous dilation facilitates the matching between the two binary images. The result is compared with the previously stored value of

*m*. If the value obtained is greater, the current value is saved, and the center of

*e*is saved as the best estimation of the pupil center (step 2.8). The idea behind this method is to consider that, in a perfect adjustment of

*R*to the pupil area, the fitted ellipse will obtain a maximum number of matching pixels with the edge image; in other words,

*m*will reach its maximum value. Finally, if 10 iterations are completed, the ellipse center that was stored last is considered the corrected new pupil center (step 2.9). As in the FREDA I, it has been shown that in less than 10 iterations, the region

*R*gets out from \(I_{c}\) or practically covers the pupil zone. The described steps are graphically represented by an example in Fig. 9, and both the center obtained by the FRST and the one obtained after the refinement process are shown in Fig. 10.

### 2.4 Evaluation images

For the evaluation of the algorithms, three collections of public databases containing eye images were used, totaling 225,569 images. Together with each collection of images the image coordinates of the pupil center are attached, which are used as references for the evaluation of the accuracy of the algorithms.

#### 2.4.1 Tübingen collection

#### 2.4.2 Świrski collection

#### 2.4.3 Labeled pupils in the wild (LPW) collection

The set of images called “Labeled Pupils in the Wild”, or LPW, published by Tonsen et al. [18] comes from a total of 66 high-quality videos from 22 different subjects. Each video contains approximately 2000 frames of \(640\times 480\) pixels, obtained at a frequency of 95 FPS, resulting in a total of 130,856 eye images. The collection covers a wide range of situations during both outdoor and indoor events. Each user was recorded in two indoor locations and one outdoor location. The change in the lighting conditions drastically affected the eye aperture, which exhibited a wide range of pupil sizes. An added difficulty is the high pupil eccentricity exhibited by certain images. All images were manually labeled. Figure 13 shows two example images from each user.

## 3 Results

We compared the precision of the pupil center estimation of each of the three proposed algorithms, namely FREDA, FREDA I and FREDA II, in the previously described datasets. For a performance evaluation comparison of the current approaches, the ELSe algorithm [18] was chosen as the reference method based on the analysis of state-of-the-art algorithms presented by Fuhl et al. [4], wherein the same image sets were used for testing. The detection error was measured as the Euclidean distance between the center estimated by the algorithm and the labeled center. To normalize error rates among the images with different sizes, those from LPW and Świrski (\(640\times 480\) pixel) were previously down-sampled to Tübingen’s resolution (\(384\times 288\) pixel).

In Fig. 15 the detection rates of the four algorithms are shown, divided according to the collections. There were notable differences in the results obtained in each collection for a specific algorithm. The FREDA was the most precise one for the Tübingen images, but its performance decayed drastically when it was evaluated in the Świrski and LPW datasets. In the same way, FREDA II was shown to be the best suited algorithm for the LPW and Świrski sets. In contrast, the results of the FREDA I and the ELSe algorithms were more balanced among the datasets, with the FREDA I being superior to the ELSe for the Tübingen and LPW collections, which, in practice, suppose the total of the images.

Table 1 shows the percentages of correctly determined pupil centers by each algorithm for each subset of the three collections. Because the error in the labeling of the images of Tübingen can be up to 5 pixels [11], a center was considered correctly estimated if the error was less than or equal to 5 pixels. The highest percentage obtained in each subset is marked in bold. According to the previous graphics, the FREDA algorithm was the most robust for the challenging images, being superior on 12 of the 24 subsets of the Tübingen collection. This result is clearly shown in Table 2, where the percentages of successfully determined centers by each algorithm are shown for the total of the three collections. The FREDA obtained 67.17% of the corrected pupil centers compared to the 65.50% reached by the FREDA I, the 60.60% by ELSe and the 49.78% obtained by FREDA II.

In contrast, FREDA II was superior to its competitors for 14 of the 22 subsets of the LPW collection and for the Świrski images. Therefore, it can be argued that it is the most precise in high-quality images and in the presence of pupils with elliptical appearances. In addition, as can be seen in Table 2, the FREDA II obtained 76.84 and 86.83% of correctly estimated centers in LPW and Świrski collections, respectively, in contrast to the 65.86% and the 81.17% reached by its closest competitor, i.e., ELSe.

Percentage of correctly detected centers of the four algorithms for each set of images

FREDA | FREDA I | FREDA II | ELSe | |
---|---|---|---|---|

| ||||

I | | 85.70 | 69.58 | 85.95 |

II | 74.45 | | 52.87 | 63.76 |

III | | 80.30 | 55.05 | 65.31 |

IV | 92.54 | | 88.14 | 83.05 |

V | | 95.60 | 93.30 | 84.73 |

VI | | 87.95 | 86.34 | 77.27 |

VII | | 76.89 | 72.47 | 59.61 |

VIII | 71.58 | | 51.43 | 67.30 |

IX | | 89.33 | 68.63 | 86.72 |

X | | 87.50 | 88.33 | 78.93 |

XI | | 72.37 | 72.37 | 75.27 |

XII | 73.47 | 86.26 | | 79.01 |

XIII | 73.31 | | 67.62 | 73.73 |

XIV | 91.68 | | 61.62 | 84.22 |

XV | | 73.00 | 62.53 | 57.30 |

XVI | | 86.48 | 75.26 | 59.95 |

XVII | 94.77 | | 75.75 | 89.18 |

XVIII | 50.63 | 48.70 | 24.09 | |

XIX | 33.47 | 32.17 | 25.14 | |

XX | 61.68 | 59.45 | 37.94 | |

XXI | | 67.81 | 60.93 | 41.32 |

XXII | | 60.44 | 30.01 | 56.61 |

XXIII | 91.19 | | 97.33 | 93.40 |

XXIV | 43.60 | | 47.55 | 51.20 |

| ||||

1 | 63.70 | | 95.88 | 88.23 |

2 | 74.45 | | 93.15 | 50.15 |

3 | 55.72 | | 61.00 | 50.20 |

4 | 18.25 | 35.53 | | 34.25 |

5 | 14.03 | 19.72 | 21.02 | |

6 | 47.41 | 84.85 | | 63.09 |

7 | 56.48 | 85.30 | | 70.07 |

8 | 54.43 | 89.27 | | 84.17 |

9 | 32.48 | 66.52 | 70.05 | |

10 | 27.72 | 67.90 | | 65.80 |

11 | 20.28 | 53.43 | | 56.18 |

12 | 49.25 | 85.13 | 86.68 | |

13 | 28.44 | 55.94 | | 52.31 |

14 | 31.33 | 55.52 | 65.60 | |

15 | 15.47 | 60.18 | | 65.93 |

16 | 66.86 | 85.73 | | 87.47 |

17 | 21.83 | 72.82 | | 67.05 |

18 | 58.12 | 88.63 | | 83.33 |

19 | 32.03 | 52.98 | | 41.13 |

20 | 53.83 | 86.18 | | 23.05 |

21 | 24.57 | 78.27 | | 62.10 |

22 | 15.12 | 30.52 | 35.63 | |

| ||||

21.67 | 75.17 | | 81.17 |

Percentage of correctly detected centers of the four algorithms for each collection of images

FREDA | FREDA I | FREDA II | ELSe | |
---|---|---|---|---|

Tübingen | | 65.50 | 49.78 | 60.60 |

LPW | 43.15 | 69.88 | | 65.86 |

Świrski | 21.67 | 75.17 | | 81.17 |

Percentage of correctly detected centers of the four algorithms for each entire dataset

FREDA | FREDA I | FREDA II | ELSe | |
---|---|---|---|---|

Total | 54.76 | | 67.29 | 64.69 |

Taking into account individual sets, it was observed that there were great differences in the rate of success among them. While the success rate exceeded 90% in numerous sets, the low rate observed particularly in sets XVIII and XIX of the Tübingen collection, as well as in sets 4 and 5 of the LPW collection, is remarkable. In the first two, the ELSe algorithm obtained the best results, with only 52.99 and 35.41% of correctly estimated centers and was closely followed by the FREDA, which obtained 50.63 and 33.47% of the hits. As shown in Fig. 16, in which three examples of each of the two sets are shown, pupil occlusions due to reflections, eyelids or blurring of the image, caused the detection of the center to be particularly difficult in these two cases.

With respect to the images of users 4 and 5 in the LPW collection, the success rates of the FREDA II were 52.20 and 21.02%, respectively, and 34.25 and 31.28% with ELSe. It can be seen in Fig. 17 that in user 4 the pupil may become inappreciable, as it was occluded by the eyelid and even partially cut by the image border. In case 5, however, it was the spectacle frames the subject wore, responsible for totally or partially concealing the pupil. The effect of the lens is also noticeable in the blurring of the image.

The algorithm presents several parameters that need to be tuned to enable the method to work. Some of the parameters are highly dependent on the working conditions and are not easily standardized, e.g., the values of the radius when calculating the FRST should be in accordance with the average size of the pupil in the camera, while others, such as those involved in the preprocessing stage are more difficult to select. To measure the robustness of the FREDA in terms of the specific values of the parameters, slight changes of \(\pm \, 10\%\) were made to the size of the filters and to the limits of the contrast stretching transform. The overall result did not change, and the conclusions are still valid. The large number of images involved compensated for possible biases, and on average, the result remained the same. Moreover, the datasets involved presented different types of images, and the parameters values were valid across datasets demonstrating the robustness of the method and the lack of sensitivity to the involved parameters. In fact, the FRST was the only stage that presented problems regarding the elliptical pupils in the Świrski collection and that did not depend on any parameter.

### 3.1 Computing time

### 3.2 Comparison to CNN

CNN have merged as an effective solution for solving several artificial vision problems, such as object detection or scene recognition. The work by Fuhl et al. [13] presents a comparison among several methods based on CNN using an extended version of the Tübingen dataset that was employed in this paper. They train the network using a random set of images consisting of 50% of the images obtained from the alternative datasets forming the database. The test is carried out in the other half of the images.

In Fig. 18, we find a comparison of the best results they obtained over the 50% of the testing images and our results from the use of the whole database with the FREDA and FREDA I. It would not be fair to include the results over the training images in the comparison. From the figure, it can be deduced that our results are slightly better for any error value. Except for the pixel errors over 12 pixels for which the CNN obtains a somewhat better rate but still comparable. In the case of the CNN, 79% of the pupil centers were estimated within an error of 15 pixels, while this value decreased to 78% in the case of our approach. Regarding the gaze estimation error, these type of errors do not allow a reliable estimation of gaze, i.e., these images would have to be rejected or an additional refinement stage would be required in order to obtain a more accurate estimation.

## 4 Conclusions

A new algorithm, the FREDA, with two additional center-refining steps (FREDA I and FREDA II) has been developed for eye center detection in head-mounted systems, based on the calculation of the radial symmetry of the pupil. The FREDA algorithm is publicly available.^{1} After evaluating their performance on a large set of images obtained under a wide variety of conditions, the FREDA I showed greater precision in the detection of the pupil center, surpassing the ELSe algorithm, which has been used as a reference among the published algorithms to date. In addition, it showed better results than other works using completely different perspectives, such as CNN.

A fast radial symmetry transform was chosen as the basis for the pupil center estimation in order to develop a robust method for difficult images that have been taken in real scenarios. Although it was shown to be an effective method for circularly appearing pupils, there was a lack of precision when the pupils possessed an elliptical shape. Thus, two approaches with additional center refinement steps (FREDA I and FREDA II) were developed to solve this inconvenience and the results showed that the FREDA II was the best suited for elliptical pupil images. However, its precision decayed in response to challenging images where the pupil is not well defined due to strong reflections, blurring, partial occlusions by eyelids or eyelashes, etc. In these cases, the center refinement stage of the FREDA I was more reliable, reaching higher detection rates than ELSe. Therefore, it can be concluded that the FREDA I algorithm is a robust and efficient approach for eye tracking systems, as it is able to obtain a high rate of detection in a great number of challenging situations that are common in those systems.

## Footnotes

## Notes

### Acknowledgements

We would like to acknowledge the Spanish Ministry of Economy, Industry and Competitiveness for their support under Contract TIN2014-52897-R in the framework of the National Plan of I\(+\)D\(+\)i.in the framework of the National Plan of I\(+\)D\(+\)i

## References

- 1.Duchowski, A.T.: Eye Tracking Methodology: Theory and Practice. Springer, New York Inc, Secaucus (2007)MATHGoogle Scholar
- 2.Huey, E.B.: The Psychology and Pedagogy of Reading, with a Review of the History of Reading and Writing and of Methods, Texts, and Hygiene in Reading, p. 16. Macmillan, New York (1908)Google Scholar
- 3.Majaranta, P., Aoki, H., Donegan, M., Hansen, D.W., Hansen, J.P.: Gaze Interaction and Applications of Eye Tracking: Advances in Assistive Technologies, 1st edn. Information Science Reference—Imprint of: IGI Publishing, Hershey (2011)Google Scholar
- 4.Fuhl, W., Tonsen, M., Bulling, A., Kasneci, E.: Pupil detection for head-mounted eye tracking in the wild: an evaluation of the state of the art. Mach. Vis. Appl.
**27**(8), 1275–1288 (2016)CrossRefGoogle Scholar - 5.Hansen, D.W., Ji, Q.: In the eye of the beholder: a survey of models for eyes and gaze. IEEE Trans. Pattern Anal. Mach. Intell.
**32**, 478–500 (2010)CrossRefGoogle Scholar - 6.Li, D., Winfield, D., Parkhurst, D.J.: Starburst: a hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05)—Workshops, vol. 3, CVPR ’05 (Washington, DC, USA). IEEE Computer Society, p. 79 (2005)Google Scholar
- 7.Świrski, L., Bulling, A., Dodgson, N.: Robust real-time pupil tracking in highly off-axis images. In: Proceedings of the Symposium on Eye Tracking Research and Applications, ETRA ’12 (New York, NY, USA). ACM, pp. 173–176 (2012)Google Scholar
- 8.Javadi, A.-H., Hakimi, Z., Barati, M., Walsh, V., Tcheang, L.: Set: a pupil detection method using sinusoidal approximation. Frontiers in Neuroengineering
**8**, 4 (2015)CrossRefGoogle Scholar - 9.Kassner, M., Patera, W., Bulling, A.: Pupil: an open source platform for pervasive eye tracking and mobile gaze-based interaction. CoRR, vol. abs/1405.0006 (2014)Google Scholar
- 10.Fuhl, W., Kübler, T., Sippel, K., Rosenstiel, W., Kasneci, E.: ExCuSe: Robust Pupil Detection in Real-World Scenarios, pp. 39–51. Springer, Cham (2015)Google Scholar
- 11.Fuhl, W., Santini, T.C., Kübler, T.C., Kasneci, E.: Else: ellipse selection for robust pupil detection in real-world environments. CoRR, vol. abs/1511.06575 (2015)Google Scholar
- 12.Krafka, K., Khosla, A., Kellnhofer, P., Kannan, H., Bhandarkar, S., Matusik, W., Torralba, A.: Eye tracking for everyone. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
- 13.Fuhl, W., Santini, T., Kasneci, G., Kasneci, E.: Pupilnet: convolutional neural networks for robust pupil detection. preprint, vol. abs/1601.04902 (2016)Google Scholar
- 14.Loy, G., Zelinsky, A.: Fast radial symmetry for detecting points of interest. IEEE Trans. Pattern Anal. Mach. Intell.
**25**, 959–973 (2003)CrossRefMATHGoogle Scholar - 15.Skodras, E., Fakotakis, N.: Precise localization of eye centers in low resolution color images. Image Vis. Comput.
**36**, 51–60 (2015)CrossRefGoogle Scholar - 16.Kasneci, E., Sippel, K., Aehling, K., Heister, M., Rosenstiel, W., Schiefer, U., Papageorgiou, E.: Driving with binocular visual field loss? A study on a supervised on-road parcours with simultaneous eye and head tracking. PLoS ONE
**9**, 1–13, 02 (2014)CrossRefGoogle Scholar - 17.Sippel, K., Kasneci, E., Aehling, K., Heister, M., Rosenstiel, W., Schiefer, U., Papageorgiou, E.: Binocular glaucomatous visual field loss and its impact on visual exploration—a supermarket study. PLoS ONE
**9**, 1–7, 08 (2014)CrossRefGoogle Scholar - 18.Tonsen, M., Zhang, X., Sugano, Y., Bulling, A.: Labeled pupils in the wild: a dataset for studying pupil detection in unconstrained environments. CoRR, vol. abs/1511.05768 (2015)Google Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.