Abstract
In this paper we address the problem of automatic gaze estimation using a depth sensor under unconstrained head pose motion and large user-sensor distances. To achieve robustness, we formulate this problem as a regression problem. To solve the task in hand, we propose to use a regression forest according to their high ability of generalization by handling large training set. We train our trees on an important synthetic training data using a statistical model of the human face with an integrated parametric 3D eyeballs. Unlike previous works relying on learning the mapping function using only RGB cues represented by the eye image appearances, we propose to integrate the depth information around the face to build the input vector. In our experiments, we show that our approach can handle real data scenarios presenting strong head pose changes even though it is trained only on synthetic data, we illustrate also the importance of the depth information on the accuracy of the estimation especially in unconstrained scenarios.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Hansen, D.W., Ji, Q.: In the eye of the beholder: a survey of models for eyes and gaze. In: TPAMI (2010)
Guestrin, E.D., Eizenman, M.: General theory of remote gaze estimation using the pupil center and corneal reflections. IEEE Trans. Biomed. Eng. 53, 1124–1133 (2006)
Wang, J.G., Sung, E.: Study on eye gaze estimation. IEEE Trans. Syst. Man Cybern. Part B Cybern. 32, 332–350 (2002)
Ishikawa, T.: Passive driver gaze tracking with active appearance models (2004)
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. In: TPAMI (2001)
Matsumoto, Y., Zelinsky, A.: An algorithm for real-time stereo vision implementation of head pose and gaze direction measurement. In: Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 499–504. IEEE (2000)
Chen, J., Ji, Q.: 3D gaze estimation with a single camera without IR illumination. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4. IEEE (2008)
Bär, T., Reuter, J.F., Zöllner, J.M.: Driver head pose and gaze estimation based on multi-template ICP 3-D point cloud alignment. In: 2012 15th International IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 1797–1802. IEEE (2012)
Jianfeng, L., Shigang, L.: Eye-model-based gaze estimation by RGB-D camera. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 592–596 (2014)
Timm, F., Barth, E.: Accurate eye centre localisation by means of gradients. In: VISAPP (2011)
Zhu, Z., Ji, Q.: Novel eye gaze tracking techniques under natural head movement. IEEE Trans. Biomed. Eng. 54, 2246–2260 (2007)
Baluja, S., Pomerleau, D.: Non-intrusive gaze tracking using artificial neural networks. Technical report, DTIC Document (1994)
Tan, K.H., Kriegman, D.J., Ahuja, N.: Appearance-based eye gaze estimation. In: Proceedings of the Sixth IEEE Workshop on Applications of Computer Vision (WACV 2002), pp. 191–195. IEEE (2002)
Hansen, D.W., Hansen, J.P., Nielsen, M., Johansen, A.S., Stegmann, M.B.: Eye typing using Markov and active appearance models. In: Proceedings of the Sixth IEEE Workshop on Applications of Computer Vision (WACV 2002), pp. 132–136. IEEE (2002)
Williams, O., Blake, A., Cipolla, R.: Sparse and semi-supervised visual mapping with the S\(^3\)GP. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 230–237. IEEE (2006)
Sugano, Y., Matsushita, Y., Sato, Y.: Calibration-free gaze sensing using saliency maps. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2667–2674. IEEE (2010)
Lu, F., Sugano, Y., Okabe, T., Sato, Y.: Inferring human gaze from appearance via adaptive linear regression. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 153–160. IEEE (2011)
Lu, F., Okabe, T., Sugano, Y., Sato, Y.: A head pose-free approach for appearance-based gaze estimation. In: BMVC, pp. 1–11 (2011)
Mora, K.A.F., Odobez, J.M.: Gaze estimation from multimodal kinect data. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 25–30. IEEE (2012)
Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Appearance-based gaze estimation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4511–4520 (2015)
Cappelli, R., Erol, A., Maio, D., Maltoni, D.: Synthetic fingerprint-image generation. In: Proceedings of the 15th International Conference on Pattern Recognition, vol. 3, pp. 471–474. IEEE (2000)
Zuo, J., Schmid, N.A., Chen, X.: On generation and analysis of synthetic iris images. IEEE Trans. Inf. Forensics Secur. 2, 77–90 (2007)
Thian, N.P.H., Marcel, S., Bengio, S.: Improving face authentication using virtual samples. In: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2003), vol. 3, p. III-233. IEEE (2003)
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56, 116–124 (2013)
Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: CVPR (2011)
Breiman, L.: Random forests. Mach. Learn. 45, 2–32 (2001)
Marée, R., Wehenkel, L., Geurts, P.: Extremely randomized trees and random subwindows for image classification, annotation, and retrieval. In: Criminisi, A., Shotton, J. (eds.) Decision Forests for Computer Vision and Medical Image Analysis, pp. 125–141. Springer, London (2013)
Gall, J., Yao, A., Razavi, N., Van Gool, L., Lempitsky, V.: Hough forests for object detection, tracking, and action recognition. In: TPAMI (2011)
Lepetit, V., Lagger, P., Fua, P.: Randomized trees for real-time keypoint recognition. In: CVPR (2005)
Criminisi, A., Shotton, J., Robertson, D., Konukoglu, E.: Regression forests for efficient anatomy detection and localization in CT studies. In: Medical Computer Vision Workshop (2010)
Kacete, A., Seguier, R., Royan, J., Collobert, M., Soladie, C.: Real-time eye pupil localization using hough regression forest. In: Proceedings of the Sixth IEEE Workshop on Applications of Computer Vision (WACV 2016). IEEE (2016)
Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Twentieth Annual Conference on Neural Information Processing Systems (NIPS 2006), pp. 985–992. MIT Press (2007)
Ram, P., Gray, A.G.: Density estimation trees. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 627–635. ACM (2011)
Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: Advanced Video and Signal Based Surveillance (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1 (wmv 6443 KB)
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Kacete, A., Séguier, R., Collobert, M., Royan, J. (2017). Unconstrained Gaze Estimation Using Random Forest Regression Voting. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10113. Springer, Cham. https://doi.org/10.1007/978-3-319-54187-7_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-54187-7_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54186-0
Online ISBN: 978-3-319-54187-7
eBook Packages: Computer ScienceComputer Science (R0)