Sensing and Controlling Human Gaze in Daily Living Space for Human-Harmonized Information Environments

  • Yoichi SatoEmail author
  • Yusuke Sugano
  • Akihiro Sugimoto
  • Yoshinori Kuno
  • Hideki Koike


This chapter introduces new techniques we developed for sensing and guiding human gaze non-invasively in daily living space. Such technologies are the key to realize human-harmonized information systems which can provide us various kinds of supports effectively without distracting our activities. Toward the goal of realizing non-invasive gaze sensing, we developed gaze estimation techniques, which requires very limited or no calibration effort by exploiting various cues such as spontaneous attraction of our visual attention to visual stimuli. For shifting our gaze to desired locations in a non-disturbing and natural way, we exploited two approaches for gaze control: subtle modulation of visual stimuli based on visual saliency models, and non-verbal gestures in human-robot interactions.


Appearance-based gaze sensing Calibration-free gaze estimation Visual saliency Gaze guidance 



The work presented in this chapter was supported by CREST, JST.


  1. 1.
    R. Bailey, A. McNamara, N. Sudarsanam, C. Grimm, Subtle gaze direction. ACM Trans. Graph. (TOG) 28(4), 100 (2009)CrossRefGoogle Scholar
  2. 2.
    A. Borji, L. Itti, State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013)CrossRefGoogle Scholar
  3. 3.
    L. Breiman, Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  4. 4.
    M. Cerf, J. Harel, W. Einhäuser, C. Koch, Predicting human gaze using low-level saliency combined with face detection, in Advances in Neural Information Processing Systems (2008), pp. 241–248Google Scholar
  5. 5.
    I. Chamveha, Y. Sugano, D. Sugimura, T. Siriteerakul, T. Okabe, Y. Sato, A. Sugimoto, Head direction estimation from low resolution images with scene adaptation. Comput. Vis. Image Underst. 117(10), 1502–1511 (2013)CrossRefGoogle Scholar
  6. 6.
    D. Cornish, D. Dukette, The Essential 20: Twenty Components of an Excellent Health Care Team (Dorrance Publishing Co. Inc., 2010)Google Scholar
  7. 7.
    N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in CVPR (1) (IEEE Computer Society, 2005), pp. 886–893Google Scholar
  8. 8.
    D. Das, M.M. Hoque, T. Onuki, Y. Kobayashi, Y. Kuno, Vision-based attention control system for socially interactive robots, in: IEEE International Symposium on Robot and Human Interactive Communication (Paris, France, 2012), pp. 496–502Google Scholar
  9. 9.
    D. Das, M.G. Rashed, Y. Kobayashi, Y. Kuno, Supporting human-robot interaction based on the level of visual focus of attention, in IEEE Transactions on Human-Machine Systems (Accepted for Publication)Google Scholar
  10. 10.
    G. Evangelopoulos, A. Zlatintsi, A. Potamianos, P. Maragos, K. Rapantzikos, G. Skoumas, Y. Avrithis, Multimodal saliency and fusion for movie summarization based on aural, visual, and textual attention. IEEE Trans. Multimedia 15(7), 1553–1568 (2013)CrossRefGoogle Scholar
  11. 11.
    Y. Furukawa, J. Ponce, Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1362–1376 (2010)CrossRefGoogle Scholar
  12. 12.
    A.J. Glenstrup, T. Engell-Nielsen, Eye controlled media: present and future state. Ph.D. thesis, Information Psychology, University of Copenhagen, DIKU, DK-2100, Denmark, 1995Google Scholar
  13. 13.
    A. Hagiwara, A. Sugimoto, K. Kawamoto, Saliency-based image editing for guiding visual attention, in Proceedings of the 1st International Workshop on Pervasive Eye Tracking and Mobile Eye-Based Interaction (ACM, 2011), pp. 43–48Google Scholar
  14. 14.
    A. Hagiwara, A. Sugimoto, K. Kawamoto, Saliency-based image editing for guiding visual attention, in Proceedings of the 1st International Workshop on Pervasive Eye Tracking & #38; Mobile Eye-based Interaction, PETMEI ’11 (ACM, New York, 2011), pp. 43–48Google Scholar
  15. 15.
    Y.S. Hajime Hata Hideki Koike, Visual attention guidance using image resolution control. J. Inf. Proc. Soc. Jpn. 56(4), 1142–1151 (2015)Google Scholar
  16. 16.
    D.W. Hansen, Q. Ji, In the eye of the beholder: a survey of models for eyes and gaze. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 478–500 (2010)CrossRefGoogle Scholar
  17. 17.
    J. Harel, C. Koch, P. Perona, Graph-based visual saliency, in Advances in Neural Information Processing Systems (2006), pp. 545–552Google Scholar
  18. 18.
    M.M. Hoque, D. Das, T. Onuki, Y. Kobayashi, Y. Kuno, An integrated approach of attention control of target human by nonverbal behaviors of robots in different viewing situations, in IROS (IEEE, 2012), pp. 1399–1406Google Scholar
  19. 19.
    M.M. Hoque, T. Onuki, Y. Kobayashi, Y. Kuno, Effect of robot’s gaze behaviors for attracting and controlling human attention. Adv. Robot. 27(11), 813–829 (2013)CrossRefGoogle Scholar
  20. 20.
    L. Itti, P. Baldi, Bayesian surprise attracts human attention. Vis. Res. 49(10), 1295–1306 (2009)CrossRefGoogle Scholar
  21. 21.
    L. Itti, N. Dhavale, F. Pighin, Realistic avatar eye and head animation using a neurobiological model of visual attention, in Optical Science and Technology, SPIE’s 48th Annual Meeting (International Society for Optics and Photonics, 2004), pp. 64–78Google Scholar
  22. 22.
    L. Itti, C. Koch, E. Niebur, A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)CrossRefGoogle Scholar
  23. 23.
    T. Joachims, Making large-scale svm learning practical, in Advances in Kernel Methods—Support Vector Learning (1999)Google Scholar
  24. 24.
    J.C. Karremans, W. Stroebe, Beyond vicary’s fantasies: the impact of subliminal priming and brand choice. J. Exp. Soc. Psychol. 792–798 (2006)Google Scholar
  25. 25.
    D. Kahneman, Attention and Effort (Prentice-Hall, 1973)Google Scholar
  26. 26.
    Y. Kim, A. Varshney, Persuading visual attention through geometry. IEEE Trans. Visual. Comput. Graph. 14(4), 772–782 (2008)CrossRefGoogle Scholar
  27. 27.
    H. Kobayashi, S. Kohshima, Unique morphology of the human eye and its adaptive meaning: comparative studies on external morphology of the primate eye. J. Hum. Evol. 40, 419–435 (2001)CrossRefGoogle Scholar
  28. 28.
    K. Liang, Y. Chahir, M. Molina, C. Tijus, F. Jouen, Appearance-based gaze tracking with spectral clustering and semi-supervised gaussian process regression, in ETSA (2013), pp. 17–23Google Scholar
  29. 29.
    F. Lu, T. Okabe, Y. Sugano, Y. Sato, Learning gaze biases with head motion for head pose-free gaze estimation. Image Vis. Comput. 32(3), 169–179 (2014)CrossRefGoogle Scholar
  30. 30.
    F. Lu, Y. Sugano, T. Okabe, Y. Sato, Adaptive linear regressionfor appearance-based gaze estimation. IEEE Trans. Pattern Anal. Mach. Intell. 10, 2033–2046 (2014)CrossRefGoogle Scholar
  31. 31.
    F. Martinez, A. Carbone, E. Pissaloux, Gaze estimation using local features and non-linear regression, in ICIP (2012), pp. 1961–1964Google Scholar
  32. 32.
    J. Nakajima, A. Kimura, A. Sugimoto, K. Kashino, Visual attention driven by auditory cues—selecting visual features in synchronization with attracting auditory events, in MultiMedia Modeling—21st International Conference, MMM 2015 (Sydney, NSW, Australia, January 5–7, 2015), Proceedings, Part II (2015), pp. 74–86Google Scholar
  33. 33.
    J. Nakajima, A. Sugimoto, K. Kawamoto, Incorporating audio signals into constructing a visual saliency map, in Image and Video Technology (Springer, 2014), pp. 468–480Google Scholar
  34. 34.
    B. Noris, K. Benmachiche, A. Billard, Calibration-free eye gaze direction detection with gaussian processes, in VISAPP (2008), pp. 611–616Google Scholar
  35. 35.
    T. Onuki, K. Ida, T. Ezure, T. Ishinoda, K. Sano, Y. Kobayashi, Y. Kuno, Designing robot eyes and head and their motions for gaze communication. Int. Conf. Intell. Comput. (ICIC2014) LNCS8588, 607–618 (2014)Google Scholar
  36. 36.
    T. Onuki, T. Ishinoda, E. Tsuburaya, Y. Miyata, Y. Kobayashi, Y. Kuno, Designing robot eyes for communicating gaze. Interact. Stud. 14(3), 451–479 (2013)Google Scholar
  37. 37.
    C.E. Rasmussen, C.K.I. Williams, Gaussian Processes for Machine Learning (The MIT Press, 2006)Google Scholar
  38. 38.
    M. Rolf, M. Asada, Visual attention by audiovisual signal-level synchrony, in Proceedings of the 9th ACM/IEEE International Conference on Human-Robot Interaction Workshop on Attention Models in Robotics: Visual Systems for Better HRI (2014)Google Scholar
  39. 39.
    J. Ruesch, M. Lopes, A. Bernardino, J. Hornstein, J. Santos-Victor, R. Pfeifer, Multimodal saliency-based bottom-up attention a framework for the humanoid robot icub, in Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on (IEEE, 2008), pp. 962–967Google Scholar
  40. 40.
    B. Schauerte, B. Kühn, K. Kroschel, R. Stiefelhagen, Multimodal saliency-based attention for object-based scene analysis, in Intelligent Robots and Systems (IROS), 2011 IEEE/RSJ International Conference on (IEEE, 2011), pp. 1173–1179Google Scholar
  41. 41.
    B. Schauerte, R. Stiefelhagen, “wow!” bayesian surprise for salient acoustic event detection, in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (IEEE, 2013), pp. 6402–6406Google Scholar
  42. 42.
    R. Stiefelhagen, J. Yang, A. Waibel, Modeling focus of attention for meeting indexing based on multiple cues. IEEE Trans. Neural Netw. 13(4), 928–938 (2002)CrossRefGoogle Scholar
  43. 43.
    W. Stroebe, The subtle power of hidden messages. Sci. Am. Mind 23, 46–51 (2012)CrossRefGoogle Scholar
  44. 44.
    Y. Sugano, Y. Matsushita, Y. Sato, Appearance-based gaze estimation using visual saliency. IEEE Trans. Pattern Anal. Mach. Intell. 35(2), 329–341 (2013)CrossRefGoogle Scholar
  45. 45.
    Y. Sugano, Y. Matsushita, Y. Sato, Learning-by-synthesis for appearance-based 3d gaze estimation, in Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014) (IEEE, 2014), pp. 1821–1828Google Scholar
  46. 46.
    K. Tan, D. Kriegman, N. Ahuja, Appearance-based eye gaze estimation, in WACV (2002), pp. 191–195Google Scholar
  47. 47.
    X. Tan, L. Qiao, W. Gao, J. Liu, Robust faces manifold modeling: most expressive versus most Sparse criterion, in ICCV Workshops (2010), pp. 139–146Google Scholar
  48. 48.
  49. 49.
    A. Wagner, J. Wright, A. Ganesh, Z. Zhou, Y. Ma, Towards a practical face recognition system: robust registration and illumination by sparse representation. CVPR 2009, 597–604 (2009)Google Scholar
  50. 50.
    C. Ware, Information Visulization: Perception for Design (Morgan Kaufmann Publishers Inc., San Francisco, 2004)Google Scholar
  51. 51.
    O. Williams, A. Blake, R. Cipolla, Sparse and semi-supervised visual mapping with the S3GP, in CVPR (2006), pp. 230–237Google Scholar
  52. 52.
    J. Wright, A. Yang, A. Ganesh, S. Sastry, Y. Ma, Robust face recognition via sparse representation. PAMI 31(2), 210–227 (2008)CrossRefGoogle Scholar

Copyright information

© Springer Japan 2016

Authors and Affiliations

  • Yoichi Sato
    • 1
    Email author
  • Yusuke Sugano
    • 2
  • Akihiro Sugimoto
    • 3
  • Yoshinori Kuno
    • 4
  • Hideki Koike
    • 5
  1. 1.Institute of Industrial ScienceThe University of TokyoTokyoJapan
  2. 2.Max Planck Institute for InformaticsSaarbrückenGermany
  3. 3.National Institute of InformaticsTokyoJapan
  4. 4.Saitama UniversitySaitamaJapan
  5. 5.Tokyo Institute of TechnologyTokyoJapan

Personalised recommendations