Eye Contact Detection via Deep Neural Networks

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 713)


With the presence of ubiquitous devices in our daily lives, effectively capturing and managing user attention becomes a critical device requirement. While gaze-tracking is typically employed to determine the user’s focus of attention, gaze-lock detection to sense eye-contact with a device is proposed in [16]. This work proposes eye contact detection using deep neural networks, and makes the following contributions: (1) With a convolutional neural network (CNN) architecture, we achieve superior eye-contact detection performance as compared to [16] with minimal data pre-processing; our algorithm is furthermore validated on multiple datasets, (2) Gaze-lock detection is improved by combining head pose and eye-gaze information consistent with social attention literature, and (3) We demonstrate gaze-locking on an Android mobile platform via CNN model compression.


Eye contact detection Human-Computer Interaction Convolutional neural networks 


  1. 1.
    Baltrusaitis, T., Robinson, P., Morency, L.P.: Constrained local neural fields for robust facial landmark detection in the wild. In: International Conference on Computer Vision Workshops, pp. 354–361 (2013)Google Scholar
  2. 2.
    Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop (2011)Google Scholar
  3. 3.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 886–893. IEEE Computer Society, Washington, DC (2005)Google Scholar
  4. 4.
    Funes Mora, K.A., Monay, F., Odobez, J.M.: EYEDIAP: a database for the development and evaluation of gaze estimation algorithms from RGB and RGB-D cameras. In: Eye Tracking Research and Applications, pp. 255–258. ACM, New York (2014)Google Scholar
  5. 5.
    Hains, S.M., Muir, D.W.: Infant sensitivity to adult eye direction. Child Dev. 67, 1940–1951 (1996)CrossRefGoogle Scholar
  6. 6.
    Hinton, G., Vinyals, O., Dean, J.: Distilling the Knowledge in a Neural Network. CoRR, March 2015Google Scholar
  7. 7.
    Holzman, P.S., Proctor, L.R., Levy, D.L., Yasillo, N.J., Meltzer, H.Y., Hurt, S.W.: Eye-tracking dysfunctions in schizophrenic patients and their relatives. Arch. Gen. Psychiatry 31(2), 143–151 (1974)CrossRefGoogle Scholar
  8. 8.
    Krafka, K., Khosla, A., Kellnhofer, P., Kannan, H., Bhandarkar, S., Matusik, W., Torralba, A.: Eye tracking for everyone. In: CVPR (2016)Google Scholar
  9. 9.
    Langton, S.R.: Do the eyes have it? Cues to the direction of social attention. Trends Cogn. Sci. 4(2), 50–59 (2000)CrossRefGoogle Scholar
  10. 10.
    Li, R., Shi, P., Haake, A.R.: Image understanding from experts’ eyes by modeling perceptual skill of diagnostic reasoning processes. In: CVPR, pp. 2187–2194 (2013)Google Scholar
  11. 11.
    Majaranta, P., Bulling, A.: Eye tracking and eye-based human–computer interaction. In: Fairclough, S.H., Gilleade, K. (eds.) Advances in Physiological Computing. HIS, pp. 39–65. Springer, London (2014). doi: 10.1007/978-1-4471-6392-3_3 CrossRefGoogle Scholar
  12. 12.
    Morimoto, C.H., Mimica, M.R.: Eye gaze tracking techniques for interactive applications. CVIU 98(1), 4–24 (2005)Google Scholar
  13. 13.
    Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: ICML, pp. 807–814 (2010)Google Scholar
  14. 14.
    Rayner, K.: Eye movements in reading and information processing: 20 years of research. Psychol. Bull. 124, 372–422 (1998)CrossRefGoogle Scholar
  15. 15.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  16. 16.
    Smith, B.A., Yin, Q., Feiner, S.K., Nayar, S.K.: Gaze locking: passive eye contact detection for human-object interaction. In: User Interface Software and Technology, pp. 271–280. ACM (2013)Google Scholar
  17. 17.
    Subramanian, R., Staiano, J., Kalimeri, K., Sebe, N., Pianesi, F.: Putting the pieces together: multimodal analysis of social attention in meetings. In: ACM International Conference on Multimedia, pp. 659–662. ACM (2010)Google Scholar
  18. 18.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR, vol. 1, pp. 1–511. IEEE (2001)Google Scholar
  19. 19.
    Volokitin, A., Gygli, M., Boix, X.: Predicting when saliency maps are accurate and eye fixations consistent. In: CVPR, pp. 544–552 (2016)Google Scholar
  20. 20.
    Vrânceanu, R., Florea, C., Florea, L., Vertan, C.: NLP EAC recognition by component separation in the eye region. In: Wilson, R., Hancock, E., Bors, A., Smith, W. (eds.) CAIP 2013. LNCS, vol. 8048, pp. 225–232. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40246-3_28 CrossRefGoogle Scholar
  21. 21.
    Zhang, X., Sugano, Y., Fritz, M., Bulling, A.: Appearance-based gaze estimation in the wild. In: CVPR, pp. 4511–4520. IEEE Computer Society (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.International Institute of Information TechnologyHyderabadIndia

Personalised recommendations