Skip to main content

Stage-by-Stage Based Design Paradigm of Two-Pathway Model for Gaze Following

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2019)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11858))

Included in the following conference series:

Abstract

Gaze, which is an important non-verbal cue of interactions between human beings, can be used to estimate a person’s point of regard as well as deduce his intention. And gaze following is an task to estimate the visual attention of people in a single image. To tackle this challenging problem, earlier state-of-the-art work try to combine the information from image saliency as well as the gaze directions of people, thus demonstrate a deep-learning based two-pathway model. However, previous work do not focus much on why such a two-pathway model works well. Thus, in this paper, we divide the two-pathway model into three stages, compare different mechanisms in those stages to better understand how each stage may influence the model performance. Finally, we find out the best combinations of the mechanism in three stages and evaluate the model on the benchmark GazeFollow.

The first author is a student.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Recasens, A., Khosla, A., Vondrick, C., Torralba, A.: Where are they looking? In: Advances in Neural Information Processing Systems (NIPS) (2015)

    Google Scholar 

  2. Jiang, M., Huang, S., Duan, J., et al.: Salicon: saliency in context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1072–1080 (2015)

    Google Scholar 

  3. Krafka, K., Khosla, A., Kellnhofer, P., et al.: Eye tracking for everyone. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2176–2184 (2016)

    Google Scholar 

  4. Recasens, A., Vondrick, C., Khosla, A., Torralba, A.: Following gaze in video. In: IEEE International Conference on Computer Vision (2017)

    Google Scholar 

  5. Chong, E., Ruiz, N., et al.: Connecting gaze, scene, and attention: generalized attention estimation via joint modeling of gaze and scene saliency. In: The European Conference on Computer Vision (2018)

    Google Scholar 

  6. He K, Zhang X, Ren S, et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  7. Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)

    Article  Google Scholar 

  8. Deng, J., Dong, W., Socher, R., et al.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

  9. Matthias, K., Lucas, T., Matthias, B.: Deep gaze I: boosting saliency prediction with feature maps trained on imagenet. CoRR, vol.abs/1411.1045 (2014)

    Google Scholar 

  10. Kruthiventi, S.S., Ayush, K., et al.: DeepFix: a fully convolutional neural network for predicting human eye fixations. IEEE Trans. Image Process. 26(9), 4446–4455 (2017)

    Article  MathSciNet  Google Scholar 

  11. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25(2), 1097–1105 (2012)

    Google Scholar 

  12. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, vol.abs/1409.1556 (2014)

    Google Scholar 

  13. Sun, X., Xiao, B., Liang, S., et al.: Integral human pose regression. CoRR, vol. abs/1711.08229 (2017)

    Google Scholar 

  14. Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  15. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  16. Pytorch: tensors and dynamic neural networks in python with strong GPU acceleration. https://github.com/pytorch/pytorch. Accessed 03 Nov 2017

  17. Judd, T., Ehinger, K., Durand, F., et al.: Learning to predict where humans look. In: Proceedings of the 2009 IEEE International Conference on Computer Vision (2009)

    Google Scholar 

  18. Sourabh, V., Akshay, R., Trivedi, M.M.: Gaze zone estimation using convolutional neural networks: a general framework and ablative analysis. IEEE Trans. Intell. Veh. 3(3), 254–265 (2018)

    Article  Google Scholar 

  19. Cheng, M.M., Mitra, N.J., Huang, X., et al.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2018)

    Article  Google Scholar 

  20. Saran, A., Majumdar, S., Shor, E.S., et al.: Human gaze following for human-robot interaction. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8615–8621 (2018)

    Google Scholar 

  21. Zhao, J.X., Cao, Y., Cheng, M.M., et al.: Contrast prior and fluid pyramid integration for RGBD salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision (2019)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of P.R. China Under Grant Nos. 61772574 and 61375080.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guoli Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cao, Z., Wang, G., Guo, X. (2019). Stage-by-Stage Based Design Paradigm of Two-Pathway Model for Gaze Following. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2019. Lecture Notes in Computer Science(), vol 11858. Springer, Cham. https://doi.org/10.1007/978-3-030-31723-2_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-31723-2_55

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-31722-5

  • Online ISBN: 978-3-030-31723-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics