Towards Context-Dependence Eye Movements Prediction in Smart Meeting Rooms

  • Redwan Abdo A. Mohammed
  • Lars Schwabe
  • Oliver Staadt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8681)


Being able to predict gaze locations, as compared to only measuring them, is desirable in many systems such as the design of web pages and commercials adaptive user interfaces, interactive visualization, or attention management systems. However, accurately predicting eye movements remains a challenging problem. In this paper, we present the results of experimental study to improve the prediction of saliency maps in smart meeting rooms. More specifically, we investigate meeting scenarios in terms of their context-dependence saliency based on different image features. We have recorded the center of gaze of users in meeting rooms in different scenarios (giving a talk, listening). We then used a data-driven approach to find out which features are important in each scenario. We found that the predictions differ according to the type of features we selected. Most interestingly, we found that models trained on face features perform better than the models trained on other features in the giving a talk scenario, but in the listening scenario the models trained on competing saliency features from Itti and Koch perform better than the models trained on another features. This finding points towards including context information about the scene and situation into the computation of saliency maps as important towards developing models of eye movements, which operate well under natural conditions such as those encountered in ubiquitous computing settings.


Saliency Model Steerable Pyramid Talk Scenario Smart Meeting Room SVMs Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Roda, C.: Human Attention in Digital Environments. Cambridge University Press, Cambridge (2011)CrossRefGoogle Scholar
  2. 2.
    Wellner, P., Flynn, M., Guillemot, M.: Browsing recorded meetings with ferret. In: Bengio, S., Bourlard, H. (eds.) MLMI 2004. LNCS, vol. 3361, pp. 12–21. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  3. 3.
    Ellis, C.S., Barthelmess, P.: The neem dream. In: Proceedings of the 2003 Conference on Diversity in Computing, TAPIA 2003, pp. 23–29. ACM, New York (2003)Google Scholar
  4. 4.
    Kleinbauer, T., Becker, S., Becker, T.: T.: Combining multiple information layers for the automatic generation of indicative meeting abstracts. In: Proc. of ENLG 2007 (2007)Google Scholar
  5. 5.
    McCowan, L., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M., Zhang, D.: Automatic analysis of multimodal group actions in meetings. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3), 305–317 (2005)CrossRefGoogle Scholar
  6. 6.
    Favre, S., Salamin, H., Vinciarelli, A., Hakkani Tür, D., Garg, N.P.: Role recognition for meeting participants: an approach based on lexical information and social network analysis. In: ACM International Conference on Multimedia (October 2008)Google Scholar
  7. 7.
    Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(11), 1254–1259 (1998)CrossRefGoogle Scholar
  8. 8.
    Mahadevan, V., Vasconcelos, N.: Spatiotemporal saliency in dynamic scenes. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 171–177 (2010)CrossRefGoogle Scholar
  9. 9.
    Gao, D., Vasconcelos, N.: Discriminant saliency for visual recognition from cluttered scenes. In: NIPS (2004)Google Scholar
  10. 10.
    Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: ICCV (2009)Google Scholar
  11. 11.
    Yarbus, A.: Eye-movements and vision. Plenum Press, New York (1967)CrossRefGoogle Scholar
  12. 12.
    Simoncelli, E.P., Freeman, W.T.: The steerable pyramid: A flexible architecture for multi-scale derivative computation. In: IEEE Intl Conf. on Image Processing, pp. 444–447. IEEE Signal Processing Society (1995)Google Scholar
  13. 13.
    Torralba, A.: Modeling global scene factors in attention. JOSA - A 20, 1407–1418 (2003)CrossRefGoogle Scholar
  14. 14.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57(2), 137–154 (2004)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Redwan Abdo A. Mohammed
    • 1
  • Lars Schwabe
    • 1
  • Oliver Staadt
    • 1
  1. 1.Institute of Computer ScienceUniversity of RostockRostockGermany

Personalised recommendations