Towards Context-Dependence Eye Movements Prediction in Smart Meeting Rooms
Being able to predict gaze locations, as compared to only measuring them, is desirable in many systems such as the design of web pages and commercials adaptive user interfaces, interactive visualization, or attention management systems. However, accurately predicting eye movements remains a challenging problem. In this paper, we present the results of experimental study to improve the prediction of saliency maps in smart meeting rooms. More specifically, we investigate meeting scenarios in terms of their context-dependence saliency based on different image features. We have recorded the center of gaze of users in meeting rooms in different scenarios (giving a talk, listening). We then used a data-driven approach to find out which features are important in each scenario. We found that the predictions differ according to the type of features we selected. Most interestingly, we found that models trained on face features perform better than the models trained on other features in the giving a talk scenario, but in the listening scenario the models trained on competing saliency features from Itti and Koch perform better than the models trained on another features. This finding points towards including context information about the scene and situation into the computation of saliency maps as important towards developing models of eye movements, which operate well under natural conditions such as those encountered in ubiquitous computing settings.
KeywordsSaliency Model Steerable Pyramid Talk Scenario Smart Meeting Room SVMs Model
Unable to display preview. Download preview PDF.
- 3.Ellis, C.S., Barthelmess, P.: The neem dream. In: Proceedings of the 2003 Conference on Diversity in Computing, TAPIA 2003, pp. 23–29. ACM, New York (2003)Google Scholar
- 4.Kleinbauer, T., Becker, S., Becker, T.: T.: Combining multiple information layers for the automatic generation of indicative meeting abstracts. In: Proc. of ENLG 2007 (2007)Google Scholar
- 6.Favre, S., Salamin, H., Vinciarelli, A., Hakkani Tür, D., Garg, N.P.: Role recognition for meeting participants: an approach based on lexical information and social network analysis. In: ACM International Conference on Multimedia (October 2008)Google Scholar
- 9.Gao, D., Vasconcelos, N.: Discriminant saliency for visual recognition from cluttered scenes. In: NIPS (2004)Google Scholar
- 10.Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: ICCV (2009)Google Scholar
- 12.Simoncelli, E.P., Freeman, W.T.: The steerable pyramid: A flexible architecture for multi-scale derivative computation. In: IEEE Intl Conf. on Image Processing, pp. 444–447. IEEE Signal Processing Society (1995)Google Scholar