How to Find Interesting Locations in Video: A Spatiotemporal Interest Point Detector Learned from Human Eye Movements

  • Wolf Kienzle
  • Bernhard Schölkopf
  • Felix A. Wichmann
  • Matthias O. Franz
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4713)


Interest point detection in still images is a well-studied topic in computer vision. In the spatiotemporal domain, however, it is still unclear which features indicate useful interest points. In this paper we approach the problem by learning a detector from examples: we record eye movements of human subjects watching video sequences and train a neural network to predict which locations are likely to become eye movement targets. We show that our detector outperforms current spatiotemporal interest point architectures on a standard classification dataset.


Saliency Function Interest Point Interest Operator Interest Point Detector Interesting Location 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.J.: Behavior recognition via sparse spatio-temporal features. In: International Workshop on Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)Google Scholar
  2. 2.
    Findlay, J.M., Gilchrist, I.D.: Active Vision: The Psychology of Looking and Seeing. Oxford University Press, Oxford (2003)Google Scholar
  3. 3.
    Frantz, S., Rohr, K., Stiehl, H.S.: On the Localization of 3D Anatomical Point Landmarks in Medical Imagery Using Multi-Step Differential Approaches. In: Proc. DAGM, pp. 340–347 (1997)Google Scholar
  4. 4.
    Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, pp. 147–151 (1988)Google Scholar
  5. 5.
    Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE PAMI 20(11), 1254–1259 (1998)Google Scholar
  6. 6.
    Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: Proc. ICCV, pp. 166–173 (2005)Google Scholar
  7. 7.
    Kienzle, W., Wichmann, F.A., Schölkopf, B., Franz, M.O.: Learning an interest operator from eye human movements. In: IEEE CVPR Workshop, p. 24. IEEE Computer Society Press, Los Alamitos (2006)Google Scholar
  8. 8.
    Kienzle, W., Wichmann, F.A., Schölkopf, B., Franz, M.O.: A nonparametric approach to bottom-up visual saliency. In: Proc. NIPS 19 (in press, 2007)Google Scholar
  9. 9.
    Laptev, I.: On space-time interest points. IJCV 64, 107–123 (2005)CrossRefGoogle Scholar
  10. 10.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)CrossRefGoogle Scholar
  11. 11.
    Niebles, J.C., Wang, H., Wang, H., Fei Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: Proc. BMVC  (2006)Google Scholar
  12. 12.
    Reinagel, P., Zador, A.M.: Natural scene statistics at the center of gaze. Network: Computation in Neural Systems 10(4), 341–350 (1999)zbMATHCrossRefGoogle Scholar
  13. 13.
    Rutishauser, U., Walther, D., Koch, C., Perona, P.: Is bottom-up attention useful for object recognition? In: IEEE Proc. CVPR, pp. 37–44. IEEE Computer Society Press, Los Alamitos (2004)Google Scholar
  14. 14.
    Schmid, C., Mohr, R., Bauckhage, C.: Evaluation of interest point detectors. IJCV 37(2), 151–172 (2000)zbMATHCrossRefGoogle Scholar
  15. 15.
    Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: Proc. ICPR, pp. 32–36 (2004)Google Scholar
  16. 16.
    The Netlab Toolbox, available at
  17. 17.
    Wandell, B.A.: Foundations of Vision. Sinauer Associates, Inc. (1995)Google Scholar
  18. 18.
    Yarbus, A.: Eye movements and vision. Plenum Press (1967)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Wolf Kienzle
    • 1
  • Bernhard Schölkopf
    • 1
  • Felix A. Wichmann
    • 2
    • 3
  • Matthias O. Franz
    • 1
  1. 1.Max-Planck Institut für biologische Kybernetik, Abteilung Empirische Inferenz, Spemannstr. 38, 72076 Tübingen 
  2. 2.Technische Universität Berlin, Fakultät IV, FB Modellierung Kognitiver, Prozesse, Sekr. FR 6-4, Franklinstr. 28/29, 10587 Berlin 
  3. 3.Bernstein Center for Computational Neuroscience, Philippstr. 13 Haus 6, 10115 Berlin 

Personalised recommendations