Attentive Object Detection Using an Information Theoretic Saliency Measure

  • Gerald Fritz
  • Christin Seifert
  • Lucas Paletta
  • Horst Bischof
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3368)


A major goal of selective attention is to focus processing on relevant information to enable rapid and robust task performance. For the example of attentive visual object recognition, we investigate here the impact of top-down information on multi-stage processing, instead of integrating generic visual feature extraction into object specific interpretation. We discriminate between generic and specific task based filters that select task relevant information of different scope and specificity within a processing chain. Attention is applied by tuned early features to selectively respond to generic task related visual features, i.e., to information that is in general locally relevant for any kind of object search. The mapping from appearances to discriminative regions is then modeled using decision trees to accelerate processing. The focus of attention on discriminative patterns enables efficient recognition of specific objects, by means of a sparse object representation that enables selective, task relevant, and rapid object specific responses. In the experiments the performance in object recognition from single appearance patterns dramatically increased considering only discriminative patterns, and evaluation of complete image analysis under various degrees of partial occlusion and image noise resulted in highly robust recognition, even in the presence of severe occlusion and noise effects. In addition, we present performance evaluation on our public available reference object database (TSG-20).


Object Recognition Recognition Rate Majority Vote Partial Occlusion Interest Operator 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, S., Roth, D.: Learning a sparse representation for object detection. In: Proc. European Conference on Computer Vision, vol. 4, pp. 113–130 (2002)Google Scholar
  2. 2.
    Braun, J., Koch, C., Lee, D.K., Itti, L.: Perceptual consequences of multilevel selection. In: Braun, J., Koch, C., Davies, J.L. (eds.) Visual Attention and Cortical Circuits, pp. 215–241. The MIT Press, Cambridge (2001)Google Scholar
  3. 3.
    de Verdiére, V.C., Crowley, J.L.: Visual recognition using local appearance. In: Proc. European Conference on Computer Vision (1998)Google Scholar
  4. 4.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 264–271 (2003)Google Scholar
  5. 5.
    Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software 3(3), 209–226 (1977)zbMATHCrossRefGoogle Scholar
  6. 6.
    Fritz, G., Paletta, L., Bischof, H.: Object recognition using local information content. In: Proc. International Conference on Pattern Recognition, ICPR 2004, Cambridge, UK, vol. II, pp. 15–18 (2004)Google Scholar
  7. 7.
    Fritz, G., Seifert, C., Paletta, L., Luley, P., Almer, A.: Mobile vision for tourist information systems in urban environments. In: Proc. International Conference on Mobile Learning, MLEARN 2004, Bracciano, Italy (2004)Google Scholar
  8. 8.
    Hall, D., Leibe, B., Schiele, B.: Saliency of interest points under scale changes. In: Proc. British Machine Vision Conference (2002)Google Scholar
  9. 9.
    Jägersand, M.: Saliency maps and attention selection in scale and spatial coordinates: An information theoretic approach. In: Proc. International Conference on Computer Vision (1995)Google Scholar
  10. 10.
    Kadir, T., Brady, M.: Scale, saliency and image description. International Journal of Computer Vision 45(2), 83–105 (2001)zbMATHCrossRefGoogle Scholar
  11. 11.
    Lowe, D.: Object recognition from local scale-invariant features. In: Proc. International Conference on Computer Vision, pp. 1150–1157 (1999)Google Scholar
  12. 12.
    Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Proc. European Conference on Computer Vision, pp. 128–142 (2002)Google Scholar
  13. 13.
    Murase, H., Nayar, S.K.: Visual learning and recognition of 3-D objects from appearance. International Journal of Computer Vision 14(1), 5–24 (1995)CrossRefGoogle Scholar
  14. 14.
    Obdrzalek, S., Matas, J.: Object recognition using local affine frames on distinguished regions. In: Proc. British Machine Vision Conference, pp. 113–122 (2002)Google Scholar
  15. 15.
    Paletta, L., Greindl, C.: Context based object detection from video. In: Proc. International Conference on Computer Vision Systems, pp. 502–512 (2003)Google Scholar
  16. 16.
    Parzen, E.: On estimation of a probability density function and mode. Annals of Mathematical Statistics 33, 1065–1076 (1962)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)Google Scholar
  18. 18.
    Saito, N., Coifman, R.R., Geshwind, F.B., Warner, F.: Discriminant feature extraction using empirical probability density estimation and a local basis library, vol. 35, pp. 2841–2852 (2002)Google Scholar
  19. 19.
    Swets, D.L., Weng, J.: Using discriminant eigenfeatures for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 18(8), 831–837 (1996)CrossRefGoogle Scholar
  20. 20.
    van de Laar, P., Heskes, T., Gielen, S.: Task-dependent learning of attention. Neural Networks 10(6), 981–992 (1997)CrossRefGoogle Scholar
  21. 21.
    Vidal-Naquet, M., Ullman, S.: Object recognition with informative features and linear classification. In: Proc. International Conference on Computer Vision (2003)Google Scholar
  22. 22.
    Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Proc. European Conference on Computer Vision, pp. 18–32 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Gerald Fritz
    • 1
  • Christin Seifert
    • 1
  • Lucas Paletta
    • 1
  • Horst Bischof
    • 2
  1. 1.Institute of Digital Image ProcessingJOANNEUM RESEARCH, Forschungsgesellschaft mbHGrazAustria
  2. 2.Institute for Computer Graphics and VisionGraz University of TechnologyGrazAustria

Personalised recommendations