Machine Vision and Applications

, Volume 16, Issue 1, pp 64–73 | Cite as

Integrating context-free and context-dependent attentional mechanisms for gestural object reference

  • Gunther Heidemann
  • Robert Rae
  • Holger Bekel
  • Ingo Bax
  • Helge Ritter
Special issue on ICVS 2003


We present a vision system for human-machine interaction based on a small wearable camera mounted on glasses. The camera views the area in front of the user, especially the hands. To evaluate hand movements for pointing gestures and to recognise object references, an approach to integrating bottom-up generated feature maps and top-down propagated recognition results is introduced. Modules for context-free focus of attention work in parallel with the hand gesture recognition. In contrast to other approaches, the fusion of the two branches is on the sub-symbolic level. This method facilitates both the integration of different modalities and the generation of auditory feedback.


Human-machine interaction Gesture recognition Neural networks Focus of attention Auditory feedback 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Backer G, Mertsching B, Bollmann M (2001) Data- and model-driven gaze control for an active-vision system. IEEE Trans Pattern Anal Mach Intell 23(12):1415-1429CrossRefGoogle Scholar
  2. 2.
    Bauckhage , Fink GA, Fritsch J, Kummert F, Lömker F, Sagerer G, Wachsmuth S (2001) An integrated system for cooperative man-machine interaction. In: IEEE international symposium on computer intelligence in robotics and automation, Banff, CanadaGoogle Scholar
  3. 3.
    Bax I, Bekel H, Heidemann G (2003) Recognition of gestural object reference with auditory feedback. In: Proc. international conference on neural networks, Istanbul, Turkey, pp 425-432Google Scholar
  4. 4.
    Bruce V, Morgan M (1954) Violations of symmetry and repetition in visual patterns. Psychol Rev 61:183-193Google Scholar
  5. 5.
    Crevier D, Lepage R (1977) Knowledge-based image understanding systems: a survey. Comput Vision Image Understand 67(2):161-185CrossRefGoogle Scholar
  6. 6.
    Fislage M, Rae R, Ritter H (1999) Using visual attention to recognize human pointing gestures in assembly tasks. In: 7th IEEE international conference on computer visionGoogle Scholar
  7. 7.
    Handmann U, Kalinke T, Tzomakas C, Werner M, van Seelen W (2000) An image processing system for driver assistance. Image Vision Comput 18(5):367-376CrossRefGoogle Scholar
  8. 8.
    Harris C, Stephens M (1988) A combined corner and edge detector. In: Proc. 4th Alvey vision conference, pp 147-151Google Scholar
  9. 9.
    Heidemann G (2004) Combining spatial and colour information for content based image retrieval. Comput Vision Image Understand 94(1-3):234-270Google Scholar
  10. 10.
    Heidemann G (2004) Focus-of-attention from local color symmetries. IEEE Trans Pattern Anal Mach Intell 26(7):817-830CrossRefGoogle Scholar
  11. 11.
    Heidemann G, Ritter H (2001) Efficient vector quantization using the WTA-rule with activity equalization. Neural Process Lett 13(1):17-30CrossRefMATHGoogle Scholar
  12. 12.
    Heidemann G, Ritter H (2003) Learning to recognise objects and situations to control a robot end-effector. KI Künstliche Intelligenz (Special Issue on Vision, Learning, Robotics) 2:24-29Google Scholar
  13. 13.
    Itti L, Koch C, Niebur E (1998) A Model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254-1259CrossRefGoogle Scholar
  14. 14.
    Jähne B (1991) Digital image processing. Springer, Berlin Heidelberg New YorkGoogle Scholar
  15. 15.
    Jolliffe I (1986) Principal component analysis. Springer, Berlin Heidelberg New YorkGoogle Scholar
  16. 16.
    Kalinke T, Handmann U (1997) Fusion of texture and contour based methods for object recognition. In: IEEE conference on intelligent transportation systems, StuttgartGoogle Scholar
  17. 17.
    Kalinke T, von Seelen W (1996) Entropie als Maß des lokalen Informationsgehalts in Bildern zur Realisierung einer Aufmerksamkeitssteuerung. In: Jähne B, Geißler P, Haußecker H, Hering F (eds) Mustererkennung, Springer, Berlin Heidelberg New York, pp 627-634Google Scholar
  18. 18.
    Kohonen T (1984) Self-organization and associative memory. In: Springer series in information sciences 8. Springer, Berlin Heidelberg New YorkGoogle Scholar
  19. 19.
    Locher PJ, Nodine CF (1987) Symmetry catches the eye. In: Levy-Schoen A, O’Reagan JK (eds) Eye movements: from physiology to cognition. Elsevier (North Holland), Amsterdam, pp 353-361Google Scholar
  20. 20.
    Moody J, Darken C (1988) Learning with localized receptive fields. In: Proc. 1988 Connectionist Models Summer School, Morgan Kaufman, San Mateo, CA, pp 133-143Google Scholar
  21. 21.
    Privitera CM, Stark LW (2000) Algorithms for defining visual regions-of-interest: comparison with eye fixations. IEEE Trans Pattern Anal Mach Intell 22(9):970-982CrossRefGoogle Scholar
  22. 22.
    Reisfeld D, Wolfson H, Yeshurun Y (1995) Context-free attentional operators: the generalized symmetry transform. Int J Comput Vision 14:119-130Google Scholar
  23. 23.
    Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat Neurosci 2(11):1019-1025CrossRefGoogle Scholar
  24. 24.
    Ritter HJ, Martinetz TM, Schulten KJ (1992) Neuronale Netze. Addison-Wesley, MunichGoogle Scholar
  25. 25.
    Sanger TD (1989) Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw 2:459-473CrossRefGoogle Scholar
  26. 26.
    Schmid C, Mohr R, Bauckhage C (2000) Evaluation of interest point detectors. Int J Comput Vision 37(2):151-172CrossRefMATHGoogle Scholar
  27. 27.
    Shannon CE (1948) A mathematical theory of communication. Bell Systems Tech J 27:379-423Google Scholar
  28. 28.
    Theis C, Iossifidis I, Steinhage A (2001) Image processing methods for interactive robot control. In: Proc. IEEE Roman international workshop on robot-human interactive communication, Bordeaux and Paris, FranceGoogle Scholar
  29. 29.
    Tipping ME, Bishop CM (1999) Mixtures of probabilistic principal component analyzers. Neural Comput 11(2):443-482CrossRefGoogle Scholar
  30. 30.
    Walther D, Itti L, Riesenhuber M, Poggio T, Koch C (2002) Attentional selection for object recognition - a gentle way. In: Proc. 2nd workshop on biologically motivated computer vision (BMCV’02), Tübingen, GermanyGoogle Scholar

Copyright information

© Springer-Verlag Berlin/Heidelberg 2004

Authors and Affiliations

  • Gunther Heidemann
    • 1
  • Robert Rae
    • 1
  • Holger Bekel
    • 1
  • Ingo Bax
    • 1
  • Helge Ritter
    • 1
  1. 1.Neuroinformatics Group, Faculty of TechnologyBielefeld UniversityBielefeldGermany

Personalised recommendations