Machine Vision and Applications

, Volume 16, Issue 1, pp 59–63 | Cite as

Efficient pose estimation using view-based object representations

  • Gabriele PetersEmail author
Special issue on ICVS 2003


We present an efficient method for estimating the pose of a three-dimensional object. Its implementation is embedded in a computer vision system which is motivated by and based on cognitive principles concerning the visual perception of three-dimensional objects. Viewpoint-invariant object recognition has been subject to controversial discussions for a long time. An important point of discussion is the nature of internal object representations. Behavioral studies with primates, which are summarized in this article, support the model of view-based object representations. We designed our computer vision system according to these findings and demonstrate that very precise estimations of the poses of real-world objects are possible even if only a small number of sample views of an object is available. The system can be used for a variety of applications.


Pose estimation 3d object recognition tracking cognitive modeling 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Burr DC, Morrone MC, Spinelli D (1989) Evidence for edge and bar detectors in human vision. Vision Res 29(4):419-431CrossRefGoogle Scholar
  2. 2.
    Cutzu F, Edelman S (1994) Canonical views in object representation and recognition. Vision Res 34:3037-3056CrossRefGoogle Scholar
  3. 3.
    Chvatal V (1979) A greedy heuristic for the set-covering problem. Math Oper Res 4(3):233-235MathSciNetzbMATHGoogle Scholar
  4. 4.
    Dhome M, Richetin M, Lapreste J, Rives G (1989) Determination of the attitude of 3-D objects from a single perspective view. IEEE Trans Patt Anal Mach Intell 11(12):1265-1278CrossRefGoogle Scholar
  5. 5.
    Edelman S, Bülthoff HH (1992) Orientation dependence in the recognition of familiar and novel views of three-dimensional objects. Vision Res 32(12):2385-2400CrossRefGoogle Scholar
  6. 6.
    Eckes C, Vorbrüggen JC (1996) Combining data-driven and model-based cues for segmentation of video sequences. In: Proc. WCNN96, pp 868-875Google Scholar
  7. 7.
    Horaud R, Conio B, Leboulleux O, Lacolle B (1989) An analytic solution for the perspective 4-point problem. Comput Vision Graph Image Process 47:33-44Google Scholar
  8. 8.
    Haralick RM, Lee C, Ottenberg K, Nölle M (1991) Analysis and solutions of the three point perspective pose estimation problem. In: Proc. IEEE conference on computer vision and pattern recognition, pp 592-598Google Scholar
  9. 9.
    Jones JP, Palmer LA (1987) An evaluation of the two-dimensional gabor filter model of simple receptive fields in cat striate cortex. J Neurophysiol 58(6):1233-1258Google Scholar
  10. 10.
    Koenderink JJ, van Doorn AJ (1976) The singularities of the visual mapping. Biol Cybern 24:51-59zbMATHGoogle Scholar
  11. 11.
    Krüger V, Sommer G (2002) Gabor wavelet networks for efficient head pose estimation. Image Vision Comput 20(9-10):665-672Google Scholar
  12. 12.
    Lades M, Vorbrüggen JC, Buhmann J, Lange J, von der Malsburg C, Würtz RP, Konen W (1993) Distortion invariant object recognition in the dynamic link architecture. IEEE Trans Comput 42:300-311CrossRefGoogle Scholar
  13. 13.
    Logothetis NK, Pauls J, Bülthoff HH, Poggio T (1994) View-dependent object recognition by monkeys. Curr Biol 4:401-414CrossRefGoogle Scholar
  14. 14.
    Logothetis NK, Pauls J, Poggio T (1995) Shape representation in the inferior temporal cortex of monkeys. Curr Biol 5(5):552-563CrossRefGoogle Scholar
  15. 15.
    Lowe DG (1987) Three-dimensional object recognition from single two-dimensional images. Artif Intell 31:355-395CrossRefGoogle Scholar
  16. 16.
    Maurer T, von der Malsburg C (1996) Tracking and learning graphs and pose on image sequences of faces. In: Proc. international conference on automatic face- and gesture- recognition, pp 176-181Google Scholar
  17. 17.
    Peters G (2002) A view-based approach to three-dimensional object perception. Ph.D. Thesis, Shaker Verlag, Aachen, GermanyGoogle Scholar
  18. 18.
    Peters G, von der Malsburg C (2001) View reconstruction by linear combination of sample views. In: Proc. BMVC 2001, pp 223-232Google Scholar
  19. 19.
    Pötzsch M (1994) Die Behandlung der Wavelet-Transformation von Bildern in der Nähe von Objektkanten. Technical Report IRINI 94-04, Institut für Neuroinformatik, Ruhr-Universität Bochum, GermanyGoogle Scholar
  20. 20.
    Tarr MJ (1993) Orientation dependence in three-dimensional object recognition. Ph.D. Thesis, MIT, Cambridge, MAGoogle Scholar
  21. 21.
    Ullman S, Basri R (1990) Recognition by linear combinations of models. IEEE Trans Patt Anal Mach Intell 13(10):992-1006CrossRefGoogle Scholar
  22. 22.
    Wexler M, Kosslyn SM, Berthoz A (1998) Motor processes in mental rotation. Cognition 68:77-94CrossRefGoogle Scholar
  23. 23.
    Wiskott L, Fellous J-M, Krüger N, von der Malsburg C (1997) Face recognition by elastic bunch graph matching. IEEE Trans Patt Anal Mach Intell 19(7):775-779CrossRefGoogle Scholar
  24. 24.
    Yuan J (1989) A general photogrammetric method for determining object position and orientation. IEEE J Robot Automat 5(2):129-142CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin/Heidelberg 2004

Authors and Affiliations

  1. 1.Informatik VIIUniversität DortmundDortmundGermany

Personalised recommendations