Object Recognition Using Local Affine Frames on Maximally Stable Extremal Regions

  • Štěpán Obdržálek
  • Jiří Matas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4170)


Methods based on distinguished regions (transformation covariant detectable regions) have achieved considerable success in object recognition, retrieval and matching problems in both still images and videos. The chapter focuses on a method exploiting local coordinate systems (local affine frames) established on maximally stable extremal regions. We provide a taxonomy of affine-covariant constructions of local coordinate systems, prove their affine covariance and present algorithmic details on their computation. Exploiting processes proposed for computation of affine-invariant local frames of reference, tentative region-to-region correspondences are established. Object recognition is formulated as a problem of finding a maximal set of geometrically consistent matches.

State of the art results are reported on standard, publicly available, object recognition tests (COIL-100, ZuBuD, FOCUS). Change of scale, illumination conditions, out-of-plane rotation, occlusion , locally anisotropic scale change and 3D translation of the viewpoint are all present in the test problems.


Object Recognition Discrete Cosine Transformation Query Image Epipolar Geometry Object Recognition Test 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ballester, C., Gonzalez, M.: Affine invariant texture segmentation and shape from texture by variational methods. Journal of Mathematical Imaging and Vision 9, 141–171 (1998)MATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Caputo, B., Hornegger, J., Paulus, D., Niemann, H.: A spin-glass markov random field for 3-D object recognition. Technical Report LME-TR-2002-01, Lehrstuhl für Mustererkennung, Institut für Informatik, Universität Erlangen-Nürnberg (2002)Google Scholar
  3. 3.
    Chum, O., Matas, J., Obdržálek, Š.: Enhancing RANSAC by generalized model optimization. In: Proc. of the Asian Conference on Computer Vision (ACCV), vol. 2, pp. 812–817 (January 2004)Google Scholar
  4. 4.
    Cohen, S.: Finding color and shape patterns in images. Technical Report STAN-CS-TR-99-1620, Stanford University (May 1999)Google Scholar
  5. 5.
    Douglas, D., Peucker, T.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Canadian Cartographer 10, 112–122 (1973)Google Scholar
  6. 6.
    Ferrari, V., Tuytelaars, T., Van Gool, L.: Simultaneous object recognition and segmentation by image exploration. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 40–54. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Finlayson, G., Drew, M., Funt, B.: Color constancy: Generalized diagonal transforms suffice. Journal of the Optical Society of America 11, 3011–3019 (1994)CrossRefGoogle Scholar
  8. 8.
    Finlayson, G., Drew, M., Funt, B.: Spectral sharpening: Sensor transformations for improved color constancy. Journal of the Optical Society of America 11, 1553–1563 (1994)CrossRefGoogle Scholar
  9. 9.
    Forssén, P.-E., Granlund, G.: Robust Multi-scale Extraction of Blob Features. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 11–18. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, pp. 147–152 (1988)Google Scholar
  11. 11.
    Healey, G.: Using color for geometry-insensitive segmentation. Journal of the Optical Society of America 6, 86–103 (1989)Google Scholar
  12. 12.
    Heikkilä, J.: Pattern matching with affine moment descriptors. Pattern Recognition 37(9), 1825–1834 (2004)MATHCrossRefGoogle Scholar
  13. 13.
    Jain, A.K.: Fundamentals of Digital Image Processing (1986)Google Scholar
  14. 14.
    Lindeberg, T.: Feature detection with automatic scale selection. International Journal on Computer Vision 30(2), 79–116 (1998)CrossRefGoogle Scholar
  15. 15.
    Liu, X., Srivastava, A.: A spectral representation for appearance-based classification and recognition. In: Proceedings of the International Conference on Pattern Recognition, pp. 37–40 (2002)Google Scholar
  16. 16.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision 20(2), 91–110 (2004)CrossRefGoogle Scholar
  17. 17.
    Marée, R., Geurts, P., Piater, J., Wehenkel, L.: Random subwindows for robust image classification. In: Proceedings of the Conference on Computer Vision and Pattern Recognition (2005)Google Scholar
  18. 18.
    Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing 22(10), 761–767 (2004)CrossRefGoogle Scholar
  19. 19.
    Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: Proceedings of the International Conference on Computer Vision, pp. 525–531 (2001)Google Scholar
  20. 20.
    Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Proceedings of the European Conference on Computer Vision, pp. 128–142 (2002)Google Scholar
  21. 21.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., van Gool, L.: A comparison of affine region detectors. International Journal of Computer Vision 65(7), 43–72 (2005)CrossRefGoogle Scholar
  22. 22.
    Mokhtarian, F., Mackworth, A.K.: A theory of multiscale, curvature-based shape representation for planar curves. IEEE Transactions on Pattern Analysis and Machine Intelligence 14(8), 789–805 (1992)CrossRefGoogle Scholar
  23. 23.
    Mundy, J., Zisserman, A.: Geometric Invariance in Computer Vision (1992)Google Scholar
  24. 24.
    Obdržálek, Š., Matas, J.: Object recognition using local affine frames on distinguished regions. In: Proceedings of the British Machine Vision Conference (2002)Google Scholar
  25. 25.
    Ramer, U.: An iterative procedure for the polygonal approximation of plane curves. Computer Graphics and Image Processing 1, 244–259 (1972)CrossRefGoogle Scholar
  26. 26.
    Shao, H., Svoboda, T., Tuytelaars, T., Van Gool, L.: HPAT indexing for fast object/scene recognition based on local appearance. In: International Conference on Image and Video Retrieval, pp. 71–80 (2003)Google Scholar
  27. 27.
    Shao, H., Svoboda, T., Van Gool, L.: ZuBuD — Zurich Buildings Database for Image Based Recognition. Technical Report 260, Computer Vision Laboratory, Swiss Federal Institute of Technology (March 2003),
  28. 28.
    Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: Proceedings of the International Conference on Computer Vision, pp. 1470–1477 (2003)Google Scholar
  29. 29.
    Tuytelaars, T., Van Gool, L.: Content-based image retrieval based on local affinely invariant regions. In: Visual Information and Information Systems, pp. 493–500 (1999)Google Scholar
  30. 30.
    Tuytelaars, T., Van Gool, L.: Wide baseline stereo matching based on local, affinely invariant regions. In: Proceedings of the British Machine Vision Conference (2000)Google Scholar
  31. 31.
    Vasconcelos, N., Ho, P., Moreno, P.J.: The Kullback-Leibler kernel as a framework for discriminant and localized representations for visual recognition. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3023, pp. 430–441. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  32. 32.
    Yang, M.H., Roth, D., Ahuja, N.: Learning to Recognize 3D Objects with SNoW. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 439–454. Springer, Heidelberg (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Štěpán Obdržálek
    • 1
  • Jiří Matas
    • 1
  1. 1.Center for Machine PerceptionCzech Technical University Prague 

Personalised recommendations