A Comparison of Affine Region Detectors
- 4.4k Downloads
- 1.4k Citations
Abstract
The paper gives a snapshot of the state of the art in affine covariant region detectors, and compares their performance on a set of test images under varying imaging conditions. Six types of detectors are included: detectors based on affine normalization around Harris (Mikolajczyk and Schmid, 2002; Schaffalitzky and Zisserman, 2002) and Hessian points (Mikolajczyk and Schmid, 2002), a detector of ‘maximally stable extremal regions', proposed by Matas et al. (2002); an edge-based region detector (Tuytelaars and Van Gool, 1999) and a detector based on intensity extrema (Tuytelaars and Van Gool, 2000), and a detector of ‘salient regions', proposed by Kadir, Zisserman and Brady (2004). The performance is measured against changes in viewpoint, scale, illumination, defocus and image compression.
The objective of this paper is also to establish a reference test set of images and performance software, so that future detectors can be evaluated in the same framework.
Keywords
affine region detectors invariant image description local features performance evaluationPreview
Unable to display preview. Download preview PDF.
References
- Baumberg, A. 2000. Reliable feature matching across widely separated views. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head Island, South Carolina, USA, pp. 774–781.Google Scholar
- Brown, M. and Lowe, D. 2003. Recognizing panoramas. In Proceedings of the International Conference on Computer Vision, Nice, France, pp. 1218–1225.Google Scholar
- Canny, J. 1986. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8: 679–698.Google Scholar
- Csurka, G., Dance, C., Bray, C., and Fan, L. 2004. Visual categorization with bags of keypoints. In Proceedings Workshop on Statistical Learning in Computer Vision.Google Scholar
- Dorko, G. and Schmid, C. 2003. Selection of scale invariant neighborhoods for object class recognition. In Proceedings International Conference on Computer Vision, Nice, France, pp. 634–640.Google Scholar
- Fergus, R., Perona, P., and Zisserman, A. 2003. Object class recognition by unsupervised scale-invariant learning. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA.Google Scholar
- Ferrari, V., Tuytelaars, T., and Van Gool, L. 2001. Simultaneous object recognition and segmentation by image exploration. In Proceedings European Conference on Computer Vision, Prague, Czech Republic, pp. 40–54.Google Scholar
- Ferrari, V., Tuytelaars, T., and Van Gool, L. 2005. Simultaneous object recognition and segmentation from single or multiple model views. International Journal of Computer Vision, to appear.Google Scholar
- Goedeme, T., Tuytelaars, T., and Van Gool, L. 2004. Fast wide baseline matching for visual navigation. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, pp. 24–29.Google Scholar
- Harris, C. and Stephens, M. 1988. A combined corner and edge detector. In Alvey Vision Conference, pp. 147–151.Google Scholar
- Hartley, R.I. and Zisserman, A. 2004. Multiple View Geometry in Computer Vision, 2nd edition, Cambridge University Press, ISBN: 0521540518.Google Scholar
- Kadir, T., Zisserman, A., and Brady, M. 2004. An affine invariant salient region detector. In Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, pp. 345–457.Google Scholar
- Lazebnik, S., Schmid, C., and Ponce, J. 2003a. A sparse texture representation using affine-invariant regions. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, pp. 319–324.Google Scholar
- Lazebnik, S., Schmid, C., and Ponce, J. 2003b. Affine-invariant local descriptors and neighborhood statistics for texture recognition. In Proceedings of the International Conference on Computer Vision, Nice, France, pp. 649–655.Google Scholar
- Lazebnik, S., Schmid, C., and Ponce, J. 2005. A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8):1265–1278.Google Scholar
- Lindeberg, T. and Gårding, J. 1997. Shape-adapted smoothing in estimation of 3-D shape cues from affine deformations of local 2-D brightness structure. Image and Vision Computing 15(6):415–434.Google Scholar
- Lindeberg, T. 1998. Feature detection with automatic scale selection. International Journal of Computer Vision 30(2):79–116.Google Scholar
- Lowe, D. 1999. Object recognition from local scale-invariant features. In Proceedings of the 7th International Conference on Computer Vision, Kerkyra, Greece, pp. 1150–1157.Google Scholar
- Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision 60(2):91–110.Google Scholar
- Matas, J., Burianek, J., and Kittler, J. 2000. Object Recognition using the Invariant Pixel-Set Signature. In Proceedings of the British Machine Vision Conference, London, UK, pp. 606–615.Google Scholar
- Matas, J. Chum, O., Urban, M., and Pajdla, T. 2002. Robust wide-baseline stereo from maximally stable extremal regions. In Proceedings of the British Machine Vision Conference, Cardiff, UK, pp. 384–393.Google Scholar
- Matas, J., Chum, O., Urban, M., and Pajdla, T. 2004. Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing 22(10):761–767.Google Scholar
- Mikolajczyk, K. and Schmid, C. 2001. Indexing based on scale invariant interest points. In Proceedings of the 8th International Conference on Computer Vision, Vancouver, Canada.Google Scholar
- Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark.Google Scholar
- Mikolajczyk, K. and Schmid, C. 2003. A performance evaluation of local descriptors. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA.Google Scholar
- Mikolajczyk, K., Zisserman, A., and Schmid, C. 2003. Shape recognition with edge-based features. In Proceedings of the British Machine Vision Conference, Norwich, UK.Google Scholar
- Mikolajczyk, K. and Schmid, C. 2004. Scale & affine invariant interest point detectors. International Journal on Computer Vision 60(1):63–86.Google Scholar
- Mikolajczyk, K. and Schmid, C. 2005. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(10):1615–1630.Google Scholar
- Obdržálek, Ŝ. and Matas, J. 2002. Object recognition using local affine frames on distinguished regions. In Proceedings of the British Machine Vision Conference, Cardiff, UK, pp. 113–122.Google Scholar
- Opelt, A., Fussenegger, M., Pinz, A., and Auer, P. 2004. Weak hypotheses and boosting for generic object detection and recognition. In Proceedings of European Conference on Computer Vision, Prague, Czech Republic, pp. 71–84.Google Scholar
- Pritchett, P. and Zisserman, A. 1998. Wide baseline stereo matching. In Proceedings of the 6th International Conference on Computer Vision, Bombay, India, pp. 754–760.Google Scholar
- Rothganger, F., Lazebnik, S., Schmid, C., and Ponce, J. 2003. 3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, pp. 272–277.Google Scholar
- Rothganger, F., Lazebnik, S., Schmid, C., and Ponce, J. 2005. Object modeling and recognition using local affine-invariant image descriptors and multi-view spatial consraints. International Journal of Computer Vision, to appear.Google Scholar
- Schaffalitzky, F., and Zisserman, A. 2002. Multi-view matching for unordered image sets, or “How do I organize my holiday snaps?”. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, pp. 414–431.Google Scholar
- Schaffalitzky, F. and Zisserman, A. 2003. Automated Location matching in movies. Computer Vision and Image Understanding, 92(2):236–264.Google Scholar
- Schmid, C. and Mohr, R. 1997. Local grayvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(5):530–535.Google Scholar
- Se, S., Lowe, D., and Little, J. 2002. Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks. International Journal of Robotics Research 21(8):735–758.Google Scholar
- Sedgewick, R. 1988. Algorithms, 2nd edition. Addison-Wesley.Google Scholar
- Sivic, J., and Zisserman, A. 2003. Video google: A text retrieval approach to object matching in videos. In Proceedings of the International Conference on Computer Vision, Nice, France.Google Scholar
- Sivic, J., Schaffalitzky, F., and Zisserman, A. 2004. Object level grouping for video shots. In Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, pp. 724–734.Google Scholar
- Sivic, J., and Zisserman, A. 2004. Video data mining using configurations of viewpoint invariant regions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, pp. 488–495.Google Scholar
- Tell, D. and Carlsson, S. 2000. Wide baseline point matching using affine invariants computed from intensity profiles. In Proceedings of the 6th European Conference on Computer Vision, Dublin, Ireland, pp. 814–828.Google Scholar
- Tell, D. and Carlsson, S. 2002. Combining appearance and topology for wide baseline matching. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, pp. 68–81.Google Scholar
- Turina, A., Tuytelaars, T., and Van Gool, L. 2001. Efficient Grouping under perspective skew. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition, Hawaii, USA, pp. 247–254.Google Scholar
- Tuytelaars, T. and Van Gool, L. 1999. Content-based image retrieval based on local affinely invariant regions. In Int. Conf. on Visual Information Systems, pp. 493–500.Google Scholar
- Tuytelaars, T., Van Gool, L., D'haene, L., and Koch, R. 1999. Matching of affinely invariant regions for visual servoing. In Int. Conference Robotics and Automation ICRA 99.Google Scholar
- Tuytelaars, T. and Van Gool, L. 2000. Wide baseline stereo matching based on local, affinely invariant regions. In Proceedings of the 11th British Machine Vision Conference, Bristol, UK, pp. 412–425.Google Scholar
- Tuytelaars, T. and Van Gool, L. 2004. Matching Widely Separated Views based on Affine Invariant Regions. International Journal on Computer Vision 59(1):61–85.Google Scholar