Distinctive Image Features from Scale-Invariant Keypoints

Abstract

This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

This is a preview of subscription content, log in to check access.

References

  1. Arya, S. and Mount, D.M. 1993. Approximate nearest neighbor queries in fixed dimensions. In Fourth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'93),pp. 271–280.

  2. Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., and Wu, A.Y. 1998. Anoptimal algorithm for approximate nearest neighbor searching. Journal of the ACM, 45:891–923.

    Google Scholar 

  3. Ballard, D.H. 1981. Generalizing the Hough transform to detect arbitrary patterns. Pattern Recognition, 13(2):111–122.

    Google Scholar 

  4. Basri, R. and Jacobs, D.W. 1997. Recognition using region correspondences. International Journal of Computer Vision, 25(2):145–166.

    Google Scholar 

  5. Baumberg, A. 2000. Reliable feature matching across widely separated views. In Conference on ComputerVision andPattern Recognition, Hilton Head, South Carolina, pp. 774–781.

  6. Beis, J. and Lowe, D.G. 1997. Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In Conference on Computer Vision and Pattern Recognition, Puerto Rico, pp. 1000–1006.

  7. Brown, M. and Lowe, D.G. 2002. Invariant features from interest point groups. In British Machine Vision Conference, Cardiff, Wales, pp. 656–665.

  8. Carneiro, G. and Jepson, A.D. 2002. Phase-based local features. In European Conference on Computer Vision (ECCV), Copenhagen, Denmark, pp. 282–296.

  9. Crowley, J.L. and Parker, A.C. 1984. Arepresentation for shape based on peaks and ridges in the difference of low-pass transform. IEEE Trans. on Pattern Analysis and Machine Intelligence, 6(2):156–170.

    Google Scholar 

  10. Edelman, S., Intrator, N., and Poggio, T. 1997. Complex cells and object recognition. Unpublished manuscript: http://kybele.psych.cornell.edu/~edelman/archive.html

  11. Fergus, R., Perona, P., and Zisserman, A. 2003. Object class recognition by unsupervised scale-invariant learning. In IEEE Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, pp. 264–271.

  12. Friedman, J.H., Bentley, J.L., and Finkel, R.A. 1977. An algorithm for finding best matches in logarithmic expected time. ACMTransactions on Mathematical Software, 3(3):209–226.

    Google Scholar 

  13. Funt, B.V. and Finlayson, G.D. 1995. Color constant color indexing. IEEE Trans. on Pattern Analysis and Machine Intelligence, 17(5):522–529.

    Google Scholar 

  14. Grimson, E. 1990. Object Recognition by Computer: The Role of Geometric Constraints, The MIT Press: Cambridge, MA.

    Google Scholar 

  15. Harris, C. 1992. Geometry from visual motion. In Active Vision, A. Blake and A. Yuille (Eds.), MIT Press, pp. 263–284.

  16. Harris, C. and Stephens, M. 1988. Acombined corner and edge detector. In Fourth Alvey Vision Conference, Manchester, UK, pp. 147–151.

  17. Hartley, R. and Zisserman, A. 2000. Multiple view geometry in computer vision, Cambridge University Press: Cambridge, UK.

    Google Scholar 

  18. Hough, P.V.C. 1962. Method and means for recognizing complex patterns. U.S. Patent 3069654.

  19. Koenderink, J.J. 1984. The structure of images. Biological Cybernetics, 50:363–396.

    Google Scholar 

  20. Lindeberg, T. 1993. Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention. International Journal of Computer Vision, 11(3):283–318.

    Google Scholar 

  21. Lindeberg, T. 1994. Scale-space theory: A basic tool for analysing structures at different scales. Journal of Applied Statistics, 21(2):224–270.

    Google Scholar 

  22. Lowe, D.G. 1991. Fitting parameterized three-dimensional models to images. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13(5):441–450.

    Google Scholar 

  23. Lowe, D.G. 1999. Object recognition from local scale-invariant features. In International Conference on Computer Vision, Corfu, Greece, pp. 1150–1157.

  24. Lowe, D.G. 2001. Local feature view clustering for 3D object recognition. IEEE Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, pp. 682–688.

  25. Luong, Q.T. and Faugeras, O.D. 1996. The fundamental matrix: Theory, algorithms, and stability analysis. International Journal of Computer Vision, 17(1):43–76.

    Google Scholar 

  26. Matas, J., Chum, O., Urban, M., and Pajdla, T. 2002. Robust wide baseline stereo from maximally stable extremal regions. In British Machine Vision Conference, Cardiff, Wales, pp. 384–393.

  27. Mikolajczyk, K. 2002. Detection of local features invariant to affine transformations, Ph.D. thesis, Institut National Polytechnique de Grenoble, France.

  28. Mikolajczyk, K. and Schmid, C. 2002. An affine invariant interest point detector. In European Conference on Computer Vision (ECCV), Copenhagen, Denmark, pp. 128–142.

  29. Mikolajczyk, K., Zisserman, A., and Schmid, C. 2003. Shape recognition with edge-based features. In Proceedings of the British Machine Vision Conference, Norwich, U.K.

  30. Moravec, H. 1981. Rover visual obstacle avoidance. In International Joint Conference on Artificial Intelligence, Vancouver, Canada, pp. 785–790.

  31. Nelson, R.C. and Selinger, A. 1998. Large-scale tests of a keyed, appearance-based 3-D object recognition system. Vision Research, 38(15):2469–2488.

    Google Scholar 

  32. Pope, A.R. and Lowe, D.G. 2000. Probabilistic models of appearance for 3-D object recognition. International Journal of Computer Vision, 40(2):149–167.

    Google Scholar 

  33. Pritchard, D. and Heidrich,W. 2003. Cloth motion capture. Computer Graphics Forum (Eurographics 2003), 22(3):263–271.

    Google Scholar 

  34. Schaffalitzky, F. and Zisserman, A. 2002. Multi-view matching for unordered image sets, or 'How do I organize my holiday snaps?'” In European Conference on Computer Vision, Copenhagen, Denmark, pp. 414–431.

  35. Schiele, B. and Crowley, J.L. 2000. Recognition without correspondence using multidimensional receptive field histograms. International Journal of Computer Vision, 36(1):31–50.

    Google Scholar 

  36. Schmid, C. and Mohr, R. 1997. Local grayvalue invariants for image retrieval. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(5):530–534.

    Google Scholar 

  37. Se, S., Lowe, D.G., and Little, J. 2001. Vision-based mobile robot localization and mapping using scale-invariant features. In International Conference on Robotics and Automation, Seoul, Korea, pp. 2051–2058.

  38. Se, S., Lowe, D.G., and Little, J. 2002. Global localization using distinctive visual features. In International Conference on Intelligent Robots and Systems, IROS 2002, Lausanne, Switzerland, pp. 226–231.

  39. Shokoufandeh, A., Marsic, I., and Dickinson, S.J. 1999. View-based object recognition using saliency maps. Image and Vision Computing, 17:445–460.

    Google Scholar 

  40. Torr, P. 1995. Motion segmentation and outlier detection, Ph.D. Thesis, Dept. of Engineering Science, University of Oxford, UK.

  41. Tuytelaars, T. and Van Gool, L. 2000. Wide baseline stereo based on local, affinely invariant regions. In British Machine Vision Conference, Bristol, UK, pp. 412–422.

  42. Weber, M., Welling, M., and Perona, P. 2000. Unsupervised learning of models for recognition. In European Conference on Computer Vision, Dublin, Ireland, pp. 18–32.

  43. Witkin, A.P. 1983. Scale-space filtering. In International Joint Conference on Artificial Intelligence, Karlsruhe, Germany, pp. 1019–1022.

  44. Zhang, Z., Deriche, R., Faugeras, O., and Luong, Q.T. 1995. Arobust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry. Artificial Intelligence, 78:87–119.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision 60, 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94

Download citation

  • invariant features
  • object recognition
  • scale invariance
  • image matching