Machine Vision and Applications

, Volume 26, Issue 6, pp 819–836 | Cite as

Better than SIFT?

Original Paper

Abstract

Independent evaluation of the performance of feature descriptors is an important part of the process of developing better computer vision systems. In this paper, we compare the performance of several state-of-the art image descriptors including several recent binary descriptors. We test the descriptors on an image recognition application and a feature matching application. Our study includes several recently proposed methods and, despite claims to the contrary, we find that SIFT is still the most accurate performer in both application settings. We also find that general purpose binary descriptors are not ideal for image recognition applications but perform adequately in a feature matching application.

Keywords

Image recognition Feature matching Binary descriptors 

References

  1. 1.
    Agrawal, M., Konolige, K., Blas, M.: CenSurE: center surround extremas for realtime feature detection and matching. In: International European Conference on Computer Vision, vol. 5305, pp. 102–115. Springer, Berlin (2008)Google Scholar
  2. 2.
    Alahi, A., Ortiz, R., Vandergheynst, P.: Freak: fast retina keypoint. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 510–517 (2012)Google Scholar
  3. 3.
    Aly, M., Munich, M.E., Perona, P.: Bag of words for large scale object recognition—properties and benchmark. In: International Conference on Computer Vision Theory and Applications, pp. 299–306 (2011)Google Scholar
  4. 4.
    Arandjelovic, R., Zisserman, A.: Smooth object retrieval using a bag of boundaries. In: IEEE International Conference on Computer Vision, pp. 375–382 (2011)Google Scholar
  5. 5.
    Babenko, B., Dollar, P., Belongie, S.: Task specific local region matching. In: International Conference on Computer Vision (2007)Google Scholar
  6. 6.
    Bauer, J., Sünderhauf, N., Protzel, P.: Comparing several implementations of two recently published feature detectors. In: International Conference on Intelligent and Autonomous Systems, pp. 1–6 (2007)Google Scholar
  7. 7.
    Bay, H., Ess, A., Tuytelaars, T., Gool, L.V.: Surf: speeded up robust features. Comput. Vis. Image Underst. 110(3), 346–359 (2008)CrossRefGoogle Scholar
  8. 8.
    Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)CrossRefGoogle Scholar
  9. 9.
    Boureau, Y.l., Cun, Y.L., et al.: Sparse feature learning for deep belief networks. In: Advances in Neural Information Processing Systems, pp. 1185–1192 (2008)Google Scholar
  10. 10.
    Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library. O’Reilly, Cambridge (2008)Google Scholar
  11. 11.
    Byrne, J.: NESTED Descriptor (2013). https://github.com/jebyrne
  12. 12.
    Byrne, J., Shi, J.: Nested shape descriptors. In: IEEE International Conference on Computer Vision, pp. 1201–1208 (2013)Google Scholar
  13. 13.
    Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: binary robust independent elementary features. In: International European conference on Computer Vision: Part IV, pp. 778–792 (2010)Google Scholar
  14. 14.
    Chandrasekhar, V., Chen, D., Tsai, S., Cheung, N.M., Chen, H., Takacs, G., Reznik, Y., Vedantham, R., Grzeszczuk, R., Bach, J., Girod, B.: The Stanford mobile visual search dataset. In: ACM Multimedia Systems Conference, pp. 117–122 (2011)Google Scholar
  15. 15.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. IEEE Int. Conf. Comput. Vis. Pattern Recognit. 1, 886–893 (2005)Google Scholar
  16. 16.
    Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)CrossRefGoogle Scholar
  17. 17.
    Freeman, W.T., Adelson, E.H.: The design and use of steerable filters. IEEE Trans. Pattern Anal. Mach. Intell. 13(9), 891–906 (1991)CrossRefGoogle Scholar
  18. 18.
    Gauglitz, S., Höllerer, T., Turk, M.: Evaluation of interest point detectors and feature descriptors for visual tracking. Int. J. Comput. Vis. 94(3), 335–360 (2011)CrossRefMATHGoogle Scholar
  19. 19.
    Harris, C., Stephens, M.: A combined corner and edge detector. In: Fourth Alvey Vision Conference, pp. 147–151 (1988)Google Scholar
  20. 20.
    Heinly, J., Dunn, E., Frahm, J.M.: Comparative evaluation of binary features. In: International European Conference on Computer Vision—Volume Part II, pp. 759–773. Springer, Berlin (2013)Google Scholar
  21. 21.
    Jain, P., Kulis, B., Davis, J.V., Dhillon, I.S.: Metric and kernel learning using a linear transformation. J. Mach. Learn. Res. 13(1), 519–547 (2012)MathSciNetMATHGoogle Scholar
  22. 22.
    Juan, L., Gwun, O.: A comparison of SIFT, PCA-SIFT and SURF. Int. J. Image Process. 3(4), 143–152 (2009)Google Scholar
  23. 23.
    Junior, O.L., Delgado, D., Gonçalves, V., Nunes, U.: Trainable classifier-fusion schemes: an application to pedestrian detection. In: Intelligent Transportation Systems, pp. 1–6 (2009)Google Scholar
  24. 24.
    Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image descriptors. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 506–513 (2004)Google Scholar
  25. 25.
    Khan, N.: Benchmark Dataset to Evaluate Image Retrieval Performance. http://www.cs.otago.ac.nz/students/postgrads/nabeel/ (2014)
  26. 26.
    Khan, N., McCane, B.: Smartphone application for indoor scene localization. In: International ACM SIGACCESS Conference on Computers and Accessibility, pp. 201–202 (2012)Google Scholar
  27. 27.
    Khan, N., McCane, B., Mills, S.: Feature set reduction for image matching in large scale environments. In: International Conference on Image and Vision Computing, pp. 68–72. ACM, New York (2012)Google Scholar
  28. 28.
    Khan, N., McCane, B., Mills, S.: Vision based indoor scene localization via smart phone. In: International Conference of the NZ Chapter of the ACM’s Special Interest Group on Human–Computer Interaction, pp. 88 (2012)Google Scholar
  29. 29.
    Khan, N., McCane, B., Wyvill, G.: SIFT and SURF performance evaluation against various image deformations on benchmark dataset. In: International Conference on Digital Image Computing Techniques and Applications, pp. 501–506 (2011)Google Scholar
  30. 30.
    Khan, U.M., McCane, B., Trotman, A.: A feature compression scheme for large scale image retrieval systems. In: Proceedings of the 27th Conference on Image and Vision Computing New Zealand, pp. 492–496 (2012)Google Scholar
  31. 31.
    Koenderink, J.J., van Doorn, A.J.: Representation of local geometry in the visual system. Biol. Cybern. 55(6), 367–375 (1987)CrossRefMATHGoogle Scholar
  32. 32.
    Lazebnik, S., Schmid, C., Ponce, J.: A sparse texture representation using affine-invariant regions. In: IEEE International Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II-319. IEEE, New York (2003)Google Scholar
  33. 33.
    Leutenegger, S., Chli, M., Siegwart, R.Y.: BRISK: binary robust invariant scalable keypoints. In: IEEE International Conference on Computer Vision, pp. 2548–2555 (2011)Google Scholar
  34. 34.
    Lowe, D.G.: Distinctive image features from scale invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)CrossRefGoogle Scholar
  35. 35.
    Mair, E., Hager, G.D., Burschka, D., Suppa, M., Hirzinger, G.: Adaptive and generic corner detection based on the accelerated segment test. In: Proceedings of the 11th European Conference on Computer Vision: Part II, pp. 183–196 (2010)Google Scholar
  36. 36.
    Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Computer Vision, pp. 128–142. Springer, Berlin (2002)Google Scholar
  37. 37.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)CrossRefGoogle Scholar
  38. 38.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. Comput. Vis. 65(1–2), 43–72 (2005)CrossRefGoogle Scholar
  39. 39.
    Miksik, O., Mikolajczyk, K.: Evaluation of local detectors and descriptors for fast feature matching. In: 21st International Conference on Pattern Recognition (ICPR), 2012, pp. 2681–2684. IEEE, New York (2012)Google Scholar
  40. 40.
    Mokhtarian, F., Mohanna, F.: Performance evaluation of corner detectors using consistency and accuracy measures. Comput. Vis. Image Underst. 102(1), 81–94 (2006)CrossRefGoogle Scholar
  41. 41.
    Moreels, P., Perona, P.: Evaluation of features detectors and descriptors based on 3D objects. Int. J. Comput. Vis. 73(3), 263–284 (2007)CrossRefGoogle Scholar
  42. 42.
    Muja, M., Lowe, D.G.: Scalable nearest neighbour algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2227–2240 (2014)CrossRefGoogle Scholar
  43. 43.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. IEEE Int. Conf. Comput. Vis. Pattern Recognit. 2, 2161–2168 (2006)Google Scholar
  44. 44.
  45. 45.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)CrossRefMATHGoogle Scholar
  46. 46.
    Ozuysal, M., Fua, P., Lepetit, V.: Fast keypoint recognition in ten lines of code. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
  47. 47.
    Parkhi, O.M., Vedaldi, A., Zisserman, A., Jawahar, C.V.: Cats and dogs. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3498–3505 (2012)Google Scholar
  48. 48.
    Philbin, J., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
  49. 49.
    Raguram, R., Chum, O., Pollefeys, M., Matas, J., Frahm, J.: Usac: a universal framework for random sample consensus. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 2022–2038 (2013)CrossRefGoogle Scholar
  50. 50.
    Ranzato, M., Huang, F.J., Boureau, Y.L., LeCun, Y.: Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition. CVPR’07, pp. 1–8. IEEE, New York (2007)Google Scholar
  51. 51.
    Rosten, E., Drummond, T.: Machine learning for high-speed corner detection. In: International European Conference on Computer Vision, pp. 430–443 (2006)Google Scholar
  52. 52.
    Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: An efficient alternative to SIFT or SURF. In: IEEE International Conference on Computer Vision, pp. 2564–2571 (2011)Google Scholar
  53. 53.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge (2014)Google Scholar
  54. 54.
    Schaffalitzky, F., Zisserman, A.: Multi-view matching for unordered image sets, or “how do i organize my holiday snaps?”. In: International European Conference on Computer Vision—Part I, pp. 414–431 (2002)Google Scholar
  55. 55.
    Shi, J., Tomasi, C.: Good features to track. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 593–600. IEEE, New York (1994)Google Scholar
  56. 56.
    Simonyan, K., Vedaldi, A., Zisserman, A.: Learning local feature descriptors using convex optimisation. IEEE Trans. Pattern Anal. Mach. Intell. 12, 25–70 (2014)Google Scholar
  57. 57.
    Tola, E.: DAISY Descriptor (2009). http://cvlab.epfl.ch/software/daisy
  58. 58.
    Tola, E., Lepetit, V., Fua, P.: DAISY: an efficient dense descriptor applied to wide baseline stereo. IEEE Trans. Pattern Anal. Mach. Intell. 32(5), 815–830 (2010)CrossRefGoogle Scholar
  59. 59.
    Tombari, F., Franchi, A., Di Stefano, L.: BOLD Descriptor Code (2013). http://vision.deis.unibo.it/BOLD/
  60. 60.
    Tombari, F., Franchi, A., Di Stefano, L.: BOLD features to detect texture-less objects. In: IEEE International Conference on Computer Vision, pp. 1265–1272 (2013)Google Scholar
  61. 61.
    Trzcinski, T., Christoudias, M., Lepetit, V.: Learning image descriptors with boosting. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 597–610 (2015). doi:10.1109/TPAMI.2014.2343961 CrossRefGoogle Scholar
  62. 62.
    Van Gool, L., Moons, T., Ungureanu, D.: Affine/photometric invariants for planar intensity patterns. In: Computer Vision, pp. 642–651. Springer, Berlin (1996)Google Scholar
  63. 63.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE International Conference on Computer Vision and Pattern Recognition, vol. 1, pp. I–511 (2001)Google Scholar
  64. 64.
    Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bull. 1(6), 80–83 (1945)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.University of OtagoDunedinNew Zealand

Personalised recommendations