Are Performance Differences of Interest Operators Statistically Significant?

  • Nadia Kanwal
  • Shoaib Ehsan
  • Adrian F. Clark
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6855)


The differences in performance of a range of interest operators are examined in a null hypothesis framework using McNemar’s test on a widely-used database of images, to ascertain whether these apparent differences are statistically significant. It is found that some performance differences are indeed statistically significant, though most of them are at a fairly low level of confidence, i.e. with about a 1-in-20 chance that the results could be due to features of the evaluation database. A new evaluation measure i.e. accurate homography estimation is used to characterize the performance of feature extraction algorithms.Results suggest that operators employing longer descriptors are more reliable.


Feature Extraction Homography McNemar’s Test 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1615–1630 (2005)Google Scholar
  2. 2.
    Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision 3(3), 177–280 (2008)CrossRefGoogle Scholar
  3. 3.
    Saag, M.S., Powderly, W.G., Cloud, G.A., Robinson, P., Grieco, M.H., Sharkey, P.K., Thompson, S.E., Sugar, A.M., Tuazon, C.U., Fisher, J.F., et al.: Comparison of amphotericin B with fluconazole in the treatment of acute AIDS-associated cryptococcal meningitis. New England Journal of Medicine 326(2), 83–89 (1992)CrossRefGoogle Scholar
  4. 4.
    Bay, H., Tuytelaars, T., Van Gool, L.: Surf: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. International Journal of Computer Vision 60(1), 63–86 (2004)CrossRefGoogle Scholar
  6. 6.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International journal of computer vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  7. 7.
    Schmid, C., Mohr, R., Bauckhage, C.: Evaluation of interest point detectors. International Journal of computer vision 37(2), 151–172 (2000)CrossRefzbMATHGoogle Scholar
  8. 8.
    Ehsan, S., Kanwal, N., Clark, A.F., McDonald-Maier, K.D.: Improved repeatability measures for evaluating performance of feature detectors. Electronics Letters 46(14), 998–1000 (2010)CrossRefGoogle Scholar
  9. 9.
    Valgren, C., Lilienthal, A.: SIFT, SURF and seasons: Long-term outdoor localization using local features. In: Proceedings of the European Conference on Mobile Robots (ECMR), pp. 253–258 (2007)Google Scholar
  10. 10.
    Clark, A.F., Clark, C.: Performance Characterization in Computer Vision A Tutorial (1999)Google Scholar
  11. 11.
    Crease, R.P.: Discovery with statistics. Physics World 23(8), 19 (2010)CrossRefGoogle Scholar
  12. 12.
    Abdi, H.: Bonferroni and Šidák corrections for multiple comparisons. Sage, Thousand Oaks, CA (2007)Google Scholar
  13. 13.
    Perneger, T.V.: What’s wrong with bonferroni adjustments. British Medical Journal 316, 1236–1238 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Nadia Kanwal
    • 1
    • 2
  • Shoaib Ehsan
    • 1
  • Adrian F. Clark
    • 1
  1. 1.VASE Laboratory, Computer Science & Electronic EngineeringUniversity of EssexColchesterUK
  2. 2.Lahore College for Women UniversityPakistan

Personalised recommendations