Information Retrieval

, Volume 11, Issue 2, pp 77–107 | Cite as

Features for image retrieval: an experimental comparison

Article

Abstract

An experimental comparison of a large number of different image descriptors for content-based image retrieval is presented. Many of the papers describing new techniques and descriptors for content-based image retrieval describe their newly proposed methods as most appropriate without giving an in-depth comparison with all methods that were proposed earlier. In this paper, we first give an overview of a large variety of features for content-based image retrieval and compare them quantitatively on four different tasks: stock photo retrieval, personal photo collection retrieval, building retrieval, and medical image retrieval. For the experiments, five different, publicly available image databases are used and the retrieval performance of the features is analyzed in detail. This allows for a direct comparison of all features considered in this work and furthermore will allow a comparison of newly proposed features to these in the future. Additionally, the correlation of the features is analyzed, which opens the way for a simple and intuitive method to find an initial set of suitable features for a new task. The article concludes with recommendations which features perform well for what type of data. Interestingly, the often used, but very simple, color histogram performs well in the comparison and thus can be recommended as a simple baseline for many applications.

Keywords

Image retrieval Features Image classification Quantitative comparison 

References

  1. Antani, S., Kasturi, R., & Jain, R. (2002). A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video. Pattern Recognition, 35, 945–965.MATHCrossRefGoogle Scholar
  2. Armitage, L. H., & Enser, P. G. (1997). Analysis of user need in image archives. Journal of Information Science, 23(4), 287–299.CrossRefGoogle Scholar
  3. Bloehdorn, S., Petridis, K., Saathoff, C., Simou, N., Tzouvaras, V., Avrithis, Y., et al. (2005). Semantic annotation of images and videos for multimedia analysis. In European Semantic Web Conference (ESWC 05). Heraklian, Greece.Google Scholar
  4. Bober, M. (2001). MPEG-7 Visual Shape Descriptors. IEEE Trans on Circuits and Systems for Video Technology, 11(6), 716–719.CrossRefGoogle Scholar
  5. Carson, C., Belongie, S., Greenspan, H., & Malik, J. (2002). Blobworld: Image segmentation using expectation-maximization and its application to image querying. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8), 1026–1038.CrossRefGoogle Scholar
  6. Clough, P., Mueller, H., Deselaers, T., Grubinger, M., Lehmann, T., Jensen, J., et al. (2006). The CLEF 2005 cross-language image retrieval track. In Accessing Multilingual Information Repositories, 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005, vol. 4022 of Lecture Notes in Computer Science (pp. 535–557). Vienna, Austria.Google Scholar
  7. Clough, P., Müller, H., & Sanderson, M. (2004). The CLEF Cross Language Image Retrieval Track (ImageCLEF) 2004. In Fifth Workshop of the Cross-Language Evaluation Forum (CLEF 2004), vol. 3491 of LNCS (pp. 597–613).Google Scholar
  8. Datta, R., Li, J., & Wang, J. Z. (2005). Content-based image retrieval—approaches and trends of the new age. In ACM Intl. Workshop on Multimedia Information Retrieval, ACM Multimedia. Singapore.Google Scholar
  9. de Vries, A. P., & Westerveld, T. A. (2004). comparison of continuous vs. discrete image models for probabilistic image and video retrieval. In Proc. International Conference on Image Processing (pp. 2387–2390). Singapore.Google Scholar
  10. Deselaers, T. (2003). Features for image retrieval. Master’s thesis. Aachen: Human Language Technology and Pattern Recognition Group, RWTH Aachen University.Google Scholar
  11. Deselaers, T., Hegerath, A., Keysers, D., & Ney, H. (2006). Sparse patch-histograms for object classification in cluttered images. In DAGM 2006, Pattern Recognition, 27th DAGM Symposium, vol. 4174 of Lecture Notes in Computer Science (pp. 202–211). Berlin.Google Scholar
  12. Deselaers, T., Weyand, T., Keysers, D., Macherey, W., & Ney, H. (2006). FIRE in ImageCLEF 2005: Combining content-based image retrieval with textual information retrieval. In Accessing Multilingual Information Repositories, 6th Workshop of the Cross-Language Evaluation Forum, CLEF 2005, vol. 4022 of Lecture Notes in Computer Science (pp. 652–661). Vienna, Austria.Google Scholar
  13. Deselaers, T., Weyand, T., & Ney, H. (2007). Image retrieval and annotation using maximum entropy. In C. Peters, P. Clough, F. Gey, J. Karlgren, B. Magnini, D. Oard, et al. (Eds.), Evaluation of Multilingual and Multi-modal Information Retrieval—Seventh Workshop of the Cross-Language Evaluation Forum, CLEF 2000a, vol. 4730 of Lecture Notes in Computer Series (pp. 725–734). Alicante.Google Scholar
  14. Deselaers, T., Keysers, D., & Ney, H. (2004) Features for image retrieval—a quantitative comparison. In DAGM 2004, Pattern Recognition, 26th DAGM Symposium, vol. 3175 of Lecture Notes in Computer Science (pp. 228–236). Tübingen, Germany.Google Scholar
  15. Deselaers, T., Keysers, D., & Ney, H. (2004). Classification error rate for quantitative evaluation of content-based image retrieval systems. In International Conference on Pattern Recognition 2004 (ICPR 2004) (Vol. 2, pp. 505–508). Cambridge.Google Scholar
  16. Deselaers, T., Keysers, D., & Ney, H. (2005). FIRE— Flexible Image Retrieval Engine: ImageCLEF 2004 evaluation. In Multilingual Information Access for Text, Speech and Images – Fifth Workshop of the Cross-Language Evaluation Forum, CLEF 2004, vol. 3491 of Lecture Notes in Computer Science (pp. 688–698). Bath: Springer.Google Scholar
  17. Deselaers, T., Keysers, D., & Ney, H. (2005). Discriminative training for object recognition using image patches. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 05) (Vol. 2, pp. 157–162). San Diego.Google Scholar
  18. Deselaers, T., Rybach, D., Dreuw, P., Keysers, D., & Ney, H. (2005). Face-based image retrieval-one step toward object-based image retrieval. In H. Müller & A. Hanbury (Eds.), MUSCLE/ImageCLEF Workshop on Image and Video Retrieval Evaluation (pp. 25–32). Vienna.Google Scholar
  19. Di Sciascio, E., Donini, F. M., & Mongiello, M. (2002). Structured knowledge representation for image retrieval. Journal of Artificial Intelligence Research, 16, 209–257.MATHCrossRefMathSciNetGoogle Scholar
  20. Dorkó, G. (2006) Selection of discriminative regions and local descriptors for generic. Object Class Recognition. Ph.D. thesis. Institut National Polytechnique de Grenoble.Google Scholar
  21. Eidenberger, H. (2003). How good are the visual MPEG-7 features? In Proceedings SPIE Visual Communications and Image Processing Conference (Vol. 5150, pp. 476–488). Lugano.Google Scholar
  22. Faloutsos, C., Barber, R., Flickner, M., Hafner, J., Niblack, W., & Petkovic, D., et al. (1994). Efficient and effective querying by image content. Journal of Intelligent Information Systems, 3(3/4), 231–262.CrossRefGoogle Scholar
  23. Fei-Fei, L., & Perona, P. (2005). A bayesian hierarchical model for learning natural scene categories. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 524–531). IEEE: San DiegoGoogle Scholar
  24. Fergus, R., Perona, P., & Zissermann, A. (2003). Object class recognition by unsupervised scale-invariant learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 03) (pp. 264–271). Blacksburg.Google Scholar
  25. Fergus, R., Perona, P., & Zisserman, A. (2005). A sparse object category model for efficient learning and exhaustive recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 05) (Vol. 2, pp. 380–389). IEEE: San Diego.Google Scholar
  26. Forsyth, D. A., & Ponce, J. (2002) Computer Vision: A Modern Approach (pp. 599–619). Prentice Hall.Google Scholar
  27. Gu, Z. Q., Duncan, C. N., Renshaw, E., Mugglestone, M. A., Cowan, C. F. N., & Grant, P. M. (1989). Comparison of techniques for measuring cloud texture in remotely sensed satellite meteorological image data. Radar and Signal Processing, 136(5), 236–248.CrossRefGoogle Scholar
  28. Haberäcker, P. (1995). Praxis der Digitalen Bildverarbeitung und Mustererkennung. München, Wien: Carl Hanser Verlag.Google Scholar
  29. Hand, D., Manila, H., & Smyth, P. (2001). Principles of data mining. Cambridge: MIT Press.Google Scholar
  30. Haralick, R. M., Shanmugam, B., & Dinstein, I. (1973). Texture Features for image classification. IEEE Transactions on Systems, Man, and Cybernetics, 3(6), 610–621.CrossRefGoogle Scholar
  31. Heesch, D., & Rüger, S. (2002). Combining features for content-based sketch retrieval—a comparative evaluation of retrieval performance. In European Colloquium on Information Retrieval Research, vol. 2291 of LNCS (pp. 41–52). Glasgow, Scotland.Google Scholar
  32. Heesch, D., & Rüger, S. (2003). Performance boosting with three mouse clicks—relevance feedback for CBIR. In European Conference on Information Retrieval Research. No. 2633 in LNCS (pp. 363–376). Pisa: Springer Verlag.Google Scholar
  33. Iqbal, Q., & Aggarwal. J. (2002). CIRES: A system for content-based retrieval in digital image libraries. In International Conference on Control, Automation, Robotics and Vision (pp. 205–210). Singapore.Google Scholar
  34. Jain, S. (2004). Fast image retrieval using local features: Improving approximate search employing seed-grow approach. Master’s thesis. INPG, Grenoble.Google Scholar
  35. Keysers, D., Deselaers, T., Gollan, C., & Ney, H. (2007). Deformation models for image recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(8), 1422–1435.CrossRefGoogle Scholar
  36. Kittler, J. (1998). On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 226–239.CrossRefGoogle Scholar
  37. Lehmann, T. M., Güld, M. -O., Deselaers, T., Keysers, D., Schubert, H., Spitzer, K., et al. (2005). Automatic categorization of medical images for content-based retrieval and data mining. Computerized Medical Imaging and Graphics, 29(2), 143–155.CrossRefGoogle Scholar
  38. Lew, M. S., Sebe, N., Djeraba, C., & Jain, R. (2006). Content-based multimedia information retrieval: State of the art and challenges. ACM Transactions on Multimedia Computing, Communications and Applications, 2(1), 1–19.CrossRefGoogle Scholar
  39. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRefGoogle Scholar
  40. MacArthur, S. D., Brodley, C. E., & Shyu, C. -R. (2000). Relevance feedback decision trees in content-based image retrieval. In Content-based access of image and video libraries (pp. 68–72). IEEE: Hilton Head Island, SC.Google Scholar
  41. Manjunath, B., Ohm, J. -R., Vasudevan, V. V., & Yamada, A. (2001). Color and texture descriptors. IEEE Trans Circuits and Systems for Video Technology, 11(6), 703–715.CrossRefGoogle Scholar
  42. Marée, R., Geurts, P., Piater, J., & Wehenkel, L. (2005) Random subwindows for robust image classification. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 34–40).Google Scholar
  43. Markkula, M., & Sormunen, E. (1998). Searching for photos—journalists’ practices in pictorial IR. In Electronic Workshops in Computing—Hallenge of Image Retrieval (pp. 1–13). Newcastle.Google Scholar
  44. Meghini, C., Sebastiani, F., & Straccia, U. (2001). A model of multimedia information retrieval. Journal of the ACM, 48(5), 909–970.CrossRefMathSciNetGoogle Scholar
  45. Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., et al. (2005). A comparison of affine region detectors. International Journal of Computer Vision 65(1/2).Google Scholar
  46. Müller, H., Müller, W., Squire, D. M., Marchand-Maillet, S., & Pun, T. (2000). Learning features weights from user behavior in Content-Based Image Retrieval. In S. Simoff & O. Zaiane (Eds.), ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Workshop on Multimedia Data Mining MDM/KDD2000). Boston.Google Scholar
  47. Müller, H., Müller, W., Marchand-Maillet, S., & Squire, D. M. (2000). Strategies for positive and negative relevance feedback in image retrieval. In International Conference on Pattern Recognition, vol. 1 of Computer Vision and Image Analysis (pp. 1043–1046). Barcelona.Google Scholar
  48. Müller, H., Müller, W., Squire, D. M., Marchand-Maillet, S., & Pun, T. (2001). Performance evaluation in content-based image retrieval: overview and proposals. In H. Bunke & X. Jiang (Eds.), Pattern Recognition Letters (Special Issue on Image and Video Indexing) 22(5), 593–601.Google Scholar
  49. Müller, H., Marchand-Maillet, S., & Pun, T. (2002) The truth about corel—evaluation in image retrieval. In Proceedings of The Challenge of Image and Video Retrieval (CIVR2002), vol. 2383 of LNCS (pp. 38–49). London.Google Scholar
  50. Müller, H., Michoux, N., Bandon, D., & Geissbuhler A. (2004). A review of content-based image retrieval systems in medical applications–clinical benefits and future directions. International Journal of Medical Informatics 73(1), 1–23.Google Scholar
  51. Najjar, M., Ambroise, C., & Cocquerez, J. -P. (2003). Feature selection for semi supervised learning applied to image retrieval. In ICIP 2003 (Vol. 3, pp. 559–562). Barcelona.Google Scholar
  52. Nölle, M. (2003). Distribution distance measures applied to 3-D object recognition—a case study. In DAGM 2003, Pattern Recognition, 25th DAGM Symposium, vol. 2781 of Lecture Notes in Computer Science (pp. 84–91). Magdeburg: Springer Verlag.Google Scholar
  53. Nowak, E., & Jurie, F. (2007). Learning visual similarity measures for comparing never seen objects. In CVPR 2007. Minneapolis.Google Scholar
  54. Obdrzalek, S., & Matas, J. (2003). Image retrieval using local compact DCT-Based representation. In DAGM 2003, Pattern Recognition, 25th DAGM Symposium, vol. 2781 of Lecture Notes in Computer Science (pp. 490–497). Magdeburg, Germany: Springer Verlag.Google Scholar
  55. Ohm, J.-R. (2001). The MPEG-7 visual description framework—concepts, accuracy and applications. In CAIP 2001. No. 2124 in LNCS (pp. 2–10).Google Scholar
  56. Opelt, A., Pinz, A., Fussenegger, M., & Auer, P. (2006). Generic object recognition with boosting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(3), 416–431.CrossRefGoogle Scholar
  57. Paredes, R., Perez-Cortes, J., Juan, A., & Vidal, E. (2001). Local representations and a direct voting scheme for face recognition. In Workshop on Pattern Recognition in Information Systems (pp. 71–79). Setúbal, Portugal.Google Scholar
  58. Park, M., Jin, J. S., & Wilson, L. S. (2002). Fast content-based image retrieval using quasi-gabor filter and reduction of image feature. In Southwest Symposium on Image Analysis and Interpretation (pp. 178–182). Santa Fe.Google Scholar
  59. Pentland, A., Picard, R., & Sclaroff, S. (1996). Photobook: Content-based manipulation of image databases. International Journal of Computer Vision, 18(3), 233–254.CrossRefGoogle Scholar
  60. Puzicha, J., Rubner, Y., Tomasi, C., & Buhmann, J. (1999). Empirical evaluation of dissimilarity measures for color and texture. In International Conference on Computer Vision (Vol. 2, pp. 1165–1173). Corfu, Greece.Google Scholar
  61. Rubner, Y., Tomasi, C., & Guibas, L. J. (1998). A metric for distributions with applications to image databases. In International Conference on Computer Vision (pp. 59–66). Bombay.Google Scholar
  62. Rui, Y., Huang, T., & Chang, S. (1999). Image retrieval: Current techniques, promising directions and open issues. Journal of Visual Communication and Image Representation, 10(4), 39–62.CrossRefGoogle Scholar
  63. Schaefer, G. (2004). CVPIC colour/shape histograms for compressed domain image retrieval. In DAGM 2004. vol. 3175 of LNCS (pp. 424–431). Tübingen, Germany.Google Scholar
  64. Schaefer, G., & Stich, M. (2004) UCID-An uncompressed colour image database. In Proc. SPIE Storage and Retrieval Methods and Applications for Multimedia (pp. 472–480). San Jose.Google Scholar
  65. Schmid, C., & Mohr, R. (1997). Local grayvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis & Machine Intelligence, 19(5), 530–534.CrossRefGoogle Scholar
  66. Shao, H., Svoboda, T., & van Gool, L.(2003a). ZuBuD—Zurich Buildings Database for image based recognition. Computer Vision Lab, Swiss Federal Institute of Technology, Switzerland. Zurich, Switzerland.Google Scholar
  67. Shao, H., Svoboda, T., Tuytelaars, T., & Gool, L. V. (2003b). HPAT indexing for fast object/scene recognition based on local appearance. In Conference on Image and Video Retrieval. vol. 2728 of LNCS (pp. 71–80). Urbana-Champaign: Springer Verlag.Google Scholar
  68. Shirahatti, N. V., & Barnard, K. (2005). Evaluating image retrieval. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR 05) (Vol. 1, pp. 955–961). IEEE: San Diego.Google Scholar
  69. Siggelkow, S. (2002). Feature histograms for content-based image retrieval. Ph.D. thesis. University of Freiburg, Institute for Computer Science. Freiburg, Germany.Google Scholar
  70. Siggelkow, S., Schael, M., & Burkhardt, H. (2001). SIMBA—Search IMages By Appearance. In DAGM 2001, Pattern Recognition, 23rd DAGM Symposium, vol. 2191 of Lecture Notes in Computer Science (pp. 9–17). Munich: Springer Verlag.Google Scholar
  71. Smeulders, A. W. M., Worring, M., Santini, S., Gupta, A., & Jain, R. (2000). Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1349–1380.CrossRefGoogle Scholar
  72. Smith, J. R., & Chang, S. -F. (1996). Tools and techniques for color image retrieval. In SPIE Storage and Retrieval for Image and Video Databases (Vol. 2670, pp. 426–437).Google Scholar
  73. Squire, D. M., Müller, W., Müller, H., & Raki, J. (1999) Content-Based query of image databases, inspirations from text retrieval: Inverted files, frequency-based weights and relevance feedback. In Scandinavian Conference on Image Analysis (pp. 143–149). Kangerlussuaq.Google Scholar
  74. Sun, Y., Zhang, H., Zhang, L., & Li, M. (2002). MyPhotos-A system for home photo management and processing. In ACM Multimedia Conference (pp. 81–82). Juan-les-Pins.Google Scholar
  75. Swain, M. J., & Ballard, D. H. (1991). Color indexing. International Journal of Computer Vision, 7(1), 11–32.CrossRefGoogle Scholar
  76. Tamura, H., Mori, S., & Yamawaki, T. (1978). Textural features corresponding to visual perception. IEEE Transaction on Systems, Man, and Cybernetics, 8(6), 460–472.CrossRefGoogle Scholar
  77. Vailaya, A., Figueiredo, M. A. T., Jain, A. K., & Zhang, H. -J. (2001). Image classification for content-based indexing. IEEE Transactions on Image Processing, 10(1), 117–130.MATHCrossRefGoogle Scholar
  78. van Gool, L., Tuytelaars, T., & Turina, A. (2001). Local features for image retrieval. In R. C. Veltkamp, H. Burkhardt, H.-P. Kriegel (Eds), State-of-the-art in content-based image and video retrieval (pp. 21–41). Kluwer Academic Publishers.Google Scholar
  79. Vasconcelos, N., & Vasconcelos, M. (2004). Scalable discriminant feature selection for image retrieval and recognition. In CVPR 2004. 2 (pp. 770–775). Washington.Google Scholar
  80. Wang, J. Z., Li, J., & Wiederhold, G. (2001). SIMPLIcity: Semantics-sensitive integrated matching for picture LIbraries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(9), 947–963.CrossRefGoogle Scholar
  81. Yang, Z., & Kuo, C. (1999). Survey on image content analysis, indexing, and retrieval techniques and status report of MPEG-7. Tamkang Journal of Science and Engineering, 3(2), 101–118.Google Scholar
  82. Yavlinski, A., Pickering, M. J., Heesch, D., & Rüger, S. (2004). A comparative study of evidence combination strategies. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004) (Vol. 3, pp. 1040–1043). Montreal, Canada.Google Scholar
  83. Zahedi, M., Keysers, D., Deselaers, T., & Ney, H. (2005). Combination of tangent distance and an image distortion model for appearance-based sign language recognition. In DAGM 2005, Pattern Recognition, 26th DAGM Symposium, vol. 3663 of Lecture Notes in Computer Science (pp. 401–408). Vienna, Austria.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.Human Language Technology and Pattern Recognition, Computer Science DepartmentRWTH Aachen UniversityAachenGermany
  2. 2.Image Understanding and Pattern RecognitionGerman Research Center for Artificial Intelligence (DFKI)KaiserslauternGermany

Personalised recommendations