Signal, Image and Video Processing

, Volume 2, Issue 3, pp 241–250 | Cite as

Feature selection for content-based image retrieval

  • Esin Guldogan
  • Moncef Gabbouj
Open Access
Original Paper


In this article, we propose a novel system for feature selection, which is one of the key problems in content-based image indexing and retrieval as well as various other research fields such as pattern classification and genomic data analysis. The proposed system aims at enhancing semantic image retrieval results, decreasing retrieval process complexity, and improving the overall system usability for end-users of multimedia search engines. Three feature selection criteria and a decision method construct the feature selection system. Two novel feature selection criteria based on inner-cluster and intercluster relations are proposed in the article. A majority voting-based method is adapted for efficient selection of features and feature combinations. The performance of the proposed criteria is assessed over a large image database and a number of features, and is compared against competing techniques from the literature. Experiments show that the proposed feature selection system improves semantic performance results in image retrieval systems.


Feature selection Mutual information Intercluster analysis Inner-cluster analysis Majority voting Content-based indexing and retrieval 

List of symbols


Probability density functions

p(x, y)

Joint probability density function

I(X; Y)

Mutual information


Shannon’s entropy


Correlation measure for evaluating the discrimination power of feature


Number of classes


Correlation between clusters


ith item in the cluster x


Mean of cluster x


Standard deviation of cluster x


Cardinality of clusters x


Eigen vector corresponding to the largest eigen value of the covariance matrix


The best representative feature vector


Set of feature vectors


Feature vector corresponding to the ith item in the cluster


jth element of the feature vector corresponding to the ith item of the cluster


Mean vector


Elements of M, mean values


Distance between π and M


Number of elements in the vectors π and M


Euclidean distance between cluster members


Compactness measurements


Covering radius, distance from the center to the farthest item in the cluster




Normalized numerical results from mutual information criterion


Normalized numerical results from inner-cluster relation criterion


Normalized numerical results from Pearson’s product-moment correlation criterion


Votes for each feature


Number of features in the FSRL list


Weights of the features in retrieval


Rank of the ith feature in FSRL list


Weight of item i in SPFL list


Weight of item j in FL list



  1. 1.
    MUVIS: A system for content-based multimedia indexing and retrieval in multimedia databases.
  2. 2.
    Pentland, A., Picard, R.W., Sclaroff, S.: Photobook: content-based manipulation of image databases. Int. J. Comput. Vis. 18(3), 233–254 (1996)CrossRefGoogle Scholar
  3. 3.
    Niblack, W., Barber, R., et al.: The QBIC project: querying images by content using color, textures and shape. In: Proceedings of SPIE Storage and Retrieval for Image and Video Databases, 1996, pp. 124–128 (1996)Google Scholar
  4. 4.
    Smith, J.R., Chang, S.-F.:VisualSEEk: a fully automated content-based image query system. In: Proceedings of ACM Multimedia, Boston, November 1996, pp. 87–98 (1996)Google Scholar
  5. 5.
    Dy, J.G., Brodley, C.E., Kak, A.C., Broderick, L.S., Aisen, A.M.: Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Pattern Anal. Mach. Intell. 25(3), 373–378 (2003)CrossRefGoogle Scholar
  6. 6.
    Collins, R.T., Yanxi, L., Leordeanu, M.: Online selection of discriminative tracking features. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1631–1643 (2005)CrossRefGoogle Scholar
  7. 7.
    Wei, J., Guihua, E., Qionghai, D., Jinwei, G.: Similarity-based online feature selection in content-based image retrieval. IEEE Trans. Image Process. 15(3), 702–712 (2006)CrossRefGoogle Scholar
  8. 8.
    Vasconcelos, N., Vasconcelos, M.:Scalable discriminant feature selection for image retrieval and recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–775 (2004)Google Scholar
  9. 9.
    Xing, E.P., Jordan, M.I., Karp, R.M.:Feature selection for high-dimensional genomic microarray data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 601–608 (2001)Google Scholar
  10. 10.
    Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowledge Data Eng. 17(4), 491–502 (2005)CrossRefGoogle Scholar
  11. 11.
    Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997)CrossRefGoogle Scholar
  12. 12.
    Koller, D., Sahami, M.:Toward optimal feature selection. In: Proceedings of the 13th International Conference on Machine Learning, pp. 284–292 (1996)Google Scholar
  13. 13.
    Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–238 (2005)CrossRefGoogle Scholar
  14. 14.
    Ding, C., Peng, H.: minimum redundancy feature selection from microarray gene expression data. In: Proceedings of the IEEE Computer Society Conference on Bioinformatics, 11–14 August 2003, pp. 523–528 (2003)Google Scholar
  15. 15.
    Ellis, D.P.W., Bilmes, J.A.: Using mutual information to design feature combinations. In: Proceedings of International Conference on Spoken Language Processing, ICSLP-2000, Vol. 3, Beijing, October 2000, pp. 79–82 (2000)Google Scholar
  16. 16.
    Hariri, S., Yousif, M., QuA, G.: New dependency and correlation analysis for features. IEEE Trans. Knowl. Data Eng. 17(9), 199–1207 (2005)Google Scholar
  17. 17.
    Shi, D., Shu, W., Liu, H.: Feature selection for handwritten chinese character recognition based on genetic algorithms. IEEE Int. Conf. Syst. Man Cybernet. 5, 4201–4206 (1998)Google Scholar
  18. 18.
    Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)CrossRefGoogle Scholar
  19. 19.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)zbMATHGoogle Scholar
  20. 20.
    Kiranyaz, S., Gabbouj, M.: Hierarchical cellular tree: an efficient indexing scheme for content-based retrieval on multimedia databases. IEEE Trans. Multimed. 9(1), 102–119 (2007)CrossRefGoogle Scholar
  21. 21.
    Partio, M., Cramariuc, B., Gabbouj, M., Visa, A.: Rock texture retrieval using gray level co-occurrence matrix. In: Proceedings of 5th Nordic Signal Processing Symposium, October 2002. (2002)Google Scholar
  22. 22.
    Xu, L., Krzyzak, A., Suen, C.Y.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst. Man Cybernet. 22(3), 418–435 (1992)CrossRefGoogle Scholar
  23. 23.
    Lin, X., Yacoub, S.M., Burns, J., Simske, S.J.: Performance analysis of pattern classifier combination by plurality voting. Pattern Recognit. Lett. 24(12), 1959–1969 (2003)CrossRefGoogle Scholar
  24. 24.
    Corel Stock Photo Library, Corel, OntarioGoogle Scholar
  25. 25.
    Swain, M.J., Ballard, D.H: Color indexing. Int. J. Comput. Vis. 7(1), 11–32 (1991)CrossRefGoogle Scholar
  26. 26.
    Ma, W.Y., Manjunath, B.: Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intell. 18, 837–842 (1996)CrossRefGoogle Scholar
  27. 27.
    Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–698 (1986)CrossRefGoogle Scholar
  28. 28.
    Manjunath B., S., Ohm, J.-R., Vasudevan, V.V., Yamada, A.: Color and texture descriptors. IEEE Trans. Circuits Syst. Video Technol. 11(6), 703–715 (2001)CrossRefGoogle Scholar

Copyright information

© The Author(s) 2008

Authors and Affiliations

  1. 1.Institute of Signal ProcessingTampere University of TechnologyTampereFinland

Personalised recommendations