Diversity in Ensembles of Codebooks for Visual Concept Detection

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8157)


Visual codebooks generated by the quantization of local descriptors allows building effective feature vectors for image archives. Codebooks are usually constructed by clustering a subset of image descriptors from a set of training images. In this paper we investigate the effect of the combination of an ensemble of different codebooks, each codebook being created by using different pseudo-random techniques for subsampling the set of local descriptors. Despite the claims in the literature on the gain attained by combining different codebook representations, reported results on different visual detection tasks show that the diversity is quite small, thus allowing for modest improvement in performance w.r.t. the standard random subsampling procedure, and calling for further investigation on the use of ensemble approaches in this context.


bag of words clustering SVM 


  1. 1.
    Ballan, L., Bertini, M., Del Bimbo, A., Serain, A.M., Serra, G., Zaccone, B.F.: Combining generative and discriminative models for classifying social images from 101 object categories. In: ICPR, Tsukuba, Japan (2012)Google Scholar
  2. 2.
    Chang, C.C., Lin, C.J.: Libsvm: A library for support vector machines. ACM TIST 2(3), 27 (2011)Google Scholar
  3. 3.
    Chang, S.F., Sikora, T., Puri, A.: Overview of the mpeg-7 standard. IEEE Trans. Circuits Syst. Video Techn., 688–695 (2001)Google Scholar
  4. 4.
    Chatzichristofis, S.A., Boutalis, Y.S.: Fcth: Fuzzy color and texture histogram - a low level feature for accurate image retrieval. In: Proc. of the 9th Int. Workshop on Image Analysis for Multimedia Interactive Services, pp. 191–196. IEEE CS (2008)Google Scholar
  5. 5.
    Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press (2000)Google Scholar
  6. 6.
    Crowley, J.L., Sanderson, A.C.: Multiple resolution representation and probabilistic matching of 2-d gray-scale shape. IEEE Trans. PAMI 9(1), 113–121 (1987)CrossRefGoogle Scholar
  7. 7.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22 (2004)Google Scholar
  8. 8.
    Deselaers, T., Keysers, D., Ney, H.: Features for image retrieval: an experimental comparison. Inf. Retr. 11(2), 77–107 (2008)CrossRefGoogle Scholar
  9. 9.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR (2), pp. 264–271 (2003)Google Scholar
  10. 10.
    Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: ICCV, pp. 604–610. IEEE Computer Society (2005)Google Scholar
  11. 11.
    Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998)CrossRefGoogle Scholar
  12. 12.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR (2), pp. 2169–2178. IEEE Computer Society (2006)Google Scholar
  13. 13.
    Leung, T.K., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. IJCV 43(1), 29–44 (2001)CrossRefzbMATHGoogle Scholar
  14. 14.
    Li, F.F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR (2), pp. 524–531 (2005)Google Scholar
  15. 15.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  16. 16.
    Luo, H.L., Wei, H., Hu, F.: Improvements in image categorization using codebook ensembles. Image Vision Comput. 29(11), 759–773 (2011)CrossRefGoogle Scholar
  17. 17.
    Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points. In: ICCV, pp. 525–531 (2001)Google Scholar
  18. 18.
    Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. International Journal of Computer Vision 60(1), 63–86 (2004)CrossRefGoogle Scholar
  19. 19.
    Pillai, I., Fumera, G., Roli, F.: Threshold optimisation for multi-label classifiers. Pattern Recognition 46(7), 2055–2065 (2013)CrossRefGoogle Scholar
  20. 20.
    Van Rijsbergen, C.J.: Information Retrieval. Butterworth (1979)Google Scholar
  21. 21.
    Van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)CrossRefGoogle Scholar
  22. 22.
    Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34(1), 1–47 (2002)CrossRefGoogle Scholar
  23. 23.
    Sivic, J., Zisserman, A.: A text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477. IEEE Computer Society (2003)Google Scholar
  24. 24.
    Thomee, B., Popescu, A.: Overview of the imageclef 2012 flickr photo annotation and retrieval task. Tech. rep., CLEF 2012 Working Notes, Rome, Italy (2012)Google Scholar
  25. 25.
    Tronci, R., Giacinto, G., Roli, F.: Dynamic score combination: A supervised and unsupervised score combination method. In: Perner, P. (ed.) MLDM 2009. LNCS, vol. 5632, pp. 163–177. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  26. 26.
    Tsoumakas, G., Katakis, I., Vlahavas, I.P.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer (2010)Google Scholar
  27. 27.
    Tuytelaars, T., Mikolajczyk, K.: Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision 3(3), 177–280 (2007)CrossRefGoogle Scholar
  28. 28.
    Yang, Y.: A study on thresholding strategies for text categorization. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 137–145 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Electrical and Electronic EngineeringUniversity of CagliariItaly

Personalised recommendations