Sampling Strategies for Bag-of-Features Image Classification

  • Eric Nowak
  • Frédéric Jurie
  • Bill Triggs
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3954)


Bag-of-features representations have recently become popular for content based image classification owing to their simplicity and good performance. They evolved from texton methods in texture analysis. The basic idea is to treat images as loose collections of independent patches, sampling a representative set of patches from the image, evaluating a visual descriptor vector for each patch independently, and using the resulting distribution of samples in descriptor space as a characterization of the image. The four main implementation choices are thus how to sample patches, how to describe them, how to characterize the resulting distributions and how to classify images based on the result. We concentrate on the first issue, showing experimentally that for a representative selection of commonly used test databases and for moderate to large numbers of samples, random sampling gives equal or better classifiers than the sophisticated multiscale interest operators that are in common use. Although interest operators work well for small numbers of samples, the single most important factor governing performance is the number of patches sampled from the test image and ultimately interest operators can not provide enough patches to compete. We also study the influence of other factors including codebook size and creation method, histogram normalization method and minimum scale for feature extraction.


Interest Point Sift Descriptor Interest Operator Codebook Size Interest Point Detector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV 2004 workshop on Statistical Learning in Computer Vision, pp. 59–74 (2004)Google Scholar
  2. 2.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR 2003, vol. II, pp. 264–271 (2003)Google Scholar
  3. 3.
    Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. IJCV 43, 29–44 (2001)CrossRefzbMATHGoogle Scholar
  4. 4.
    Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. PAMI 26, 1475–1490 (2004)CrossRefGoogle Scholar
  5. 5.
    Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: ICCV, vol. II, pp. 1816–1823 (2005)Google Scholar
  6. 6.
    Grauman, K., Darrell, T.: Efficient image matching with distributions of local invariant features. In: CVPR 2005, vol. II, pp. 627–634 (2005)Google Scholar
  7. 7.
    Leibe, B., Schiele, B.: Interleaved object categorization and segmentation. In: BMVC (2003)Google Scholar
  8. 8.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)CrossRefGoogle Scholar
  9. 9.
    Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 128–142. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  10. 10.
    Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV 2003, pp. 1470–1477 (2003)Google Scholar
  11. 11.
    Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  12. 12.
    Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: ICCV (2005)Google Scholar
  13. 13.
    Winn, J., Criminisi, A., Minka, T.: Object categorization by learned universal visual dictionary. In: ICCV (2005)Google Scholar
  14. 14.
    Bouchard, G., Triggs, B.: Hierarchical part-based visual object categorization. In: CVPR, vol. 1, pp. 710–715 (2005)Google Scholar
  15. 15.
    Agarwal, A., Triggs, B.: Hyperfeatures – multilevel local coding for visual recognition. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 30–43. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  16. 16.
    Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  17. 17.
    Niblack, W., Barber, R., Equitz, W., Flickner, M., Glasman, D., Petkovic, D., Yanker, P.: The qbic project: Querying image by content using color, texture, and shape. SPIE 1908, 173–187 (1993)Google Scholar
  18. 18.
    Lazebnik, S., Schmid, C., Ponce, J.: Affine-invariant local descriptors and neighborhood statistics for texture recognition. In: ICCV, pp. 649–655 (2003)Google Scholar
  19. 19.
    Rubner, Y., Tomasi, C., Guibas, L.: The earth mover’s distance as a metric for image retrieval. IJCV 40, 99–121 (2000)CrossRefzbMATHGoogle Scholar
  20. 20.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.V.: A comparison of affine region detectors. Int. J. Computer Vision 65, 43–72 (2005)CrossRefGoogle Scholar
  21. 21.
    Lindeberg, T.: Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention. IJCV 11, 283–318 (1993)CrossRefGoogle Scholar
  22. 22.
    Nowak, E., Jurie, F.: Vehicle categorization: Parts for speed and accuracy. In: VS-PETS workshop, in conjuction with ICCV 2005 (2005)Google Scholar
  23. 23.
    Everingham, M., et al.: The 2005 pascal visual object classes challenge. In: First PASCAL Challenges Workshop, Springer, Heidelberg (2006)Google Scholar
  24. 24.
    Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classifcation of texture and object categories: An in-depth study. Technical Report RR-5737, INRIA Rhône-Alpes, 665 avenue de l’Europe, 38330 Montbonnot, France (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Eric Nowak
    • 1
    • 2
  • Frédéric Jurie
    • 1
  • Bill Triggs
    • 1
  1. 1.GRAVIR-CNRS-INRIAMontbonnotFrance
  2. 2.Bertin TechnologieAix en ProvenceFrance

Personalised recommendations