Object Categorization by an Augmented Bag-of-Visual-Words Approach

  • Shuang Bai
  • Noboru Ohnishi
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 128)


In object categorization, the bag-of-visual-words approach has demonstrated promising performance. However, one problem with it is that spatial information of object parts is discarded completely. This paper proposes to incorporate spatial information into bag of visual words framework. First, a set of flexible category specific key point patterns are selected from training images. Then, we use them to filter key points in an image, and estimate the object position using coordinates of filtered key points. After that, a set of windows are set into the image, based on the estimated object position. Then, a histogram is created for each window, which is concatenated at last as the final image representation. SVM is used for classification. We conducted experiments on the dataset of Caltech 101, and measurable improvement was achieved by the proposed method.


Target Object Visual Word Training Image Regular Grid Image Representation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR (2006)Google Scholar
  2. 2.
    Chang, C., Lin, C.J.: LIBSVM : a library for support vector machines (2001), Software,
  3. 3.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categorie. In: Proc. CVPR (2004)Google Scholar
  4. 4.
    Csurka, G., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Proc. ECCV (2004)Google Scholar
  5. 5.
    Vailaya, A., Figueiredo, A., Jain, A.: Image classification for content-based indexing. Transactions on Image Processing, 117–129 (2001)Google Scholar
  6. 6.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proc. ICPR (2003)Google Scholar
  7. 7.
    Ommer, B., Buhmann, J.M.: Object Categorization by Compositional Graphical Models, CVPR, pp. 235–250 (2005)Google Scholar
  8. 8.
    Szummer, M., Picard, R.W.: Indoor-outdoor image classification. In: ICCV, pp. 42–50 (1998)Google Scholar
  9. 9.
    Lowe, D.G.: Distinctive image features from scale-invariant key points. International Journal of Computer Vision 60(2), 191–210 (2004)CrossRefGoogle Scholar
  10. 10.
    Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. International Journal on Computer Vision 60, 63–86 (2004)CrossRefGoogle Scholar
  11. 11.
    Lazebnik, S., Schmid, C., Ponce, J.: Semi-local affine parts for object recognition. In: Proc. BMVC (2004)Google Scholar
  12. 12.
    Mirza-Mohammadi, M., Escalera, S., Radeva, P.: Contextual Guided Bag of Visual Words Model for Multi-class Object Categorization. In: CAIP, pp. 748–756 (2009)Google Scholar

Copyright information

© Springer-Verlag GmbH Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Nagoya UniversityNagoyaJapan

Personalised recommendations