Encoding Spatial Arrangements of Visual Words for Rotation-Invariant Image Classification

  • Hafeez AnwarEmail author
  • Sebastian Zambanini
  • Martin Kampel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8753)


Incorporating the spatial information of visual words enhances the performance of the well-known bag-of-visual words (BoVWs) model for problems like object category recognition. However, object images can undergo various in-plane rotations due to which the spatial information must be added to the BoVWs model in rotation-invariant manner. We present a novel approach to integrate the spatial information to BoVWs model in a rotation-invariant way by encoding the triangular relationship among the positions of identical visual words in the \(2D\) image space. Our proposed BoVWs model is based on densely sampled local features for which the dominant orientations are calculated. Thus we achieve rotation-invariance both globally and locally. We validate our proposed method for rotation-invariance on datasets of ancient coins and butterflies and achieve better performance than the conventional BoVWs model.


Spatial Information Visual Word Image Space Vocabulary Construction Visual Vocabulary 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Anwar, H., Zambanini, S., Kampel, M.: Supporting ancient coin classification by image-based reverse side symbol recognition. In: Wilson, R., Hancock, E., Bors, A., Smith, W. (eds.) CAIP 2013, Part II. LNCS, vol. 8048, pp. 17–25. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  2. 2.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV, pp. 1–22 (2004)Google Scholar
  3. 3.
    Deselaers, T., Ferrari, V.: Global and efficient self-similarity for object classification and detection. In: CVPR, pp. 1633–1640 (2010)Google Scholar
  4. 4.
    Kavelar, A., Zambanini, S., Kampel, M., Vondrovec, K., Siegl, K.: The ILAC-project: supporting ancient coin classification by means of image analysis. In: XXIV International CIPA Symposium (2013)Google Scholar
  5. 5.
    Khan, R., Barat, C., Muselet, D., Ducottet, C.: Spatial orientation of visual word pairs to improve bag-of-visual-words model. In: BMVC, pp. 1–11 (2012)Google Scholar
  6. 6.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)Google Scholar
  7. 7.
    Li, F.F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp. 524–531 (2005)Google Scholar
  8. 8.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  9. 9.
    Penatti, O.A.B., Silva, F.B., Valle, E., Gouet-Brunet, V., da Silva Torres, R.: Visual word spatial arrangement for image retrieval and classification. Pattern Recogn. 47(2), 705–720 (2014)CrossRefGoogle Scholar
  10. 10.
    Perdoch, M., Chum, O., Matas, J.: Efficient representation of local geometry for large scale object retrieval. In: CVPR, pp. 9–16 (2009)Google Scholar
  11. 11.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (2008)Google Scholar
  12. 12.
    Veksler, O.: Star shape prior for graph-cut image segmentation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 454–467. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Wang, J., Markert, K., Everingham, M.: Learning models for object recognition from natural language descriptions. In: BMVC, pp. 2.1–2.11 (2009)Google Scholar
  14. 14.
    Zambanini, S., Kampel, M.: Robust automatic segmentation of ancient coins. In: VISAPP, pp. 273–276 (2009)Google Scholar
  15. 15.
    Zhang, E., Mayo, M.: Enhanced spatial pyramid matching using log-polar-based image subdivision and representation. In: DICTA, pp. 208–213 (2010)Google Scholar
  16. 16.
    Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vision 73(2), 213–238 (2007)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Hafeez Anwar
    • 1
    Email author
  • Sebastian Zambanini
    • 1
  • Martin Kampel
    • 1
  1. 1.Computer Vision LabVienna University of TechnologyViennaAustria

Personalised recommendations