Abstract
This paper presents a new approach to encode spatial-relationship information of visual words in the well-known visual dictionary model. The current most popular approach to describe images based on visual words is by means of bags-of-words which do not encode any spatial information. We propose a graceful way to capture spatial-relationship information of visual words that encodes the spatial arrangement of every visual word in an image. Our experiments show the importance of the spatial information of visual words for image classification and show the gain in classification accuracy when using the new method. The proposed approach creates opportunities for further improvements in image description under the visual dictionary model.
Chapter PDF
Similar content being viewed by others
References
Boureau, Y.L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: CVPR, pp. 2559–2566 (2010)
Cao, Y., Wang, C., Li, Z., Zhang, L., Zhang, L.: Spatial-bag-of-features. In: CVPR, pp. 3352–3359 (2010)
van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual word ambiguity. TPAMI 32(7), 1271–1283 (2010)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Tech. Rep. 7694, California Institute of Technology (2007)
Hoíng, N.V., Gouet-Brunet, V., Rukoz, M., Manouvrier, M.: Embedding spatial information into image content description for scene retrieval. Pattern Recognition 43(9), 3013–3024 (2010)
Wenjun, L., Min, W.: Multimedia forensic hash based on visual words. In: ICIP, pp. 989–992 (2010)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, vol. 2, pp. 2169–2178 (2006)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. Journal of Comp. Vis. 60(2), 91–110 (2004)
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Int. Journal of Comp. Vis. 60, 63–86 (2004)
Penatti, O.A.B., Torres, R.da.S.: Spatial relationship descriptor based on partitions. REIC 7(3) (2007) (in Portuguese)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (2008)
Jianzhao, Q., Yung, N.: Category-specific incremental visual codebook training for scene categorization. In: ICIP, pp. 1501–1504 (2010)
Savarese, S., Winn, J., Criminisi, A.: Discriminative object class models of appearance and shape by correlatons. In: CVPR, vol. 2, pp. 2033–2040 (2006)
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: ICCV, vol. 1, pp. 370–377 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Penatti, O.A.B., Valle, E., da S. Torres, R. (2011). Encoding Spatial Arrangement of Visual Words. In: San Martin, C., Kim, SW. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2011. Lecture Notes in Computer Science, vol 7042. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25085-9_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-25085-9_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25084-2
Online ISBN: 978-3-642-25085-9
eBook Packages: Computer ScienceComputer Science (R0)