Learning Hierarchical Bag of Words Using Naive Bayes Clustering

Chandra, Siddhartha; Kumar, Shailesh; Jawahar, C. V.

doi:10.1007/978-3-642-37331-2_29

Siddhartha Chandra²⁰,
Shailesh Kumar²¹ &
C. V. Jawahar²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7724))

Included in the following conference series:

Asian Conference on Computer Vision

8442 Accesses
2 Citations

Abstract

Image analysis tasks such as classification, clustering, detection, and retrieval are only as good as the feature representation of the images they use. Much research in computer vision is focused on finding better or semantically richer image representations. Bag of visual Words (BoW) is a representation that has emerged as an effective one for a variety of computer vision tasks. BoW methods traditionally use low level features. We have devised a strategy to use these low level features to create ‘‘higher level’’ features by making use of the spatial context in images. In this paper, we propose a novel hierarchical feature learning framework that uses a Naive Bayes Clustering algorithm to convert a 2-D symbolic image at one level to a 2-D symbolic image at the next level with richer features. On two popular datasets, Pascal VOC 2007 and Caltech 101, we empirically show that classification accuracy obtained from the hierarchical features computed using our approach is significantly higher than the traditional SIFT based BoW representation of images even though our image representations are more compact.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV (2003)
Google Scholar
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV (2004)
Google Scholar
Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. In: CVPR (2010)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
van Gemert, J.C., Geusebroek, J.-M., Veenman, C.J., Smeulders, A.W.M.: Kernel Codebooks for Scene Categorization. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 696–709. Springer, Heidelberg (2008)
Chapter Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010)
Google Scholar
Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image Classification Using Super-Vector Coding of Local Image Descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010)
Chapter Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)
Google Scholar
Hinton, G.E., Osindero, S., Whye Teh, Y.: A fast learning algorithm for deep belief nets. In: Neural Computation (2006)
Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE, 2278–2324 (1998)
Google Scholar
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: ICML (2009)
Google Scholar
Riesenhuber, M., Poggio, T., Studies, E.: Hierarchical models of object recognition in cortex (1999)
Google Scholar
Fergus, R., Yu, K., Ranzato, M.A., Lee, H., Salakhutdinov, R., Taylor, G.: Tutorial on deep learning methods for vision. In: CVPR 2012 Tutorial (2012), http://cs.nyu.edu/~fergus/tutorials/deep_learning_cvpr12/
Agarwal, A., Triggs, B.: Multilevel image coding with hyperfeatures. International Journal of Computer Vision (2008)
Google Scholar
Quack, T., Ferrari, V., Leibe, B., Gool, L.V.: Efficient mining of frequent and distinctive feature configurations (2007)
Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In: WGMBV (2004)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: A maximum entropy framework for part-based texture and object recognition. In: ICCV (2005)
Google Scholar
Sivic, J., Russell, B., Efros, A., Zisserman, A., Freeman, W.: Discovering objects and their location in images. In: ICCV (2005)
Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: ICCV (2005)
Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)
Google Scholar
Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: BMVC (2011)
Google Scholar
lan Boreau, Y., Bach, F., Lecun, Y., Ponce, J.: Learning mid-level features for recognition (2010)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics (2007)
Google Scholar
Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: SODA (2007)
Google Scholar
Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

CVIT, IIIT Hyderabad, India
Siddhartha Chandra & C. V. Jawahar
Google, Hyderabad, India
Shailesh Kumar

Authors

Siddhartha Chandra
View author publications
You can also search for this author in PubMed Google Scholar
Shailesh Kumar
View author publications
You can also search for this author in PubMed Google Scholar
C. V. Jawahar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Seoul National University, 1 Gwanak-ro, 151-744, Gwanak-gu, Seoul, Korea
Kyoung Mu Lee
Microsoft Research Asia, No. 5, Danling st., Haidian district, 100080, Beijing, P.R. China
Yasuyuki Matsushita
School of Interactive Computing, Georgia Institute of Technology, 801 Atlantic Drive, CCB 315, 30332, Atlanta, GA, USA
James M. Rehg
Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Zhong Quan Cun East Road 95, Haidian District, 100 190, Beijing, P.R. China
Zhanyi Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chandra, S., Kumar, S., Jawahar, C.V. (2013). Learning Hierarchical Bag of Words Using Naive Bayes Clustering. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37331-2_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-37331-2_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37330-5
Online ISBN: 978-3-642-37331-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics