Abstract
This paper augments the Bag-of-Word scheme in several respects: we incorporate a category label into the clustering process, build classifier-tailored codebooks, and weight codewords according to their probability to occur. A size-adaptive feature clustering algorithm is also proposed as an alternative to k-means. Experiments on the PASCAL VOC 2007 challenge validate the approach for classical hard-assignment as well as VLAD encoding.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: BMVC (2011)
Kaufman, L., Rousseeuw, P.-J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)
Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S.: Constrained K-means clustering with background knowledge. In: ICML (2001)
Hartigan, J., Wang, M.: A K-means clustering algorithm. Appl. Stat. 28, 100–108 (1979)
Perronnin, F., Dance, C.: Fisher kenrels on visual vocabularies for image categorizaton. In: CVPR (2006)
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15561-1_11
Negrel, R., Picard, D., Gosselin, P.H.: Compact tensor based image representation for similarity search. In: ICIP (2012)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: CVPR (2008)
Csurka, G., Bray, C., Dance, C.R., Fan, L., Willamowski, J.: Visual categorization with bags of keypoints. In: ECCV (2004)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
Jegou, H., Douze, M., Schmid, C., Perez, P.: Aggregating local descriptors into a compact image representation. In: CVPR (2010)
Delhumeau, J., Gosselin, P.-H., Jégou, H., Pérez, P.: Revisiting the VLAD image representation. ACM Multimedia (2013)
Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image classification using super-vector coding of local image descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15555-0_11
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: ICCV (2003)
Moosmann, F., Nowak, E., Jurie, F.: Randomized clustering forests for image classification. PAMI 30(9), 1632–1646 (2008)
Winn, J., Criminisi, A., Minka, A.: Object categorization by learned universal visual dictionary. In: ICCV (2005)
Yang, L., Jin, R., Sukthankar, R., Jurie, F.: Unifying discriminative visual codebook generation with classifier training for object category recognition. In: CVPR (2008)
Larlus, D., Jurie, F.: Latent mixture vocabularies for object categorization. In: BMVC (2006)
López-Sastre, R.J., Renes-Olalla, J., Gil-Jiménez, P., Maldonado-Bascón, S., Lafuente-Arroyo, S.: Heterogeneous visual codebook integration via consensus clustering for visual categorization. TCSVT 23, 1358–1368 (2013)
Liu, J., Yang, Y., Shah, M.: Learning semantic visual vocabularies using diffusion distance. In: CVPR (2009)
Zhang, S., Tian, Q., Hua, G., Zhou, W., Huang, Q., Li, H., Gao, W.: Modeling spatial and semantic cues for large-scale near-duplicated image retrieval. CVIU 115(3), 403–414 (2011)
Li, T., Mei, T., Kweon, I.-S., Hua, X.-S.: Contextual bag-of-words for visual categorization. TCSVT 21(4), 381–392 (2011)
Trichet, R., Nevatia, R.: Video segmentation and feature co-occurrences for activity classification. In: WACV (2014)
Leibe, B., Ettlin, A., Schiele, B.: Learning semantic object parts for object categorization. Image Vis. Comput. 26(1), 15–26 (2008)
Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Action recognition using context-constrained linear coding. IEEE Sig. Process. Lett. 19(7), 2112–2119 (2012)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010)
Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: CVPR (2010)
Wang, H., Yuan, J., Tan, Y.-P.: Combining feature context and spatial context for image pattern discovery. In: ICDM (2011)
Arandjelovic, R., Zisserman, A.: All about vlad. In: CVPR (2013)
Everingham, M., Zisserman, A., Williams, C., Van Gool, L.: The PASCAL visual object classes challenge 2007 (VOC2007) results. Technical report, Pascal Challenge (2007)
Vedaldi, A., Fulkerson, B.: VLFeat-an open and portable library of computer vision algorithms. ACM Multimedia (2010)
Krystian, M., Schmid, C.: A performance evaluation of local descriptors. PAMI 27(10), 1615–1630 (2005)
Peng, X., Wang, L., Qiao, Y., Peng, Q.: Boosting VLAD with supervised dictionary learning and high-order statistics. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 660–674. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10578-9_43
Acknowledgement
This publication has emanated from research conducted with the financial support of Science Foundation Ireland (SFI) under grant number SFI/12/RC/2289.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Trichet, R., O’Connor, N.E. (2016). Vector Quantization Enhancement for Computer Vision Tasks. In: Blanc-Talon, J., Distante, C., Philips, W., Popescu, D., Scheunders, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2016. Lecture Notes in Computer Science(), vol 10016. Springer, Cham. https://doi.org/10.1007/978-3-319-48680-2_35
Download citation
DOI: https://doi.org/10.1007/978-3-319-48680-2_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48679-6
Online ISBN: 978-3-319-48680-2
eBook Packages: Computer ScienceComputer Science (R0)