Nested Sparse Quantization for Efficient Feature Coding

  • Xavier Boix
  • Gemma Roig
  • Christian Leistner
  • Luc Van Gool
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7573)


Many state-of-the-art methods in object recognition extract features from an image and encode them, followed by a pooling step and classification. Within this processing pipeline, often the encoding step is the bottleneck, for both computational efficiency and performance. We present a novel assignment-based encoding formulation. It allows for the fusion of assignment-based encoding and sparse coding into one formulation. We also use this to design a new, very efficient, encoding. At the heart of our formulation lies a quantization into a set of k-sparse vectors, which we denote as sparse quantization. We design the new encoding as two nested, sparse quantizations. Its efficiency stems from leveraging bit-wise representations. In a series of experiments on standard recognition benchmarks, namely Caltech 101, PASCAL VOC 07 and ImageNet, we demonstrate that our method achieves results that are competitive with the state-of-the-art, and requires orders of magnitude less time and memory. Our method is able to encode one million images using 4 CPUs in a single day, while maintaining a good performance.


Quantization Error Sparse Code Feature Code Codebook Size Feature Encode 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: CVPR (2008)Google Scholar
  2. 2.
    Boureau, Y., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: CVPR (2010)Google Scholar
  3. 3.
    Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: An evaluation of recent feature encoding methods. In: BMVC (2011)Google Scholar
  4. 4.
    Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: ICCV (2011)Google Scholar
  5. 5.
    Yu, K., Lin, Y., Lafferty, J.: Learning image representations from pixel level via hierarchical sparse coding. In: CVPR (2011)Google Scholar
  6. 6.
    Liu, L., Wang, L., Liu, X.: In defence of soft-assignment coding. In: ICCV (2011)Google Scholar
  7. 7.
    Coates, A., Ng, A.: The importance of encoding versus training with sparse coding and vector quantization. In: ICML (2011)Google Scholar
  8. 8.
    Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV (2004)Google Scholar
  9. 9.
    Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV (2003)Google Scholar
  10. 10.
    Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for Bag-of-Features Image Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR (2006)Google Scholar
  12. 12.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (2008)Google Scholar
  13. 13.
    Moosmann, F., Jurie, F., Triggs, B.: Fast discriminative visual codebooks using randomized clustering forests. In: NIPS (2007)Google Scholar
  14. 14.
    Boureau, Y., Ponce, J., LeCun, Y.: A theoretical analysis of feature pooling in vision algorithms. In: NIPS (2010)Google Scholar
  15. 15.
    Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)Google Scholar
  16. 16.
    Benoit, L., Mairal, J., Bach, F., Ponce, J.: Sparse image representation with epitomes. In: CVPR (2011)Google Scholar
  17. 17.
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR (2010)Google Scholar
  18. 18.
    Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image Classification Using Super-Vector Coding of Local Image Descriptors. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 141–154. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  19. 19.
    Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  20. 20.
    DeVore, R.: Nonlinear approximation. Acta Numerica (1998)Google Scholar
  21. 21.
    Olshausen, B., Field, D.J.: Sparse coding with an overcomplete basis set: A strategy employed by v1? Vis. Res. (1997)Google Scholar
  22. 22.
    Shakhnarovich, G.: Learning Task-Specic Similarity. PhD thesis, Massachusetts Institute of Technology (2005)Google Scholar
  23. 23.
    Calonder, M., Lepetit, V., Strecha, C., Fua, P.: BRIEF: Binary Robust Independent Elementary Features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 778–792. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  24. 24.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: STOC (1998)Google Scholar
  25. 25.
    Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS (2008)Google Scholar
  26. 26.
    Gordo, A., Perronnin, F.: Asymmetric distances for binary embeddings. In: CVPR (2011)Google Scholar
  27. 27.
    Jégou, H., Matthijs Douze, C.S.: Product quantization for nearest neighbor search. PAMI (2011)Google Scholar
  28. 28.
    Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. PAMI (2006)Google Scholar
  29. 29.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge (VOC 2007) (2007) ResultsGoogle Scholar
  30. 30.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR 2009 (2009)Google Scholar
  31. 31.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV (2004)Google Scholar
  32. 32.
    Deng, J., Berg, A.C., Li, K., Fei-Fei, L.: What Does Classifying More Than 10,000 Image Categories Tell Us? In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 71–84. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  33. 33.
    Lin, Y., Lv, F., Zhu, S., Yang, M., Cour, T., Yu, K.: Large-scale image classification: fast feature extraction and svm training. In: CVPR (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Xavier Boix
    • 1
  • Gemma Roig
    • 1
  • Christian Leistner
    • 1
  • Luc Van Gool
    • 1
    • 2
  1. 1.Computer Vision LabETH ZurichSwitzerland
  2. 2.KU LeuvenBelgium

Personalised recommendations