Fast Approximations to Structured Sparse Coding and Applications to Object Classification

  • Arthur Szlam
  • Karol Gregor
  • Yann LeCun
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7576)

Abstract

We describe a method for fast approximation of sparse coding. A given input vector is passed through a binary tree. Each leaf of the tree contains a subset of dictionary elements. The coefficients corresponding to these dictionary elements are allowed to be nonzero and their values are calculated quickly by multiplication with a precomputed pseudoinverse. The tree parameters, the dictionary, and the subsets of the dictionary corresponding to each leaf are learned. In the process of describing this algorithm, we discuss the more general problem of learning the groups in group structured sparse modeling. We show that our method creates good sparse representations by using it in the object recognition framework of [1,2]. Implementing our own fast version of the SIFT descriptor the whole system runs at 20 frames per second on 321 ×481 sized images on a laptop with a quad-core cpu, while sacrificing very little accuracy on the Caltech 101, Caltech 256, and 15 scenes benchmarks.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR 2006 (2006) 1, 2, 8, 9, 11Google Scholar
  2. 2.
    Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR 2009 (2009) 1, 2, 8, 9, 11Google Scholar
  3. 3.
    Olshausen, B., Field, D.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996) 1CrossRefGoogle Scholar
  4. 4.
    Aharon, M., Elad, M., Bruckstein, A.: K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing 54, 4311–4322 (2006) 1, 2CrossRefGoogle Scholar
  5. 5.
    Kavukcuoglu, K., Ranzato, M., LeCun, Y.: Fast inference in sparse coding algorithms with applications to object recognition. Technical Report CBLL-TR-2008-12-01, Computational and Biological Learning Lab, Courant Institute, NYU (2008) 1, 7Google Scholar
  6. 6.
    Yang, J., Yu, K., Huang, T.: Efficient Highly Over-Complete Sparse Coding Using a Mixture Model. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 113–126. Springer, Heidelberg (2010) 1, 2, 7CrossRefGoogle Scholar
  7. 7.
    Boureau, Y., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: Proc. International Conference on Computer Vision and Pattern Recognition (CVPR 2010). IEEE (2010) 1, 11Google Scholar
  8. 8.
    Boureau, Y., Roux, N.L., Bach, F., Ponce, J., LeCun, Y.: Ask the locals: multi-way local pooling for image recognition. In: International Conference on Computer Vision (2011) 1, 2, 7, 10, 11Google Scholar
  9. 9.
    Pati, Y.C., Rezaiifar, R., Rezaiifar, Y.C.P.R., Krishnaprasad, P.S.: Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In: Proceedings of the 27 th Annual Asilomar Conference on Signals, Systems, and Computers, pp. 40–44 (1993) 2, 4Google Scholar
  10. 10.
    Jenatton, R., Mairal, J., Obozinski, G., Bach, F.: Proximal methods for sparse hierarchical dictionary learning. In: International Conference on Machine Learning, ICML (2010) 2, 6Google Scholar
  11. 11.
    Kim, S., Xing, E.P.: Tree-guided group lasso for multi-task regression with structured sparsity. In: ICML, pp. 543–550 (2010) 2, 6Google Scholar
  12. 12.
    Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 433–440. ACM, New York (2009) 2, 6Google Scholar
  13. 13.
    Baraniuk, R.G., Cevher, V., Duarte, M.F., Hegde, C.: Model-Based Compressive Sensing (2009) 2, 6Google Scholar
  14. 14.
    Lloyd, S.P.: Least squares quantization in pcm. IEEE Transactions on Information Theory 28, 129–137 (1982) 2, 3MathSciNetMATHCrossRefGoogle Scholar
  15. 15.
    Gilbert, A.C., Strauss, M.J., Tropp, J.A.: Simultaneous Sparse Approximation via Greedy Pursuit. IEEE Trans. Acoust. Speech Signal Process. 5, 721–724 (2005) 2, 4Google Scholar
  16. 16.
    Tropp, J.: Topics in Sparse Approximation. PhD thesis, University of Texas at Austin, Computational and Applied Mathematics (2004) 5Google Scholar
  17. 17.
    Ostrovsky, R., Rabani, Y., Schulman, L., Swamy, C.: The effectiveness of lloyd-type methods for the k-means problem. In: FOCS 2006 (2006) 6Google Scholar
  18. 18.
    Dasgupta, S., Freund, Y.: Random projection trees and low dimensional manifolds. In: STOC 2008 (2008) 6Google Scholar
  19. 19.
    Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing, pp. 518–529 (1999) 6Google Scholar
  20. 20.
    Huang, J., Zhang, T., Metaxas, D.N.: Learning with structured sparsity. In: ICML, p. 53 (2009) 6Google Scholar
  21. 21.
    Vidal, R.: Subspace clustering. IEEE Signal Processing Magazine 28, 52–68 (2011) 6CrossRefGoogle Scholar
  22. 22.
    Wang, F., Lee, N., Sun, J., Hu, J., Ebadollahi, S.: In: Burgard, W., Roth, D. (eds.) AAAI. AAAI Press (2011) 6Google Scholar
  23. 23.
    Ramírez, I., Sprechmann, P., Sapiro, G.: Classification and clustering via dictionary learning with structured incoherence and shared features. In: CVPR, pp. 3501–3508 (2010) 6Google Scholar
  24. 24.
    Allard, W., Chen, G., Maggioni, M.: Multiscale geometric methods for data sets II: Geometric multi-resolution analysis. Applied and Computational Harmonic Analysis 32, 435–462 (2012) 7MathSciNetMATHCrossRefGoogle Scholar
  25. 25.
    Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Non-local sparse models for image restoration. In: International Conference on Computer Vision (2009) 7Google Scholar
  26. 26.
    Gregor, K., LeCun, Y.: Learning fast approximations of sparse coding. In: International Conference on Machine Learning, ICML (2010) 7Google Scholar
  27. 27.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874 (2008) 10MATHGoogle Scholar
  28. 28.
    Fei-Fei, L., Fergus, R., Perona, P.: Caltech 101, http://www.vision.caltech.edu/Image_Datasets/Caltech101/ 10
  29. 29.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007) 10Google Scholar
  30. 30.
    Lazebnik, S., Schmid, C., Ponce, J., Li, F., Oliva, A.: 15 scenes, http://www-cvr.ai.uiuc.edu/ponce_grp/data/ 10
  31. 31.
    Mairal, J.: SPAMS sparse coding toolbox, http://www.di.ens.fr/willow/SPAMS/ 10, 12
  32. 32.
    Gao, S., Tsang, I.W.-H., Zhao, L.T.C., Local, P.: Local features are not lonely-laplacian sparse coding for image classification. In: CVPR 2010 (2010) 10Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Arthur Szlam
    • 1
  • Karol Gregor
    • 2
  • Yann LeCun
    • 3
  1. 1.City College of New YorkUSA
  2. 2.Howard Hughes Medical InstituteUSA
  3. 3.New York UniversityUSA

Personalised recommendations