International Journal of Computer Vision

, Volume 114, Issue 2–3, pp 306–321 | Cite as

Dictionary Learning for Fast Classification Based on Soft-thresholding

Article

Abstract

Classifiers based on sparse representations have recently been shown to provide excellent results in many visual recognition and classification tasks. However, the high cost of computing sparse representations at test time is a major obstacle that limits the applicability of these methods in large-scale problems, or in scenarios where computational power is restricted. We consider in this paper a simple yet efficient alternative to sparse coding for feature extraction. We study a classification scheme that applies the soft-thresholding nonlinear mapping in a dictionary, followed by a linear classifier. A novel supervised dictionary learning algorithm tailored for this low complexity classification architecture is proposed. The dictionary learning problem, which jointly learns the dictionary and linear classifier, is cast as a difference of convex (DC) program and solved efficiently with an iterative DC solver. We conduct experiments on several datasets, and show that our learning algorithm that leverages the structure of the classification problem outperforms generic learning procedures. Our simple classifier based on soft-thresholding also competes with the recent sparse coding classifiers, when the dictionary is learned appropriately. The adopted classification scheme further requires less computational time at the testing stage, compared to other classifiers. The proposed scheme shows the potential of the adequately trained soft-thresholding mapping for classification and paves the way towards the development of very efficient classification methods for vision problems.

Keywords

Dictionary learning Soft-thresholding Sparse coding Rectifier linear units Neural networks 

References

  1. 1.
    Akata, Z., Perronnin, F., Harchaoui, Z., & Schmid, C. (2014). Good practice in large-scale learning for image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3), 507–520.Google Scholar
  2. 2.
    An, L. T. H., & Tao, P. D. (2005). The DC (difference of convex functions) programming and DCA revisited with dc models of real world nonconvex optimization problems. Annals of Operations Research, 133(1–4), 23–46.CrossRefMathSciNetMATHGoogle Scholar
  3. 3.
    Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1), 183–202.CrossRefMathSciNetMATHGoogle Scholar
  4. 4.
    Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798–1828.CrossRefMATHGoogle Scholar
  5. 5.
    Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford: oxford University Press Inc.Google Scholar
  6. 6.
    Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.CrossRefMATHGoogle Scholar
  7. 7.
    Burges, C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.CrossRefGoogle Scholar
  8. 8.
    Chang, C. C, & Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27.Google Scholar
  9. 9.
    Chen, C. F., Wei, C. P., & Wang, Y. C. (2012). Low-rank matrix recovery with structural incoherence for robust face recognition. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2618–2625).Google Scholar
  10. 10.
    Coates, A., & Ng, A. (2011). The importance of encoding versus training with sparse coding and vector quantization. In International conference on machine learning (ICML) (pp. 921–928).Google Scholar
  11. 11.
    Denil, M., & de Freitas, N. (2012). Recklessly approximate sparse coding. arXiv preprint arXiv:12080959.Google Scholar
  12. 12.
    Elad, M., & Aharon, M. (2006). Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing, 15(12), 3736–3745.CrossRefMathSciNetGoogle Scholar
  13. 13.
    Fadili, J., Starck, J. L., & Murtagh, F. (2009). Inpainting and zooming using sparse representations. The Computer Journal, 52(1), 64–79.CrossRefGoogle Scholar
  14. 14.
    Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.MATHGoogle Scholar
  15. 15.
    Figueras i Ventura, R., Vandergheynst, P., & Frossard, P. (2006). Low-rate and flexible image coding with redundant representations. IEEE Transactions on Image Processing, 15(3), 726–739.CrossRefGoogle Scholar
  16. 16.
    Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep sparse rectifier networks. In International Conference on Artificial Intelligence and Statistics (AISTATS) (Vol. 15, pp. 315–323).Google Scholar
  17. 17.
    Gregor, K., & LeCun, Y. (2010). Learning fast approximations of sparse coding. In International Conference on Machine Learning (ICML) (pp. 399–406).Google Scholar
  18. 18.
    Horst, R. (2000). Introduction to global optimization. Berlin: Springer.CrossRefGoogle Scholar
  19. 19.
    Huang, K., & Aviyente, S. (2006). Sparse representation for signal classification. In Advances in neural information processing systems (pp. 609–616).Google Scholar
  20. 20.
    Hull, J. J. (1994). A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5), 550–554.CrossRefGoogle Scholar
  21. 21.
    Kavukcuoglu, K., Ranzato, M., & LeCun, Y. (2010a). Fast inference in sparse coding algorithms with applications to object recognition. arXiv preprint arXiv:10103467.Google Scholar
  22. 22.
    Kavukcuoglu, K., Sermanet, P., Boureau, Y. L., Gregor, K., Mathieu, M., & LeCun, Y. (2010b). Learning convolutional feature hierarchies for visual recognition. In Advances in neural information processing systems (NIPS) (pp. 1090–1098).Google Scholar
  23. 23.
    Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto.Google Scholar
  24. 24.
    Larochelle, H., Mandel, M., Pascanu, R., & Bengio, Y. (2012). Learning algorithms for the classification restricted boltzmann machine. The Journal of Machine Learning Research, 13, 643–669.Google Scholar
  25. 25.
    LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.CrossRefGoogle Scholar
  26. 26.
    Ma, L., Wang, C., Xiao, B., & Zhou, W. (2012). Sparse representation for face recognition based on discriminative low-rank dictionary learning. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2586–2593).Google Scholar
  27. 27.
    Maas, A., Hannun, A., & Ng, A. (2013). Rectifier nonlinearities improve neural network acoustic models. In International conference on machine learning (ICML).Google Scholar
  28. 28.
    Mairal, J., Bach, F., & Ponce, J. (2012). Task-driven dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 791–804.CrossRefGoogle Scholar
  29. 29.
    Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. The Journal of Machine Learning Research, 11, 19–60.MathSciNetMATHGoogle Scholar
  30. 30.
    Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2008). Supervised dictionary learning. In Advances in neural information processing systems (NIPS) (pp. 1033–1040).Google Scholar
  31. 31.
    Mairal, J., & Yu, B. (2012). Complexity analysis of the lasso regularization path. In International conference on machine learning (ICML) (pp. 353–360).Google Scholar
  32. 32.
    Raina, R., Battle, A., Lee, H., Packer, B., & Ng, A. Y. (2007). Self-taught learning: transfer learning from unlabeled data. In International conference on machine learning (ICML) (pp. 759–766).Google Scholar
  33. 33.
    Ramirez, I., Sprechmann, P., & Sapiro, G. (2010). Classification and clustering via dictionary learning with structured incoherence and shared features. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3501–3508).Google Scholar
  34. 34.
    Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  35. 35.
    Sriperumbudur, B. K., Torres, D. A., & Lanckriet, G. R. (2007). Sparse eigen methods by DC programming. In International conference on machine learning (ICML) (pp. 831–838).Google Scholar
  36. 36.
    Tao, P. D., & An, L. T. H. (1998). A DC optimization algorithm for solving the trust-region subproblem. SIAM Journal on Optimization, 8(2), 476–505.CrossRefMathSciNetMATHGoogle Scholar
  37. 37.
    Valkealahti, K., & Oja, E. (1998). Reduced multidimensional co-occurrence histograms in texture classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1), 90–94.CrossRefGoogle Scholar
  38. 38.
    Wright, J., Yang, A., Ganesh, A., Sastry, S., & Ma, Y. (2009). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 210–227.CrossRefGoogle Scholar
  39. 39.
    Yang, J., Yu, K., Gong, Y., & Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1794–1801).Google Scholar
  40. 40.
    Yuille, A., Rangarajan, A., & Yuille, A. (2002). The concave-convex procedure (cccp). In Advances in neural information processing systems (NIPS) (Vol. 2, pp. 1033–1040).Google Scholar
  41. 41.
    Zeiler, M., Ranzato, M., Monga, R., Mao, M., Yang, K., Le, Q., Nguyen, P., Senior, A., Vanhoucke, V., Dean, J., & Hinton, G. (2013). On rectified linear units for speech processing. In IEEE International conference on acoustics, speech and signal processing (ICASSP) (pp. 3517–3521).Google Scholar
  42. 42.
    Zhang, Y., Jiang, Z., & Davis, L. (2013). Learning structured low-rank representations for image classification. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 676–683).Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Signal Processing Laboratory (LTS4)Ecole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
  2. 2.IDCOMThe University of EdinburghEdinburghUK

Personalised recommendations