Advertisement

International Journal of Computer Vision

, Volume 109, Issue 3, pp 209–232 | Cite as

Sparse Representation Based Fisher Discrimination Dictionary Learning for Image Classification

  • Meng Yang
  • Lei ZhangEmail author
  • Xiangchu Feng
  • David Zhang
Article

Abstract

The employed dictionary plays an important role in sparse representation or sparse coding based image reconstruction and classification, while learning dictionaries from the training data has led to state-of-the-art results in image classification tasks. However, many dictionary learning models exploit only the discriminative information in either the representation coefficients or the representation residual, which limits their performance. In this paper we present a novel dictionary learning method based on the Fisher discrimination criterion. A structured dictionary, whose atoms have correspondences to the subject class labels, is learned, with which not only the representation residual can be used to distinguish different classes, but also the representation coefficients have small within-class scatter and big between-class scatter. The classification scheme associated with the proposed Fisher discrimination dictionary learning (FDDL) model is consequently presented by exploiting the discriminative information in both the representation residual and the representation coefficients. The proposed FDDL model is extensively evaluated on various image datasets, and it shows superior performance to many state-of-the-art dictionary learning methods in a variety of classification tasks.

Keywords

Dictionary learning Sparse representation Fisher criterion Image classification 

References

  1. Aharon, M., Elad, M., & Bruckstein, A. (2006). K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(1), 4311–4322.MathSciNetCrossRefGoogle Scholar
  2. Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1), 183–202.zbMATHMathSciNetCrossRefGoogle Scholar
  3. Bengio, S., Pereira, F., Singer, Y., & Strelow, D. (2009). Group sparse coding. In Proceedings of the Neural Information Processing Systems Google Scholar
  4. Bobin, J., Starck, J., Fadili, J., Moudden, Y., & Donoho, D. (2007). Morphological component analysis: An adaptive thresholding strategy. IEEE Transactions on Image Processing, 16(11), 2675–2681.zbMATHMathSciNetCrossRefGoogle Scholar
  5. Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge university press.zbMATHCrossRefGoogle Scholar
  6. Bryt, O., & Elad, M. (2008). Compression of facial images using the K-SVD algorithm. Journal of Visual Communication and Image Representation, 19(4), 270–282.CrossRefGoogle Scholar
  7. Candes, E. (2006). Compressive sampling. International Congress of Mathematicians, 3, 1433–1452.MathSciNetGoogle Scholar
  8. Castrodad, A., & Sapiro, G. (2012). Sparse modeling of human actions from motion imagery. International Journal of Computer Vision, 100, 1–15.CrossRefGoogle Scholar
  9. Cooley, J. W., & Tukey, J. W. (1965). An algorithm for the machine calculation of complex Fourier series. Mathematics of Computation, 19, 297–301.zbMATHMathSciNetCrossRefGoogle Scholar
  10. Deng, W. H., Hu, J. N., & Guo, J. (2012). Extended SRC: Undersampled face recognition via intraclass variation dictionary. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(9), 1864–1870.CrossRefGoogle Scholar
  11. Duda, R., Hart, P., & Stork, D. (2000). Pattern classification (2nd ed.). New York: Wiley-Interscience.Google Scholar
  12. Elad, M., & Aharon, M. (2006). Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing, 15(12), 3736–3745.MathSciNetCrossRefGoogle Scholar
  13. Engan, K., Aase, S. O., & Husoy, J. H. (1999). Method of optimal directions for frame design. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing Google Scholar
  14. Fernando, B., Fromont, E., & Tuytelaars, T. (2012). Effective use of frequent itemset mining for image classification. In: Proceedings of the European Conference Computer Vision Google Scholar
  15. Gehler, P., & Nowozin, S. (2009). On feature combination for multiclass object classification. In: Proceedings of the International Conference Computer Vision Google Scholar
  16. Georghiades, A., Belhumeur, P., & Kriegman, D. (2001). From few to many: Illumination cone models for face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 643–660.CrossRefGoogle Scholar
  17. Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-PIE. Image and Vision Computing, 28, 807–813.CrossRefGoogle Scholar
  18. Guha, T., & Ward, R. K. (2012). Learning sparse representations for human action recognition. IEEE Transactions on Pattern Analysis and Machine Learning, 34(8), 1576–1888.CrossRefGoogle Scholar
  19. Guo, Y., Li, S., Yang, J., Shu, T., & Wu, L. (2003). A generalized Foley–Sammon transform based on generalized Fisher discrimination criterion and its application to face recognition. Pattern Recognition Letter, 24(1), 147–158.zbMATHCrossRefGoogle Scholar
  20. Hoyer, P. O. (2002). Non-negative sparse coding. In: Proceedings of the IEEE Workshop Neural Networks for Signal Processing Google Scholar
  21. Huang, K., & Aviyente, S. (2006). Sparse representation for signal classification. In: Proceedings of the Neural Information and Processing Systems Google Scholar
  22. Hull, J. J. (1994). A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5), 550–554.CrossRefGoogle Scholar
  23. Jenatton, R., Mairal, J., Obozinski, G., & Bach, F. (2011). Proximal methods for hierarchical sparse coding. Journal of Machine Learning Research, 12, 2234–2297.Google Scholar
  24. Jia, Y. Q., Nie, F. P., & Zhang, C. S. (2009). Trace ratio problem revisited. IEEE Transactions on Neural Network, 20(4), 729–735.CrossRefGoogle Scholar
  25. Jiang, Z. L., Lin, Z., & Davis, L. S. (2013). abel consistent K-SVD: Learning a discriminative dictionary for recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 533.CrossRefGoogle Scholar
  26. Jiang, Z. L., Zhang, G. X., & Davis, L. S. (2012). Submodular dictionary learning for sparse coding. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition Google Scholar
  27. Kim, S. J., Koh, K., Lustig, M., Boyd, S., & Gorinevsky, D. (2007). A interior-point method for large-scale \(l_{1}\)-regularized least squares. IEEE Journal on Selected Topics in Signal Processing, 1, 606–617.CrossRefGoogle Scholar
  28. Kong, S., & Wang, D. H. (2012). A dictionary learning approach for classification: Separating the particularity and the commonality. In: Proceedings of the European Conference on Computer Vision.Google Scholar
  29. Li, H., Jiang, T., & Zhang, K. (2006). Efficient and robust feature extraction by maximum margin criterion. IEEE Transactions on Neural Network, 17(1), 157–165.CrossRefGoogle Scholar
  30. Lian, X. C., Li, Z. W., Lu, B. L., & Zhang, L. (2010). Max-Margin Dictionary Learning for Multi-class Image Categorization. In: Proceedings of the European Conference on Computer Vision Google Scholar
  31. Mairal, J., Bach, F., & Ponce, J. (2012). Task-driven dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 791–804.CrossRefGoogle Scholar
  32. Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zissserman, A. (2008b). Learning discriminative dictionaries for local image analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  33. Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2009). Supervised dictionary learning. In: Proceedings of the Neural Information and Processing Systems Google Scholar
  34. Mairal, J., Elad, M., & Sapiro, G. (2008a). Sparse representation for color image restoration. IEEE Transactions on Image Processing, 17(1), 53–69.MathSciNetCrossRefGoogle Scholar
  35. Mairal, J., Leordeanu, M., Bach, F., Hebert, M., & Ponce, J. (2008c). Discriminative sparse image models for class-specific edge detection and image interpretation. In: Proceedings of the European Conference on Computer Vision Google Scholar
  36. Mallat, S. (1999). A wavelet tour of signal processing (2nd ed.). San Diego: Academic Press.zbMATHGoogle Scholar
  37. Martinez, A., & Benavente, R. (1998). The AR face database (p. 24). Report No: CVC Tech.Google Scholar
  38. Nesterov, Y., & Nemirovskii, A. (1994). Interior-point polynomial algorithms in convex programming. Philadelphia: SIAM.zbMATHCrossRefGoogle Scholar
  39. Nilsback, M., & Zisserman, A. (2006). A visual vocabulary for flower classification. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition Google Scholar
  40. Okatani, T., & Deguchi, K. (2007). On the Wiberg algorithm for matrix factorization in the presence of missing components. Internationall Journal of Computer Vision, 72(3), 329–337.CrossRefGoogle Scholar
  41. Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42, 145–174.zbMATHCrossRefGoogle Scholar
  42. Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609.CrossRefGoogle Scholar
  43. Olshausen, B. A., & Field, D. J. (1997). Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, 37(23), 3311–3325.CrossRefGoogle Scholar
  44. Pham, D., & Venkatesh, S. (2008). Joint learning and dictionary construction for pattern recognition. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition Google Scholar
  45. Phillips, P. J., Flynn, P. J., Scruggs, W. T., Bowyer, K. W., Chang, J., Hoffman, K., et al. (2005). Overiew of the face recognition grand challenge. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  46. Qiu, Q., Jiang, Z. L., & Chellappa, R. (2011). Sparse dictionary-based representation and recognition of action attributes. In: Proceedings of the International Conference on Computer Vision Google Scholar
  47. Ramirez, I., Sprechmann, P., & Sapiro, G. (2010). Classification and clustering via dictionary learning with structured incoherence and shared features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  48. Rodriguez, F., & Sapiro, G. (2007). Sparse representation for image classification: Learning discriminative and reconstructive non-parametric dictionaries (p. 2213). Preprint: IMA.Google Scholar
  49. Rodriguez, M., Ahmed, J., & Shah, M. (2008). A spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  50. Rosasco, L., Verri, A., Santoro, M., Mosci, S., & Villa, S. (2009). Iterative Projection Methods for Structured Sparsity Regularization. MIT Technical Reports, MIT-CSAIL-TR-2009-050, CBCL-282.Google Scholar
  51. Rubinstein, R., Bruckstein, A. M., & Elad, M. (2010). Dictionaries for sparse representation modeling. Proceedings of the IEEE, 98(6), 1045–1057.CrossRefGoogle Scholar
  52. Sadanand, S., & Corso, J. J. (2012). Action bank: A high-level representation of activeity in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  53. Shen, L., Wang, S. H., Sun, G., Jiang, S. Q., & Huang, Q. M. (2013). Multi-level discriminative dictionary learning towards hierarchical visual categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  54. Song, F. X., Zhang, D., Mei, D. Y., & Guo, Z. W. (2007). A multiple maximum scatter difference discriminant criterion for facial feature extraction. IEEE Transactions on Systems, Man, and Cybernetics Part B, 37(6), 1599–1606.CrossRefGoogle Scholar
  55. Sprechmann, P., & Sapiro, G. (2010). Dictionary learning and sparse coding for unsupervised clustering. In: Proceedings of the International Conference on Acoustics Speech and Signal Processing Google Scholar
  56. Szabo, Z., Poczos, B., & Lorincz, A. (2011). Online group-structured dictionary learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  57. Tropp, J. A., & Wright, S. J. (2010). Computational methods for sparse solution of linear inverse problems. Proceedings of the IEEE Conference Special Issue on Applications of Compressive Representation, 98(6), 948–958.Google Scholar
  58. Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1), 71–86.CrossRefGoogle Scholar
  59. Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57, 137–154.CrossRefGoogle Scholar
  60. Wagner, A., Wright, J., Ganesh, A., Zhou, Z. H., Mobahi, H., & Ma, Y. (2012). Toward a practical face recognition system: Robust alignment and illumination by sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(2), 373–386.CrossRefGoogle Scholar
  61. Wang, H., Ullah, M., Klaser, A., Laptev, I., & Schmid C. (2009). Evaluation of local spatio-temporal features for actions recognition. In: Proceedings of the British Machine Vision Conference.Google Scholar
  62. Wang, H., Yan, S.C., Xu, D., Tang, X.O., & Huang, T. (2007). Trace ratio versus ratio trace for dimensionality reduction. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition.Google Scholar
  63. Wang, H. R., Yuan, C. F., Hu, W. M., & Sun, C. Y. (2012). Supervised class-specific dictionary learning for sparse modeling in action recognition. Pattern Recognition, 45(11), 3902–3911.CrossRefGoogle Scholar
  64. Wright, J., Yang, A. Y., Ganesh, A., Sastry, S. S., & Ma, Y. (2009b). Robust face recognition via sparse representation. IEEE Trans Pattern Analysis and Machine Intelligence, 31(2), 210–227.CrossRefGoogle Scholar
  65. Wright, J. S., Nowak, D. R., & Figueiredo, T. A. M. (2009a). Sparse reconstruction by separable approximation. IEEE Transactions on Signal Processing, 57(7), 2479–2493.MathSciNetCrossRefGoogle Scholar
  66. Wu, Y. N., Si, Z. Z., Gong, H. F., & Zhu, S. C. (2010). Learning active basis model for object detection and recognition. International Journal of Computer Vision, 90, 198–235.MathSciNetCrossRefGoogle Scholar
  67. Xie, N., Ling, H., Hu, W., & Zhang, X. (2010). Use bin-ratio information for category and scene classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  68. Yang, A.Y., Ganesh, A., Zhou, Z. H., Sastry, S. S., & Ma, Y. (2010a). A review of fast \(l_{1}\)-minimization algorithms for robust face recognition. arXiv:1007.3753v2.
  69. Yang, J. C., Wright, J., Ma, Y., & Huang, T. (2008). Image super-resolution as sparse representation of raw image patches. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  70. Yang, J. C., Yu, K., Gong, Y., & Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification.In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  71. Yang, J. C., Yu, K., & Huang, T. (2010b). Supervised Translation-Invariant Sparse coding. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition Google Scholar
  72. Yang, M., & Zhang, L. (2010). Gabor feature based sparse representation for face recognition with gabor occlusion dictionary. In: Proceedings of the European Conference on Computer Vision Google Scholar
  73. Yang, M., Zhang, L., Feng, X. C., & Zhang, D. (2011b). Fisher discrimination dictionary learning for sparse representatio. In: Proceedings of the International Conference on Computer Vision Google Scholar
  74. Yang, M., Zhang, L., Yang, J., & Zhang, D. (2010c). Metaface learning for sparse representation based face recognition. In: Proceedings of the IEEE Conference on Image Processing Google Scholar
  75. Yang, M., Zhang, L., Yang, J., & Zhang, D. (2011a). Robust sparse coding for face recognition. In: Proceedings of the IEEE Conference Computer Vision and Pattern Recognition Google Scholar
  76. Yang, M., Zhang, L., & Zhang, D. (2012). Efficient misalignment robust representation for real-time face recognition. In: Proceedings of the European Conference on Computer Vision Google Scholar
  77. Yao, A., Gall, J., & Gool, L. V. (2010). A hough transform-based voting framework for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  78. Ye, G. N., Liu, D., Jhuo, I.-H., & Chang, S.-F. (2012). Robust late fusion with rank minimization. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  79. Yu, K., Xu, W., & Gong, Y. (2009). Deep learning with kernel regularization for visual recognition. In: Advances in Neural Information Processing Systems, p. 21.Google Scholar
  80. Yuan, X. T., & Yan, S. C. (2010). Visual classification with multitask joint sparse representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  81. Zhang, L., Yang, M., & Feng, X. C. (2011). Sparse representation or collaborative representation: which helps face recognition?. In: Proceedings of the International Conference on Computer Vision Google Scholar
  82. Zhang, Q., & Li, B. X. (2010). Discriminative K-SVD for dictionary learning in face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  83. Zhang, Z. D., Ganesh, A., Liang, X., & Ma, Y. (2012). TILT: Transformation invariant low-rank textures. International Journal of Computer Vision, 99, 1–24. Google Scholar
  84. Zhou, M. Y., Chen, H. J., Paisley, J., Ren, L., Li, L. B., Xing, Z. M., et al. (2012). Nonparametric Bayesian dictionary learning for analysis of noisy and incomplete images. IEEE Transactions on Image Processing, 21(1), 130–144.MathSciNetCrossRefGoogle Scholar
  85. Zhou, N., & Fan, J. P. (2012). Learning inter-related visual dictionary for object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Google Scholar
  86. Zou, H., & Hastie, T. (2005). Regularization and variable selection via elastic net. Journal of the Royal Statistical Society B, 67(Part 2), 301–320.zbMATHMathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Meng Yang
    • 1
  • Lei Zhang
    • 2
    Email author
  • Xiangchu Feng
    • 3
  • David Zhang
    • 2
  1. 1.College of Computer Science & Software EngineeringShenzhen UniversityShenzhenChina
  2. 2.Department of ComputingThe Hong Kong Polytechnic UniversityHong KongChina
  3. 3.Department of Applied MathematicsXidian UniversityXi’anChina

Personalised recommendations