Cognitive Computation

, Volume 11, Issue 5, pp 697–710 | Cite as

Joint Sparse Regularization for Dictionary Learning

  • Jianyu Miao
  • Heling Cao
  • Xiao-Bo Jin
  • Rongrong Ma
  • Xuan Fei
  • Lingfeng NiuEmail author


As a powerful data representation framework, dictionary learning has emerged in many domains, including machine learning, signal processing, and statistics. Most existing dictionary learning methods use the 0 or 1 norm as regularization to promote sparsity, which neglects the redundant information in dictionary. In this paper, a class of joint sparse regularization is introduced to dictionary learning, leading to a compact dictionary. Unlike previous works which obtain sparse representations independently, we consider all representations in dictionary simultaneously. An efficient iterative solver based on ConCave-Convex Procedure (CCCP) framework and Lagrangian dual is developed to tackle the resulting model. Further, based on the dictionary learning with joint sparse regularization, we consider the multi-layer structure, which can extract the more abstract representation of data. Numerical experiments are conducted on several publicly available datasets. The experimental results demonstrate the effectiveness of joint sparse regularization for dictionary learning.


Dictionary learning Joint sparse regularization Multi-layer structure 


Funding Information

This work was supported by National Natural Science Foundation of China (11671379, 61602154, and U1804159), by High-level Talent Fund Project of Henan University of Technology (31401155), by Natural Science Research Project of Henan Provincial Department of Science and Technology (182102210092), and by Fundamental Research Funds for Henan Provincial Colleges and Universities in Henan University of Technology (2016RCJH06).

Compliance with Ethical Standards

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants performed by any of the authors.


  1. 1.
    Aharon M, Elad M. Bruckstein, a.: k-svd: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 2006;54(11):4311–22.Google Scholar
  2. 2.
    Argyriou A, Evgeniou T, Pontil M. Multi-task feature learning. Advances in neural information processing systems; 2007. p. 41–8.Google Scholar
  3. 3.
    Bengio Y, Lamblin P, Popovici D, Larochelle H. Greedy layer-wise training of deep networks. Advances in neural information processing systems; 2007. p. 153–60.Google Scholar
  4. 4.
    Boyd S, Parikh N, Chu E, Peleato B, Eckstein J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends, in Machine Learning 2011;3(1):1–122.Google Scholar
  5. 5.
    Chen SB, Xin Y, Luo B. Action-based pedestrian identification via hierarchical matching pursuit and order preserving sparse coding. Cogn Comput 2016;8(5):797–805.Google Scholar
  6. 6.
    Chen Y, Nasrabadi NM, Tran TD. Hyperspectral image classification via kernel sparse representation. IEEE Trans Geosci Remote Sens 2013;51(1):217–31.Google Scholar
  7. 7.
    Davis G, Mallat S, Avellaneda M. Adaptive greedy approximations. Constr Approx 1997;13(1):57–98.Google Scholar
  8. 8.
    Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 2001;96(456):1348–60.Google Scholar
  9. 9.
    Gui J, Sun Z, Ji S, Tao D, Tan T. Feature selection based on structured sparsity: a comprehensive study. IEEE Trans Neural Netw Learn Syst 2017;28(7):1490–1507.Google Scholar
  10. 10.
    Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Comput 2006; 18(7):1527–54.PubMedGoogle Scholar
  11. 11.
    Huang K, Ying Y, Campbell C. Gsml: a unified framework for sparse metric learning. 2009 ninth IEEE international conference on data mining. IEEE; 2009. p. 189–98.Google Scholar
  12. 12.
    Jiang W, Nie F, Huang H. Robust dictionary learning with capped l1-norm. IJCAI; 2015. p. 3590–96.Google Scholar
  13. 13.
    Jiang Z, Lin Z, Davis LS. Label consistent k-svd: learning a discriminative dictionary for recognition. IEEE Trans Pattern Anal Mach Intell 2013;35(11):2651–64.PubMedGoogle Scholar
  14. 14.
    Kasiviswanathan SP, Wang H, Banerjee A, Melville P. Online 1 -dictionary learning with application to novel document detection. International conference on neural information processing systems; 2012. p. 2258–66.Google Scholar
  15. 15.
    Larochelle H, Erhan D, Courville A, Bergstra J, Bengio Y. An empirical evaluation of deep architectures on problems with many factors of variation. Proceedings of the 24th international conference on Machine learning. ACM; 2007. p. 473–80.Google Scholar
  16. 16.
    Lee H, Battle A, Raina R, Ng AY. Efficient sparse coding algorithms. Advances in neural information processing systems; 2007. p. 801–8.Google Scholar
  17. 17.
    Li X, Hu Z, Wang H. Combining non-negative matrix factorization and sparse coding for functional brain overlapping community detection. Cogn Comput. 2018;991–1005.Google Scholar
  18. 18.
    Li Z, Tang J. Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 2017;26(1):276–88.PubMedGoogle Scholar
  19. 19.
    Liu H, Sun F. Discovery of topical objects from video: a structured dictionary learning approach. Cogn Comput 2016;8(3):519–28.Google Scholar
  20. 20.
    Liu H, Wang F, Zhang X, Sun F. Weakly-paired deep dictionary learning for cross-modal retrieval. Pattern Recogn Lett. 2018.
  21. 21.
    Liu X, Wang L, Zhang J, Yin J, Liu H. Global and local structure preservation for feature selection. IEEE Transactions on Neural Networks and Learning Systems 2014;25(6):1083–95.Google Scholar
  22. 22.
    Liu X, Zhong G, Dong J. Natural image illuminant estimation via deep non-negative matrix factorisation. IET Image Process 2017;12(1):121–5.Google Scholar
  23. 23.
    Lou Y, Yin P, He Q, Xin J. Computing sparse representation in a highly coherent dictionary based on difference of l_1 and l_2. J Sci Comput 2015;64(1):178–96.Google Scholar
  24. 24.
    Mairal J, Bach F, Ponce J, Sapiro G. Online dictionary learning for sparse coding. Proceedings of the 26th annual international conference on machine learning. ACM; 2009. p. 689–96.Google Scholar
  25. 25.
    Mairal J, Ponce J, Sapiro G, Zisserman A, Bach FR. Supervised dictionary learning. Advances in neural information processing systems; 2009. p. 1033–40.Google Scholar
  26. 26.
    Majumdar A, Ward RK. Improved group sparse classifier. Pattern Recogn Lett 2010;31(13):1959–64.Google Scholar
  27. 27.
    Majumdar A, Ward RK. Robust classifiers for data reduced via random projections. IEEE Trans Syst Man Cybern B Cybern 2010;40(5):1359–71.PubMedGoogle Scholar
  28. 28.
    Manjani I, Tariyal S, Vatsa M, Singh R, Majumdar A. Detecting silicone mask-based presentation attack via deep dictionary learning. IEEE Trans Inf Forensics Secur 2017;12(7):1713–23.Google Scholar
  29. 29.
    Meinshausen N, Yu B. Lasso-type recovery of sparse representations for high-dimensional data. Ann Stat 2009;37:246–70.Google Scholar
  30. 30.
    Mukherjee S, Basu R, Seelamantula CS. 1 -k-svd: a robust dictionary learning algorithm with simultaneous update. Signal Process 2016;123:42–52.Google Scholar
  31. 31.
    Nie F, Huang H, Cai X, Ding CH. Efficient and robust feature selection via joint 2,1-norms minimization. Advances in neural information processing systems; 2010. p. 1813–21.Google Scholar
  32. 32.
    Schmidt MW, Murphy KP, Fung G, Rosales R. Structure learning in random fields for heart motion abnormality detection. CVPR; 2008. p. 2.Google Scholar
  33. 33.
    Sharma P, Abrol V, Sao AK. Deep-sparse-representation-based features for speech recognition. IEEE/ACM Transactions on Audio Speech & Language Processing 2017;25(11):2162–75.Google Scholar
  34. 34.
    Shen Y, Li J, Zhu Z, Cao W, Song Y. Image reconstruction algorithm from compressed sensing measurements by dictionary learning. Neurocomputing 2015;151:1153–62.Google Scholar
  35. 35.
    Shi Y, Miao J, Wang Z, Zhang P, Niu L. Feature selection with 2,1 − 2 regularization. IEEE Trans Neural Netw Learn Syst 2018;29(10):4967–82.Google Scholar
  36. 36.
    Singhal V, Khurana P, Majumdar A. Class-wise deep dictionary learning. 2017 international joint conference on neural networks (IJCNN). IEEE; 2017. p. 1125–32.Google Scholar
  37. 37.
    Tariyal S, Majumdar A, Singh R, Vatsa M. Deep dictionary learning. IEEE Access 2016;4:10,096–109.Google Scholar
  38. 38.
    Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol 1996;58:267–88.Google Scholar
  39. 39.
    Trigeorgis G, Bousmalis K, Zafeiriou S, Schuller BW. A deep matrix factorization method for learning attribute representations. IEEE Trans Pattern Anal Mach Intell 2017;39(3):417–29.PubMedGoogle Scholar
  40. 40.
    Tuia D, Flamary R, Barlaud M. Nonconvex regularization in remote sensing. IEEE Trans Geosci Remote Sens 2016;54(11):6470–80.Google Scholar
  41. 41.
    Wang H, Nie F, Cai W, Huang H. Semi-supervised robust dictionary learning via efficient 2,0+-norms minimization. Proceedings of the IEEE international conference on computer vision. IEEE; 2013. p. 1145–52.Google Scholar
  42. 42.
    Wang S, Liu Q, Xia Y, Dong P, Luo J, Huang Q, Feng DD. Dictionary learning based impulse noise removal via l1–l1 minimization. Signal Process 2013;93(9):2696–708.Google Scholar
  43. 43.
    Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 2009;31(2):210–27.PubMedGoogle Scholar
  44. 44.
    Xu J, Yang G, Yin Y, Man H, He H. Sparse-representation-based classification with structure-preserving dimension reduction. Cogn Comput 2014;6(3):608–21.Google Scholar
  45. 45.
    Xu Z, Chang X, Xu F, Zhang H. l_{1/2} regularization: a thresholding representation theory and a fast solver. IEEE Trans Neural Netw Learn Syst 2012;23(7): 1013–27.PubMedGoogle Scholar
  46. 46.
    Xue HJ, Dai XY, Zhang J, Huang S, Chen J. Deep matrix factorization models for recommender systems. International joint conference on artificial intelligence; 2017. p. 3203–9.Google Scholar
  47. 47.
    Yang ZX, Tang L, Zhang K, Wong PK. Multi-view cnn feature aggregation with elm auto-encoder for 3d shape recognition. Cogn Comput 2018;10(6):908–21.Google Scholar
  48. 48.
    Ying Y, Huang K, Campbell C. Sparse metric learning via smooth optimization. Advances in neural information processing systems; 2009. p. 2214–22.Google Scholar
  49. 49.
    Zhang C, et al. Nearly unbiased variable selection under minimax concave penalty. Ann Statist 2010;38(2): 894–942.Google Scholar
  50. 50.
    Zhang L, Zhou WD, Chang PC, Liu J, Yan Z, Wang T, Li FZ. Kernel sparse representation-based classifier. IEEE Trans Signal Process 2012;60(4):1684–95.Google Scholar
  51. 51.
    Zhang M, Ding CH, Zhang Y, Nie F. Feature selection at the discrete limit. AAAI; 2014. p. 1355–61.Google Scholar
  52. 52.
    Zhang Q, Li B. Discriminative k-svd for dictionary learning in face recognition. 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE; 2010. p. 2691–98.Google Scholar
  53. 53.
    Zhang Z, Xiahou J, Bai ZJ, Hancock ER, Zhou D, Chen SB, Chen L. Discriminative lasso. Cogn Comput 2016;8(5):847–55.Google Scholar
  54. 54.
    Zheng A, Xu M, Luo B, Zhou Z, Li C. Class: Collaborative low-rank and sparse separation for moving object detection. Cogn Comput 2017;9(2):180–93.Google Scholar
  55. 55.
    Zhou P, Fang C, Lin Z, Zhang C, Chang EY. Dictionary learning with structured noise. Neurocomputing 2018;273:414–23.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Jianyu Miao
    • 1
  • Heling Cao
    • 1
  • Xiao-Bo Jin
    • 2
  • Rongrong Ma
    • 3
    • 4
  • Xuan Fei
    • 1
  • Lingfeng Niu
    • 4
    • 5
    Email author
  1. 1.College of Information Science and EngineeringHenan University of TechnologyZhengzhouChina
  2. 2.Department of Electrical and Electronic EngineeringXi’an Jiaotong-Liverpool UniversitySuzhouChina
  3. 3.School of Mathematical SciencesUniversity of Chinese Academy of SciencesBeijingChina
  4. 4.Key Laboratory of Big Data Mining and Knowledge ManagementChinese Academy of SciencesBeijingChina
  5. 5.School of Economics and ManagementUniversity of Chinese Academy of SciencesBeijingChina

Personalised recommendations