Dictionary Learning Algorithms and Applications pp 115-144 | Cite as

# Other Views on the DL Problem

## Abstract

The dictionary learning problem can be posed in different ways, as we have already seen. In this chapter we first take a look at the DL problem where the sparsity level is not bounded for each training signal; instead, we bound the average sparsity level. This allows better overall representation power, due to the ability to place the nonzeros where they are most needed. The simplest way to pose the problem is to combine the error objective with an *ℓ*_{1} penalty that encourages sparsity in the whole representation matrix * X*. Several algorithms can solve this problem; we present those based on coordinate descent in AK-SVD style, on majorization and on proximal gradient. The latter approach can also be used with a 0-norm penalization. Other modifications of the objective include the addition of a regularization term (elastic net) or of a coherence penalty. Another view is given by task-driven DL, where the optimization objective is taken directly from the application and the sparse representation is only an intermediary tool. Returning to the standard DL problem, we present two new types of algorithms. One is based on selection: the atoms are chosen from a pool of candidates and so are no longer free variables. The other is online DL, where the training signals are assumed to be available in small bunches and the dictionary is updated for each bunch; online DL can thus adapt the dictionary to a time-varying set of signals, following the behavior of the generating source. Two online algorithms are presented, one based on coordinate descent, the other inspired by the classic recursive least squares (RLS). Finally, we tackle the DL problem with incomplete data, where some of the signals elements are missing, and present a version of AK-SVD suited to this situation.

## References

- 7.F. Bach, R. Jenatton, J. Mairal, G. Obozinski, Optimization with sparsity-inducing penalties. Found. Trends Mach. Learn.
**4**(1), 1–106 (2011)Google Scholar - 9.C. Bao, H. Ji, Y. Quan, Z. Shen, Dictionary learning for sparse coding: algorithms and convergence analysis. IEEE Trans. Pattern Anal. Mach. Intell.
**38**(7), 1356–1369 (2016)CrossRefGoogle Scholar - 29.V. Cevher, A. Krause, Greedy dictionary selection for sparse representation. IEEE J. Sel. Topics Signal Process.
**5**(5), 979–988 (2011)CrossRefGoogle Scholar - 35.Y. Chen, C. Caramanis, Noisy and missing data regression: distribution-oblivious support recovery, in
*International Conference on Machine Learning*, Atlanta, GA (2013)Google Scholar - 40.Y. Cong, J. Yuan, J. Luo, Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Trans. Multimed.
**14**(1), 66–75 (2012)CrossRefGoogle Scholar - 41.Y. Cong, J. Liu, G. Sun, Q. You, Y. Li, J. Luo, Adaptive greedy dictionary selection for web media summarization. IEEE Trans. Image Process.
**26**(1), 185–195 (2017)Google Scholar - 59.B. Dumitrescu, P. Irofti, Low dimensional subspace finding via size-reducing dictionary learning, in
*International Workshop Machine Learning for Signal Processing*, Vietri sul Mare (2016)Google Scholar - 80.G.H. Golub, C. Van Loan,
*Matrix Computations*, 4th edn. (Johns Hopkins University Press, Baltimore, 2013)zbMATHGoogle Scholar - 85.R. Gribonval, K. Schnass, Dictionary identification-sparse matrix-factorization via
*ℓ*_{1}-minimization. IEEE Trans. Inf. Theory**56**(7), 3523–3539 (2010)MathSciNetCrossRefGoogle Scholar - 87.C. Guichaoua, Dictionary learning for audio inpainting, 2012. HAL Robotics [cs.RO]. 2012. http://dumas.ccsd.cnrs.fr/dumas-00725263
- 104.R. Jenatton, R. Gribonval, F. Bach, Local stability and robustness of sparse dictionary learning in the presence of noise. Preprint (2012). arXiv:1210.0685Google Scholar
- 107.S.P. Kasiviswanathan, H. Wang, A. Banerjee, P. Melville, Online L1-dictionary learning with application to novel document detection, in
*Advances in Neural Information Processing Systems*, ed. by F. Pereira, C.J.C. Burges, L. Bottou, K.Q. Weinberger, vol. 25 (Curran Associates, Inc., Red Hook, 2012), pp. 2258–2266Google Scholar - 116.H. Lee, A. Battle, R. Raina, A.Y. Ng, Efficient sparse coding algorithms, in
*Advances in Neural Information Processing Systems 19*, ed. by P.B. Schölkopf, J.C. Platt, T. Hoffman (MIT Press, Cambridge, 2007), pp. 801–808Google Scholar - 129.J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res.
**11**, 19–60 (2010)MathSciNetzbMATHGoogle Scholar - 130.J. Mairal, F. Bach, J. Ponce, Task-driven dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell.
**34**(4), 791–804 (2012)CrossRefGoogle Scholar - 143.N. Parikh, S. Boyd, Proximal algorithms. Found. Trends Optim.
**1**(3), 123–231 (2014)Google Scholar - 148.A. Rakotomamonjy, Direct optimization of the dictionary learning problem. IEEE Trans. Signal Process.
**61**(22), 5495–5506 (2013)MathSciNetCrossRefGoogle Scholar - 151.J. Ranieri, A. Chebira, M. Vetterli, Near-optimal sensor placement for linear inverse problems. IEEE Trans. Signal Process.
**62**(5), 1135–1146 (2014)MathSciNetCrossRefGoogle Scholar - 168.M. Sadeghi, M. Babaie-Zadeh, C. Jutten, Learning overcomplete dictionaries based on atom-by-atom updating. IEEE Trans. Signal Process.
**62**(4), 883–891 (2014)MathSciNetCrossRefGoogle Scholar - 169.S. K. Sahoo, A. Makur, Dictionary training for sparse representation as generalization of
*K*-means clustering. IEEE Signal Process. Lett.**20**(6), 587–590 (2013)CrossRefGoogle Scholar - 174.B. Shen, B.D. Liu, Q. Wang, Elastic net regularized dictionary learning for image classification. Multimed. Tools Appl.
**75**, 8861–8874 (2016)CrossRefGoogle Scholar - 176.C.D. Sigg, T. Dikk, J.D. Buhmann, Learning dictionaries with bounded self-coherence. IEEE Signal Process. Lett.
**19**(19), 861–865 (2012)CrossRefGoogle Scholar - 177.K. Skretting, K. Engan, Recursive least squares dictionary learning. IEEE Trans. Signal Process.
**58**(4), 2121–2130 (2010)MathSciNetCrossRefGoogle Scholar - 182.P. Stoica, P. Ahgren, Exact initialization of the recursive least-squares algorithm. Int. J. Adapt. Control Signal Process.
**16**, 219–230 (2002)CrossRefGoogle Scholar - 199.S. Wang, L. Zhang, Y. Liang, Q. Pan, Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis, in
*2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*, Providence (2012), pp. 2216–2223Google Scholar - 200.H. Wang, J. Wright, D. Spielman, A batchwise monotone algorithm for dictionary learning (2015). arXiv:1502.00064Google Scholar
- 208.Z. Xing, M. Zhou, A. Castrodad, G. Sapiro, L. Carin, Dictionary learning for noisy and incomplete hyperspectral images. SIAM J. Imag. Sci.
**5**(1), 33–56 (2012)MathSciNetCrossRefGoogle Scholar - 210.M. Yaghoobi, T. Blumensath, M.E. Davies, Dictionary learning for sparse approximations with the majorization method. IEEE Trans. Signal Process.
**57**(6), 2178–2191 (2009)MathSciNetCrossRefGoogle Scholar - 212.J. Yang, J. Wright, T.S. Huang, Y. Ma, Image super-resolution via sparse representation. IEEE Trans. Image Process.
**19**(11), 2861–2873 (2010)MathSciNetCrossRefGoogle Scholar - 215.J. Yang, Z. Wang, Z. Lin, S. Cohen, T. Huang, Coupled dictionary training for image super-resolution. IEEE Trans. Image Process.
**21**(8), 3467–3478 (2012)MathSciNetCrossRefGoogle Scholar - 226.M. Zhou, H. Chen, J. Paisley, L. Ren, L. Li, Z. Xing, D. Dunson, G. Sapiro, L. Carin, Nonparametric Bayesian dictionary learning for analysis of noisy and incomplete images. IEEE Trans. Image Process.
**21**(1), 130–144 (2012)MathSciNetCrossRefGoogle Scholar