Advertisement

Machine Learning

, Volume 108, Issue 8–9, pp 1307–1327 | Cite as

LSALSA: accelerated source separation via learned sparse coding

  • Benjamin CowenEmail author
  • Apoorva Nandini Saridena
  • Anna Choromanska
Article
Part of the following topical collections:
  1. Special Issue of the ECML PKDD 2019 Journal Track

Abstract

We propose an efficient algorithm for the generalized sparse coding (SC) inference problem. The proposed framework applies to both the single dictionary setting, where each data point is represented as a sparse combination of the columns of one dictionary matrix, as well as the multiple dictionary setting as given in morphological component analysis (MCA), where the goal is to separate a signal into additive parts such that each part has distinct sparse representation within an appropriately chosen corresponding dictionary. Both the SC task and its generalization via MCA have been cast as \(\ell _1\)-regularized optimization problems of minimizing quadratic reconstruction error. In an effort to accelerate traditional acquisition of sparse codes, we propose a deep learning architecture that constitutes a trainable time-unfolded version of the split augmented lagrangian shrinkage algorithm (SALSA), a special case of the alternating direction method of multipliers (ADMM). We empirically validate both variants of the algorithm, that we refer to as learned-SALSA (LSALSA), on image vision tasks and demonstrate that at inference our networks achieve vast improvements in terms of the running time and the quality of estimated sparse codes on both classic SC and MCA problems over more common baselines. We also demonstrate the visual advantage of our technique on the task of source separation. Finally, we present a theoretical framework for analyzing LSALSA network: we show that the proposed approach exactly implements a truncated ADMM applied to a new, learned cost function with curvature modified by one of the learned parameterized matrices. We extend a very recent stochastic alternating optimization analysis framework to show that a gradient descent step along this learned loss landscape is equivalent to a modified gradient descent step along the original loss landscape. In this framework, the acceleration achieved by LSALSA could potentially be explained by the network’s ability to learn a correction to the gradient direction of steeper descent.

Keywords

Sparse coding Morphological component analysis Deep learning 

Notes

Supplementary material

10994_2019_5812_MOESM1_ESM.pdf (4.1 mb)
Supplementary material 1 (pdf 4165 KB)

References

  1. Adler, J., & Öktem, O. (2017). Learned primal-dual reconstruction. CoRR arXiv:1707.06474.
  2. Afonso, M., Bioucas-Dias, J., & Figueiredo, M. (2010). Fast image recovery using variable splitting and constrained optimization. IEEE Transactions on Image Processing, 19(9), 2345–2356.MathSciNetzbMATHCrossRefGoogle Scholar
  3. Afonso, M., Bioucas-Dias, J., & Figueiredo, M. (2011). An augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems. IEEE Transactions on Image Processing, 20(3), 681–695.MathSciNetzbMATHCrossRefGoogle Scholar
  4. Bauschke, H. H., & Combettes, P. L. (2011). Convex analysis and monotone operator theory in Hilbert spaces (1st ed.). Berlin: Springer.zbMATHCrossRefGoogle Scholar
  5. Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM: SIAM Journal on Imaging Sciences, 2(1), 183–202.MathSciNetzbMATHGoogle Scholar
  6. Borgerding, M., & Schniter, P. (2016) Onsager-corrected deep learning for sparse linear inverse problems. In GlobalSIP.Google Scholar
  7. Boyd, S., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, 3(1), 1–122.zbMATHCrossRefGoogle Scholar
  8. Chen, X., Liu, J., Wang, Z., & Yin, W. (2018). Theoretical linear convergence of unfolded ISTA and its practical weights and thresholds. arXiv preprint arXiv:1808.10038.
  9. Choromanska, A., Cowen, B., Kumaravel, S., Luss, R., Rish, I., Kingsbury, B., Tejwani, R., & Bouneffouf, D. (2019). Beyond backprop: Alternating minimization with co-activation memory. arXiv preprint arXiv:1806.09077v3.
  10. Daubechies, I., Defrise, M., & De Mol, C. (2004). An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications on Pure and Applied Mathematics, 57(11), 1413–1457.MathSciNetzbMATHCrossRefGoogle Scholar
  11. Eckstein, J., & Bertsekas, D. (1992). On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Mathematical Programming, 55, 293–318.MathSciNetzbMATHCrossRefGoogle Scholar
  12. Elad, M., Starck, J. L., Querre, P., & Donoho, D. L. (2005). Simultaneous cartoon and texture image inpainting using morphological component analysis (MCA). Applied and Computational Harmonic Analysis, 19(3), 340–358.MathSciNetzbMATHCrossRefGoogle Scholar
  13. Elson, J., Douceur, J., Howell, J., & Saul, J. (2007). Asirra: A CAPTCHA that exploits interest-aligned manual image categorization. In ACM CCS.Google Scholar
  14. Figueiredo, M., Bioucas-Dias, J., & Afonso, M. (2009). Fast frame-based image deconvolution using variable splitting and constrained optimization. In Proceedings of IEEE workshop on statistical signal processing (pp. 109–112).Google Scholar
  15. Gers, F. A., Schraudolph, N. N., & Schmidhuber, J. (2003). Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research, 3, 115–143.MathSciNetzbMATHGoogle Scholar
  16. Goldstein, T., O’Donoghue, B., & Setzer, S. (2014). Fast alternating direction optimization methods. SIAM Journal on Imaging Sciences, 7, 1588–1623.MathSciNetzbMATHCrossRefGoogle Scholar
  17. Golle, P. (2008). Machine learning attacks against the Asirra CAPTCHA. In ACM CCS.Google Scholar
  18. Greff, K., Srivastava, R. K., & Schmidhuber, J. (2016). Highway and residual networks learn unrolled iterative estimation. arXiv preprint arXiv:1612.07771.
  19. Gregor, K., & LeCun, Y. (2010). Learning fast approximations of sparse coding. In ICML.Google Scholar
  20. Jarrett, K., Kavukcuoglu, K., Koray, M., & LeCun, Y. (2009). What is the best multi-stage architecture for object recognition? In ICCV.Google Scholar
  21. Kavukcuoglu, K., Ranzato, M. A., & LeCun, Y. (2010). Fast inference in sparse coding algorithms with applications to object recognition. CoRR arXiv:1010.3467.
  22. Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images (Vol. 1, No. 4, p. 7). Technical report, University of Toronto.Google Scholar
  23. Lange, M., Zühlke, D., Holz, O., Villmann, T. (2014). Applications of LP-norms and their smooth approximations for gradient based learning vector quantization. In ESANN.Google Scholar
  24. Le Roux, J., Hershey, J. R., & Weninger, F. (2015). Deep NMF for speech separation. In ICASSP.Google Scholar
  25. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (2009). Gradient-based learning applied to document recognition. In Proceedings of the IEEE.Google Scholar
  26. Liao, Q., & Poggio, T. (2016). Bridging the gaps between residual learning, recurrent neural networks and visual cortex. arXiv preprint arXiv:1604.03640.
  27. Liu, S., Xian, Y., Li, H., & Yu, Z. (2017). Text detection in natural scene images using morphological component analysis and laplacian dictionary. IEEE/CAA Journal of Automatica Sinica, PP(99), 1–9.CrossRefGoogle Scholar
  28. Moreau, T., & Bruna, J. (2016). Understanding trainable sparse coding with matrix factorization. arXiv preprint arXiv:1609.00285.
  29. Nesterov, Y. (2013). Introductory lectures on convex optimization: A basic course (Vol. 87). Berlin: Springer.zbMATHGoogle Scholar
  30. Olshausen, B., & Field, D. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609.CrossRefGoogle Scholar
  31. Orhan, E., & Pitkow, X. (2018). Skip connections eliminate singularities. In International conference on learning representations.Google Scholar
  32. Otazo, R., Candès, E., & Sodickson, D. K. (2015). Low-rank and sparse matrix decomposition for accelerated dynamic MRI with separation of background and dynamic components. Magnetic Resonance in Medicine, 73(3), 1125–36.CrossRefGoogle Scholar
  33. Parekh, A., Selesnick, I., Rapoport, D., & Ayappa, I. (2014). Sleep spindle detection using time-frequency sparsity. In IEEE SPMB.Google Scholar
  34. Peyré, G., Fadili, J., & Starck, J. L. (2007). Learning adapted dictionaries for geometry and texture separation. In SPIE Wavelets.Google Scholar
  35. Peyré, G., Fadili, J., & Starck, J. L. (2010). Learning the morphological diversity. SIAM Journal on Imaging Sciences, 3(3), 646–669.MathSciNetzbMATHCrossRefGoogle Scholar
  36. Schmidt, M., Fung, G., & Rosales, R. (2007). Fast optimization methods for l1 regularization: A comparative study and two new approaches. In J. N. Kok, J. Koronacki, R. L. D. Mantaras, S. Matwin, D. Mladenič, A. Skowron (Eds.), ECML.Google Scholar
  37. Selesnick, I. (2014). L1-norm penalized least squares with salsa. Connexions (p. 66). Retrieved March 1, 2017 from http://cnx.org/contents/e980d3cd-f201-4ef6-8992-d712bf0a88a3@5.
  38. Shoham, N., & Elad, M. (2008). Algorithms for signal separation exploiting sparse representations, with application to texture image separation. In Proceedings of the IEEE 25th convention of electrical and electronics engineers in Israel.Google Scholar
  39. Sprechmann, P., Litman, R., Yakar, T., Bronstein, A., & Sapiro, G. (2013). Efficient supervised sparse analysis and synthesis operators. In NIPS.Google Scholar
  40. Starck, J. L., Elad, M., & Donoho, D. (2004). Redundant multiscale transforms and their application for morphological component separation. Advances in Imaging and Electron Physics, 132, 287–348.CrossRefGoogle Scholar
  41. Starck, J. L., Elad, M., & Donoho, D. (2005a). Image decomposition via the combination of sparse representations and a variational approach. IEEE Transactions on Image Processing, 14(10), 1570–1582.MathSciNetzbMATHCrossRefGoogle Scholar
  42. Starck, J. L., Moudden, Y., Bobina, J., Elad, M., Donoho, D. (2005b). Morphological component analysis. In Proceedings of SPIE Wavelets.Google Scholar
  43. Tian, S., Pan, Y., Huang, C., Lu, S., Yu, K., & Lim Tan, C. (2015). Text flow: A unified text detection system in natural scene images. In Proceedings of the IEEE international conference on computer vision (pp. 4651–4659).Google Scholar
  44. Uysal, F., Selesnick, I., & Isom, B. (2016). Mitigation of wind turbine clutter for weather radar by signal separation. IEEE Transactions on Geoscience and Remote Sensing, 54(5), 2925–2934.CrossRefGoogle Scholar
  45. Wang, Y., Yin, W., & Zeng, J. (2019). Global convergence of ADMM in nonconvex nonsmooth optimization. Journal of Scientific Computing, 78(1), 29–63.  https://doi.org/10.1007/s10915-018-0757-z.MathSciNetzbMATHCrossRefGoogle Scholar
  46. Wang, Z., Ling, Q., & Huang, T. (2016). Learning deep L0 encoders. In AAAI.Google Scholar
  47. Wisdom, S., Powers, T., Pitton, J., & Atlas, L. (2017). Deep recurrent NMF for speech separation by unfolding iterative thresholding. In IEEE workshop on applications of signal processing to audio and acoustics (WASPAA) (pp. 254–258).Google Scholar
  48. Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. CoRR arXiv:1708.07747.
  49. Yan, C., Xie, H., Liu, S., Yin, J., Zhang, Y., & Dai, Q. (2018). Effective Uyghur language text detection in complex background images for traffic prompt identification. IEEE Transactions on Intelligent Transportation Systems, 19(1), 220–229.CrossRefGoogle Scholar
  50. Yang, Y., Sun, J., Li, H., & Xu, Z. (2016). Deep ADMM-net for compressive sensing MRI. In NIPS.Google Scholar
  51. Zhou, J., Di, K., Du, J., Peng, X., Yang, H., Pan, S.J., Tsang, I. W., Liu, Y., Qin, Z., & Goh, R. (2018). SC2Net: Sparse LSTMs for sparse coding. In AAAI.Google Scholar

Copyright information

© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.New York University Tandon School of EngineeringBrooklynUSA

Personalised recommendations