Skip to main content

Compressive Sensing and Neural Networks from a Statistical Learning Perspective

  • Chapter
  • First Online:
Compressed Sensing in Information Processing

Abstract

Various iterative reconstruction algorithms for inverse problems can be unfolded as neural networks. Empirically, this approach has often led to improved results, but theoretical guarantees are still scarce. While some progress on generalization properties of neural networks have been made, great challenges remain. In this chapter, we discuss and combine these topics to present a generalization error analysis for a class of neural networks suitable for sparse reconstruction from few linear measurements. The hypothesis class considered is inspired by the classical iterative soft-thresholding algorithm (ISTA). The neural networks in this class are obtained by unfolding iterations of ISTA and learning some of the weights. Based on training samples, we aim at learning the optimal network parameters via empirical risk minimization and thereby the optimal network that reconstructs signals from their compressive linear measurements. In particular, we may learn a sparsity basis that is shared by all of the iterations/layers and thereby obtain a new approach for dictionary learning. For this class of networks,we present a generalization bound, which is based on bounding the Rademacher complexity of hypothesis classes consisting of such deep networks via Dudley’s integral. Remarkably, under realistic conditions, the generalization error scales only logarithmically in the number of layers, and at most linear in number of measurements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aberdam, A., Golts, A., Elad, M.: Ada-LISTA: Learned solvers adaptive to varying models. Preprint. arXiv:2001.08456 (2020)

    Google Scholar 

  2. Arora, S., Ge, R., Neyshabur, B., Zhang, Y.: Stronger generalization bounds for deep nets via a compression approach. In: International Conference on Machine Learning, pp. 254–263 (2018)

    Google Scholar 

  3. Arridge, S., Maass, P., Öktem, O., Schönlieb, C.B.: Solving inverse problems using data-driven models. Acta Numerica 28, 1–174 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: risk bounds and structural results. J. Mach. Learn. Res. 3(Nov), 463–482 (2002)

    MathSciNet  MATH  Google Scholar 

  5. Bartlett, P.L., Foster, D.J., Telgarsky, M.J.: Spectrally-normalized margin bounds for neural networks. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems 30, pp. 6240–6249 (2017)

    Google Scholar 

  6. Behrens, F., Sauder, J., Jung, P.: Neurally augmented ALISTA. In: International Conference on Learning Representations (2021)

    Google Scholar 

  7. Chen, X., Liu, J., Wang, Z., Yin, W.: Theoretical linear convergence of unfolded ISTA and its practical weights and thresholds. In: Advances in Neural Information Processing Systems, pp. 9061–9071 (2018)

    Google Scholar 

  8. Chou, H.H., Gieshoff, C., Maly, J., Rauhut, H.: Gradient descent for deep matrix factorization: dynamics and implicit bias towards low rank. Preprint. arxiv:2011.13772 (2021)

    Google Scholar 

  9. Daras, G., Dean, J., Jalal, A., Dimakis, A.G.: Intermediate layer optimization for inverse problems using deep generative models. arXiv:2102.07364 [cs] (2021). http://arxiv.org/abs/2102.07364

  10. DasGupta, B., Sontag, E.: Sample complexity for learning recurrent perceptron mappings. IEEE Trans. Inf. Theory 42(5), 1479–1487 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  11. Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. J. Issued Courant Instit. Math. Sci. 57(11), 1413–1457 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  12. Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Applied and Numerical Harmonic Analysis. Springer, New York (2013)

    Google Scholar 

  13. Genzel, M., Macdonald, J., März, M.: Solving Inverse Problems With Deep Neural Networks – Robustness Included? arXiv:2011.04268 (2020)

    Google Scholar 

  14. Georgogiannis, A.: The generalization error of dictionary learning with Moreau envelopes. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1617–1625. PMLR, Stockholmsmässan, Stockholm (2018)

    Google Scholar 

  15. Golowich, N., Rakhlin, A., Shamir, O.: Size-independent sample complexity of neural networks. In: Conference on Learning Theory, pp. 297–299 (2018)

    Google Scholar 

  16. Gottschling, N.M., Antun, V., Adcock, B., Hansen, A.C.: The troublesome kernel: why deep learning for inverse problems is typically unstable. Preprint. arXiv:2001.01258 (2020)

    Google Scholar 

  17. Gregor, K., LeCun, Y.: Learning fast approximations of sparse coding. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 399–406 (2010)

    Google Scholar 

  18. Gribonval, R., Schnass, K.: Dictionary identification – sparse matrix-factorisation via 1-minimisation. IEEE Trans. Inf. Theory 56(7), 3523–3539 (2010)

    Article  MATH  Google Scholar 

  19. Gribonval, R., Jenatton, R., Bach, F., Kleinsteuber, M., Seibert, M.: Sample complexity of dictionary learning and other matrix factorizations. IEEE Trans. Inf. Theory 61(6), 3469–3486 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  20. Hasannasab, M., Hertrich, J., Neumayer, S., Plonka, G., Setzer, S., Steidl, G.: Parseval proximal neural networks. J. Fourier Anal. Appl. 26(4), 59 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  21. Jiang, Y., Neyshabur, B., Mobahi, H., Krishnan, D., Bengio, S.: Fantastic generalization measures and where to find them. In: International Conference on Learning Representations (2020)

    Google Scholar 

  22. Jung, A., Eldar, Y.C., Görtz, N.: Performance limits of dictionary learning for sparse coding. In: 2014 22nd European Signal Processing Conference (EUSIPCO), pp. 765–769 (2014)

    Google Scholar 

  23. Jung, A., Eldar, Y.C., Görtz, N.: On the minimax risk of dictionary learning. IEEE Trans. Inf. Theory 62(3), 1501–1515 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  24. Kamilov, U.S., Mansour, H.: Learning optimal nonlinearities for iterative thresholding algorithms. IEEE Signal Process. Lett. 23(5), 747–751 (2016)

    Article  Google Scholar 

  25. Koiran, P., Sontag, E.D.: Vapnik-Chervonenkis dimension of recurrent neural networks. Discret. Appl. Math. 86(1), 63–79 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  26. Koltchinskii, V.: Rademacher penalties and structural risk minimization. IEEE Trans. Inf. Theory 47(5), 1902–1914 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  27. LeCun, Y.: The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/

  28. Ledoux, M., Talagrand, M.: Probability in Banach spaces: isoperimetry and processes. Classics in Mathematics. Springer, Berlin (2011)

    Google Scholar 

  29. Lezcano-Casado, M., Martınez-Rubio, D.: Cheap orthogonal constraints in neural networks: a simple parametrization of the orthogonal and unitary group. In: International Conference on Machine Learning, pp. 3794–3803. PMLR (2019)

    Google Scholar 

  30. Liu, J., Chen, X., Wang, Z., Yin, W.: ALISTA: Analytic weights are as good as learned weights in LISTA. In: International Conference on Learning Representations (2019)

    Google Scholar 

  31. Maurer, A.: A vector-contraction inequality for Rademacher complexities. In: Algorithmic Learning Theory, Lecture Notes in Computer Science, pp. 3–17 (2016)

    Google Scholar 

  32. Mousavi, A., Patel, A.B., Baraniuk, R.G.: A deep learning approach to structured signal recovery. In: 2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton, pp. 1336–1343. IEEE, Piscataway (2015)

    Google Scholar 

  33. Nagarajan, V., Kolter, J.Z.: Uniform convergence may be unable to explain generalization in deep learning. In: Advances in Neural Information Processing Systems, pp. 11611–11622 (2019)

    Google Scholar 

  34. Neyshabur, B., Tomioka, R., Srebro, N.: In search of the real inductive bias: on the role of implicit regularization in deep learning. In: ICLR (Workshop) (2015)

    Google Scholar 

  35. Neyshabur, B., Bhojanapalli, S., McAllester, D., Srebro, N.: Exploring generalization in deep learning. In: Advances in Neural Information Processing Systems, pp. 5947–5956 (2017)

    Google Scholar 

  36. Neyshabur, B., Bhojanapalli, S., Srebro, N.: A PAC-Bayesian approach to spectrally-normalized margin bounds for neural networks. In: International Conference on Learning Representations (2018)

    Google Scholar 

  37. Neyshabur, B., Li, Z., Bhojanapalli, S., LeCun, Y., Srebro, N.: The role of over-parametrization in generalization of neural networks. In: International Conference on Learning Representations (2019)

    Google Scholar 

  38. Rakhlin, A., Mukherjee, S., Poggio, T.: Stability results in learning theory. Anal. Appl. 03(04), 397–417 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  39. Rauhut, H., Schnass, K., Vandergheynst, P.: Compressed sensing and redundant dictionaries. IEEE Trans. Inf. Theory 54(5), 2210–2219 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  40. Schnass, K.: On the identifiability of overcomplete dictionaries via the minimisation principle underlying K-SVD. Appl. Comput. Harmonic Anal. (3), 37 (2014)

    Google Scholar 

  41. Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, New York (2014)

    Book  MATH  Google Scholar 

  42. Shalev-Shwartz, S., Shamir, O., Srebro, N., Sridharan, K.: Learnability, stability and uniform convergence. J. Mach. Learn. Res. 11, 2635–2670 (2010)

    MathSciNet  MATH  Google Scholar 

  43. Sprechmann, P., Bronstein, A.M., Sapiro, G.: Learning efficient sparse and low rank models. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1821–1833 (2015)

    Article  Google Scholar 

  44. Sreter, H., Giryes, R.: Learned convolutional sparse coding. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2191–2195. IEEE, Piscataway (2018)

    Google Scholar 

  45. Talagrand, M.: Upper and Lower Bounds for Stochastic Processes: Modern Methods and Classical Problems. Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge/A Series of Modern Surveys in Mathematics. Springer, Berlin (2014)

    Google Scholar 

  46. Vainsencher, D., Mannor, S., Bruckstein, A.M.: The sample complexity of dictionary learning. J. Mach. Learn. Res. 12(Nov), 3259–3281 (2011)

    MathSciNet  MATH  Google Scholar 

  47. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York. Imprint: Springer, New York (2000)

    Google Scholar 

  48. Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. In: Vovk, V., Papadopoulos, H., Gammerman, A. (eds.) Measures of Complexity: Festschrift for Alexey Chervonenkis, pp. 11–30 (2015)

    Google Scholar 

  49. Wu, K., Guo, Y., Li, Z., Zhang, C.: Sparse coding with gated learned ISTA. In: International Conference on Learning Representations (2020)

    Google Scholar 

  50. Xin, B., Wang, Y., Gao, W., Wipf, D., Wang, B.: Maximal sparsity with deep networks? In: Advances in Neural Information Processing Systems, pp. 4340–4348 (2016)

    Google Scholar 

  51. Xu, H., Mannor, S.: Robustness and generalization. Mach. Learn. 86(3), 391–423 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  52. Yang, Y., Sun, J., Li, H., Xu, Z.: Deep ADMM-Net for compressive sensing MRI. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  53. Zhang, J., Ghanem, B.: ISTA-Net: interpretable optimization-inspired deep network for image compressive sensing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1828–1837 (2018)

    Google Scholar 

  54. Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: International Conference on Learning Representations (2017)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Sebastian Lubjuhn for proofreading an earlier version of this paper and giving valuable suggestions for improvement. The third author acknowledges funding from the Deutsche Forschungsgemeinschaft (DFG) through the project Structured Compressive Sensing via Neural Network Learning (SCoSNeL, MA 1184/36-1) within the SPP 1798 Compressed Sensing in Information Processing (CoSIP).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ekkehard Schnoor .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Behboodi, A., Rauhut, H., Schnoor, E. (2022). Compressive Sensing and Neural Networks from a Statistical Learning Perspective. In: Kutyniok, G., Rauhut, H., Kunsch, R.J. (eds) Compressed Sensing in Information Processing. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-031-09745-4_8

Download citation

Publish with us

Policies and ethics