Abstract
Importance sampling is a technique originating in Monte Carlo simulation whereby one samples from a different, weighted distribution, in order to reduce variance of the resulting estimator. More recently, variations of importance sampling have emerged as a means for reducing computational and sample complexity in different problems of modern signal processing. Here we review importance sampling as it is manifested in three such problems: stochastic optimization, compressive sensing, and low-rank matrix approximation. In keeping with a general trend in convex optimization towards the analysis of phase transitions for exact recovery, importance sampling in compressive sensing and low-rank matrix recovery can be used to effectively push the phase transition for exact recovery towards fewer measurements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The unknown might also be higher-dimensional, and is often 3-dimensional, but the ideas are analogous and we focus on the 2D example for simplicity.
- 2.
This becomes the trace norm for positive-definite matrices. It is now well recognized to be a convex surrogate for rank minimization.
- 3.
References
B. Adcock, A. Hansen, B. Roman, The quest for optimal sampling: computationally efficient, structure-exploiting measurements for compressed sensing. arXiv preprint (2014)
F. Bach, E. Moulines, Non-asymptotic analysis of stochastic approximation algorithms for machine learning, in Advances in Neural Information Processing Systems (2011)
L. Bottou, Large-scale machine learning with stochastic gradient descent, in Proceedings of COMPSTAT’2010, pp. 177–186, 2010
C. Boutsidis, M. Mahoney, P. Drineas, An improved approximation algorithm for the column subset selection problem, in Proceedings of the Symposium on Discrete Algorithms, pp. 968–977, 2009
L. Brutman, Lebesgue functions for polynomial interpolation: a survey. Ann. Numer. Math. 4, 111– 127 (1997)
N. Burq, S. Dyatlov, R. Ward, M. Zworski, Weighted eigenfunction estimates with applications to compressed sensing. SIAM J. Math. Anal. 44(5), 3481–3501 (2012)
E. Candès, B. Recht, Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717–772 (2009)
E.J. Candès, T. Tao, Near optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theory 52(12), 5406–5425 (2006)
E. Candès, T. Tao, The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inf. Theory 56(5), 2053–2080 (2010)
E.J. Candès, T. Tao, J. Romberg, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006)
E.J. Candès, J. Romberg, T. Tao, Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59(8), 1207–1223 (2006)
Y. Chen, Incoherence-optimal matrix completion. arXiv preprint arXiv:1310.0154 (2013)
Y. Chen, S. Bhojanapalli, S. Sanghavi, R. Ward, Coherent matrix completion, in Proceedings of the 31st International Conference on Machine Learning, pp. 674–682, 2014
P. Drineas, M. Magdon-Ismail, M. Mahoney, D. Woodruff, Fast approximation of matrix coherence and statistical leverage. J. Mach. Learn. Res. 13, 3475–3506 (2012)
S. Foucart, H. Rauhut, A Mathematical Introduction to Compressive Sensing (Springer, Berlin, 2013)
R. Foygel, R. Salakhutdinov, O. Shamir, N. Srebro, Learning with the weighted trace-norm under arbitrary sampling distributions. arXiv:1106.4251 (2011)
D. Gross, Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inf. Theory 57(3), 1548–1566 (2011)
J. Hampton, A. Doostan, Compressive sampling of polynomial chaos expansions: convergence analysis and sampling strategies. J. Comput. Phys. 280, 363–386 (2015)
R. Johnson, T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction. Adv. Neural Inf. Process. Syst. 26, 315–323 (2013)
A. Jones, B. Adcock, A. Hansen, Analyzing the structure of multidimensional compressed sensing problems through coherence. arXiv preprint (2014)
F. Krahmer, R. Ward, Stable and robust sampling strategies for compressive imaging. IEEE Trans. Image Process. 23(2), 612–622 (2014)
A. Krishnamurthy, A. Singh, Low-rank matrix and tensor completion via adaptive sampling. arXiv preprint. arXiv:1304.4672v2 (2013)
M. Lustig, D. Donoho, J. Pauly, Sparse mri: the application of compressed sensing for rapid mri imaging. Magn. Reson. Med. 58(6), 1182–1195 (2007)
M. Lustig, D. Donoho, J. Santos, J. Pauly, Compressed sensing mri. IEEE Signal Process. Mag. 25(2), 72–82 (2008)
D. Needell, R. Ward, Stable image reconstruction using total variation minimization. SIAM J. Imag. Sci. 6(2), 1035–1058 (2013)
D. Needell, N. Srebro, R. Ward, Stochastic gradient descent and the randomized kaczmarz algorithm. arXiv preprint. arXiv:1310.5715 (2013)
S. Negahban, M. Wainwright, Restricted strong convexity and weighted matrix completion: optimal bounds with noise. J. Mach. Learn. Res. 13, 1665–1697 (2012)
A. Nemirovski, A. Juditsky, G. Lan, A. Shapiro, Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
Y. Nesterov, Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)
J. Nocedal, S.J. Wright, Conjugate Gradient Methods (Springer, Berlin, 2006)
A.B. Owen, Monte Carlo Theory, Methods and Examples (2013)
H. Rauhut, R. Ward, Sparse Legendre expansions via ℓ 1-minimization. J. Approx. Theory 164, 517–533 (2012)
H. Rauhut, R. Ward, Interpolation via weighted ℓ 1 minimization. arXiv preprint. arXiv:1308.0759 (2013)
B. Recht, A simpler approach to matrix completion. arXiv preprint. arXiv:0910.0651 (2009)
P. Richtárik, M. Takáč, Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144(1), 1–38 (2014)
H. Robbins, S. Monrow, A stochastic approximation method. Ann. Math. Stat. 22(22), 400–407 (1951)
N.L. Roux, M. Schmidt, F. Bach, A stochastic gradient method with an exponential convergence rate for finite training sets. Adv. Neural Inf. Process. Syst. 25, 2672–2680 (2012)
M. Rudelson, R. Vershynin, On sparse reconstruction from Fourier and Gaussian measurements. Commun. Pure Appl. Math. 61, 1025–1045 (2008)
R. Salakhutdinov, N. Srebro, Collaborative filtering in a non-uniform world: learning with the weighted trace norm. arXiv preprint. arXiv:1002.2780 (2010)
S. Shalev-Shwartz, N. Srebro, Svm optimization: inverse dependence on training set size, in Proceedings of the 25th International Conference on Machine Learning, pp. 928–935, 2008
S. Shalev-Shwartz, T. Zhang, Proximal stochastic dual coordinate ascent. arXiv:1211.2772 (2012)
S. Shalev-Shwartz, T. Zhang, Stochastic dual coordinate ascent methods for regularized loss minimization. J. Mach. Learn. Res. 14, 567–599 (2013)
N. Srebro, K. Sridharan, A. Tewari, Smoothness, low noise and fast rates, in Advances in Neural Information Processing Systems (2010)
G. Szegö, Orthogonal Polynomials (American Mathematical Society, Providence, RI, 1939)
L. Xiao, T. Zhang, A proximal stochastic gradient method with progressive variance reduction. SIAM J. Optim. 24(4), 2057–2075 (2014)
P. Zhao, T. Zhang, Stochastic optimization with importance sampling. arXiv:1401.2753 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Ward, R. (2015). Importance sampling in signal processing applications. In: Balan, R., Begué, M., Benedetto, J., Czaja, W., Okoudjou, K. (eds) Excursions in Harmonic Analysis, Volume 4. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-20188-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-20188-7_8
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-20187-0
Online ISBN: 978-3-319-20188-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)