Abstract
The randomized sparse Kaczmarz method, designed for seeking the sparse solutions of the linear systems \(Ax=b\), selects the i-th projection hyperplane with likelihood proportional to \(\Vert a_{i}\Vert _2^2\), where \(a_{i}^{\mathrm{T}}\) is the i-th row of A. In this work, we propose a weighted randomized sparse Kaczmarz method, which selects the i-th projection hyperplane with probability proportional to \(|\langle a_{i},x_{k}\rangle -b_{i}|^p\), where \(0<p<\infty \), for possible acceleration. It bridges the randomized Kaczmarz and greedy Kaczmarz by parameter p. Theoretically, we show its linear convergence rate in expectation with respect to the Bregman distance in the noiseless and noisy cases, which is at least as efficient as the randomized sparse Kaczmarz method. The superiority of the proposed method is demonstrated via a group of numerical experiments.
Similar content being viewed by others
References
Chen X, Qin J (2021) Regularized Kaczmarz algorithms for tensor recovery. SIAM J Imaging Sci 14(4):1439–1471. https://doi.org/10.1137/21M1398562
Chen SS, Donoho DL, Saunders MA (2001) Atomic decomposition by basis pursuit. SIAM Rev 43(1):129–159. https://doi.org/10.1137/S003614450037906X
Davis TA, Hu Y (2011) The university of Florida sparse matrix collection. ACM Trans Math Softw (TOMS) 38(1):1–25. https://doi.org/10.1145/2049662.2049663
Du K, Sun XH (2021) Randomized regularized extended Kaczmarz algorithms for tensor recovery. Preprint arXiv:2112.08566
Elad M (2010) Sparse and redundant representations: from theory to applications in signal and image processing. Springer, London. https://doi.org/10.1007/978-1-4419-7011-4
Feichtinger HG, Cenker C, Mayer M et al (1992) New variants of the POCS method using affine subspaces of finite codimension with applications to irregular sampling. In: Visual communications and image processing’92, pp 299–310. https://doi.org/10.1117/12.131447
Groß J (2021) A note on the randomized Kaczmarz method with a partially weighted selection step. Preprint arXiv:2105.14583
Jiang Y, Wu G, Jiang L (2020) A Kaczmarz method with simple random sampling for solving large linear systems. Preprint arXiv:2011.14693
Kaczmarz S (1937) Angenäherte auflösung von systemen linearer glei-chungen. Bull Int Acad Pol Sic Let Cl Sci Math Nat 1:355–357
Li RR, Liu H (2022) On randomized partial block Kaczmarz method for solving huge linear algebraic systems. Comput Appl Math 41(6):1–10. https://doi.org/10.1007/s40314-022-01978-0
Lorenz DA, Wenger S, Schöpfer F et al (2014a) A sparse Kaczmarz solver and a linearized Bregman method for online compressed sensing. In: 2014 IEEE international conference on image processing (ICIP), pp 1347–1351. https://doi.org/10.1109/ICIP.2014.7025269
Lorenz DA, Schöpfer F, Wenger S (2014b) The linearized Bregman method via split feasibility problems: analysis and generalizations. SIAM J Imaging Sci 7(2):1237–1262. https://doi.org/10.1137/130936269
Needell D (2010) Randomized Kaczmarz solver for noisy linear systems. BIT Numer Math 50(2):395–403. https://doi.org/10.1007/s10543-010-0265-5
Nesterov Y (2003) Introductory lectures on convex optimization: a basic course, vol 87. Springer, London. https://doi.org/10.1007/978-1-4419-8853-9
Patel V, Jahangoshahi M, Maldonado DA (2021) Convergence of adaptive, randomized, iterative linear solvers. Preprint arXiv:2104.04816
Petra S (2015) Randomized sparse block Kaczmarz as randomized dual block-coordinate descent. Anal Univ Ovidius Const Ser Mat 23(3):129–149. https://doi.org/10.1515/auom-2015-0052
Rockafellar RT, Wets RJB (2009) Variational analysis, vol 317. Springer, London. https://doi.org/10.1007/978-3-030-63416-2_683
Schöpfer F (2012) Exact regularization of polyhedral norms. SIAM J Optim 22(4):1206–1223. https://doi.org/10.1137/11085236X
Schöpfer F, Lorenz DA (2019) Linear convergence of the randomized sparse Kaczmarz method. Math Program 173(1):509–536. https://doi.org/10.1007/s10107-017-1229-1
Steinerberger S (2021) A weighted randomized Kaczmarz method for solving linear systems. Math Comput 90(332):2815–2826. https://doi.org/10.1090/mcom/3644
Strohmer T, Vershynin R (2009) A randomized Kaczmarz algorithm with exponential convergence. J Fourier Anal Appl 15(2):262–278. https://doi.org/10.1007/s00041-008-9030-4
Tan YS, Vershynin R (2019) Phase retrieval via randomized Kaczmarz: theoretical guarantees. Inf Infer J IMA 8(1):97–123. https://doi.org/10.1093/imaiai/iay005
Wang X, Che M, Mo C et al (2022) Solving the system of nonsingular tensor equations via randomized Kaczmarz-like method. J Comput Appl Math. https://doi.org/10.1016/j.cam.2022.114856
Yuan ZY, Zhang H, Wang H (2022a) Sparse sampling Kaczmarz–Motzkin method with linear convergence. Math Methods Appl Sci 45(7):3463–3478. https://doi.org/10.1002/mma.7990
Yuan ZY, Zhang L, Wang H et al (2022b) Adaptively sketched Bregman projection methods for linear systems. Inverse Prob 38(6):065,005. https://doi.org/10.1088/1361-6420/ac5f76
Zouzias A, Freris NM (2013) Randomized extended Kaczmarz for solving least squares. SIAM J Matrix Anal Appl 34(2):773–793. https://doi.org/10.1137/120889897
Acknowledgements
The authors would like to thank the anonymous referees and the associate editor for valuable suggestions and comments, which allowed us to improve the original presentation. This work was supported by the National Natural Science Foundation of China (Nos. 11971480, 61977065), the Natural Science Fund of Hunan for Excellent Youth (No. 2020JJ3038), and the Fund for NUDT Young Innovator Awards (No. 20190105).
Author information
Authors and Affiliations
Contributions
The data used in the manuscript are available in the SuiteSparse Matrix Collection. All authors contributed to the study’s conception and design. The first draft of the manuscript was written by LZ and all authors commented on previous versions of the manuscript. All authors read and approve the final manuscript and are all aware of the current submission to COAM.
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflict of interest.
Additional information
Communicated by Yimin Wei.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A Proof of Theorem 1
Proof
The proof is divided into two parts: we deduce the convergence rate of WRaSK in the first part and compare the convergence rate between WRaSK and RaSK in the second part.
First, we derive the convergence rate of WRaSK. By Theorem 2.8 in Lorenz et al. (2014b), we know that (11) in Lemma 2 holds for both the exact and inexact step sizes. Note that f is 1-strongly convex and \(\Vert a_{i_k}\Vert _2=1\); it follows that
we fix the values of the indices \(i_0,\ldots ,i_{k-1}\) and only consider \(i_k\) as a random variable. Taking the conditional expectation on both sides, we derive that
The last inequality follows by invoking Lemma 3. Now considering all indices \(i_0,\ldots ,i_k\) as random variables and taking the full expectation on both sides, we have that
where \(z=x-{\hat{x}}\). According to Lemma 1 and f is 1-strongly convex, we can obtain
Thus, we get
Next, we compare the convergence rates between RaSK and WRaSK. Hölder’s inequality implies that for any \(0\ne x\in {\mathbb {R}}^m,\)
and
Based on (A2) and (A3), for \(0\ne Az\in {\mathbb {R}}^m\) we deduce that
Hence,
It follows that
with which we further derive that
Thereby, we conclude that the convergence rate of WRaSK is at least as efficient as RaSK.
As we all know, Hölder’s inequality takes the equal sign if and only if one of the two vectors is the constant multiple of the other. Since uses Hölder’s inequality twice, (A5) with equality holds if and only if Az is a constant multiple of the unit vector. The proof is completed. \(\square \)
Appendix B Proof of Theorem 2
Proof
We make use of the observation in Needell (2010) that
Note that f is 1-strongly convex and \(\Vert a_{i_k}\Vert _2=1\); hence according to Lemma 2, we deduce that
Reformulating (B7) by (B6), we derive that
(a) In the WRaSK method, we have
Recall that \(x_k^{\delta }-{\hat{x}}=(b_{i_k}^{\delta }-b_{i_k})a_{i_k}\), we get
and
Plugging the reformulations (B9) and (B10) into (B8), we have
We fix the values of the indices \(i_0,\ldots ,i_{k-1}\) and only consider \(i_k\) as a random variable. Taking the conditional expectation on both sides, we get
The last inequality can be deduced by using the conclusion of Theorem 1 and Hölder’s inequality
Now considering all indices \(i_0,\ldots ,i_k\) as random variables and taking the full expectation on both sides, we can derive that
According to the equivalence of vector norms in \({\mathbb {R}}^m\), there is a constant \(c\in {\mathbb {R}}\) such that for any vector \(z\in {\mathbb {R}}^m\) we have that
Thus,
Using \(\sqrt{u+v}\le \sqrt{u}+\sqrt{v}\) and f is 1-strongly convex, we further deduce that
(b) In the EWRaSK method, according to Example 1 we have \(x_{k+1}^*=x_k+\lambda \cdot s_k\), where \(\Vert s_k\Vert _{\infty },\Vert s_{k+1}\Vert _{\infty }\le 1\). The exact linesearch guarantees \(\langle x_{k+1},a_{i_k}\rangle =b_{i_k}^{\delta };\) thus,
Bringing (B10) and (B11) into (B8), note that \(\Vert a_{i_k}\Vert _2=1\) we derive
Use Hölder’s inequality to reformulate
Similar to (a), we get
The proof is completed. \(\square \)
Appendix C Proof of Lemma 4
Proof
First, we compute the derivative of the function f(x) as follows:
Denote
It follows that \(f^{'}(x)=\frac{h(x)}{\left( \sum _{i=1}^{n}e^{d_ix}\right) ^2}\). Let \(t_{ij}=e^{2d_i}e^{(d_i+d_j)x}\); then we have
Exchanging the role of i and j, we obtain
Based on (C14) and (C15), we deduce that
It follows that \(h(x)\ge 0\) and hence \(f^{'}(x)\ge 0,\, \forall x\in (0,\infty )\). Therefore, f(x) is a monotonic increasing function. The proof is completed. \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, L., Yuan, Z., Wang, H. et al. A weighted randomized sparse Kaczmarz method for solving linear systems. Comp. Appl. Math. 41, 383 (2022). https://doi.org/10.1007/s40314-022-02105-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40314-022-02105-9
Keywords
- Weighted sampling rule
- Bregman distance
- Bregman projection
- Sparse solution
- Kaczmarz method
- Linear convergence