On two-subspace randomized extended Kaczmarz method for solving large linear least-squares problems

Wu, Wen-Ting

doi:10.1007/s11075-021-01104-x

On two-subspace randomized extended Kaczmarz method for solving large linear least-squares problems

Original Paper
Published: 12 May 2021

Volume 89, pages 1–31, (2022)
Cite this article

Numerical Algorithms Aims and scope Submit manuscript

Wen-Ting Wu ORCID: orcid.org/0000-0002-1592-0198¹

1091 Accesses
15 Citations
Explore all metrics

Abstract

For solving the large-scale linear least-squares problem, we propose a block version of the randomized extended Kaczmarz method, called the two-subspace randomized extended Kaczmarz method, which does not require any row or column paving. Theoretical analysis and numerical results show that the two-subspace randomized extended Kaczmarz method is much more efficient than the randomized extended Kaczmarz method. When the coefficient matrix is of full column rank, the two-subspace randomized extended Kaczmarz method can also outperform the randomized coordinate descent method. If the linear system is consistent, we remove one of the iteration sequences in the two-subspace randomized extended Kaczmarz method, which approximates the projection of the right-hand side vector onto the orthogonal complement space of the range space of the coefficient matrix, and obtain the generalized two-subspace randomized Kaczmarz method, which is actually a generalization of the two-subspace randomized Kaczmarz method without the assumptions of unit row norms and full column rank on the coefficient matrix. We give the upper bound for the convergence rate of the generalized two-subspace randomized Kaczmarz method which also leads to a better upper bound for the convergence rate of the two-subspace randomized Kaczmarz method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Article 13 April 2024

Preconditioned golden ratio primal-dual algorithm with linesearch

Article 16 April 2024

References

Bai, Z.-Z., Liu, X.-G.: On the Meany inequality with applications to convergence analysis of several row-action iteration methods. Numer. Math. 124, 215–236 (2013)
Article MathSciNet Google Scholar
Bai, Z.-Z., Wang, L., Wu, W.-T.: On convergence rate of the randomized Gauss-Seidel method. Linear Algebra Appl. 611, 237–252 (2021)
Article MathSciNet Google Scholar
Bai, Z.-Z., Wu, W.-T.: On greedy randomized Kaczmarz method for solving large sparse linear systems. SIAM J. Sci. Comput. 40, A592–A606 (2018)
Article MathSciNet Google Scholar
Bai, Z. -Z., Wu, W. -T.: On relaxed greedy randomized Kaczmarz methods for solving large sparse linear systems. Appl. Math. Lett. 83, 21–26 (2018)
Article MathSciNet Google Scholar
Bai, Z.-Z., Wu, W.-T.: On convergence rate of the randomized Kaczmarz method. Linear Algebra Appl. 553, 252–269 (2018)
Article MathSciNet Google Scholar
Bai, Z.-Z., Wu, W.-T.: On greedy randomized coordinate descent methods for solving large linear least-squares problems. Numer. Linear Algebra Appl. 26, e2237, 1–15 (2019)
Article MathSciNet Google Scholar
Bai, Z.-Z., Wu, W.-T.: On partially randomized extended Kaczmarz method for solving large sparse overdetermined inconsistent linear systems. Linear Algebra Appl. 578, 225–250 (2019)
Article MathSciNet Google Scholar
Byrne, C.: A unified treatment of some iterative algorithms in signal processing and image reconstruction. Inverse Problems 20, 103–120 (2004)
Article MathSciNet Google Scholar
Censor, Y.: Row-action methods for huge and sparse systems and their applications. SIAM Rev. 23, 444–466 (1981)
Article MathSciNet Google Scholar
Chen, J.-Q., Huang, Z.-D.: On the error estimate of the randomized double block Kaczmarz method. Appl. Math. Comput. 370, 124907, 1–11 (2020)
MathSciNet MATH Google Scholar
Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Software 38, Art. 1, 1–25 (2011)
MathSciNet MATH Google Scholar
Du, K.: Tight upper bounds for the convergence of the randomized extended Kaczmarz and Gauss-Seidel algorithms. Numer. Linear Algebra Appl. 26, e2233, 1–14 (2019)
Article MathSciNet Google Scholar
Galántai, A.: Projectors and Projection Methods. Kluwer Academic Publishers, Norwell (2004)
Book Google Scholar
Gower, R.M., Richtárik, P.: Stochastic dual ascent for solving linear systems, arXiv:1512.06890, 1–28 (2015)
Hansen, P.C., Jørgensen, J.S.: AIR Tools II: Algebraic iterative reconstruction methods, improved implementation. Numer. Algorithms 79, 107–137 (2018)
Article MathSciNet Google Scholar
Herman, G. T.: Fundamentals of Computerized Tomography: Image Reconstruction from Projections, 2nd edn. Springer, Dordrecht (2009)
Book Google Scholar
Kaczmarz, S.: Angenäherte Auflösung von Systemen linearer Gleichungen. Bull. Int. Acad. Polon. Sci. Lett. A 35, 355–357 (1937)
MATH Google Scholar
Leventhal, D., Lewis, A.S.: Randomized methods for linear constraints: Convergence rates and conditioning. Math. Oper. Res. 35, 641–654 (2010)
Article MathSciNet Google Scholar
Ma, A., Needell, D., Ramdas, A.: Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods. SIAM J. Matrix Anal. Appl. 36, 1590–1604 (2015)
Article MathSciNet Google Scholar
Natterer, F.: The Mathematics of Computerized Tomography. SIAM, Philadelphia (2001)
Book Google Scholar
Needell, D.: Randomized Kaczmarz solver for noisy linear systems. BIT 50, 395–403 (2010)
Article MathSciNet Google Scholar
Needell, D., Tropp, J.A.: Paved with good intentions: Analysis of a randomized block Kaczmarz method. Linear Algebra Appl. 441, 199–221 (2014)
Article MathSciNet Google Scholar
Needell, D., Ward, R.: Two-subspace projection method for coherent overdetermined systems. J. Fourier Anal. Appl. 19, 256–269 (2013)
Article MathSciNet Google Scholar
Needell, D., Zhao, R., Zouzias, A.: Randomized block Kaczmarz method with projection for solving least squares. Linear Algebra Appl. 484, 322–343 (2015)
Article MathSciNet Google Scholar
Strohmer, T., Vershynin, R.: A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl. 15, 262–278 (2009)
Article MathSciNet Google Scholar
Zhang, J.-J.: A new greedy Kaczmarz algorithm for the solution of very large linear systems. Appl. Math. Lett. 91, 207–212 (2019)
Article MathSciNet Google Scholar
Zouzias, A., Freris, N.M.: Randomized extended Kaczmarz for solving least squares. SIAM J. Matrix Anal. Appl. 34, 773–793 (2013)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The author is thankful to the referees for their constructive comments and valuable suggestions, which greatly improved the original manuscript of this paper.

Funding

This work is supported by The National Natural Science Foundation (No. 12001043 and No. 12071472), P.R. China; in part by Beijing Institute of Technology Research Fund Program for Young Scholars; and in part by Science and Technology Commission of Shanghai Municipality (No. 18dz2271000).

Author information

Authors and Affiliations

School of Mathematics and Statistics, Beijing Institute of Technology, Beijing, 100081, People’s Republic of China
Wen-Ting Wu

Authors

Wen-Ting Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen-Ting Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Lemma 4.1

From the definition of the GTRK method we can obtain

$$ \begin{array}{@{}rcl@{}} x_{k+1} &=&x_{k}+ \frac{\left( b^{(i_{k_2})}-A^{(i_{k_2})} x_{k}\right)} {\|A^{(i_{k_2})}\|_{2}^{2}} (A^{(i_{k_2})})^{*} \\&&+\left( \frac{b^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}}-\bar{\mu}_{k}\frac{b^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}} -u_{k}^{*} x_{k}\right) \frac{u_{k}}{\|u_{k}\|_{2}^{2}}, \end{array} $$

where u_k is defined as in (3.3) with $\|u_{k}\|_{2}^{2}=1-|\mu _{k}|^{2}$. Since the linear system Ax = b is assumed to be consistent, it holds that b = Ax_⋆, then we have

$$ \begin{array}{@{}rcl@{}} x_{k} - x_{\star}+ \frac{\left( b^{(i_{k_2})}-A^{(i_{k_2})} x_{k}\right)} {\|A^{(i_{k_2})}\|_{2}^{2}} (A^{(i_{k_2})})^{*} &=&x_{k} - x_{\star}+ \frac{\left( A^{(i_{k_2})} x_{\star} - A^{(i_{k_2})} x_{k}\right)} {\|A^{(i_{k_2})}\|_{2}^{2}} (A^{(i_{k_2})})^{*} \\ &=&\left( I_{n}-\frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}} {\|A^{(i_{k_2})}\|_{2}^{2}}\right)(x_{k}-x_{\star}) \end{array} $$

and

$$ \begin{array}{@{}rcl@{}} \frac{b^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}} - \bar{\mu}_{k}\frac{b^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}} - u_{k}^{*} x_{k} = \frac{A^{(i_{k_1})} x_{\star}}{\|A^{(i_{k_1})}\|_{2}} - \bar{\mu}_{k}\frac{A^{(i_{k_2})} x_{\star}}{\|A^{(i_{k_2})}\|_{2}} -u_{k}^{*} x_{k} =u_{k}^{*}(x_{\star}-x_{k}). \end{array} $$

Therefore, it holds that

$$ \begin{array}{@{}rcl@{}} x_{k+1}-x_{\star} =\left( I_{n}-\frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}} -\frac{u_{k}u_{k}^{*}}{\|u_{k}\|_{2}^{2}}\right)(x_{k}-x_{\star}). \end{array} $$

(A.1)

We denote $\check {y}_{k}$ and $\check {x}_{k+1}$ as the vectors obtained by utilizing one RK iteration on x_k and $\check {y}_{k}$ with the working rows $A^{(i_{k_1})}$ and $A^{(i_{k_2})}$ for the consistent linear system Ax = b, respectively. That is,

$$ \check{y}_{k} = x_{k} +\frac{\left( b^{(i_{k_1})}-A^{(i_{k_1})} x_{k}\right)} {\|A^{(i_{k_1})}\|_{2}^{2}} (A^{(i_{k_1})})^{*} $$

and

$$ \check{x}_{k+1} =\check{y}_{k} + \frac{\left( b^{(i_{k_2})}-A^{(i_{k_2})} \check{y}_{k}\right)} {\|A^{(i_{k_2})}\|_{2}^{2}} (A^{(i_{k_2})})^{*}. $$

Then, it follows from b = Ax_⋆ that

$$ \begin{array}{@{}rcl@{}} \check{x}_{k+1}-x_{\star} &=&\check{y}_{k}-x_{\star}+ \frac{\left( A^{(i_{k_2})} x_{\star}-A^{(i_{k_2})} \check{y}_{k}\right)} {\|A^{(i_{k_2})}\|_{2}^{2}} (A^{(i_{k_2})})^{*}\\ &=&\left( I_{n}-\frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}}\right) (\check{y}_{k}-x_{\star}) \\ & =&\left( I_{n}-\frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}}\right) \left( x_{k}-x_{\star} +\frac{\left( A^{(i_{k_1})} x_{\star}-A^{(i_{k_1})} x_{k}\right)} {\|A^{(i_{k_1})}\|_{2}^{2}} (A^{(i_{k_1})})^{*}\right) \\ & =&\left( I_{n}-\frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}}\right) \left( I_{n}-\frac{(A^{(i_{k_1})})^{*}A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}^{2}}\right) (x_{k}-x_{\star}). \end{array} $$

(A.2)

From the definition of u_k we have

$$ \begin{array}{@{}rcl@{}} \check{x}_{k+1} - x_{\star} \!& =&\!\left( \!I_{n} - \frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}} - \left( I_{n} - \frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}}\right) \frac{(A^{(i_{k_1})})^{*}A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}^{2}}\!\right) (x_{k} - x_{\star}) \\ \!& =&\!\left( I_{n}-\frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}}- \frac{u_{k}A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}}\right) (x_{k}-x_{\star}) \\ \!& =&\!\left( I_{n}-\frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}}- \frac{u_{k} u_{k}^{*}}{\|u_{k}\|_{2}^{2}}\right)(x_{k}-x_{\star}) \\&&\!+\left( \frac{u_{k} u_{k}^{*}}{\|u_{k}\|_{2}^{2}}- \frac{u_{k}A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}}\right)(x_{k}-x_{\star}). \end{array} $$

From the orthogonality of the vectors $A^{(i_{k_2})}$ and u_k that

$$ \begin{array}{@{}rcl@{}} A^{(i_{k_2})} u_{k} =A^{(i_{k_2})}\left( \frac{(A^{(i_{k_1})})^{*}}{\|A^{(i_{k_1})}\|_{2}} -\mu_{k}\frac{(A^{(i_{k_2})})^{*}}{\|A^{(i_{k_2})}\|_{2}}\right) =0, \end{array} $$

we have

$$ \begin{array}{@{}rcl@{}} &&\!\!\|\check{x}_{k+1}-x_{\star}\|_{2}^{2} \\ \!\! & =& \!\!\left\|\!\left( \!I_{n} - \frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}} - \frac{u_{k} u_{k}^{*}}{\|u_{k}\|_{2}^{2}}\right)\!(x_{k}\! -\! x_{\star})\right\|_{2}^{2} \! +\! \left\|\!\left( \frac{u_{k} u_{k}^{*}}{\|u_{k}\|_{2}^{2}}\! -\! \frac{u_{k}A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}}\right)\! (x_{k}\! -\! x_{\star})\right\|_{2}^{2}. \end{array} $$

Then, it follows from (A.1) that

$$ \begin{array}{@{}rcl@{}} \|x_{k+1}-x_{\star}\|_{2}^{2} =\|\check{x}_{k+1}-x_{\star}\|_{2}^{2} -\left|\frac{A^{(i_{k_1})} (x_{k}-x_{\star})}{\|A^{(i_{k_1})}\|_{2}} -\frac{u_{k}^{*}(x_{k}-x_{\star})}{\|u_{k}\|_{2}^{2}} \right|^{2}\|u_{k}\|_{2}^{2}. \end{array} $$

By taking the conditional expectation on both sides of this equality, we can obtain

$$ \begin{array}{@{}rcl@{}} {\mathbb{E}}_{k}\|x_{k+1}-x_{\star}\|_{2}^{2} &=&{\mathbb{E}}_{k}\|\check{x}_{k+1}-x_{\star}\|_{2}^{2} \\&&-{\mathbb{E}}_{k}\left|\frac{A^{(i_{k_1})} (x_{k}-x_{\star})}{\|A^{(i_{k_1})}\|_{2}} -\frac{u_{k}^{*}(x_{k}-x_{\star})}{\|u_{k}\|_{2}^{2}}\right|^{2}\|u_{k}\|_{2}^{2}. \end{array} $$

(A.3)

Next, we give the estimates for the first and second parts of the right-hand side of the equality (A.3) respectively. Denote the two orthogonal projection matrices as

$$ \begin{array}{@{}rcl@{}} Q_{i_{k_{1}}}=I_{n}-\frac{(A^{(i_{k_1})})^{*}A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}^{2}} \quad\text{and}\quad Q_{i_{k_{2}}}=I_{n}-\frac{(A^{(i_{k_2})})^{*}A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}^{2}}, \end{array} $$

then from equality (A.2) we have

$$ \begin{array}{@{}rcl@{}} &&{\mathbb{E}}_{k}\|\check{x}_{k+1}-x_{\star}\|_{2}^{2} \\& =&\sum\limits_{i_{k_{1}}=1}^{m}\frac{\|A^{(i_{k_1})}\|_{2}^{2}}{\|A\|_{F}^{2}} \underset{i_{k_{2}}\neq i_{k_{1}}}{\sum\limits_{i_{k_{2}}=1}^{m}} \frac{\|A^{(i_{k_2})}\|_{2}^{2}}{\|A\|_{F}^{2}-\|A^{(i_{k_1})}\|_{2}^{2}} (x_{k}-x_{\star})^{*} Q_{i_{k_{1}}}Q_{i_{k_{2}}}Q_{i_{k_{1}}} (x_{k}-x_{\star}) \\ & =& \sum\limits_{i_{k_{1}}=1}^{m}\frac{\|A^{(i_{k_1})}\|_{2}^{2}}{\|A\|_{F}^{2}} (x_{k}-x_{\star})^{*}Q_{i_{k_{1}}} \left( I_{n}-\frac{A^{*}A-(A^{(i_{k_1})})^{*}A^{(i_{k_1})}}{\|A\|_{F}^{2}-\|A^{(i_{k_1})}\|_{2}^{2}}\right)Q_{i_{k_{1}}} (x_{k}-x_{\star}) \\ & =& \sum\limits_{i_{k_{1}}=1}^{m}\frac{\|A^{(i_{k_1})}\|_{2}^{2}}{\|A\|_{F}^{2}} (x_{k}-x_{\star})^{*}Q_{i_{k_{1}}} \left( I_{n}-\frac{A^{*}A}{\|A\|_{F}^{2}-\|A^{(i_{k_1})}\|_{2}^{2}}\right) Q_{i_{k_{1}}}(x_{k}-x_{\star}). \end{array} $$

Since $x_{k}-x_{\star }\in {\mathcal R}(A^{*})$ and $Q_{i_{k_{1}}}(x_{k}-x_{\star })\in {\mathcal R}(A^{*})$, it holds that

$$ \begin{array}{@{}rcl@{}} {\mathbb{E}}_{k}\|\check{x}_{k+1}-x_{\star}\|_{2}^{2} & \leq& \left( 1-\frac{\lambda_{\min}(A^{*}A)} {\tau_{\max}}\right) \sum\limits_{i_{k_{1}}=1}^{m}\frac{\|A^{(i_{k_1})}\|_{2}^{2}}{\|A\|_{F}^{2}} (x_{k}-x_{\star})^{*}Q_{i_{k_{1}}}(x_{k}-x_{\star}) \\ & =& \left( 1-\frac{\lambda_{\min}(A^{*}A)} {\tau_{\max}}\right) (x_{k}-x_{\star})^{*} \left( I_{n}-\frac{A^{*} A}{\|A\|_{F}^{2}}\right) (x_{k}-x_{\star}) \\ & \leq& \left( 1-\frac{\lambda_{\min}(A^{*}A)} {\tau_{\max}}\right) \left( 1-\frac{\lambda_{\min}(A^{*}A)} {\|A\|_{F}^{2}}\right)\|x_{k}-x_{\star}\|_{2}^{2}. \end{array} $$

(A.4)

From the definition of u_k and $\|u_{k}\|_{2}^{2}=1-|\mu _{k}|^{2}$, we have

$$ \begin{array}{@{}rcl@{}} && {\mathbb{E}}_{k}\left|\frac{A^{(i_{k_1})} (x_{k}-x_{\star})}{\|A^{(i_{k_1})}\|_{2}} -\frac{u_{k}^{*}(x_{k}-x_{\star})}{\|u_{k}\|_{2}^{2}}\right|^{2}\|u_{k}\|_{2}^{2}\\ & =& {\mathbb{E}}_{k}\left|\|u_{k}\|_{2}\frac{A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}}(x_{k}-x_{\star}) -\frac{1}{\|u_{k}\|_{2}}\left( \frac{A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}} -\bar{\mu}_{k}\frac{A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}}\right)(x_{k}-x_{\star}) \right|^{2} \\ & =& \sum\limits_{i_{k_{1}}=1}^{m}\frac{\|A^{(i_{k_1})}\|_{2}^{2}}{\|A\|_{F}^{2}} \underset{i_{k_{2}}\neq i_{k_{1}}}{\sum\limits_{i_{k_{2}}=1}^{m}} \frac{\|A^{(i_{k_2})}\|_{2}^{2}}{\|A\|_{F}^{2}-\|A^{(i_{k_1})}\|_{2}^{2}} \left|\frac{|\mu_{k}|^{2}}{\|u_{k}\|_{2}}\frac{A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}}(x_{k}-x_{\star}) \right.\\&&\left.-\frac{\bar{\mu}_{k}}{\|u_{k}\|_{2}}\frac{A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}}(x_{k}-x_{\star}) \right|^{2} \\ & \geq& \sum\limits_{i_{k_{1}}=1}^{m} \underset{i_{k_{2}}\neq i_{k_{1}}}{\sum\limits_{i_{k_{2}}=1}^{m}} \frac{\|A^{(i_{k_1})}\|_{2}^{2}\|A^{(i_{k_2})}\|_{2}^{2}} {\|A\|_{F}^{2} \tau_{\max}} \left|\frac{|\mu_{k}|^{2}}{\|u_{k}\|_{2}}\frac{A^{(i_{k_1})}}{\|A^{(i_{k_1})}\|_{2}}(x_{k}-x_{\star}) \right.\\&&\left.-\frac{\bar{\mu}_{k}}{\|u_{k}\|_{2}}\frac{A^{(i_{k_2})}}{\|A^{(i_{k_2})}\|_{2}}(x_{k}-x_{\star}) \right|^{2}. \end{array} $$

Then, with the notations

$$ \begin{array}{@{}rcl@{}} \theta_{p,q} =\frac{\frac{|A^{(q)}(A^{(p)})^{*}|^{2}}{\|A^{(q)}\|_{2}^{2}\|A^{(p)}\|_{2}^{2}}} {\sqrt{1-\frac{|A^{(q)}(A^{(p)})^{*}|^{2}}{\|A^{(q)}\|_{2}^{2}\|A^{(p)}\|_{2}^{2}}}} \quad\text{and}\quad \eta_{p,q} =\frac{\frac{A^{(p)}(A^{(q)})^{*}}{\|A^{(q)}\|_{2}\|A^{(p)}\|_{2}}} {\sqrt{1-\frac{|A^{(q)}(A^{(p)})^{*}|^{2}}{\|A^{(q)}\|_{2}^{2}\|A^{(p)}\|_{2}^{2}}}} \end{array} $$

for p, q ∈{1, 2,…, m} and p≠q, we can obtain

$$ \begin{array}{@{}rcl@{}} && {\mathbb{E}}_{k}\left|\frac{A^{(i_{k_1})} (x_{k}-x_{\star})}{\|A^{(i_{k_1})}\|_{2}} -\frac{u_{k}^{*}(x_{k}-x_{\star})}{\|u_{k}\|_{2}^{2}}\right|^{2}\|u_{k}\|_{2}^{2} \\ & \geq& \frac{1}{\|A\|_{F}^{2} \tau_{\max}} \sum\limits_{p<q}\|A^{(p)}\|_{2}^{2}\|A^{(q)}\|_{2}^{2} \left( \left|\theta_{p,q}\frac{A^{(p)}}{\|A^{(p)}\|_{2}}(x_{k}-x_{\star}) \right.\right.\\&&\left.-\eta_{p,q}\frac{A^{(q)}}{\|A^{(q)}\|_{2}}(x_{k}-x_{\star})\right|^{2} \\ &&+\left.\left|\theta_{p,q}\frac{A^{(q)}}{\|A^{(q)}\|_{2}}(x_{k}-x_{\star}) -\bar{\eta}_{p,q}\frac{A^{(p)}}{\|A^{(p)}\|_{2}}(x_{k}-x_{\star})\right|^{2}\right). \end{array} $$

Since for any $\theta , \eta , \phi , \psi \in {\mathbb {C}}$, it holds that

$$ \begin{array}{@{}rcl@{}} |\theta \phi-\eta \psi|^{2}+|\theta \psi-\bar{\eta} \phi|^{2} \geq (|\eta|-|\theta|)^{2}(|\phi|^{2}+|\psi|^{2}), \end{array} $$

we have

$$ \begin{array}{@{}rcl@{}} && {\mathbb{E}}_{k}\left|\frac{A^{(i_{k_1})} (x_{k}-x_{\star})}{\|A^{(i_{k_1})}\|_{2}} -\frac{u_{k}^{*}(x_{k}-x_{\star})}{\|u_{k}\|_{2}^{2}}\right|^{2}\|u_{k}\|_{2}^{2} \\ & \geq& \frac{1}{\|A\|_{F}^{2} \tau_{\max}} \sum\limits_{p<q}\|A^{(p)}\|_{2}^{2}\|A^{(q)}\|_{2}^{2} (|\eta_{p,q}|-|\theta_{p,q}|)^{2}\\ &&\cdot\left( \left|\frac{A^{(p)}}{\|A^{(p)}\|_{2}}(x_{k}-x_{\star})\right|^{2} +\left|\frac{A^{(q)}}{\|A^{(q)}\|_{2}}(x_{k}-x_{\star})\right|^{2}\right). \end{array} $$

Then from

$$ \begin{array}{@{}rcl@{}} (|\eta_{p,q}|-|\theta_{p,q}|)^{2} &=&\left( \frac{\frac{|A^{(p)}(A^{(q)})^{*}|}{\|A^{(q)}\|_{2}\|A^{(p)}\|_{2}} -\frac{|A^{(q)}(A^{(p)})^{*}|^{2}}{\|A^{(q)}\|_{2}^{2}\|A^{(p)}\|_{2}^{2}}} {\sqrt{1-\frac{|A^{(q)}(A^{(p)})^{*}|^{2}}{\|A^{(q)}\|_{2}^{2}\|A^{(p)}\|_{2}^{2}}}}\right)^{2} \\&=&\frac{\frac{|A^{(q)}(A^{(p)})^{*}|^{2}}{\|A^{(q)}\|_{2}^{2}\|A^{(p)}\|_{2}^{2}} \left( 1-\frac{|A^{(q)}(A^{(p)})^{*}|}{\|A^{(q)}\|_{2}\|A^{(p)}\|_{2}}\right)} {1+\frac{|A^{(q)}(A^{(p)})^{*}|}{\|A^{(q)}\|_{2}\|A^{(p)}\|_{2}}} \geq \gamma, \end{array} $$

we know that

$$ \begin{array}{@{}rcl@{}} && {\mathbb{E}}_{k}\left|\frac{A^{(i_{k_1})} (x_{k}-x_{\star})}{\|A^{(i_{k_1})}\|_{2}} -\frac{u_{k}^{*}(x_{k}-x_{\star})}{\|u_{k}\|_{2}^{2}}\right|^{2}\|u_{k}\|_{2}^{2} \\ & \geq& \frac{\gamma}{\|A\|_{F}^{2} \tau_{\max}} \sum\limits_{p<q}\left( \|A^{(q)}\|_{2}^{2}\left|A^{(p)}(x_{k}-x_{\star})\right|^{2} +\|A^{(p)}\|_{2}^{2}\left|A^{(q)}(x_{k}-x_{\star})\right|^{2}\right) \\ & =&\frac{\gamma}{\|A\|_{F}^{2} \tau_{\max}} \sum\limits_{p=1}^{m}\underset{q \neq p}{\sum\limits_{q=1}^{m}} \|A^{(q)}\|_{2}^{2}\left|A^{(p)}(x_{k}-x_{\star})\right|^{2} \\ & =&\frac{\gamma}{\|A\|_{F}^{2} \tau_{\max}} \sum\limits_{p=1}^{m}(\|A\|_{F}^{2}-\|A^{(p)}\|_{2}^{2}) \left|A^{(p)}(x_{k}-x_{\star})\right|^{2} \\ & \geq& \frac{\gamma \tau_{\min}} {\|A\|_{F}^{2} \tau_{\max}} \sum\limits_{p=1}^{m}\left|A^{(p)}(x_{k}-x_{\star})\right|^{2} =\frac{\gamma \tau_{\min}} {\|A\|_{F}^{2} \tau_{\max}} \|A(x_{k}-x_{\star})\|_{2}^{2}. \end{array} $$

It follows from $x_{k}-x_{\star }\in {\mathcal R}(A^{*})$ that

$$ \begin{array}{@{}rcl@{}} \!\!\!\!\!\!\!\!\!\!\!{\mathbb{E}}_{k}\left|\frac{A^{(i_{k_1})} (x_{k} - x_{\star})}{\|A^{(i_{k_1})}\|_{2}} - \frac{u_{k}^{*}(x_{k}-x_{\star})}{\|u_{k}\|_{2}^{2}}\right|^{2}\|u_{k}\|_{2}^{2} \!\geq\! \frac{\lambda_{\min}(A^{*}A)}{\|A\|_{F}^{2}} \frac{\tau_{\min}}{\tau_{\max}} \gamma\|x_{k} - x_{\star}\|_{2}^{2} . \end{array} $$

(A.5)

Substituting (A.4) and (A.5) into (A.3), we have

$$ \begin{array}{@{}rcl@{}} {\mathbb{E}}_{k}\|x_{k+1} - x_{\star}\|_{2}^{2} &\leq& \left[ \left( 1-\frac{\lambda_{\min}(A^{*}A)} {\tau_{\max}}\right) \left( 1-\frac{\lambda_{\min}(A^{*}A)}{\|A\|_{F}^{2}}\right) -\frac{\lambda_{\min}(A^{*}A)}{\|A\|_{F}^{2}} \frac{\tau_{\min}}{\tau_{\max}} \gamma \right]\\&& \cdot \|x_{k}-x_{\star}\|_{2}^{2}. \end{array} $$

Finally, taking full expectation on both sides of this inequality, we will obtain the result in Lemma 4.1 by induction on the iteration index k. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, WT. On two-subspace randomized extended Kaczmarz method for solving large linear least-squares problems. Numer Algor 89, 1–31 (2022). https://doi.org/10.1007/s11075-021-01104-x

Download citation

Received: 19 November 2020
Accepted: 22 March 2021
Published: 12 May 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s11075-021-01104-x

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On two-subspace randomized extended Kaczmarz method for solving large linear least-squares problems

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Preconditioned golden ratio primal-dual algorithm with linesearch

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix

Proof of Lemma 4.1

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

On two-subspace randomized extended Kaczmarz method for solving large linear least-squares problems

Abstract

Access this article

Similar content being viewed by others

The Frank-Wolfe Algorithm: A Short Introduction

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Preconditioned golden ratio primal-dual algorithm with linesearch

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix

Appendix

Proof of Lemma 4.1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation