Convergence Theory for Preconditioned Eigenvalue Solvers in a Nutshell

Argentati, Merico E.; Knyazev, Andrew V.; Neymeyr, Klaus; Ovtchinnikov, Evgueni E.; Zhou, Ming

doi:10.1007/s10208-015-9297-1

Convergence Theory for Preconditioned Eigenvalue Solvers in a Nutshell

Published: 23 November 2015

Volume 17, pages 713–727, (2017)
Cite this article

Foundations of Computational Mathematics Aims and scope Submit manuscript

Merico E. Argentati¹,
Andrew V. Knyazev²,
Klaus Neymeyr³,
Evgueni E. Ovtchinnikov⁴ &
…
Ming Zhou³

545 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Preconditioned iterative methods for numerical solution of large matrix eigenvalue problems are increasingly gaining importance in various application areas, ranging from material sciences to data mining. Some of them, e.g., those using multilevel preconditioning for elliptic differential operators or graph Laplacian eigenvalue problems, exhibit almost optimal complexity in practice; i.e., their computational costs to calculate a fixed number of eigenvalues and eigenvectors grow linearly with the matrix problem size. Theoretical justification of their optimality requires convergence rate bounds that do not deteriorate with the increase of the problem size. Such bounds were pioneered by E. D’yakonov over three decades ago, but to date only a handful have been derived, mostly for symmetric eigenvalue problems. Just a few of known bounds are sharp. One of them is proved in doi:10.1016/S0024-3795(01)00461-X for the simplest preconditioned eigensolver with a fixed step size. The original proof has been greatly simplified and shortened in doi:10.1137/080727567 by using a gradient flow integration approach. In the present work, we give an even more succinct proof, using novel ideas based on Karush–Kuhn–Tucker theory and nonlinear programming.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A class of accelerated GADMM-based method for multi-block nonconvex optimization problems

Article 13 April 2024

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

Article 07 June 2018

Operator Splitting

References

Z. Bai, J. Demmel, J. Dongarra, A. Ruhe, and H. van der Vorst, eds., Templates for the solution of algebraic eigenvalue problems: A practical guide, SIAM, Philadelphia, 2000.
MATH Google Scholar
F. Bottin, S. Leroux, A. Knyazev, G. Zerah, Large-scale ab initio calculations based on three levels of parallelization, Computational Materials Science, 42(2008), 2, pp. 329–336. doi:10.1016/j.commatsci.2007.07.019
H. Bouwmeester, A. Dougherty, A. V. Knyazev, Nonsymmetric Preconditioning for Conjugate Gradient and Steepest Descent Methods, Procedia Computer Science, 51 (2015), pp. 276–285.doi:10.1016/j.procs.2015.05.241. A preliminary version available at http://math.ucdenver.edu/~aknyazev/research/papers/old/k.pdf
S. Brin and L. Page, The anatomy of a large-scale hypertextual Web search engine, Computer Networks and ISDN Systems, 30 (1998), 17, pp. 107–117. doi:10.1016/S0169-7552(98)00110-X
E. G. D’yakonov, Optimization in Solving Elliptic Problems, CRC Press, Boca Raton, Florida, 1996. ISBN: 978-0849328725
R. Fletcher, Practical Methods of Optimization, John Wiley & Sons, Second Edition, 1987.
MATH Google Scholar
A. V. Knyazev, Computation of eigenvalues and eigenvectors for mesh problems: algorithms and error estimates, (In Russian), Dept. Num. Math., USSR Ac. Sci., Moscow, 1986. http://math.ucdenver.edu/~aknyazev/research/papers/old/k.pdf
A. V. Knyazev, Convergence rate estimates for iterative methods for a mesh symmetric eigenvalue problem, Russian J. Numer. Anal. Math. Modelling, 2 (1987), pp. 371–396. doi:10.1515/rnam.1987.2.5.371
A. V. Knyazev, Preconditioned eigensolvers—an oxymoron?, Electronic Transactions on Numerical Analysis, 7(1998), pp. 104–123. http://etna.mcs.kent.edu/vol.7.1998/pp104-123.dir/pp104-123.pdf
A. V. Knyazev, Modern Preconditioned Eigensolvers for Spectral Image Segmentation and Graph Bisection, Workshop on Clustering Large Data Sets Third IEEE International Conference on Data Mining (ICDM 2003), 2003. http://math.ucdenver.edu/~aknyazev/research/conf/ICDM03
A. V. Knyazev and K. Neymeyr, A geometric theory for preconditioned inverse iteration. III: A short and sharp convergence estimate for generalized eigenvalue problems, Linear Algebra Appl., 358 (2003), pp. 95–114. doi:10.1016/S0024-3795(01)00461-X
A. V. Knyazev and K. Neymeyr, Efficient solution of symmetric eigenvalue problems using multigrid preconditioners in the locally optimal block conjugate gradient method, Electronic Transactions on Numerical Analysis, 15 (2003), pp. 38–55. http://etna.mcs.kent.edu/vol.15.2003/pp38-55.dir/pp38-55.pdf
A. V. Knyazev and K. Neymeyr, Gradient flow approach to geometric convergence analysis of preconditioned eigensolvers, SIAM J. Matrix Anal. Appl., 31 (2009), pp. 621–628. doi:10.1137/080727567
D. Kressner, M. Steinlechner, and A. Uschmajew, Low-rank tensor methods with subspace correction for symmetric eigenvalue problems, SIAM J. Sci. Comput., 36(2014), 5, pp. A2346–A2368. http://sma.epfl.ch/~anchpcommon/publications/EVAMEN.pdf
D. Kressner, M. M. Pandur, M. Shao, An indefinite variant of LOBPCG for definite matrix pencils, J Numerical Algorithms, 66(2014), 4, pp. 681–703. doi:10.1007/s11075-013-9754-3
K. Neymeyr, A geometric convergence theory for the preconditioned steepest descent iteration, SIAM J. Numer. Anal., 50 (2012), pp. 3188–3207. doi:10.1137/11084488X
K. Neymeyr, E. Ovtchinnikov, and M. Zhou, Convergence analysis of gradient iterations for the symmetric eigenvalue problem, SIAM J. Matrix Anal. Appl., 32 (2011), pp. 443–456. doi:10.1137/100784928
J. Nocedal and S.J. Wright, Numerical Optimization, Springer, 2006.
MATH Google Scholar
E. E. Ovtchinnikov, Sharp convergence estimates for the preconditioned steepest descent method for Hermitian eigenvalue problems, SIAM J. Numer. Anal., 43(6):2668–2689, 2006. doi:10.1137/040620643
D. B. Szyld and F. Xue, Preconditioned eigensolvers for large-scale nonlinear Hermitian eigenproblems with variational characterizations. I. Conjugate gradient methods, Research Report 14-08-26, Department of Mathematics, Temple University, August 2014. Revised April 2015. To appear in Mathematics of Computation. https://www.math.temple.edu/~szyld/reports/NLPCG.report.rev
D. B. Szyld, E. Vecharynski and F. Xue, Preconditioned eigensolvers for large-scale nonlinear Hermitian eigenproblems with variational characterizations. II. Interior eigenvalues, Research Report 15-04-10, Department of Mathematics, Temple University, April 2015. To appear in SIAM Journal on Scientific Computing. arXiv:1504.02811
E. Vecharynski, Y. Saad, and M. Sosonkina Graph partitioning using matrix values for preconditioning symmetric positive definite systems, SIAM J. Sci. Comput., 36(2014), 1, pp. A63–A87. doi:10.1137/120898760
E. Vecharynski, C. Yang, and J. E. Pask, A projected preconditioned conjugate gradient algorithm for computing a large invariant subspace of a Hermitian matrix, Journal of Computational Physics, Vol. 290, pp. 73–89, 2015. doi:10.1016/j.jcp.2015.02.030
S. Yamada, T. Imamura, T. Kano, and M. Machida, High-performance computing for exact numerical approaches to quantum many-body problems on the earth simulator, In Proceedings of the 2006 ACM/IEEE conference on Supercomputing (SC ’06). ACM, New York, NY, USA, article 47, 2006. doi:10.1145/1188455.1188504

Download references

Author information

Authors and Affiliations

Department of Mathematical and Statistical Sciences, University Colorado Denver, P.O. Box 173364, Campus Box 170, Denver, CO, 80217-3364, USA
Merico E. Argentati
Mitsubishi Electric Research Laboratories, 201 Broadway, Cambridge, MA, 02139-1955, USA
Andrew V. Knyazev
Universität Rostock, Institut für Mathematik, Ulmenstraße 69, 18055, Rostock, Germany
Klaus Neymeyr & Ming Zhou
Numerical Analysis Group, Building R18, STFC Rutherford Appleton Laboratory, Harwell Oxford, Didcot, Oxfordshire, OX11 0QX, UK
Evgueni E. Ovtchinnikov

Authors

Merico E. Argentati
View author publications
You can also search for this author in PubMed Google Scholar
Andrew V. Knyazev
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Neymeyr
View author publications
You can also search for this author in PubMed Google Scholar
Evgueni E. Ovtchinnikov
View author publications
You can also search for this author in PubMed Google Scholar
Ming Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming Zhou.

Additional information

Dedicated to the memory of Evgenii G. D’yakonov, Moscow, Russia, 1935–2006.

Communicated by Nicholas Higham. A preliminary version is posted at http://arxiv.org.

Appendix

I. An alternative estimate for the left-hand side of (4.8), which is sharp with respect to all variables, can be derived as follows: With $\delta :=\beta /\alpha $ and $\varepsilon :=\mu _l/\mu _k$, a new representation of $\gamma ^2$ is given by

$$\begin{aligned} \gamma ^2 =\frac{(\delta -\varepsilon )^2(1+\alpha ^2)}{(1-\varepsilon )^2(1+\alpha ^2\delta ^2)}. \end{aligned}$$

This results in a quadratic equation for $\delta $ with the roots

$$\begin{aligned} \delta _{\pm }=\frac{\varepsilon (1+\alpha ^2)\pm \gamma (1-\varepsilon ) \sqrt{(1+\alpha ^2)(1+\alpha ^2\varepsilon ^2)-\alpha ^2\gamma ^2(1-\varepsilon )^2}}{(1+\alpha ^2)-\alpha ^2\gamma ^2(1-\varepsilon )^2}. \end{aligned}$$

Since $\delta ^2=\dfrac{\beta ^2}{\alpha ^2}=\dfrac{\mu _k-\mu (y)}{\mu (y)-\mu _l}\; \dfrac{\mu (x)-\mu _l}{\mu _k-\mu (x)}$, a strictly sharp bound for the estimate in (4.8) is given by $\max \{\delta _{+}^2,\delta _{-}^2\}=\delta _{+}^2$. We note that in the limit case $\mu (x)\rightarrow \mu _k$ it holds that $\alpha \rightarrow 0$, and $\delta _{+}^2$ turns into $(\varepsilon +\gamma (1-\varepsilon ))^2$. This coincides with the known bound in (4.8).

II. The bound in (4.8) contains a convex combination of 1 and $\mu _l/\mu _k$. Interestingly, this bound can also be derived by using a convex function as follows: Without loss of generality, we assume that x has a positive $x_k$ coordinate. Then, $\angle (x,x_k)$ is an acute angle. Since $B>0, \angle (Bx,x)$ and $\angle (Bx,x_k)$ are also acute angles. The equality (4.1) together with $\gamma <1$ shows further $\angle (Bx,y)<\angle (Bx,x)<\pi /2$. Since $\angle (Bx,x_k)$ and $\angle (Bx,y)$ are acute angles, the vectors $x_k$ and y are located in a half-plane whose boundary line is orthogonal to Bx. A simple case differentiation shows that $\angle (y,x_k)$ is either equal to $|\angle (Bx,x_k)-\angle (Bx,y)|$ or equal to $\angle (Bx,x_k)+\angle (Bx,y)$. Further, we use the equalities

$$\begin{aligned} \tan ^2\angle (y,x_k)=\frac{\mu _k-\mu (y)}{\mu (y)-\mu _l},\;\; \tan ^2\angle (x,x_k)=\frac{\mu _k-\mu (x)}{\mu (x)-\mu _l},\;\; \frac{\tan ^2\angle (Bx,x_k)}{\tan ^2\angle (x,x_k)}=\frac{\mu _l^2}{\mu _k^2}, \end{aligned}$$

which can be derived in a similar way to Sect. 3. The last equality proves $\angle (Bx,x_k)<\angle (x,x_k)$, since the tangent is an increasing function for acute angles, and $\mu _l<\mu _k$. This leads to $\angle (x,x_k)=\angle (Bx,x_k)+\angle (Bx,x)$, since $x, Bx, x_k$ are all in the same quadrant. In summary, it holds that

$$\begin{aligned} \angle (y,x_k)\le \angle (Bx,x_k)+\angle (Bx,y)<\angle (Bx,x_k)+\angle (Bx,x) =\angle (x,x_k)<\pi /2, \end{aligned}$$

(4.9)

i.e., $\angle (y,x_k)$ is a further acute angle. Using these acute angles, we write (4.8) equivalently as

$$\begin{aligned} \tan \angle (y,x_k)\le \gamma \tan \angle (x,x_k) +(1-\gamma )\tan \angle (Bx,x_k). \end{aligned}$$

(4.10)

In order to prove (4.10), we use (4.1) again, together with $\varphi :=\angle (Bx,x), \vartheta :=\angle (Bx,x_k)$ and the first inequality in (4.9). It holds that

$$\begin{aligned} \tan \angle (y,x_k)\le \tan [\vartheta +\arcsin \big (\gamma \sin (\varphi )\big )]=:f(\gamma ). \end{aligned}$$

Because

$$\begin{aligned} f'(\gamma )= \frac{\big (1+f(\gamma )^2\big )\sin (\varphi )}{\sqrt{1-\big (\gamma \sin (\varphi )\big )^2}}\ge 0 \quad \text{ for }\quad \gamma \in [0,1], \end{aligned}$$

$f(\gamma )$ is a monotonically increasing function in [0, 1]. The numerator of $f'(\gamma )$ is also a monotonically increasing function and its denominator is monotonically decreasing in $\gamma \in [0,1]$. These two functions are positive so that $f'(\gamma )$ is also a monotonically increasing function. Thus, $f(\gamma )$ is a convex function in [0, 1], and

$$\begin{aligned} \tan \angle (y,x_k)\le f(\gamma )\le (1-\gamma )f(0)+\gamma f(1) =(1-\gamma )\tan (\vartheta )+\gamma \tan (\vartheta +\varphi ), \end{aligned}$$

which proves (4.10) and hence (4.8).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Argentati, M.E., Knyazev, A.V., Neymeyr, K. et al. Convergence Theory for Preconditioned Eigenvalue Solvers in a Nutshell. Found Comput Math 17, 713–727 (2017). https://doi.org/10.1007/s10208-015-9297-1

Download citation

Received: 19 January 2015
Revised: 31 July 2015
Accepted: 03 November 2015
Published: 23 November 2015
Issue Date: June 2017
DOI: https://doi.org/10.1007/s10208-015-9297-1

Keywords

Mathematics Subject Classifications

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convergence Theory for Preconditioned Eigenvalue Solvers in a Nutshell

Abstract

Access this article

Similar content being viewed by others

A class of accelerated GADMM-based method for multi-block nonconvex optimization problems

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

Operator Splitting

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classifications

Navigation

Convergence Theory for Preconditioned Eigenvalue Solvers in a Nutshell

Abstract

Access this article

Similar content being viewed by others

A class of accelerated GADMM-based method for multi-block nonconvex optimization problems

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

Operator Splitting

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classifications

Search

Navigation