Minimization principles and computation for the generalized linear response eigenvalue problem

Bai, Zhaojun; Li, Ren-Cang

doi:10.1007/s10543-014-0472-6

Minimization principles and computation for the generalized linear response eigenvalue problem

Published: 14 February 2014

Volume 54, pages 31–54, (2014)
Cite this article

BIT Numerical Mathematics Aims and scope Submit manuscript

Zhaojun Bai¹ &
Ren-Cang Li²

380 Accesses
17 Citations
Explore all metrics

Abstract

The minimization principle and Cauchy-like interlacing inequalities for the generalized linear response eigenvalue problem are presented. Based on these theoretical results, the best approximations through structure-preserving subspace projection and a locally optimal block conjugate gradient-like algorithm for simultaneously computing the first few smallest eigenvalues with the positive sign are proposed. Numerical results are presented to illustrate essential convergence behaviors of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Article 13 April 2024

Positivity and Positivity-Definiteness for Cauchy Powers of Linear Functionals on the Linear Space of Polynomials

Article 12 April 2024

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Article Open access 06 March 2024

Notes

This condition is equivalent to that both $A\pm B$ are positive definite. In [2, 3] and this article, we focus on very much this case, except that one of $A\pm B$ is allowed to be positive semi-definite.
It suffices to assume one of $E_{\pm }$ is nonsingular since $E_{\pm }^{{{\mathrm{T}}}}=E_{\mp }$.
A similar statement for the case in which $K\succ 0$ but $M\succeq 0$ can be made, noting that the decompositions in (2.7) no longer hold but similar decompositions exist.
How this factorization is done is not essential mathematically. But it is included to accommodate cases when such a factorization may offer certain conveniences. In general, simply taking $W_1=W^{{{\mathrm{T}}}}$ and $W_2=I_{\ell }$ or $W_1=I_{\ell }$ and $W_2=W$ may be sufficient.
Computationally, this can be realized by the QR decompositions of $W_i^{{{\mathrm{T}}}}$. For more generality in presentation, we do not assume that they have to be QR decompositions.

References

Bai, Z., Li, R.C.: Minimization principle for linear response eigenvalue problem iii: general case. Technical Report 2013–01, Department of Mathematics, University of Texas at Arlington (2011). Available at http://www.uta.edu/math/preprint/
Bai, Z., Li, R.C.: Minimization principles for the linear response eigenvalue problem I: theory. SIAM J. Matrix Anal. Appl. 33(4), 1075–1100 (2012)
Article MATH MathSciNet Google Scholar
Bai, Z., Li, R.C.: Minimization principles for linear response eigenvalue problem II: Computation. SIAM J. Matrix Anal. Appl. 44(2), 392–416 (2013)
Google Scholar
Challacombe, M.: Linear scaling solution of the time-dependent self-consisten-field equations. e-print arXiv:1001.2586v2 (2010)
Davis, T., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1:1–1:25 (2011)
MathSciNet Google Scholar
Demmel, J.: Applied Numerical Linear Algebra. SIAM, Philadelphia (1997)
Flaschka, U., Lin, W.W., Wu, J.L.: A KQZ algorithm for solving linear-response eigenvalue equations. Linear Algebra Appl. 165, 93–123 (1992)
Article MATH MathSciNet Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Grüning, M., Marini, A., Gonze, X.: Exciton-plasmon states in nanoscale materials: breakdown of the Tamm-Dancoff approximation. Nano Lett. 9, 2820–2824 (2009)
Article Google Scholar
Lucero, M.J., Niklasson, A.M.N., Tretiak, S., Challacombe, M.: Molecular-orbital-free algorithm for excited states in time-dependent perturbation theory. J. Chem. Phys. 129(6), 64–114 (2008)
Article Google Scholar
Mehl, C., Mehrmann, V., Xu, H.: On doubly structured matrices and pencils that arise in linear response theory. Linear Algebra Appl. 380, 3–51 (2004)
Article MATH MathSciNet Google Scholar
Muta, A., Iwata, J.I., Hashimoto, Y., Yabana, K.: Solving the RPA eigenvalue equation in real-space. Prog. Theor. Phys. 108(6), 1065–1076 (2002)
Article MATH Google Scholar
Nocedal, J., Wright, S.: Numerical Optimization, 2nd edn. Springer, New York (2006)
MATH Google Scholar
Olsen, J., Jensen, H.J.A., Jørgensen, P.: Solution of the large matrix equations which occur in response theory. J. Comput. Phys. 74(2), 265–282 (1988)
Article MATH Google Scholar
Olsen, J., Jorgensen, P.: Linear and nonlinear response functions for an exact state and for an MCSCF state. J. Chem. Phys. 82(7), 3235–3264 (1985)
Article Google Scholar
Ring, P., Schuck, P.: The nuclear many-body problem. Springer, New York (1980)
Book Google Scholar
Rocca, D., Bai, Z., Li, R.C., Galli, G.: A block variational procedure for the iterative diagonalization of non-Hermitian random-phase approximation matrices. J. Chem. Phys. 136, 034–111 (2012)
Article Google Scholar
Stratmann, R.E., Scuseria, G.E., Frisch, M.J.: An efficient implementation of time-dependent density-functional theory for the calculation of excitation of large molecules. J. Chem. Phys. 109, 8218–8824 (1998)
Article Google Scholar
Thouless, D.J.: Vibrational states of nuclei in the random phase approximation. Nucl. Phys. 22(1), 78–95 (1961)
Article MATH MathSciNet Google Scholar
Thouless, D.J.: The Quantum Mechanics of Many-Body Systems. Academic Press, New York (1972)
Tsiper, E.V.: Variational procedure and generalized Lanczos recursion for small-amplitude classical oscillations. JETP Lett. 70(11), 751–755 (1999)
Article Google Scholar

Download references

Acknowledgments

We thank the referees for valuable comments and suggestions to improve the presentation of the paper Bai is supported in part by NSF grants DMR-1035468 and DMS-1115817. Li is supported in part by NSF grant DMS-1115834.

Author information

Authors and Affiliations

Department of Computer Science and Department of Mathematics, University of California, Davis, CA, 95616, USA
Zhaojun Bai
Department of Mathematics, University of Texas at Arlington, P.O. Box 19408, Arlington, TX, 76019, USA
Ren-Cang Li

Authors

Zhaojun Bai
View author publications
You can also search for this author in PubMed Google Scholar
Ren-Cang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhaojun Bai.

Additional information

Communicated by Peter Benner.

Dedicated to Professor Axel Ruhe on the occasion of his 70th birthday.

Appendix: Best approximations: the singular/unequal dimension case

This appendix continues the investigation in Sect. 4 to seek best approximate eigenpairs of $H-\lambda E$ for given $\{\mathcal{U}, \mathcal{V}\}$, a pair of approximate deflating subspaces of $H-\lambda E$ with $\dim (\mathcal{U})=\ell _1$ and $\dim (\mathcal{V})=\ell _2$. In Sect. 4, we have treated the case in which $\ell _1=\ell _2$ and $W \mathop {=}\limits ^{{\hbox {def}}}U^{{{\mathrm{T}}}}E_+V$ is nonsingular, where $U\in {\mathbb R}^{n\times \ell _1},\,V\in {\mathbb R}^{n\times \ell _2}$ are the basis matrices of $\mathcal{U}$ and $\mathcal{V}$, respectively. In what follows, we will focus on the general case: $\ell _1$ and $\ell _2$ are not necessarily equal or $W$ may be singular.

The case is much more complicated than the one in section 4, but it can be handled in the similar way as in [3] which is for $E=I_{2n}$. So we will simply summarize the results and the reader is referred to [1, Appendix A] for detail.

Factorize

$$\begin{aligned} W=W_1^{{{\mathrm{T}}}}W_2, \quad W_i\in {\mathbb R}^{r\times \ell _i},\quad r={{\mathrm{rank}}}(W)\le \min _i\ell _i. \end{aligned}$$

(8.1)

Both $W_i$ have full row rank. Factorize^{Footnote 5}

$$\begin{aligned} W_i^{{{\mathrm{T}}}}=Q_i\begin{bmatrix} R_i \\ 0 \end{bmatrix} \quad \hbox {for} i=1,2, \end{aligned}$$

(8.2)

where $R_i\in {\mathbb R}^{r\times r},\,Q_i\in {\mathbb R}^{\ell _i\times \ell _i}$ ($i=1,2$) are nonsingular. Partition

Set

$$\begin{aligned} \widehat{H}_{\hbox {SR}}=\begin{bmatrix} 0&R_1^{-1}\mathcal {K}_{11}R_1^{-{{\mathrm{T}}}} \\ R_2^{-1}\mathcal {M}_{11}R_2^{-{{\mathrm{T}}}}&0 \end{bmatrix}\in {\mathbb R}^{2r\times 2r}, \end{aligned}$$

(8.4)

where $K_{22}^{\dagger }$ and $M_{22}^{\dagger }$ are the Moore-Penrose inverses of $K_{22}$ and $M_{22}$, respectively, and

$$\begin{aligned} \mathcal {K}_{11} =K_{11}-K_{12}K_{22}^{\dagger }K_{12}^{{{\mathrm{H}}}}, \quad \mathcal {M}_{11} =M_{11}-M_{12}M_{22}^{\dagger }M_{12}^{{{\mathrm{H}}}}. \end{aligned}$$

(8.5)

Denote by $\mu _j$ for $j=1,\ldots ,r$ the eigenvalues with the positive sign of $\widehat{H}_{\hbox {SR}}$ in the ascending order and by $\hat{z}_j$ the associated eigenvectors:

$$\begin{aligned} \widehat{H}_{\hbox {SR}}\hat{z}_j=\mu _j\hat{z}_j, \quad \hat{z}_j=\begin{bmatrix} \hat{y}_j \\ \hat{x}_j \end{bmatrix}. \end{aligned}$$

(8.6)

It can be verified that $ \rho (\tilde{x}_j,\tilde{y}_j)=\mu _j\quad \hbox {for} j=1,\ldots ,r, $ where

$$\begin{aligned} \tilde{x}_j=UQ_1^{-{{\mathrm{T}}}}\begin{bmatrix} R_1^{-{{\mathrm{T}}}}\hat{x}_j \\ u_j \end{bmatrix} , \quad \tilde{y}_j=VQ_2^{-{{\mathrm{T}}}}\begin{bmatrix} R_2^{-{{\mathrm{T}}}}\hat{y}_j \\ v_j \end{bmatrix} \end{aligned}$$

(8.7)

for any $u_j$ and $v_j$ satisfying

$$\begin{aligned} K_{22}u_j=-K_{12}^{{{\mathrm{T}}}}R_1^{-{{\mathrm{T}}}}\hat{x}_j, \quad M_{22}v_j=-M_{12}^{{{\mathrm{T}}}}R_2^{-{{\mathrm{T}}}}\hat{y}_j. \end{aligned}$$

(8.8)

Naturally the approximate eigenvectors of $H-\lambda E$ should be taken as

$$\begin{aligned} \tilde{z}_j=\begin{bmatrix} \tilde{y}_j \\ \tilde{x}_j \end{bmatrix}\quad \hbox {for} j=1,\ldots ,r. \end{aligned}$$

(8.9)

Theorem 8.1

Let $\{\mathcal{U}, \mathcal{V}\}$ be a pair of approximate deflating subspaces of $H-\lambda E$ with $\dim (\mathcal{U})=\ell _1$ and $\dim (\mathcal{V})=\ell _2$, and let $U\in {\mathbb R}^{n\times \ell _1},\,V\in {\mathbb R}^{n\times \ell _2}$ be the basis matrices of $\mathcal{U}$ and $\mathcal{V}$, respectively. Let $\widehat{H}_{\hbox {SR}}$ be defined by (8.4). Then the best approximations to $\lambda _j$ for $1\le j\le k$ in the sense of (4.1) are the corresponding eigenvalues of $\widehat{H}_{\hbox {SR}}$, with the corresponding approximate eigenvectors given by (8.7)–(8.9).

Despite much more complicated appearance of $\widehat{H}_{\hbox {SR}}$ compared to $H_{\hbox {SR}}$ in Sect. 4, our next theorem surprisingly unifies both.

Theorem 8.2

The eigenvalues of $\widehat{H}_{\hbox {SR}}$ in (8.4) are the same as the finite eigenvalues of

$$\begin{aligned} \check{H}-\lambda \check{E}:&=\begin{bmatrix} U&0\\ 0&V \end{bmatrix}^{{{\mathrm{T}}}}(H-\lambda E)\begin{bmatrix} V&0\\ 0&U \end{bmatrix} \\&=\begin{bmatrix} 0&U^{{{\mathrm{T}}}}KU \\ V^{{{\mathrm{T}}}}MV&0 \end{bmatrix}-\lambda \begin{bmatrix} U^{{{\mathrm{T}}}}E_+V&\\&V^{{{\mathrm{T}}}}E_-U \end{bmatrix} \nonumber \end{aligned}$$

(8.10)

and the eigenvector $\hat{z}=\begin{bmatrix} \hat{y} \\ \hat{x} \end{bmatrix}$ of $\widehat{H}_{\hbox {SR}}$ and the eigenvector $\check{z}=\begin{bmatrix} \check{y} \\ \check{x} \end{bmatrix}$ of the pencil (8.10) associated with a finite eigenvalue are related by

$$\begin{aligned} \check{x}=Q_1^{-{{\mathrm{T}}}}\begin{bmatrix} R_1^{-{{\mathrm{T}}}}\hat{x} \\ -K_{22}^{\dagger }K_{12}^{{{\mathrm{T}}}}R_1^{-{{\mathrm{T}}}}\hat{x}+g \end{bmatrix}, \quad \check{y}=Q_2^{-{{\mathrm{T}}}}\begin{bmatrix} R_2^{-{{\mathrm{T}}}}\hat{y} \\ -M_{22}^{\dagger }M_{12}^{{{\mathrm{T}}}}R_2^{-{{\mathrm{T}}}}\hat{y}+h \end{bmatrix}, \end{aligned}$$

(8.11)

where $g$ is any vector in the kernel of $K_{22}$ and $h$ is any vector in the kernel of $M_{22}$. In particluar, if $\ell _1=\ell _2=r$, the relation in (8.11) is simplified to $\hat{z}=(W_2\oplus W_1)\check{z}$ as in Theorem 4.2.

Proof

Let $P_i=Q_i^{-{{\mathrm{T}}}}(R_i^{-{{\mathrm{T}}}}\oplus I_{\ell _i-r})$ for $i=1,2$ and both are nonsingular. It can be verified that

$$\begin{aligned} (P_1\oplus P_2)^{{{\mathrm{T}}}}(\check{H}-\lambda \check{E})(P_2\oplus P_1) =\begin{bmatrix} 0&\widehat{K} \\ \widehat{M}&0 \end{bmatrix}-\lambda \begin{bmatrix} \widehat{I}&\\ 0&\widehat{I}^{\,{{\mathrm{T}}}} \end{bmatrix}, \end{aligned}$$

where

$$\begin{aligned} \widehat{M}&=\begin{bmatrix} R_2^{-1}&\\&I_{\ell _2-r} \end{bmatrix} \begin{bmatrix} M_{11}&M_{12} \\ M_{12}^{{{\mathrm{T}}}}&M_{22} \end{bmatrix} \begin{bmatrix} R_2^{-{{\mathrm{T}}}}&\\&I_{\ell _2-r} \end{bmatrix}, \end{aligned}$$

(8.12)

$$\begin{aligned} \widehat{K}&=\begin{bmatrix} R_1^{-1}&\\&I_{\ell _1-r} \end{bmatrix} \begin{bmatrix} K_{11}&K_{12} \\ K_{12}^{{{\mathrm{T}}}}&K_{22} \end{bmatrix} \begin{bmatrix} R_1^{-{{\mathrm{T}}}}&\\&I_{\ell _1-r} \end{bmatrix}, \end{aligned}$$

(8.13)

$$\begin{aligned} \widehat{I}&=\begin{bmatrix} I_r&\\&0 \end{bmatrix}\in {\mathbb R}^{\ell _1\times \ell _2}, \end{aligned}$$

(8.14)

and $K_{ij}$ and $M_{ij}$ are defined by 8.3. Since $K$ and $M$ are positive (semi)definite, we have ${{\mathrm{span}}}(K_{12}^{{{\mathrm{T}}}})\subseteq {{\mathrm{span}}}(K_{22})$ and ${{\mathrm{span}}}(M_{12}^{{{\mathrm{T}}}})\subseteq {{\mathrm{span}}}(M_{22})$ and consequently

$$\begin{aligned} K_{22}K_{22}^{\dagger }K_{12}^{{{\mathrm{T}}}}=K_{12}^{{{\mathrm{T}}}}, \quad M_{22}M_{22}^{\dagger }M_{12}^{{{\mathrm{T}}}}=M_{12}^{{{\mathrm{T}}}}. \end{aligned}$$

(8.15)

Let

$$\begin{aligned} Z_1=\begin{bmatrix} I_r&0 \\ -K_{22}^{\dagger }K_{12}^{{{\mathrm{T}}}}R_1^{-{{\mathrm{T}}}}&I_{\ell _1-r} \end{bmatrix}, \quad Z_2=\begin{bmatrix} I_r&0 \\ -M_{22}^{\dagger }M_{12}^{{{\mathrm{T}}}}R_2^{-{{\mathrm{T}}}}&I_{\ell _2-r} \end{bmatrix}. \end{aligned}$$

It can be verified that $Z_1^{{{\mathrm{T}}}}\widehat{I} Z_2=\widehat{I}$ and, after using (8.15),

$$\begin{aligned} Z_1^{{{\mathrm{T}}}}\widehat{K} Z_1=\begin{bmatrix} R_1^{-1}\mathcal {K}_{11} R_1^{-{{\mathrm{T}}}}&0 \\ 0&K_{22} \end{bmatrix}, \quad Z_2^{{{\mathrm{T}}}}\widehat{M} Z_2=\begin{bmatrix} R_2^{-1}\mathcal {M}_{11} R_2^{-{{\mathrm{T}}}}&0 \\ 0&M_{22} \end{bmatrix}, \end{aligned}$$

where $\mathcal {K}_{11}$ and $\mathcal {M}_{11}$ are defined in (8.5). Hence $(P_1Z_1\oplus P_2Z_2)^{{{\mathrm{T}}}}(\check{H}-\lambda \check{E})(P_2Z_2\oplus P_1Z_1)$ is

whose finite eigenvalues are the eigenvalues of

$$\begin{aligned} \begin{bmatrix} 0&R_1^{-1}\mathcal {K}_{11} R_1^{-{{\mathrm{T}}}} \\ R_2^{-1}\mathcal {M}_{11} R_2^{-{{\mathrm{T}}}}&0 \end{bmatrix}-\lambda I_{2r}=\widehat{H}_{\hbox {SR}}-\lambda I_{2r}. \end{aligned}$$

(8.17)

Now we turn to look for the eigenvector relation. Given an eigenvector $\hat{z}=\begin{bmatrix} \hat{y} \\ \hat{x} \end{bmatrix}$ of $\widehat{H}_{\hbox {SR}}$, we conclude by comparing (8.16) and (8.17) that the corresponding eigenvector of the matrix pencil (8.16) is

$$\begin{aligned} \begin{bmatrix} \hat{y} \\ h \\ \hat{x} \\ g \end{bmatrix}, \end{aligned}$$

where $g$ is any vector in the kernel of $K_{22}$ and $h$ is any vector in the kernel of $M_{22}$. Therefore the corresponding eigenvector $\check{z}=\begin{bmatrix} \check{y} \\ \check{x} \end{bmatrix}$ of $\check{H}-\lambda \check{E}$ is given by

$$\begin{aligned} \check{x}=P_1Z_1\begin{bmatrix} \hat{x} \\ g \end{bmatrix}, \quad \check{y}=P_2Z_2\begin{bmatrix} \hat{y} \\ h \end{bmatrix} \end{aligned}$$

which, after simplification, yields (8.11). $\square $

The next theorem says that there are Cauchy-like interlacing inequalities for $\widehat{H}_{\hbox {SR}}$, too. We omit its proof because its similarity to [3, Theorem 8.3] (see also [1, Appendix A]).

Theorem 8.3

Assume the conditions of Theorem 8.1. Then

$$\begin{aligned} \lambda _i\le \mu _i\le \,\lambda _{i+2n-(\ell _1+\ell _2)}\quad \hbox {for} 1\le i\le r, \end{aligned}$$

(8.18)

where $\lambda _{i+2n-(\ell _1+\ell _2)}=\infty $ if $i+2n-(\ell _1+\ell _2)>n$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bai, Z., Li, RC. Minimization principles and computation for the generalized linear response eigenvalue problem. Bit Numer Math 54, 31–54 (2014). https://doi.org/10.1007/s10543-014-0472-6

Download citation

Received: 28 January 2013
Accepted: 19 January 2014
Published: 14 February 2014
Issue Date: March 2014
DOI: https://doi.org/10.1007/s10543-014-0472-6

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimization principles and computation for the generalized linear response eigenvalue problem

Abstract

Access this article

Similar content being viewed by others

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Positivity and Positivity-Definiteness for Cauchy Powers of Linear Functionals on the Linear Space of Polynomials

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Notes

References

Acknowledgments