Abstract
The eigenvectors of the particle number operator in second quantization are characterized by the block sparsity of their matrix product state representations. This is shown to generalize to other classes of operators. Imposing block sparsity yields a scheme for conserving the particle number that is commonly used in applications in physics. Operations on such block structures, their rank truncation, and implications for numerical algorithms are discussed. Explicit and rankreduced matrix product operator representations of one and twoparticle operators are constructed that operate only on the nonzero blocks of matrix product states.
1 Introduction
In wavefunction methods of quantum chemistry, one aims to directly approximate the wavefunction of the electrons in a given molecular system. In the manyelectron case, these are defined on extremely highdimensional spaces. In addition, due to the fermionic nature of electrons, these wavefunctions need to respect certain antisymmetry requirements. Post–Hartree–Fock methods are an established class of wavefunction methods based on approximations of wavefunctions by Slater determinants, that is, by antisymmetrized tensor products of singleelectron basis functions on \(\mathbb {R}^3\). With a judicious choice of such lowerdimensional basis functions, called orbitals, these methods can achieve highaccuracy approximations of the wavefunctions corresponding to the lowestenergy states of the system. This is achieved essentially by exploiting nearsparsity of wavefunctions in the basis of all Slater determinants formed from the orbitals.
However, for certain types of problems, for instance strongly correlated systems with several competing states of lowest energy, these classical methods typically fail to yield good approximations. Thus more flexible datasparse parametrizations of the linear combinations of Slater determinants that can be formed from a given finite set of orbitals are of interest. An elegant way of representing such linear combinations of antisymmetric functions is the formalism of second quantization, where wavefunctions are represented in terms of the occupation of each orbital by a particle. With respect to a sequence of orthonormal orbitals \(\{ \phi _k\}_{k\in \mathbb {N}}\), this leads to a representation of the wavefunction by occupation numbers \(({\mathbb {C}}^2)^\infty \), corresponding to an occupied and an unoccupied state for each orbital. The corresponding space of functions is called Fock space.
For electrons with Coulomb interaction in an external potential V, one has the corresponding representation of the Hamiltonian acting on occupation number tensors,
with coefficient tensors \((t_{ij})\) and \((v_{ijkl})\) depending on the orbitals, in terms of the creation operators \(\varvec{a}^*_i\) and annihilation operators \(\varvec{a}_i\). These can be thought of as switching particles from the unoccupied to the occupied state of orbital i or back, respectively. The antisymmetry of wavefunctions corresponds to the anticommutation relations
The secondquantized representation is particularly suitable for the application of lowrank tensor formats such as matrix product states (abbreviated MPS), also known as tensor trains (abbreviated TT) in the numerical analysis context, or the more general tree tensor networks (or hierarchical tensors); see [4, 16, 28]. Whereas the implementation of wavefunction antisymmetry in such tensor formats is problematic in the realspace representation of wavefunctions, this does not present a problem in the secondquantized representation: the antisymmetry properties are encoded in the representation of operators, and the corresponding occupation numbers describing the wavefunctions can be directly approximated in lowrank tensor formats. However, in contrast to realspace approximations of wavefunctions [15], where the number of electrons is tied to the spatial dimensionality of the problem, this particle number is not fixed in the secondquantized formulation and thus needs to be prescribed explicitly.
Prescribing a number of N particles amounts to restricting the eigenvalue problem for \(\varvec{H}\) to the subspace of those occupation numbers that are also eigenvectors of the particle number operator
with eigenvalue N. The particle number constraint does not need to be implemented explicitly: as we investigate in detail in this work, every particle number eigenspace corresponds to a certain blocksparse structure in the cores of MPS. This fact has a long history in the physical literature (see, e.g., [5, 8, 26, 30, 33, 34]), where such block sparsity is usually derived from gauge symmetries, such as U(1) symmetry corresponding to particle number conservation. Here we use elementary linear algebra to arrive at this block structure, which to the best of our knowledge has not received any attention thus far in a mathematical context.
Block sparsity can not only be used to build the particle number constraint into the lowrank tensor representations, but it can also be exploited to reduce the costs of operations on MPS. We also consider the implications for analogous representations of linear operators acting on MPS, which are called matrix product operators (MPOs). As we show, these can be applied in a form that preserves their lowrank structure and at the same time maintains the block structure of MPS.
The existence of such block structures is commonly exploited in applications of MPS in physics for solving eigenvalue problems for general Hamiltonians with one and twoparticle interactions as in (1.1); see, for instance, [13, 17, 27, 31]. In these applications, the focus is mainly on density matrix renormalization group (DMRG) algorithms [33, 40], which operate locally on components of the MPS. The block structures of the MPS in this case appear as block diagonal structures of density matrices computed from the MPS. However, DMRG schemes are known to fail in certain circumstances [9], a fact that is related to their local mode of operation.
For designing eigenvalue solvers for MPS that can be guaranteed to converge, an important building block is the eigenvalue residual \(\varvec{H}\varvec{x} \langle \varvec{H}\varvec{x},\varvec{x}\rangle \varvec{x}\). For its efficient evaluation for an MPO representation of \(\varvec{H}\) and an MPS representation of \(\varvec{x}\), the respective global block structures of these quantities that we consider here become important. In addition to describing the block structurepreserving action of \(\varvec{H}\), we also show that the representation rank of \(\varvec{H}\) can be substantially reduced if the tensor \((v_{ijkl})\) satisfies certain sparsity conditions that can be satisfied with a suitable choice of orbitals.
As one main contribution of this work, we thus consider from the point of view of numerical linear algebra the use of blocksparse MPS as a means of enforcing particle number constraints in solving eigenvalue problems as they arise in quantum chemistry. In particular, we consider the realization of basic operations on blockstructured MPS, how Hamiltonians acting on MPS in a compatible blockstructured form can be implemented, and their respective computational complexity. We also consider some basic effects of the block structure of MPS on the convergence of eigensolvers.
More generally, we show that a similar block structure is present whenever MPS (or tensor trains) are restricted to an eigenspace of a diagonal operator with a certain Laplaciantype structure, with the particle number operator as a particular example. Without explicitly enforcing the block structure, in exact arithmetic such a constraint is also preserved by many operations on MPS; due to issues of numerical stability, however, this is in general no longer true in numerical computations: when working on full MPS without explicit block structure, the particle number will in general accumulate numerical errors over the course of iterative schemes.
The outline of this paper is as follows: In Sect. 2, we introduce basic notions and notation of MPS. In Sect. 3, we consider the block structure of MPS under particle number constraints and some of their consequences, and we consider the realization of standard operations on MPS exploiting this block structure in Sect. 4. In Sect. 5, we consider lowrank representations of one and twoelectron operators in Hamiltonians as well as their interaction with blockstructured MPS. Finally, in 6, we discuss basic implications for iterative eigensolvers and numerical illustrations.
2 Preliminaries
Since we are mainly interested in realvalued Hamiltonians as they arise in molecular systems, we restrict ourselves to realvalued occupation numbers in \((\mathbb {R}^2)^\infty \). However, the following considerations immediately generalize to the complexvalued case. We consider Fock space restricted to a fixed number \(K \in \mathbb {N}\) of orbitals, corresponding to occupation numbers in \({\mathcal {F}}^K {:}{=} (\mathbb {R}^2)^{K}\), which we regard as tensors of order K with indices \(\alpha \in \{ 0, 1\}^K\). This space is spanned by the unit vectors \(e^\alpha = e^{\alpha _1} \otimes \cdots \otimes e^{\alpha _K}\), where \(e^{\alpha _k}=(\delta _{\alpha _k,\beta })_{\beta =0,1}\) are Kronecker vectors for \(k = 1,\ldots , K\).
2.1 Matrix product states and operators
In our notation, we follow [3, 22] with some adaptations. The matrix product state (or tensor train) representation of \(\varvec{x} \in {\mathcal {F}}^K\) with ranks \(r_1, \ldots , r_{K1} \in \mathbb {N}_0\) reads
where for notational reasons we set \(j_0 = j_K = 1\), \(r_0 = r_K = 1\). For the thirdorder component tensors \(X_k\) in such a representation, called cores, we write
For linear mappings on \({\mathcal {F}}^K\), we have an analogous matrix product operator (MPO) representation
where we similarly write \(\mathsf {M} = (M_1,\ldots ,M_K)\).
For specifying the kth component of an MPS or MPO explicitly, we use the notation
Note that here and in the following, we write row indices in subscript and column indices in superscript, indicating that \(M_k^{ [ j_{k1}, j_k ] }\) is a matrix. In terms of the vectors \(X_k^{[j_{k1}, j_k]} \in \mathbb {R}^{\{0,1\}}\), a core \(X_k\) is then given by the rankwise block representation
with the analogous notation for the matrices \( M_k^{ [j_{k1}, j_k] }\).
We define multiplication of a core \(X_k\) by a matrix G of appropriate size from the left or right on its indices \(j_{k1}\) or \(j_k\), respectively, by
with the analogous definition for the components \(M_k\) in the representation of operators.
Complementing (2.3), we also introduce
Again, the subscripts and superscripts indicate that \(X^{ \{ \alpha _k \} }_k\) and \(M_k^{ \{ \alpha _k, \beta _k \} }\) are matrices.
For a compact way of writing (2.1) in terms of the cores (2.1), we introduce the Strong Kronecker product,
For example, for two cores X, Y of ranks \(2\times 2\), we obtain
With this notation, we have
where the block \([\varvec{x}] \in \mathbb {R}^{ 1 \times \{0,1\}^K \times 1}\) has leading and trailing dimensions of mode size 1. For simplicity, we ignore such singleton dimensions, that is, we identify \([\varvec{x}]\) with the tensor \(\varvec{x} \in \mathbb {R}^{\{0,1\}^K}\) of order K. With this identification, for the representation mapping \(\tau \), we write (2.1) as
where we have \(\varvec{x} = \tau (\mathsf {X})\). We use partial representation mappings that assemble the first or last cores of a matrix product state. We set
and analogously
The representation \(\mathsf {X}\) is called leftorthogonal if the vectors \(\tau _{K,j}^{\tiny {<}}{(\mathsf {X})}\) are orthonormal for \(j=1,\ldots ,r_{K1}\), and rightorthogonal if \(\tau _{1,j}^{>}{(\mathsf {X})}\) are orthonormal for \(j=1,\ldots ,r_{1}\).
As a second product operation between an MPS core \(X_k\) of ranks r and an MPO core \(M_k\) of ranks \(r'\)(or analogously between two MPO cores), with \(j_k = 1,\ldots ,r_k\), \(j_k'=1,\ldots ,r_k'\) for each k, we introduce the mode core product,
For matrix product operators a \(\varvec{M} = \tau (\mathsf {M})\) and matrix product states \(\varvec{x} = \tau (\mathsf {X})\), we have
Finally, we introduce a lift product of a matrix \(W \in \mathbb {R}^{r \times r'}\) with a matrix \(M \in \mathbb {R}^{2 \times 2}\) or vector \(x \in \mathbb {R}^2\), that is to be understood as a Kronecker product with reordered indices,
thus resulting in an MPO core or an MPS core, respectively.
2.2 Singular value decomposition
For \(k=1,\ldots ,K\), using \((\alpha _1,\ldots ,\alpha _k)\) as row index and \((\alpha _{k+1},\ldots , \alpha _K)\) as column index, one obtains corresponding matricizations (or unfoldings) of a tensor \(\varvec{x} \in {\mathcal {F}}^K\). Any representation \(\mathsf {X}\) of \(\varvec{x}\) can be transformed by operations on its cores such that these matricizations are in SVD form. This representation is known as Vidal decomposition [38] or as tensor train SVD (TTSVD) [28] and also arises as a special case of the hierarchical SVD [14] of more general tree tensor networks.
Specifically, \(\mathsf {X}\) with \(\varvec{x}=\tau (\mathsf {X})\) is in leftorthogonal TTSVD form if for \(k = 1,\ldots , K1\), \(\{ \tau ^{{\le }{4.3pt}}_{k,j} (\mathsf {X}) \}_{j = 1,\ldots , r_{k}}\) are orthonormal and \(\{ \tau _{k,j}^{>}{(\mathsf {X}})\}_{j = 1,\ldots , r_{k}}\) are orthogonal, where \(\sigma _{k,j}(\varvec{x}) := \Vert \tau _{k,j}^{>}{(\mathsf {X})}\Vert _2\) with \(\sigma _{k,1}(\varvec{x}) \ge \ldots \ge \sigma _{k,r_k}(\varvec{x})\) are the singular values of the kth matricization. Analogously, \(\mathsf {X}\) is in rightorthogonal TTSVD form if for \(k = 1,\ldots , K1\), \(\{ \tau ^{{\le }{4.3pt}}_{k,j} (\mathsf {X}) \}_{j = 1,\ldots , r_{k}}\) are orthogonal with \(\Vert \tau ^{{\le }{4.3pt}}_{k,1} (\mathsf {X})\Vert _2 \ge \ldots \ge \Vert \tau ^{{\le }{4.3pt}}_{k,k} (\mathsf {X})\Vert _2\) and \(\{ \tau ^{>}_{k,j} (\mathsf {X}) \}_{j = 1,\ldots , r_{k}}\) are orthonormal. These forms can be obtained by the scheme given in [28, Algorithm 1].
The rank truncation of either SVD form yields quasioptimal approximations of lower ranks [14, 28]: let \(\mathsf {X}\) be given in TTSVD form with ranks \(r_1,\ldots ,r_{K1}\), and denote by \({\text {trunc}}_{s_1,\ldots ,s_{K1}}(\mathsf {X})\) its truncation to ranks \(s_k\le r_k\), \(k=1,\ldots ,K1\), then
Using the above error bound in terms of the matricization singular values, one obtains an approximation \({\text {trunc}}_\varepsilon (\mathsf {X})\) with \(\Vert \tau (\mathsf {X})  \tau ({\text {trunc}}_\varepsilon (\mathsf {X}))\Vert _2 \le \varepsilon \) for any \(\varepsilon >0\) by truncating ranks according to the smallest singular values.
2.3 Tangent space projection
It is well known that MPS of fixed multilinear rank constitute an embedded smooth submanifold of the tensor space \({\mathcal {F}}^K\) [20]. The tangent space of this manifold can be explicitly characterized, and the ranks of the tangent vectors at \(\varvec{x}\) in MPS representation are at most twice the ranks of \(\varvec{x}\).
Let \(\varvec{x} = \tau (\mathsf {U}) = \tau (\mathsf {V})\) where \(\mathsf {U} = (U_1,\cdots ,U_K)\) is in left and \(\mathsf {V} = (V_1,\cdots ,V_K)\) is in rightorthogonal form. The projection operator onto the tangent space at \(\varvec{x}\) is given by
where, identifying mappings with their representation matrices, for \(k = 1,\ldots ,K\),
for \(k = 1,\ldots ,K1\),
and \(\varvec{Q}_{\varvec{x}}^{K,2} = 0\) for \(k=K\).
2.4 Second quantization
The operators of second quantization can be represented as mappings on \({\mathcal {F}}^K\) as follows: with the elementary components
the annihilation operator \(\varvec{a}_i\) on \({\mathcal {F}}^K\) reads
and the corresponding creation operator is \(\varvec{a}_i^*\). The particle number operator on \({\mathcal {F}}^K\) is given by
In addition, we introduce the truncated versions
which act only on the left and right sections, respectively, of a matrix product state.
3 Block structure of matrix products states
In this section we characterize the block sparsity of an MPS \(\mathsf {X}\) such that \(\varvec{x}=\tau (\mathsf {X})\) is an eigenvector of the particle number operator \(\varvec{P}\), or in fact of any operator that shares a certain structural feature of \(\varvec{P}\). We first formulate the result for general tensors \(\varvec{x} \in \mathbb {R}^{n_1\times \cdots \times n_K}\) and then obtain the corresponding result for eigenvectors of \(\varvec{P}\) in \({\mathcal {F}}^K\) as a special case. While the definitions of Sect. 2 are given for tensors in \({\mathcal {F}}^K\) for simplicity, they immediately carry over to general tensors in \(\mathbb {R}^{n_1\times \cdots \times n_K}\) with indices in , where we abbreviate partial index sets for modes \(k_1\) to \(k_2\) by .
For \(\varvec{P}\), the block sparsity of \(\mathsf {X}\) that we obtain is of the following form: For each k, the matrices \(X^{\{0\}}_k\) and \(X^{\{1\}}_k\) have block structure with nonzero blocks only on the main diagonal for \(X^{\{0\}}_k\), and only on the first superdiagonal for \(X^{\{1\}}_k\). Intuitively, this can be interpreted as follows: each block corresponds to a certain number of occupied orbitals to the left of k. For \(X^{\{0\}}_k\), this number does not change. For the occupied state, in \(X^{\{1\}}_k\) the positions of the blocks correspond to increasing the number of particles by one.
More generally, we obtain such block sparsity for socalled Laplacelike operators [22] on \(\mathbb {R}^{n_1\times \cdots \times n_K}\) of the form
with diagonal matrices
where in the case of \(\varvec{P}\), we have \(L_k = A^*A\). The unit vectors \(e^{\alpha _1} \otimes \cdots \otimes e^{\alpha _K}\) are eigenvectors of such \(\varvec{L}\), with eigenvalues given by
Remark 3.1
Note that for any \(\varvec{L}\) of the form (3.1a) with general symmetric matrices \(L_k\), there exists \(\varvec{U} = U_1 \otimes \cdots \otimes U_K\) with orthogonal matrices \(U_1,\ldots , U_K\) such that \(\varvec{{\tilde{L}}} = \varvec{U} \varvec{L} \varvec{U}^\top \) satisfies also (3.1b), and thus the following considerations apply to \(\varvec{ {\tilde{L}}}\).
Let \(\varvec{x} = \tau (\mathsf {X})\) as above satisfy \(\varvec{L}\varvec{x} = \lambda \varvec{x}\), \(\varvec{x} \ne 0\). For such \(\lambda \), we define the subset \(I_\lambda \subset \mathcal {N}\) of all \(\alpha \) such that \(\lambda = \lambda _\alpha \). Furthermore, for each \(\alpha \in I_\lambda \) and \(k=1,\ldots ,K1\) we can split the eigenvalue \(\lambda \) in the form
Using the notation from (2.6), the summands \(\lambda _{k,\alpha }^{{\le }{4.3pt}}, \lambda _{k,\alpha }^{\tiny {>}}\) are eigenvalues of the truncated versions \(\varvec{L}_k^{{\le }{4.3pt}}\) and \(\varvec{L}_k^{\tiny {>}}\) of \(\varvec{L}\). We write \({\mathcal {K}}_{\lambda ,k}\) for the set of all \(\lambda _{k,\alpha }^{{\le }{4.3pt}}\) for given \(\lambda \) and k, that is,
where we have \(\mathcal {K}_{\lambda ,0} = \{0\}\). In full representation, \(\varvec{x}\) necessarily has a certain sparsity pattern, since \(\varvec{x}_\alpha = 0\) if \(\alpha \notin I_\lambda \). We can exploit the invariance
for invertible \(G_k \in \mathbb {R}^{r_k \times r_k}\) for \(k = 1,\ldots ,K1\) in order to obtain a block structure for the component tensors \(X_k\) , which is our following main result. Here and in what follows, for \({\mathcal {K}}\subset \mathbb {R}\) and \(\lambda \in \mathbb {R}\), we write \(\mathcal {K}  \lambda := \{\mu \in \mathbb {R}: \mu +\lambda \in \mathcal {K} \}\).
Theorem 3.2
Let \(\varvec{x} \in \mathbb {R}^{n_1\times \cdots \times n_K}\), \(\varvec{x}\ne 0\), have the representation \(\varvec{x} = \tau (\mathsf {X})\) with minimal ranks \(\mathsf {r} = (r_1,\ldots , r_{K1})\). Then one has \(\varvec{L} \varvec{x} = \lambda \varvec{x}\) with \(\varvec{L}\) as in (3.1) precisely when \(\mathsf {X}\) can be chosen such that the following holds: for \(k=1,\ldots , K\) and for all \(\mu \in {\mathcal {K}}_{\lambda ,k}\), there exist \(\mathcal {S}_{k,\mu } \subseteq \{ 1,\ldots , r_k\}\) such that
and the matrices \(X^{\{\beta \}}_{k}\), \(\beta = 0,1,\ldots ,n_k1\), have nonzero entries only in the blocks
where we set \(\mathcal {S}_{0,0} = \mathcal {S}_{K,\lambda } = \{1\}\).
Proof
We first show that \(\varvec{L} \varvec{x} = \lambda \varvec{x}\) implies that (3.3), (3.4) hold, proceeding by induction over k. Thus, let \(k = 1\). For fixed \(\beta \in \{ 0,\ldots , n_11 \}\) we define
Then, by the definition of \(\varvec{L}\) and by our assumption,
Consequently \(\varvec{L}^{>}_{1}\varvec{y}^\beta = (\lambda  \lambda _{1,\beta })\varvec{y}^\beta \) for each \(\beta \), and thus either \(\varvec{y}^\beta = 0\) or \(\varvec{y}^\beta \) is an eigenvector of a selfadjoint linear mapping. Orthogonality of eigenvectors with distinct eigenvalues implies \(\langle \varvec{y}^{\beta }, \varvec{y}^{\beta '}\rangle = 0\) if \( \lambda _{1,\beta } \ne \lambda _{1,\beta '}\). Writing \(\varvec{y}^\beta = X_1^{\{\beta \}} X_2 {{\,\mathrm{\bowtie }\,}}\cdots {{\,\mathrm{\bowtie }\,}}X_K\), we obtain
with
which is invertible since the ranks of \(\mathsf {X}\) are minimal. This means that \(X^{\{\beta \}}_1G_1^{1/2}, X^{\{\beta '\}}_1G_1^{1/2}\) are pairwise orthogonal. Thus using (3.2), replacing \(X_1\) by \(X_1 G_1^{1/2}\) and \(X_2\) by \(G_1^{1/2}X_2\), we can ensure that \(\langle X^{\{\beta \}}_1, X^{\{\beta '\}}_1 \rangle = 0\) if \( \lambda _{1,\beta } \ne \lambda _{1,\beta '}\). By minimality of ranks, there exist precisely \(r_1\) different \(\beta _1,\ldots ,\beta _{r_1} \in \{ 0,\ldots ,n_11\}\) such that \(X^{\{\beta _1\}}_1,\ldots , X^{\{\beta _{r_1}\}}_1\) are linearly independent. For \(\mu \in {\mathcal {K}}_{\lambda , 1} = \{ \lambda _{1,\beta } :\beta = 0,\ldots ,n_11\}\), we now define
Again making use of (3.2), by Householder reflectors we can construct an orthogonal transformation \(Q_1 \in \mathrm {O}(r_1)\) such that replacing \(X_1\) by \(X_1 Q_1\) and \(X_2\) by \(Q_1^\top X_2\), we have
This means that \(X_1^{[1,j]} \in {{\,\mathrm{span}\,}}\{ e^{\beta } :\lambda _{1,\beta } = \mu \}\) for \(j \in \mathcal {S}_{1,\mu }\). Noting that the eigenspaces of \(L_1\) are given by \({{\,\mathrm{span}\,}}\{ e^{\beta '} :\beta ' = 0,\ldots ,n_11 \text { with } \mu = \lambda _{1,\beta '} \}\) for \(\mu \in {\mathcal {K}}_{\lambda , 1}\), as well as \(X_1^{[1,j]} = \tau ^{{\le }{4.3pt}}_{1,j} (\mathsf {X})\), we thus have
This shows (3.4) and the first statement in (3.3) for \(k=1\). Moreover, combining
and \(\varvec{L}\varvec{x} = \lambda \varvec{x}\) we obtain
Since \(\tau ^{{\le }{4.3pt}}_{1,j} (\mathsf {X}) \), \(j = 1,\ldots , r_1\), are linearly independent by our assumption of minimal ranks, the second statement in (3.3) for \(k=1\) follows.
Suppose we have sets \(\mathcal {S}_{k,\mu }\) with \(\mu \in {\mathcal {K}}_{\lambda ,k}\) such that (3.3) and (3.4) hold for some k with \(1 \le k < K1\), where \(0 \le \mu \le \lambda \) by construction. Then
For \(\beta = 0,\ldots , n_{k+1}1\), let
For each \(\mu \in {\mathcal {K}}_{\lambda ,k}\), we then have
By the orthogonality of eigenvectors corresponding to distinct eigenvalues, this implies
We write \((X_{k+1}^{\{\beta \}})_j\) for the jth row of \(X_{k+1}^{\{\beta \}}\), \(1\le k < K1\). For \(\mu \in \mathcal {K}_{\lambda ,k+1}\), we define
where for each k, we set \(\mathcal {S}_{k,\mu } = \emptyset \) for \(\mu \notin \mathcal {K}_{\lambda ,k}\). Since \(k < K1\), we have \( \varvec{y}^\beta _{k,j} = (X_{k+1}^{\{\beta \}})_j X_{k+2} {{\,\mathrm{\bowtie }\,}}\cdots {{\,\mathrm{\bowtie }\,}}X_K \). Under the conditions in (3.6), we thus have
where again, \(G_{k+1}\) is invertible since the ranks of \(\mathsf {X}\) are minimal. Consequently, for \(z\in \mathcal {Z}_{k+1,\mu }\), \(z' \in \mathcal {Z}_{k+1,\mu '}\) and \(\mu \ne \mu '\), (3.6) means that \(\langle G_{k+1} z, z'\rangle = 0\). Again, there exists an orthogonal transformation \(Q_{k+1}\in \mathrm {O}(r_{k+1})\) such that by replacing \(X_{k+1}\) by \(X_{k+1}G_{k+1}^{1/2}Q_{k+1}\) and \(X_{k+2}\) by \(Q_{k+1}^TG_{k+1}^{1/2}X_{k+2}\) as before, we can ensure pairwise orthogonality of the spaces \(\mathcal {Z}_{k+1,\mu }\), \(\mu \in \mathcal {K}_{\lambda ,k+1}\), in the Euclidean inner product. Additionally, for \(\mu \in {\mathcal {K}}_{\lambda ,k+1}\), we can define subsets \(\mathcal {S}_{k+1,\mu }\) that form a partition of \(\{1, \ldots , r_{k+1}\}\) with \(\# \mathcal {S}_{k+1,\mu } = \dim \mathcal {Z}_{k+1,\mu }\), such that for all \(z \in \mathcal {Z}_{k+1,\mu }\) we have \(\mathop \mathrm{supp}(z) \subseteq \mathcal {S}_{k+1,\mu }\). For \(k < K1\), this implies the block structure (3.4) for \(X_{k+1}\), and \(\varvec{L}^{{\le }{4.3pt}}_{k+1} \tau ^{{\le }{4.3pt}}_{k+1,j} (\mathsf {X}) = \mu \tau ^{{\le }{4.3pt}}_{k+1,j} (\mathsf {X})\) for \(j \in \mathcal {S}_{k+1,\mu }\), which is the first statement in (3.3) for \(k+1\), holds by construction. Thus we also have
which by minimality of ranks yields the second statement in (3.3) for \(k+1\). By induction, the statement thus follows for all \(k \le K1\).
Finally, for \(k=K1\), if \(\mu \in {\mathcal {K}}_{\lambda ,K1}\), then \(\lambda  \mu = \lambda _{K,\beta }\) for some \(\beta \in \{ 0,\ldots ,n_K1\}\), and (3.5) becomes
noting that \(r_K = 1\). As \(L_K\) is a diagonal matrix, the eigenspaces in (3.7) are given by
For \(\beta \in \{ 0,\ldots ,n_K1 \}\) with \(\mu = \lambda  \lambda _{K,\beta }\) let \(j' \in \{ 1, \ldots , r_{K1} \}\). If \(j' \notin \mathcal {S}_{K1,\mu }\), then \(j' \in \mathcal {S}_{K1,\mu '}\) for some \(\mu ' \ne \lambda  \lambda _{K,\beta }\), and \(X_K^{[j',1]}\) is orthogonal to \(X_K^{[j,1]}\) for all \(j \in \mathcal {S}_{K1,\mu }\), which by (3.8) implies \(X_K(j,\beta ,1) = 0\). Thus also \(X_K\) satisfies (3.4).
Conversely, suppose now that \(\varvec{x} \in \mathbb {R}^{n_1\times \cdots \times n_K}\), \(\varvec{x}\ne 0\), has the block structure (3.3), (3.4). This means that expanding the representation \(\mathsf {X}\) in terms of elementary tensors of order K, each of the resulting summands is an eigenfunction of \(\varvec{L}\) with eigenvalue \(\lambda \), and thus the same holds for \(\varvec{x}\). \(\square \)
Remark 3.3
The above proof uses the explicit tensor network structure of an MPS. Analogous block sparsity results can be derived for other tree tensor network representations of tensors that are eigenvectors of an operator with the structure (3.1). Similar use of block sparsity for enforcing physical symmetries has also been considered in [5, 34] for tensor networks without tree structure such as PEPS [37] and MERA [39]. In such cases, however, whether any tensor with a given symmetry necessarily has a representation with a particular blocksparsity pattern is not clear in this case. In other words, in the absence of tree structure we cannot establish an equivalence between membership in certain eigenspaces and representability with block structure as in Theorem 3.2. Block sparsity can then still be used for obtaining approximations, see [5, Sec. III.C].
We state the specific result for the particle number operator \(\varvec{P}\) as a corollary. This corresponds to the special case \(n_k = 2\) and \(\lambda = N\). For \(k=1,\ldots ,K\), we have \(\lambda _{k,0} = 0\) and \(\lambda _{k,1} = 1\), and hence \(\lambda _{k,\alpha }^{{\le }{4.3pt}}, \lambda _{k,\alpha }^{\tiny {>}}\in \{ 0, \ldots , N\}\) in Theorem 3.2.
Corollary 3.4
Let \(\varvec{x} \in {\mathcal {F}}^K\), \(\varvec{x}\ne 0\), have the representation \(\varvec{x} = \tau (\mathsf {X})\) with minimal ranks \(\mathsf {r} = (r_1,\ldots , r_{K1})\). Then for \(N=1,\ldots , K\), one has \(\varvec{P} \varvec{x} = N \varvec{x}\) precisely when \(\mathsf {X}\) can be chosen such that the following holds: for \(k=1,\ldots , K\) and for all
there exist \(\mathcal {S}_{k,n} \subseteq \{ 1,\ldots , r_k\}\) such that
and the matrices \(X^{\{\beta \}}_{k}\), \(\beta = 0,1\), have nonzero entries only in the blocks
where we set \(\mathcal {S}_{0,0} = \mathcal {S}_{K,N} = \{1\}\).
The block structure described by Corollary 3.4 is equivalent to the one used in physics literature (see, e.g., [26, 33, 34]), where it is usually stated differently in terms of quantum numbers and derived via U(1) symmetry of operators. For other symmetries considered in physics, such as SU(2), the linear algebraic structure is different and therefore cannot be described by Laplacelike operators (3.1) as done above.
In addition to the the fermionic particle number operator \(\varvec{P}\), there is a variety of other settings where the structure described by Theorem 3.2 can be used.
Example 3.5
In quantum chemistry, not only the particle number is conserved, but also the numbers of spinup and spindown particles. So the MPS is an eigenvector of two associated Laplacelike operators \(\varvec{P}_\mathrm {up}\) and \(\varvec{P}_\mathrm {down}\). For both cases we have \(n_k = 2\) and \(\lambda _{k,1} = 1\) if k even/odd for the up/down operator and \(\lambda _{k,1} = 0\) otherwise. We then have partial eigenvalues \({\mathcal {K}}_k^{\mathrm {up}}\) and \(K_k^{\mathrm {down}}\) and the blocks depend on two partial eigenvalues, i.e., we have sets \({\mathcal {S}}_{k,n_1,n_2}\subseteq \{1,\ldots ,r_k\}\) for \(n_1 \in {\mathcal {K}}_k^{\mathrm {up}}\) and \(n_2 \in K_k^{\mathrm {down}}\). The blocks can then be ordered such that the blocks from the first operator have block structure themselves.
Alternatively, one can introduce spatial orbitals that can carry one spinup and one spindown electron [36]. In this case, all dimensions are \(n_k = 4\) and a similar block structure can be derived that also takes into account the antisymmetry of the particles.
Example 3.6
Another example is the bosonic particle number operator with \(n_k = n\) and \(\lambda _{k,\alpha _k} = \alpha _k\). Tensor trains have frequently been applied in the parametrization of elements of highdimensional polynomial spaces such as
see, e.g., [2, 10, 12, 29]. In this context, the bosonic particle number operator can be seen as a polynomial degree operator. That is, if a polynomial is a linear combination of homogeneous polynomials with the same degree, its coefficient vector is an eigenvector of the polynomial degree operator with the eigenvalue equal to the degree. In \(V_n^K\) the degree is precisely the cardinality of the multiindex \(\alpha \).
Another interesting example is the case \(n_k = n\) and \(\lambda _{k,\alpha _k} = 1\), \(\alpha _k > 0\), which in the context of polynomials with \(V_n^K\) as above measures the number of variables in a polynomial. This means eigenvectors of this operator are associated with coefficient vectors where the multiindex \(\alpha \) is nonzero only for a fixed number (the associated eigenvalue) of variables.
We define the block sizes \(\rho _{k,\mu } := \# \mathcal {S}_{k,\mu } \), where \(\sum _{\mu \in {\mathcal {K}}_{\lambda ,k}} \rho _{k,\mu } = r_k\), and derive the following upper bounds.
Lemma 3.7
Let \(\varvec{x} \in \mathbb {R}^{n_1\times \cdots \times n_K}\), \(\varvec{x}\ne 0\), have the representation \(\varvec{x} = \tau (\mathsf {X})\) with minimal ranks \(\mathsf {r} = (r_1,\ldots , r_{K1})\) and \(\varvec{L} \varvec{x} = \lambda \varvec{x}\). Furthermore, let \(E_{\mu ,k}^{{\le }{4.3pt}}\) and \(E_{\lambda \mu ,k}^{\tiny {>}}\) be the eigenspaces of \(\varvec{L}^{{\le }{4.3pt}}_{k}\) and \(\varvec{L}^{>}_{k}\) of eigenvalues \(\mu \) and \(\lambda \mu \), respectively. Then for \(k=1,\ldots , K\) and \(\mu \in {\mathcal {K}}_{\lambda ,k}\), we have
Proof
With Theorem 3.2, for \(j \in \mathcal {S}_{k,\mu }\), we obtain \(\varvec{L}^{{\le }{4.3pt}}_{k} \tau ^{{\le }{4.3pt}}_{k,j} (\mathsf {X}) = \mu \tau ^{{\le }{4.3pt}}_{k,j} (\mathsf {X})\) and \(\varvec{L}^{>}_{k} \tau ^{>}_{k,j} (\mathsf {X}) = (\lambda \mu ) \tau ^{>}_{k,j} (\mathsf {X})\). The partial tensors \(\tau ^{{\le }{4.3pt}}_{k,j} (\mathsf {X})\) are linearly independent because if they were not, we could reduce the rank \(r_k\), which is assumed to be minimal. Therefore \(\rho _{k,\mu }\) has to be smaller or equal to the dimension of the eigenspace of the operator \(\varvec{L}^{{\le }{4.3pt}}_{k}\) to the eigenvalue \(\mu \). Analogously, we look at the eigenspace of the operator \(\varvec{L}^{>}_{k}\) to the eigenvalue \(\lambda \mu \). \(\square \)
Note that for the particle number operator \(\varvec{P}\), where \(\rho _{k,n} = \# \mathcal {S}_{k,n} \) for \(n \in {\mathcal {K}}_k\),
As a final result in this chapter, we show that Laplacelike operators commute with the tangent space projection and thus, they share the same eigenvectors.
Corollary 3.8
For \(\varvec{x}\in \mathbb {R}^{n_1\times \cdots \times n_K}\) and \(\varvec{L}\varvec{x} = \lambda \varvec{x}\), we have \(\varvec{Q}_{\varvec{x}}\varvec{L} = \varvec{L}\varvec{Q}_{\varvec{x}}\).
Proof
We apply Theorem 3.2. For two arbitrary but fixed multiindices \(\alpha ,\beta \in {\mathcal {N}}\) let \(\mathsf {e}^\alpha = (e^{\alpha _1},\ldots ,e^{\alpha _K})\) and \(\mathsf {e}^\beta = (e^{\beta _1},\ldots ,e^{\beta _K})\) be two representations such that \(\tau (\mathsf {e}^\alpha )\) and \(\tau (\mathsf {e}^\beta )\) are unit vectors in \(\mathbb {R}^{n_1\times \cdots \times n_K}\). Then it suffices to show that
and since \(\tau (\mathsf {e}^\alpha )\) and \(\tau (\mathsf {e}^\beta )\) are eigenvectors of \(\varvec{L}\) with eigenvalue \(\lambda _\alpha \) and \(\lambda _\beta \), this simplifies to showing that for \(\lambda _\alpha \ne \lambda _\beta \) we have \(\langle \varvec{Q}_{\varvec{x}}\tau (\mathsf {e}^\beta ), \tau (\mathsf {e}^\alpha ) \rangle = 0 \). Now for \(k=1,\ldots ,K\), we show that
We give the proof for \(i=1\), the case \(i=2\) can be treated analogously. Note that if \(\lambda ^\alpha \ne \lambda ^\beta \), we can assume without loss of generality that \(\tau ^{<}_{k,1} (\mathsf {e}^\alpha )\) and \(\tau ^{<}_{k,1} (\mathsf {e}^\beta )\) are eigenvectors of different eigenvalues \(\lambda _{k1,\alpha }^{{\le }{4.3pt}}\ne \lambda _{k1,\beta }^{{\le }{4.3pt}}\) of \(\varvec{L}^{{\le }{4.3pt}}_{k1}\). But since \(\varvec{x}\) is also an eigenvalue of \(\varvec{L}\) it has the properties shown in Theorem 3.2, and consequently,
As \(\mathcal {S}_{k1,\lambda _{k1,\alpha }^{{\le }{4.3pt}}}\cap \mathcal {S}_{k1, \lambda _{k1,\beta }^{{\le }{4.3pt}}} = \emptyset \), this implies
concluding the proof. \(\square \)
4 Basic operations on blockstructured matrix products states
In the remainder of this article, we restrict ourselves again to the case \(\varvec{L} = \varvec{P}\) and \({\mathcal {N}} = \{0,1\}^K\). For fixed \(N \le K\), we denote the space of all tensors \(\varvec{x} \in {\mathcal {F}}^K\) with \(\varvec{P} \varvec{x} = N \varvec{x}\) as
and we represent them in the blocksparse MPS format. The block structure of the matrix product states leads to more efficient storage and computation, if exploited correctly. Furthermore, a restriction to one of the eigenspaces of the particle number operator eliminates redundancies in iterative minimization schemes.
We simplify notation by first noting that the sets \(\mathcal {S}_{k,n}\) are disjoint and that they can be ordered arbitrarily due to the invariance of the components. This means that the matrices \(X^{\{0\}}_k\) and \(X^{\{1\}}_k\) are either blockdiagonal or they have blocks only just above or just below the diagonal. We denote the blocks representing an unoccupied kth orbital by
and those representing an occupied orbital by
For k such that \(N< k < KN+1\), which we refer to as the generic case, we have \({\mathcal {K}}_{k1} = {\mathcal {K}}_k = \{0,\ldots ,N\}\); otherwise, the number of particles to the right and to the left of orbital k, and hence the elements of \({\mathcal {K}}_{k1}\) and \({\mathcal {K}}_k\), are restricted according to (3.9). The corresponding block structure according to Corollary 3.4 has the form
Since nonzero blocks for the unoccupied orbital never occur in the same position as the ones for the occupied orbital, the two layers \(\alpha = 0\) and \(\alpha = 1\) can be summarized in the core representation
where each block is composed of vectors, and where we define
and
The cases where either \(k \le N\) or \(k \ge KN+1\) have the last rows or first columns (and zero rows and columns) removed, respectively, as illustrated in the following example.
Example 4.1
A tensor \(\varvec{x} \in {\mathcal {F}}^5_2\) of order \(K=5\) and particle number \(N=2\), representing 5 orbitals and 2 particles, has the form
As an eigenspace of a linear operator, \({\mathcal {F}}^K_N\) is a linear subspace of \({\mathcal {F}}^K\). Addition and scalar multiplication of MPS in this subspace correspondingly work equivalently to those of regular MPS: Addition of two tensors in blocksparse MPS format is the concatenation of corresponding blocks, scalar multiplication is the multiplication of all blocks in one of its components. We will now explicitly describe some further more involved operations, as well as left and rightorthogonalization and rank truncation procedures.
Many operations on MPS can be performed either lefttoright or righttoleft, and in general, both versions are required. Here we state righttoleft procedures, as their description is notationally more compact. Apart from the notation, however, all lefttoright procedures are performed analogously.
Alg. 1 describes the scheme for computing the inner product of two elements of \({\mathcal {F}}^K_N\) in blocksparse MPS format, each of which may be of arbitrary ranks. The scheme consists of a righttoleft procedure that successively contracts the blocks of corresponding size. Note that in each step, one ultimately constructs the partial representation mapping \(\tau ^{>}_{k,j} (\mathsf {X})\) where \(j \in \mathcal {S}_{k,n}, n \in {\mathcal {K}}_k\). This, and its lefttoright counterpart \(\tau ^{<}_{k,j} (\mathsf {X})\), is then given in a blockdiagonal form, which can be exploited in the construction of subproblems in the DMRG algorithm discussed in Sec. 6.1.
In Alg. 2, we demonstrate the procedure for orthogonalizing from right to left, resulting in a rightorthogonal tensor. The method for bringing the tensor into its rightorthogonal TTSVD representation, as described in Alg. 3, follows a similar pattern. Here, the input tensor needs to be given in leftorthogonal format (to which one transforms analogously to Alg. 2), singular value decompositions of joined blocks are computed from right to left, and the singular values in each step are stored. These singular values are used to truncate the ranks of the tensor based on the the estimates in [14, 28]. Alg. 4 summarizes this procedure, where the smallest singular values are selected such that they do not exceed the upper bound on the truncation error \(\varepsilon \). The rows and columns of the corresponding blocks are then deleted. A similar procedure can be used for truncation to given ranks. Note that if the error threshold \(\varepsilon \) is chosen too large, it is possible that the whole tensor is truncated to zero. This means that zero is actually the best low rank approximation to the given tensor. In the present context, we want to avoid this anomaly, since the zero tensor is physically meaningless (we emphasize that it does not represent the vacuum state). As long as \(\varepsilon < \Vert \varvec{x}\Vert \) in Alg. 4, this cannot occur.
Remark 4.2
(Optimality) If the standard TTSVD is unique, for each k, it differs from the blocksparse TTSVD representation produced by Algorithm 3 only by the ordering of singular values. In particular, when the entries of the diagonal matrices \(\Sigma _{k,n}\), \(n \in {\mathcal {K}}_k\) are ordered by size, the optimality properties (2.4) of the TTSVD truncation hold also in this setting.
Remark 4.3
(Particle number conservation) The above remark implies that rank truncation of the TTSVD is a particle number preserving operation. Nonuniqueness of the TTSVD can occur only if a matricization has a multiple singular value; in such a case, an arbitrary choice of the TTSVD in general is not blocksparse, and truncation of the TTSVD may change the particle number. As illustrated in Sect. 6.3, in cases with singular values that are distinct but close, the associated numerical instability of singular vectors can lead to deviations in the particle number when the numerically computed TTSVD is truncated, unless the block structure is enforced explicitly.
Remark 4.4
(Number of operations) Depending on the relation between block sizes and total MPS ranks, the blocksparse representation can allow for a substantial reduction of computational costs. As an example, we consider the TTSVD procedure as in Algorithm 3 with \(K\gg N\). For convenience, let \(\rho _{k,n} = \#{ \mathcal {S}_{k,n} }\) if \(n \in {\mathcal {K}}_k\) and \(\rho _{k,n} = 0\) otherwise. Then the number of operations of this algorithm are dominated by the SVDs for joined blocks for each k and n, and thus in total of order
If \(\rho _{k,n} = {\bar{\rho }}\) for all k, n, the corresponding total ranks are \(r_k = (N+1){\bar{\rho }}\), with exception of the 2N lowest and highest values of k, and thus the total operation costs scale as \(\mathcal {O}(K N {\bar{\rho }}^3)\). In comparison, the TTSVD of a full MPS representation costs
where \(\sum _{k=2}^K \min \{r_{k1},r_k\}^2 \max \{r_{k1},r_k\} \approx K N^3{\bar{\rho }}^3\). In such a case, the exploitation of block sparsity thus leads to a reduction of the costs approximately by a factor \(N^2\). However, if for each k, one has \(r_k = \rho _{k,n}\) for some n, corresponding to only a single block being nonzero for each k, there is no reduction of storage or operations costs compared to the full MPS representation.
Finally, we want to show how the tangent space of fixedrank MPS manifolds (see [20]) can be handled algorithmically with the blocksparse MPS format. To this end, we state the parametrization in blocksparse MPS format of an arbitrary element \(\varvec{y}\) of the tangent space at a given \(\varvec{x} \in \mathbb {R}^{n_1\times \cdots \times n_K}\) of ranks \(r = (r_1,\cdots ,r_{K1})\). For such \(\varvec{y}\), there exist cores \(\delta Y_k\in \mathbb {R}^{r_{k1}\times n_k\times r_{k}}\) such that
where as in Sec. 2.3, we assume that \(\varvec{x} = \tau (\mathsf {U}) = \tau (\mathsf {V})\) with \(\mathsf {U} = (U_1,\cdots ,U_K)\) in left and \(\mathsf {V} = (V_1,\cdots ,V_K)\) in rightorthogonal form. Note that the cores \(\delta Y_k\) necessarily have the same blocksparse structure as \(U_k\) and \(V_k\).
The projection of \(\varvec{z} = \tau (\mathsf {Z}) \in {\mathcal {F}}^K_N\) to the tangent space at \(\varvec{x}\) can be obtained by first computing the components \(\delta Y_k\) of the projection \(\varvec{y}\) and then assembling the MPS representation of \(\varvec{y}\) using (4.1). The computation of the \(\delta Y_k\) can be performed in a similar fashion as the inner product in Algorithm 1, once from left to right and once from right to left. This means that for \(k=1,\ldots ,K1\), one recursively evaluates
With these quantities at hand, one obtains \(\delta Y_k\) as described in [35]. Then one has the representation
where the rank indices in each core can be reordered to yield the same blocksparse structure as in \(\mathsf {U}\) and \(\mathsf {V}\) with at most doubled rank parameters.
5 Matrix product operators
It is well known that the Hamiltonian (1.1) commutes with the particle number operator \(\varvec{P}\) [18, §1.3.2]:
Lemma 5.1
We have that the Hamiltonian and the particle number operator commute, that is, \(\varvec{H}\varvec{P} = \varvec{P} \varvec{H}\). Furthermore, all eigenvectors of \(\varvec{H}\) are eigenvectors of \(\varvec{P}\).
Thus \(\varvec{H}\) preserves the particle number of a state as well as its block structure. In fact, we can show that every particle numberpreserving operator can be written as a sum of rankone particle number preserving operators of the form \(\varvec{a}_{D^+}^*\varvec{a}_{D^}\) with subsets \({D^,D^+\subseteq \{1,\dots ,K\}}\) such that \(\#{D^}=\#{D^+}\), where \(\varvec{a}_D = \prod _{i\in D} \varvec{a}_i\). Note that we define \(\varvec{a}_{\emptyset }\) to be the identity mapping. Furthermore, we associate each \(D \subseteq \{ 1,\ldots , K\}\) with a unit vector \(\varvec{e}_D = \varvec{a}_D^*\varvec{e}_\mathrm {vac}\), where \(\varvec{e}_\mathrm {vac}\) is the vacuum state
We then have the following result, which is shown in Appendix 1.
Lemma 5.2
Let \(\varvec{B}: {\mathcal {F}}^K \rightarrow {\mathcal {F}}^K\) be a particle numberpreserving operator, that is, \(\varvec{B}\) maps each eigenspace of \(\varvec{P}\) to itself. Then there exist coefficients \(v_{D^+,D^}\in \mathbb {R}\) such that
in other words, \(\varvec{B}\) can be written as a sum of rankone particle numberpreserving operators.
Linear operators on matrix product states can be in the MPO format (2.2) with cores of order four. We will now investigate the ranks in the MPO representation of Hamiltonians of the form (1.1), that is, particle numberpreserving operators with one and twoparticle terms. As shown below, the MPO ranks of such operators grow at most quadratically with the order K of the tensor. Furthermore, since both of these operators preserve the particle number, their effect on a blocksparse MPS can be expressed only in terms of the blocks. At the end of this section, we will show that each of the summands in (5.1) describes nothing but a shift and scalar multiplication of some of the blocks. This means that the application of the Hamiltonian to a blocksparse MPS can be expressed in a matrixfree way, leading to an elegant and efficient algorithmic treatment.
5.1 Compact forms of operators
We now turn to the ranks of Hamiltonians as in (1.1) in second quantization in MPO format. As shown in this section, compared to the number of rankone terms in the representation (1.1), one can obtain substantially reduced ranks in MPO representations. The basic mechanism behind this rank reduction is described in [6] for projected Hamiltonians in the context of DMRG solvers and, in an MPO form for full Hamiltonians similar to the one given here, in [7, 23]. For oneparticle operators, MPO representations are also given in the mathematical literature [11, 21]. Here we use similar considerations to construct an explicit MPO representation of the full Hamiltonian with nearminimal ranks and with a unified treatment of one and twoparticle operators. To avoid technicalities, in what follows we assume K to be even, but this is not essential for the construction.
In preparation for the MPO representations of the one and twoparticle operators, for illustrative purposes, start with the case of Laplacelike operators
We denote this operator by \(\varvec{F}\) since the Fock operator is of this form when its eigenfunctions are used as orbitals. From (5.2) and (2.5), we immediately obtain a representation of \(\varvec{F}\) of MPO rank K. However, using the components in (2.5), we can write \(\varvec{F}\) in MPO format with rank 2. To this end, we define
With these blocks, as in [22] one immediately verifies that the following representation holds and a linear scaling with respect to K in the termwise representation (5.2) can be reduced to a constant rank in the MPO format.
Lemma 5.3
We have \(\varvec{F} = F_1 {{\,\mathrm{\bowtie }\,}}F_2 {{\,\mathrm{\bowtie }\,}}\cdots {{\,\mathrm{\bowtie }\,}}F_K\), that is, \(\varvec{F}\) has an MPO representation of rank 2.
Before we turn to the one and twoparticle operators, we define some notation that will be needed in both cases. For abbreviating blocks with repeated components, we introduce the abbreviations
and analogously \(\mathsf {A}_k = I_k \uparrow A\) and \(\mathsf {A}^*_k = I_k \uparrow A^*\), where we write \(I_k\) for the identity matrix of size k.
With this, we can turn to the oneparticle operator \(\varvec{S}\) given by
with symmetric coefficient matrix \((t_{ij})_{i,j=1,\ldots , K}\). Naively, \(\varvec{S}\) can be written in MPO format with rank \(K^2\), but again, one can do better: we now show that \(\varvec{S}\) in fact can be written in MPO format with rank \(K+2\).
For each \(k \in \{1,\ldots ,K\}\), we define some slices of the coefficient matrix \(T = (t_{ij})_{i=1,\ldots , K}^{j=1,\ldots , K}\), where the subscript indices correspond to rows and the superscript indices to columns:
Furthermore, the topright and bottomleft blocks of T are given by
We define the components
for \(k = 2,\ldots ,\frac{K}{2}\),
and for \(k = \frac{K}{2} + 1, \ldots ,K1\),
Finally, let
This allows us to state the oneparticle operator explicitly and with (near)minimal ranks. The same rank bounds can also be extracted from the alternative MPO representation in [11, Thm. 4.2] (see also [21, Lemma 3.2]); the construction we describe here, however, also serves as a preparation for our similar approach to the twoparticle case.
Theorem 5.4
We have
Furthermore, the MPO rank of \(\varvec{S}\) is bounded by \(K+2\). If for some \(d \in \mathbb {N}_0\), we have \( t_{ij} = 0\) whenever \(ij > d\), then the MPO ranks of \(\varvec{S}\) are bounded by \(2d+2\).
For the proof, we proceed as follows: We divide the claim into two cases \(k \le \frac{K}{2}\) and \(k > \frac{K}{2}\) and show by induction over k that the rank of \(T_k\) for \(k\le \frac{K}{2}\) can be bounded by \(2+2k\). Consequently, for \(k=\frac{K}{2}\) we find that the rank of \(\varvec{S}\) can be bounded by \(K+2\). This is also the bound for the rank of the matrix \(M_T\). For the sparse coefficient matrix we directly consider \(M_T\), since the rank is maximized at the center of the representation, where the rank of \(W_{T}^5\) and the rank of \(W_{T}^6\) can be bounded by d in both cases. Thus the rank of \(M_T\) is bounded by \(2d+2\). The details of the proof are given Appendix 2.
The case of twoelectron operators \(\varvec{D}\) given by
is more involved but can be dealt with quite analogously. We briefly note that due to the anticommutation relations (1.2), one only needs to do the sum over \(i_1 < i_2\), \(j_1 < j_2\), which reduces the number of terms: By grouping together
we obtain
We denote the tensor grouping the coefficients of V by \({\tilde{V}}=\left( {\tilde{v}}_{i_1 i_2 j_1 j_2}\right) _{i_1 i_2 j_1 j_2=1}^K\). Again, the Kroneckerrank of this operator is \(\left( {\begin{array}{c}K\\ 2\end{array}}\right) ^2 = O(K^4)\), so naively \(\varvec{D}\) could be written with MPO rank \(\left( {\begin{array}{c}K\\ 2\end{array}}\right) ^2\). But we can do better. With the help of the matrices in 2.5 we can write \(\varvec{D}\) in MPO format with rank \(\frac{1}{2} K^2 + \frac{3}{2}{K} + 2\).
As before we need to extract different matrix slices from \({\tilde{V}}\), where again subscript indices correspond to rows and superscript indices to columns:
For \(k = 1,\ldots , \frac{K}{2}\) let us define the blocks
as well as
With these blocks we have
Furthermore, we set
where there are \(\frac{K}{2}+1\) ones on each of the two antidiagonals above and below \(V_{\mathrm {mid}}\). Finally, for \(k = \frac{K}{2}+1,\ldots , K\), we analogously obtain
where \(\left( V_k^{2,1}\right) ^T\) is similar to \(V_k^{1,2}\) with modified coefficients.
With the necessary notation out of the way, we immediately state the MPO representation of the twoparticle operator with nearminimal ranks.
Theorem 5.5
We have
implying that \(\varvec{D}\) has an MPO representation of rank \(\frac{1}{2} {K^2}+ \frac{3}{2} {K}+2\). If there exists \(d\in \mathbb {N}_0\) such that \(v_{i_1 i_2 j_1 j_2} = 0\) whenever
then the MPO ranks of \(\varvec{D}\) are bounded by \(d^2+3d1\) if d is odd and by \(d^2+3d2\) if d is even.
Proof
We can proceed as in the proof of Theorem 5.4. Again, a detailed proof can be found in Appendix 3. \(\square \)
Remark 5.6
It is possible to incorporate the oneelectron operator into the twoelectron operator, such that the MPO ranks of \(\varvec{S}+\varvec{D}\) are also bounded by \(\frac{1}{2} {K^2} + \frac{3}{2} {K} +2\). Note that the stated rank bounds are not sharp for the leading and trailing cores, where further reductions are possible with additional technical effort (see also Sect. 6.4).
5.2 Matrixfree operations on block structures
Corollary 3.4 implies that particle numberpreserving operators must also preserve the block structure. In other words, if \(\varvec{x} \in {\mathcal {F}}^K_N\) and if \(\varvec{B}\) is any particle numberpreserving operator, then \(\varvec{y} := \varvec{B} \varvec{x} \in {\mathcal {F}}^K_N\) has a representation with block structure according to Corollary 3.4. However, if \(\mathsf {B}\) is an MPO representation of \(\varvec{B}\) as derived in Sect. 5.1 and \(\varvec{x} = \tau (\mathsf {X})\) with blocksparse \(\mathsf {X}\), this leaves open the questions whether we can directly obtain the blocksparse representation of \(\varvec{y}\) from the standard representation \(\mathsf {Y} := \mathsf {B} \bullet \mathsf {X}\) of the matrixvector product, or whether additional transformations of \(\mathsf {Y}\) are required to extract the block structure. We now describe how the blocks of \(\varvec{y}\) can be obtained directly by replacing the component matrices I, S, A, \(A^*\), \(A^* A\) in \(\mathsf {B}\) by certain matrixfree operations on the blocks of \(\mathsf {X}\). Furthermore, these operations are performed entirely componentwise, that is, each component can be computed separately from the others and even partial evaluations are possible.
It turns out that each summand in (5.1) acts on the blocksparse MPS by shifting and deleting some of the blocks. This can be visualized by considering the oneparticle part of the Hamiltonian, specifically the case where \(i = j\):
Since this operator has Kronecker rank one, each matrix in the product acts only on the corresponding component in the block MPS. Clearly, the identity matrices leave their components and their respective block structure unchanged. Only the matrix
has the immediate effect of assigning zero to all unoccupied blocks,
Thus in this case, the particle number is preserved locally at orbital i and therefore, the block structure remains otherwise unchanged.
Additional difficulties appear, however, when \(i\ne j\), since the particle number is then conserved only by the combination of operations on different modes. Let us first consider \(i<j\),
To avoid technicalities, for the moment we assume \(N< i< j < K  N + 1\), corresponding to the generic case where all blocks appear in each core. Again, the identity matrices leave everything unchanged. The creation matrix \(A^*\) replaces the occupied layer \(X_i^{\{1\}}\) by the unoccupied layer \(X_i^{\{0\}}\). However, this clearly violates the block structure, because occupied blocks should only be located on the offdiagonal and because \(N \notin ({\mathcal {K}}_i  1)\). This inconsistency can only be resolved by noting that the added particle will be removed further down in the jth position of the tensor. Additionally, we have to take into account that a particle was added to the left of all following components, thus increasing the particle count n by one in each block. The solution to the block structure violation therefore lies in shifting the corresponding blocks and deleting the ones that violate particle number counts. We summarize the case \(N< i< j < K  N + 1\) as follows:
where application of \(A^*\) corresponds to the block operations
application of A to the block operations
and for \(i< k < j\), applying S amounts to the block operations
The case where \(j < i\) can be dealt with analogously, but with the opposite shift because a particle gets removed on the left and all particle counts n have to be decreased by one until a particle gets added again in component i. This means that we have to distinguish the different cases in the implementation of, for instance, the action of the matrix S.
Remark 5.7
Some further technicalities need to be taken into account in an implementation:

(i)
The sizes of blocks that are set to zero in the above operations is dictated by the consistency of the representation: zero blocks of nontrivial size need to be kept, whereas redundancies due to zero blocks with a vanishing dimension need to be removed.

(ii)
For the border cases, such as \(i < N\) or \(j > KN+1\), certain blocks do not occur. The accordingly modified sets \({\mathcal {K}}_k\) and the corresponding differences in the blocks that are present lead to modifications in the above operations.
Figure 1 shows the different cases in the implementation, including the blocks that need to be deleted in order to avoid irregularities. Here \(A^*A\) corresponds to the block operation (5.4); \(A^*_\ell \), \(A_r\) and \(S^+\) correspond to (5.5), (5.6), and (5.7), respectively; and \(A^*_r\), \(A_\ell \) and \(S^\) are the analogous operations with opposite shifts. The figure assumes the generic block structure for \(i,j \in \{N+1,\ldots ,KN\}\) and thus needs to be modified for the border cases mentioned in Remark 5.7(ii).
In order to apply tensor representations of general operators of the form \(\varvec{a}_i^* \varvec{a}_j\) with a result given in the same block structure, the components need to be replaced by block operations with particle number semantics as in Fig. 1, depending on their position \(k = 1,\ldots ,K\):
In addition, we have the kindependent replacement of \(A^*A\) by the operation (5.4) and replace zero components by the operation Z that assigns zero to all blocks.
These operations can be performed efficiently by exchange, removal or sign changes of blocks in the tensor representation. One can proceed analogously for twoparticle operators \(\varvec{a}_{i_1}^* \varvec{a}_{i_2}^* \varvec{a}_{j_1} \varvec{a}_{j_2}\), as shown in Fig. 2, where again one needs to make appropriate adjustments for the border cases. In a similar manner, this can be generalized to interactions of three or more particles.
5.3 Automatic rank reduction
The particle number semantics of the block operations according to Figs. 1 and 2 are compatible with forming linear combinations of operators. In particular, the full one and twoparticle operator representations constructed in Sect. 5.1 can in the same manner be applied entirely in terms of block operations: replacing \(A^* A\), \(A^*\), A, I, S, and 0 by the respective kdependent block operations according to Figs. 1, 2 again leads to a consistent representation, and its application directly produces the correct block structure. For this, let \(\mathsf {S}_{\mathrm {b}}\) and \(\mathsf {D}_{\mathrm {b}}\) be the resulting representations of \(\varvec{S}\) and \(\varvec{D}\) with particle number semantics.
Proposition 5.8
Let \(\varvec{x} = \tau (\mathsf {X})\) be a blocksparse MPS representation, then \(\mathsf {U} = \mathsf {S}_{\mathrm {b}} \bullet \mathsf {X}\) and \(\mathsf {V} = \mathsf {D}_{\mathrm {b}} \bullet \mathsf {X}\) are blocksparse MPS representations with \(\varvec{S} \varvec{x} = \tau (\mathsf {U})\), \(\varvec{D} \varvec{x} = \tau (\mathsf {V})\).
Proof
By inspection of the proofs of Thm. 5.4 and 5.5, one finds that each rank index in the MPO representations constructed there corresponds to precisely one case in Fig. 1 or 2, respectively. The resulting representations operating on blocks thus directly yield a consistent blocksparse representation of the matrixvector products. \(\square \)
The addition of such symbolic MPO representations can be done analogously to the addition of MPS and blocksparse MPS. In each core, these symbolic representations are composed of scalar multiples of the elementary matrixfree block operations discussed above, where the corresponding scalars can be collected in a separate matrix of coefficients. We say that a collection of columns of cores in rankwise representation of matrixfree operators are linearly dependent if they contain the same symbols but the coefficient matrix is rankdeficient. In this case, the operator ranks can be reduced, provided that the corresponding rows in the next component are compatible, meaning that entrywise, their symbols can only differ if all but one of them are the zero symbol Z, which in turn means that they can be added. The resulting algorithm can be performed from left to right and the procedure can be repeated from right to left, where linearly dependent rows can be merged, see Algorithm 5. This automatically reduces the ranks of the sums of operators in the above symbolic representation. It can be very useful if the rankreduced format for an operator is not known. In fact, as we show experimentally in Sect. 6.4, automatic rank reduction can even improve upon the operator representation derived in Sect. 5.1.
Example 5.9
Let \(K = 5\). The operators \(\varvec{a}_2^* \varvec{a}_2\), \(\varvec{a}_2^* \varvec{a}_4\) and \(\varvec{a}_4^* \varvec{a}_2\) can be represented with particle number semantics by
The sum of the operators is given by
After the rank reduction, the operator has the form
6 Numerical aspects
This chapter serves as an outlook on numerical solvers for the eigenvalue problem \( \varvec{H} \varvec{x} = \lambda \varvec{x} \) with the additional constraint \(\varvec{P} \varvec{x} = N \varvec{x}\) implemented by keeping \(\varvec{x}\) in blocksparse format. We comment on standard iterative solvers and discuss their relation to this representation format. Furthermore, we give an example on the effect of enforcing block sparsity on the numerical stability of particle numbers with respect to TTSVD truncation. Finally, we show that the ranks of the one and twoparticle operators, as discussed in Sect. 5.1, are indeed nearoptimal.
6.1 Iterative methods with fixed and variable ranks
A standard method for the computation with MPS is the DMRG algorithm. All modern implementations of this method (see, for instance, [13, 17, 27, 31]) exploit the block sparsity in some form. For the sake of completeness, we give a brief overview of both the onesite and the twosite DMRG. We then turn to methods using global eigenvalue residuals. These methods are nonstandard in physical computations, but may become competitive when block sparsity is taken into account. A detailed numerical comparison of the methods will be subject of further research.
6.1.1 Onesite DMRG/ALS
The onesite DMRG or ALS algorithm [19] optimizes one component of the MPS \(\varvec{x}\) at a time. With the appropriate orthogonalization, each subiteration consists of an optimization step on the linear part of the fixedrank manifold, which coincides with its own tangent space. As such, the onesite DMRG can be formulated as a tangent space prodedure: Let \(\varvec{x}_{k,\ell }\) be the current iterate after \(\ell \) sweeps and the kth subiteration. That is, we have previously optimized the kth component and orthogononalized accordingly. Now, we optimize the \((k+1)\)st component by minimizing the energy
If \(k = K\), we can go back to \(k = 1\) or do the sweep in reverse. We note that \(\varvec{Q}_{\varvec{x}_{k,\ell }}^{k+1,1}\) is exactly the projection onto the part of the tangent space at \(\varvec{x}_{k,\ell }\) that corresponds to the \((k+1)\)st component. If \(\varvec{x}_{k,\ell }\) is an eigenvector of the particle number operator \(\varvec{P}\), then by Corollary 3.8, \(\varvec{Q}_{\varvec{x}_{k,\ell }}^{k+1,1}\) commutes with \(\varvec{P}\). By Lemma 5.1, so does the Hamiltonian \(\varvec{H}\). Thus, the next iterate \(\varvec{x}_{k+1,\ell }\) will be in the same eigenspace of \(\varvec{P}\). Therefore, if one initializes the onesite DMRG algorithm with a blocksparse MPS of fixed particle number, then the block sparsity will be preserved for each iterate and the algorithm can be performed by operating only on the nonzero blocks.
6.1.2 Twosite DMRG
The classical (twosite) DMRG [19, 40] optimizes two neighboring components at once. This allows for a certain rankadaptivity in between these components. While this gives the algorithm more flexibility, it also means that the subiterates can leave the fixedrank manifold and even the tangent space. Nevertheless, we can show that the particle number will be preserved. To this end, we define the operation \({\varvec{\tilde{Q}}}_{\varvec{x}}^{k,1}\) for \(k=1,\ldots ,K1\) similarly to \( \varvec{Q}_{\varvec{x}}^{k,1}\) by
As in Corollary 3.8, it can be shown that \({\varvec{\tilde{Q}}}_{\varvec{x}}^{k,1}\) and \(\varvec{P}\) commute. Thus, with the same argument as above, if the first iterate is an eigenvector of \(\varvec{P}\), then all iterates are in the same eigenspace.
6.1.3 (Preconditioned) Gradient descent
An alternative to the DMRG algorithm are methods operating globally on the MPS representation, such as (preconditioned) gradient descent or more involved variants such as LOBPCG [25]. For basic gradient descent, one can control the ranks by defining a threshold \(\epsilon > 0\) and performing the update scheme
Since all involved steps preserve the particle number, this scheme produces a sequence \(\varvec{x}_\ell \) with the same particle number if the initial value \(\varvec{x}_0\) has a fixed particle number. Convergence can be accelerated by using an optimized step size \(\alpha _\ell \) or by preconditioning the system [32].
6.1.4 Riemannian gradient descent
One could also consider Riemannian methods, where the gradient is projected first onto the tangent space and the step is performed on the fixedrank manifold [24]. Generalizations are possible that allow for rank adaptivity. This method is often used because the ranks can be fixed and because the projected gradient in the tangent space can be stated explicitly and compactly, thus reducing computational overhead. We construct a sequence \(\varvec{x}_\ell \) from an initial value \(\varvec{x}_0\) with initial rank r. If \(\varvec{x}_0\) has a fixed particle number, then so does the entire sequence
where \(\alpha _\ell \) is the step size. In [35], it is shown that the truncation to fixed rank is a retraction, and thus the stated scheme can be regarded as a Riemannian optimization method. These methods can be accelerated by typical techniques for gradient descent, such as nonlinear conjugate gradient descent, see [1].
6.2 Blocks of zero size
We usually assume a tensor \(\varvec{x}\) to be represented with minimal ranks; otherwise, we can perform a TTSVD truncation with a given error threshold or to a fixed multilinear rank as in Algorithm 4. This means that in the blocksparse format, all blocks that contain only zeros will be actually set to size zero, which has several implications.
First of all, as already mentioned in Sect. 3, we stress that truncating the ranks of a tensor \(\varvec{x}\) to a fixed multilinear rank \(r_1,\ldots ,r_{K1}\) can lead to the tensor being set to zero, that is, . The blocksparse format allows for a deeper understanding of this fact: Setting a block to zero can lead to the tensor as a whole being set to zero, if all other nonzero blocks depend on it.
Furthermore, the question arises whether the above iterative methods can recover a block after it has been momentarily set to zero during an iteration step. We know that the onesite DMRG is not rankadaptive and the block sizes are fixed during the iteration. In the other three methods it is possible to increase the rank based on some threshold (in the Riemannian case this can be achieved by modification of the retraction onto the manifold).
If \(\rho _{k,n} = 0\) for some k and n, then there exists a basis element \(\varvec{e}^\alpha \) of the eigenspace of the particle number operator with eigenvalue N (that is, \(\varvec{P} \varvec{e}^\alpha = N \varvec{e}^\alpha \)), such that \(\langle \varvec{e}^\alpha , \varvec{x}\rangle = 0\). Then we have \(\varvec{Q}_{\varvec{x}}^{k,1} \varvec{e}^\alpha = 0\), since \(\tau ^{<}_{k,j} (\mathsf {U})\) is not present for \(j\in \mathcal {S}_{k,n} = \emptyset \). A similar argument can be made for \(\varvec{Q}_{\varvec{x}}^{k,2}\). This means that in Riemannian gradient descent, once \(\rho _{k,n} = 0\) in some iterative step, then also \(\rho _{k,n} = 0\) for all subsequent steps. One can overcome this problem by choosing the initial point \(\varvec{x}_0\) in such a way that all \(\rho _{k,n}\) are at least 1. However, when retracting back onto the manifold, care needs to be taken that blocks are not set to zero.
For the twosite DMRG, a similar argument implies that some blocks can be created in each substep, depending on neighboring block sizes. A thorough analysis shows that the rank adaptivity of the twosite DMRG is always local and thus not all points can necessarily be reached from a given starting point \(\varvec{x}_0\). However, if we start from a generic point (with some block sizes possibly zero), we can expect a favorable behavior of the method.
The general gradient case is in this regard the most versatile as it has the fewest restrictions on the update step. Starting in \(\varvec{x}_0 \ne 0\) will allow us to optimize on the whole linear space of fixed particle number N throughout the procedure. A more detailed investigation will be given elsewhere.
6.3 Numerical stability of rounding
As mentioned in Remark 4.3, if the singular values in each matricization are distinct, the TTSVD in Algorithm 4 is unique up to the signs of singular vectors. Therefore, performing a TTSVD on a tensor with fixed particle number will automatically result in a (reordered) blocksparse format. However, when two singular values coincide, there is an additional rotational freedom in the corresponding subspaces that needs to be factored out in order to enforce block sparsity.
This fact has an important implication on the SVD truncation of a tensor of fixed particle number that is not in blocksparse format. If the TTSVD is unique, the SVD truncation will result in a tensor of the same fixed particle number (or the zero tensor, which is still an element of the same eigenspace). If two or more singular values are equal, a general SVD truncation can destroy the natural block structure, as it could remove parts of two blocks simultaneously. Numerically, this already occurs when two singular values are close to each other, since singular vectors become increasingly illconditioned with decreasing difference of the corresponding singular values, resulting in numerical errors in the particle number upon truncation. We emphasize that if the blocksparse structure is enforced, this is ruled out.
To illustrate this numerical issue, we conduct the following artificial experiment: Let \(K=20\) and \(N=6\). We pick a tensor with blocks of size 1, leftorthogonal components \(U_1,\ldots ,U_{10}\), and rightorthogonal components \(V_{11},\ldots ,V_{20}\). This tensor has rank at most 7 because there can be no more than 7 blocks of size 1 on the unoccupied or occupied layer, respectively. For \(\epsilon \ge 0\), we choose a diagonal matrix of singular values
and construct the tensor
which has the singular values \(\sigma _1,\ldots ,\sigma _7\) between its middle components \(U_{10}\) and \(V_{11}\). When \(\epsilon = 0\), truncating the last singular value in full MPS format will therefore in general lead to a deviation from the particle number \(N=6\). This effect can also be observed when the smallest singular values are only roughly equal. We consider the Rayleigh quotient of \(\varvec{x}_{6,\epsilon } = {\text {trunc}}_{6,\ldots ,6}(\varvec{x}_\epsilon )\) with the particle number operator for \(\epsilon \rightarrow 0\), which directly translates to the difference of the smallest singular values \( \sigma _6  \sigma _7  = \epsilon \). This is shown in Fig. 3.
Ideally, this Rayleigh quotient should be constant \(N=6\). This is the case for \(\epsilon > 10^{8}\). However, as we can see if we keep the MPS in its full format, for small differences in the singular values, the Rayleigh quotient deviates and the natural block structure is destroyed. This is due to the illconditioning of singular vectors in the TTSVD when the smallest singular values of \(\varvec{x}_\epsilon \) are close. If we keep the tensor in blocksparse format, as expected, this problem cannot occur.
6.4 Operator ranks
We have discussed in Sect. 5.1 that the one and twoparticle operators \(\varvec{S}\) and \(\varvec{D}\) can be explicitly stated in rankcompressed format. Here, we numerically show that these representations indeed have nearoptimal ranks, in the sense that the rank compression procedure for operators that is outlined in Sect. 5.2 can reduce the ranks only for the border cases at the beginning and end of the MPO chain.
Table 1 shows the ranks of the two operators for different numbers of orbitals K and normally distributed random values for T and V. In the left column, we see the ranks of these operators when they are represented in the rankreduced form discussed in Sect. 5.1. In the right column, we have performed an extra rank compression from left to right and one from right to left, as discussed in Sect. 5.2. One can see that the ranks in the left column are almost optimal. Only for the border cases of the twoparticle operator, the ranks can be reduced further.
Finally, for \(K=32\), we apply these operators \(\varvec{S}\) and \(\varvec{D}\) to a random MPS of rank 1 that is not in the blocksparse format. The entries in the components of this tensor are chosen to be \({\mathcal {N}}(0,1)\)distributed and the tensor is then normalized. The operators are in the rankreduced format but they are applied explicitly and not as their matrixfree versions as in Sect. 5.2, since this is applicable only to tensors in blocksparse format.
Figure 4 shows the ranks of the output tensor for the two operators (top and bottom) and different truncation parameters \(\varepsilon \). One can see that with minimal truncation, the ranks of the output are about the same as the ranks of the operators. However, the output ranks can be further reduced by about a factor 2 if one is willing to accept an error threshold of \(\varepsilon = 10^{12}\). A more substantial truncation does not reduce the ranks further, indicating that the ranks of the two operators are indeed linear and quadradic in K, respectively.
References
Absil, P.A., Mahony, R., Sepulchre, R.: Optimization algorithms on matrix manifolds. Princeton University Press, Princeton (2008)
Bachmayr, M., Cohen, A., Dahmen, W.: Parametric PDEs: Sparse or lowrank approximations? IMA J. Numer. Anal. 38, 1661–1708 (2018)
Bachmayr, M., Kazeev, V.: Stability of lowrank tensor representations and structured multilevel preconditioning for elliptic PDEs. Found. Comput. Math. 20, 1175–1236 (2020)
Bachmayr, M., Schneider, R., Uschmajew, A.: Tensor networks and hierarchical tensors for the solution of highdimensional partial differential equations. Found. Comput. Math. 16(6), 1423–1472 (2016)
Bauer, B., Corboz, P., Orús, R., Troyer, M.: Implementing global Abelian symmetries in projected entangledpair state algorithms. Phys. Rev. B. 83(12), 125106 (2011)
Chan, G.K.L., Keselman, A., Nakatani, N., Li, Z., White, S.R.: Matrix product operators, matrix product states, and ab initio density matrix renormalization group algorithms. J. Chem. Phys. 145(1), 014102 (2016)
Crosswhite, G.M., Bacon, D.: Finite automata for caching in matrix product algorithms. Phys. Rev. A. 78(1), 012356 (2008)
Daley, A.J., Kollath, C., Schollwöck, U., Vidal, G.: Timedependent densitymatrix renormalizationgroup using adaptive effective Hilbert spaces. J. Stat. Mech. Theory Exp. 2004(04), P04005 (2004)
Dolfi, M., Bauer, B., Troyer, M., Ristivojevic, Z.: Multigrid algorithms for tensor network states. Phys. Rev. Lett. 109(2), 020604 (2012)
Dolgov, S., Kalise, D., Kunisch, K.: Tensor decompositions for highdimensional HamiltonJacobiBellman equations, 24
Dolgov, S., Khoromskij, B.: Twolevel QTTTucker format for optimized tensor calculus. SIAM J. Matrix Anal. Appl. 34(2), 593–623 (2013)
Eigel, M., Pfeffer, M., Schneider, R.: Adaptive stochastic Galerkin FEM with hierarchical tensor representations. Numer. Math. 136(3), 765–803 (2017)
Fishman, M., White, S.R., Stoudenmire, E.M.: The ITensor software library for tensor network calculations, arXiv:2007.14822, (2020)
Grasedyck, L.: Hierarchical singular value decomposition of tensors. SIAM J. Matrix Anal. Appl. 31(4), 2029–2054 (2010)
Hackbusch, W.: On the representation of symmetric and antisymmetric tensors, Contemporary Computational MathematicsA Celebration of the 80th Birthday of Ian Sloan, Springer, pp. 483–515 (2018)
Hackbusch, W., Kühn, S.: A new scheme for the tensor representation. J. Fourier Anal. Appl. 15(5), 706–722 (2009)
Hauschild, J., Pollmann, F.: Efficient numerical simulations with tensor networks: Tensor network python (tenpy), SciPost Physics Lecture Notes (2018)
Helgaker, T., Jorgensen, P.: and Jeppe Olsen. John Wiley & Sons, Molecular electronicstructure theory (2000)
Holtz, S., Rohwedder, T., Schneider, R.: The alternating linear scheme for tensor optimization in the tensor train format. SIAM J. Sci. Comput. 34(2), A683–A713 (2012)
Holtz, S., Rohwedder, T., Schneider, R.: On manifolds of tensors of fixed TTrank. Numer. Math. 120(4), 701–731 (2012)
Kazeev, V., Reichmann, O., Schwab, C.: Lowrank tensor structure of linear diffusion operators in the TT and QTT formats. Linear Algebra Appl. 438(11), 4204–4221 (2013)
Kazeev, V.A., Khoromskij, B.N.: LowRank Explicit QTT Representation of the Laplace Operator and Its Inverse. SIAM J. Matrix Anal. Appl. 33(3), 742–758 (2012)
Keller, S., Dolfi, M., Troyer, M., Reiher, M.: An efficient matrix product operator representation of the quantum chemical hamiltonian. J. Chem. Phys. 143(24),(2015)
Kressner, D., Steinlechner, M., Vandereycken, B.: Lowrank tensor completion by Riemannian optimization. BIT Numer. Math. 54(2), 447–468 (2014)
Kressner, D., Tobler, C.: Preconditioned lowrank methods for highdimensional elliptic pde eigenvalue problems. Comput. Methods Appl. Math. 11(3), 363–381 (2011)
McCulloch, I.P.: From densitymatrix renormalization group to matrix product states. J. Stat. Mech. Theory Exp. 2007(10), P10014 (2007)
Mendl, C.B.: Pytenet: A concise python implementation of quantum tensor network algorithms. J. Open Sour. Softw. 3(30), 948 (2018)
Ivan, V.: Oseledets, Tensor Train decomposition. SIAM J. Sci. Comput. 33(5), 2295–2317 (2011)
Oster, M., Sallandt, L., Schneider, R.: Approximating the stationary HamiltonJacobiBellman equation by hierarchical tensor products, arXiv:1911.00279, (2020)
Östlund, S., Rommer, S.: Thermodynamic limit of density matrix renormalization. Phys. Rev. Lett. 75(19), 3537 (1995)
Roberts, C., Milsted, A., Ganahl, M., Zalcman, A., Fontaine, B., Zou, Y., Hidary, J., Vidal, G., Leichenauer, S.: Tensornetwork: A library for physics and machine learning, arXiv:1905.01330, (2019)
Rohwedder, T., Schneider, R., Zeiser, A.: Perturbed preconditioned inverse iteration for operator eigenvalue problems with applications to adaptive wavelet discretization. Adv. Comput. Math. 34(1), 43–66 (2011)
Schollwöck, U.: The densitymatrix renormalization group in the age of matrix product states. Ann. Phys. 326(1), 96–192 (2011)
Singh, S., Pfeifer, R.N.C., Vidal, G.: Tensor network states and algorithms in the presence of a global \(U(1)\) symmetry. Phys. Rev. B. 83, 115125 (2011)
Steinlechner, M.: Riemannian optimization for highdimensional tensor completion. SIAM J. Sci. Comput. 38(5), S461–S484 (2016)
Szalay, S., Pfeffer, M., Murg, V., Barcza, G., Verstraete, F., Schneider, R., Legeza, Ö.: Tensor product methods and entanglement optimization for ab initio quantum chemistry. Int. J. Quantum Chem. 115(19), 1342–1391 (2015)
Verstraete, F., Cirac, J.I.: Renormalization algorithms for quantummany body systems in two and higher dimensions, arXiv: condmat/0407066, (2004)
Vidal, G.: Efficient classical simulation of slightly entangled quantum computations. Phys. Rev. Lett. 91(14), 147902 (2003)
Vidal, G.: Entanglement renormalization. Phys. Rev. Lett. 99, 220405 (2007)
Steven, R.: White, Density matrix formulation for quantum renormalization groups. Phys. Rev. Lett. 69, 2863–2866 (1992)
Acknowledgements
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
M.B. acknowledges funding by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Projektnummern 233630050; 211504053 – TRR 146; SFB 1060. M.P. was funded by DFG – Projektnummern 448293816; 211504053 – SFB 1060. M.G. was funded by DFG (SCHN530/151).
Appendices
Appendix A. Proof of Lemma 5.2
Proof
We use induction over the eigenvalues of \(\varvec{P}\), which are \(\{0,1,\ldots ,K\}\). For the corresponding eigenspace for eigenvalue N, we write \(U_N\). Let
We aim to show that \(\varvec{B}_K = \varvec{B}\) for appropriate coefficients \(v_{D^+,D^}\). For \(\varvec{B}_0\), we set \(v_{\emptyset ,\emptyset } = \langle \varvec{e}_{\emptyset }, \varvec{B} \varvec{e}_{\emptyset } \rangle = \langle \varvec{e}_\mathrm {vac}, \varvec{B} \varvec{e}_\mathrm {vac}\rangle \). Thus, \(\varvec{B}_0\) and \(\varvec{B}\) coincide on \(U_0\).
Now assume we have \(\varvec{B}_N\) such that \(\varvec{B}_N\) and \(\varvec{B}\) coincide on \(U_L\) with \(L\le N\). Then for \({\# D^+ = \# D^ =N+1}\) we set
where \(\varvec{e}_{D^+}^T\varvec{a}_{D^+}^*\varvec{a}_{D^}\varvec{e}_{D^}\), in view of (2.5), is \(\pm 1\). We now show that \(\varvec{B}_{N+1}\) and \(\varvec{B}\) coincide on \(U_L\) for \(L\le N+1\). Since for all \(E^+, E^ \subset \{ 1,\ldots , K\}\) with \({\# E^+ = \# E^ \le N}\) we have \(\langle \varvec{e}_{E^+},\varvec{a}_{D^+}^*\varvec{a}_{D^}\varvec{e}_{E^}\rangle =0\), by the induction hypothesis, we obtain
Furthermore, we also have \(\langle \varvec{e}_{E^+},\varvec{a}_{D^+}^*\varvec{a}_{D^}\varvec{e}_{E^}\rangle =0\) for \({\# E^+ =\# E^ = N+1}\) with \(E^+\ne D^+\) or \(E^\ne D^\). Consequently,
\(\square \)
Appendix B. Proof of Thm. 5.4
Proof
For \(k=1,\ldots , K\), we have the two cases \(k\le \frac{K}{2}\) and \( k > \frac{K}{2}\). We begin with \(k\le \frac{K}{2}\) and define the first k factors of \(\varvec{a}_i^*\) and \(\varvec{a}_j\) by
Clearly, if \(\max \{i, j\} > k\), the actual values of i, j are irrelevant. Thus, we write
and in particular
We further define
With this, for \(k \le \frac{K}{2}\), we can show by induction that
This holds true by definition for \(k=1\). For \(k>1\) we calculate
Similarly, it can be shown for \(k > \frac{K}{2}\) that
where
and \(\tilde{\varvec{a}}_{k,i}^*, \tilde{\varvec{a}}_{k,j}, \tilde{\varvec{a}}_{k,<}^*\) and \(\tilde{\varvec{a}}_{k,<}\) are defined analogously to the above for the last k factors of \(\varvec{a}_i^*\) and \(\varvec{a}_j\) respectively. With this we calculate
because
The rank of \(T_k\) for \(k \le \frac{K}{2}\) can be bounded by \(2+2k\). Consequently, for \(k=\frac{K}{2}\) we find that the rank of \(\varvec{S}\) can be bounded by \(K+2\). This is also the bound for the rank of the matrix \(M_T\). For the sparse coefficient matrix we directly consider \(M_T\), since the rank is maximized at the center of the representation, where the rank of \(W_{T}^5\) and the rank of \(W_{T}^6\) can be bounded by d in both cases. Thus the rank of \(M_T\) is bounded by \(2d+2\). \(\square \)
Appendix C. Proof of Thm. 5.5
Proof
We use the same notation as in the proof of Theorem 5.4. With this, we define
as well as
We again proceed by induction and show that
This holds for \(k=1\) and for \(k=2,\ldots , \frac{K}{2}\), we want to show
The product with \(V_k^{1,1}\) is rather straightforward. For the products with \(V_k^{1,2}\) and \(V_k^{2,2}\), we calculate
and
as well as
and with this
Similarly, we show
and
We proceed in the same fashion for \(k = \frac{K}{2} +1 ,\ldots , K\). We define \(\varvec{{\tilde{I}}}_{k}\), \({\tilde{\mathsf {a}}}_{k}^{*}\), \({\tilde{\mathsf {a}}}_{k}\), \({\tilde{\mathsf {b}}}_{k}\), \({\tilde{\mathsf {c}}}_{k}^*\), \({\tilde{\mathsf {c}}}_{k}\), \({\tilde{\mathsf {e}}}_{k}\), \({\tilde{\mathsf {e}}}_{k}^*\), and \({\tilde{\mathsf {D}}}_{k}\) accordingly, such that
It remains to show that
This holds due to the antidiagonal structure of \(M_V\) and with an explicit calculation of \(\mathsf {b}_{K/2} W_{V,k}^8 \mathsf {b}_{K/2 + 1}\), \(\mathsf {c}_{K/2}^* W_{V,k}^9 \mathsf {c}_{K/2 + 1}\), and \(\mathsf {c}_{K/2} W_{V,k}^{10} \mathsf {c}_{K/2 + 1}^*\).
Finally, we consider representation ranks. For nonsparse \({\tilde{V}}\), the number of columns in \(V_k^{1,1}\) and \(V_k^{1,2}\) are given by
respectively. So for \(k = \frac{K}{2}\) we get
which is an upper bound for the MPOrank of \(\varvec{D}\).
For the sparse case (5.3), we only consider at the matrix \(M_V\), since it determines the highest rank in the representation of \(\varvec{D}\). In this case, the contributions to the rank of the left and right components in (9.1) are as follows: Each of the two combinations of components \(\varvec{I}_{K/2}\) and \(\varvec{\tilde{D}}_{K/2+1}\) as well as \(\varvec{D}_{K/2}\) and \(\varvec{\tilde{I}}_{K/2+1}\) contributes one to the rank. Each of the four combinations of \(\mathsf {a}_{K/2}^*\) and \(\tilde{\mathsf {e}}^*_{K/2+1}\); \(\mathsf {a}_{K/2}\) and \(\tilde{\mathsf {e}}_{K/2+1}\); \(\mathsf {e}_{K/2}\) and \(\tilde{\mathsf {a}}_{K/2+1}\); \(\mathsf {e}_{K/2}^*\) and \(\tilde{\mathsf {a}}_{K/2+1}^*\) contributes \(d1\). In the case of \(\mathsf {b}_{K/2}\) and \(\tilde{\mathsf {b}}_{K/2+1}\), we obtain
in the case of \(\mathsf {c}_{K/2}^*\) and \(\tilde{\mathsf {c}}_{K/2+1}\),
and in the case of \(\mathsf {c}_{K/2}\) and \(\tilde{\mathsf {c}}_{K/2+1}^*\),
and summing up these contributions completes the proof. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bachmayr, M., Götte, M. & Pfeffer, M. Particle number conservation and block structures in matrix product states. Calcolo 59, 24 (2022). https://doi.org/10.1007/s10092022004629
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10092022004629
Keywords
 Second quantization
 Particle number conservation
 Matrix product states
 Matrix product operators
Mathematics Subject Classification
 15A69
 65F15
 65Y20
 65Z05