1 Introduction

Computational quantum chemistry is by now widely regarded as one of the central pillars of modern chemistry, as evidenced by the award of two Nobel prizes (Walter Kohn and John Pople (1998) [51]; Martin Karplus, Michael Levitt, and Arieh Warschel (2013) [52]) in recent years. The field is typically thought to have begun with the pioneering work of Heitler and London [23] in the 1920s but major, concurrent advances were due to Vladimir Fock, Douglas Hartree, Egil Hylleraas and John Slater [19, 22, 27, 28, 48, 49] among others. These first developments were followed by seminal contributions in the post-war period by the likes of Boys [4], Čížek [8], Hall [21], Roothan [44, 45] and many others (see, for instance, [6, Chapter 1.2] for a more comprehensive account). The subsequent explosion in available computing resources which began in the 1970s helped spur a tremendous development in the field (see, e.g., the development of computer software such as POLYATOM [3], IBMOL [9], and GAUSSIAN 70 which is still used today [20]), and quantum-chemical simulations are today routinely performed by thousands of researchers, complementing painstaking laboratory work on the design of new compounds for sustainable energy, green catalysis, and pharmaceutical drugs (see, e.g., [14, 15, 25, 33, 35] and the references therein). Indeed, according to the 2021 annual report of the European High Performance Computing Joint Undertaking (EuroHPC-JU), nearly a quarter of of the simulations running on the supercomputers Karolina and Vega pertained to chemical and material science simulations, with similar or higher numbers reported by supercomputing centers in Germany [16], Italy [7], and Switzerland [1], for instance.

The goal of quantum chemistry is to obtain a quantitative description of the behaviour of matter at the atomic scale, i.e., when matter is viewed as a collection of nuclei and electrons. In the so-called non-relativistic Born–Oppenheimer setting, the nuclei of the molecule under study are treated as clamped, point-like particles, and the aim is to describe the evolution of the electrons in the effective electrostatic potential generated by the static configuration of positively charged nuclei, this field of study being known as electronic structure theory. The behaviour of the electrons in this situation is governed by the spectrum of the so-called electronic Hamiltonian—a semi-unbounded, self-adjoint operator acting on an \(L^2\)-type Hilbert space of antisymmetric functions. It has been known since the seminal work of Zhislin and Sigalov [55, 56] that for neutral molecules and positively charged ions, the electronic Hamiltonian possesses a lowest eigenvalue, frequently called the ground state energy, and a great deal of quantum chemical simulations are concerned with approximating this ground state energy.

The primary difficulty in the numerical computation of the lowest eigenvalue of the electronic Hamiltonian is the extremely high-dimensionality of the underlying Hilbert space. Indeed, for a system containing N electrons, the sought-after ground state of the electronic Hamiltonian depends on 3N spatial variables. A naive application of traditional numerical methods such as finite element approximations or spectral schemes, etc., therefore fails spectacularly, and specialised approximation strategies have to be developed. Broadly speaking, ab initio (first principles-based) deterministic numerical methods for approximating the ground state energy can be divided into three categories, each of which has a vast variety of subcategories and flavours (see, e.g., [6, Chapter 1] for a concise but comprehensive overview).

  • Wave-function methods which focus on approximating directly the ground state of the electronic Hamiltonian.

  • Density functional methods which are based on a reformulation of the minimisation problem for the electronic Hamiltonian (which acts on functions of 3N spatial variables) in terms of an equivalent minimisation problem over a set of electronic densities (which are functions of 3 spatial variables).

  • Reduced density matrix approaches which are based on the electronic one-body and two-body reduced density matrices.

The coupled cluster (CC) methodology, which belongs to the class of wave-function methods, is based on a non-linear ansatz for the sought-after ground state of the electronic Hamiltonian. In its most common form—the so-called single reference CC method—the unknown ground state is expressed as the action of an exponential cluster operator, i.e., the operator exponential of a linear combination of linear maps (so-called excitation operators), acting on a judiciously chosen reference function (usually a so-called discrete Hartree–Fock determinant). Using this ansatz the eigenvalue problem for the ground state energy of the electronic Hamiltonian can be reformulated as a non-linear system of equations for the unknown coefficients appearing in the linear combination of excitation operators entering the operator exponential. Approximations to the ground state energy are then obtained by restricting the class of excitation operators that appear inside the exponential, which leads to a hierarchy of computationally more tractable non-linear, root-finding problems. Usually these truncations are done on the basis of the excitation orders (see Sect. 2.3 below) and one thus speaks of CCD (double excitation operators only), CCSD (single and double excitation operators), CCSDT (single, double and triple excitation operators) and so on. We emphasise that these methods are not variational since the sought-after restricted coupled cluster wave-function is no longer the critical point of an energy functional.

Coupled cluster methods were originally introduced in the field of nuclear physics in the late 1950s by Coester and Kummel [10, 11] but were reformulated for use in quantum chemistry in the following decades by pioneers such as Čížek [8], Paldus et al. [36], and Sinanoğlu [47]. The original motivation for introducing such methods was the fact that they were size consistent: the approximate coupled cluster ground state energy of a molecular system composed of two independent sub-systems can be shown to be the sum of the individual approximate coupled cluster energies of the two sub-systems. Since size consistency seems to be a vital chemical property not conserved by some other numerical methods, and in practice, the CC methods seem to work extremely well, achieving, in many cases, the chemical accuracy of 1 kcal/mol, they quickly found wide adoption in the quantum chemical community [32]. In particular, the so-called CCSD(T)Footnote 1 variant, which can be applied to small and medium-sized molecules at a reasonable computational cost, is widely regarded as the ‘gold standard’ of quantum chemistry [40].

Despite the ubiquitous use of this ‘gold standard’ computational method in the quantum chemical community, there is a shockingly limited amount of mathematical literature on the numerical analysis of the coupled cluster methodology. Indeed, a simple search with the keyword “coupled cluster” on Google Scholar and MathSciNet, two databases that are representative of the scientific literature as a whole and the subset of mathematical literature thereof, reveals that there are more than 100,000 articles pertaining to coupled cluster theory of which less than 40 are listed on MathSciNet. Limiting ourselves to the subset of numerical analysis journals, there are a total seven articles on coupled cluster methods.

The first systematic study of the single reference coupled cluster method from a numerical analysis perspective was undertaken by Reinhold Schneider and Thorsten Rohwedder slightly more than 10 years ago. In a series of three remarkable papers [42, 43, 46], they were able to show that the excitation operators that appear inside the coupled cluster exponential are bounded linear maps between Hilbert spaces of antisymmetric functions with appropriate regularity and that consequently, the continuous (infinite-dimensional) coupled cluster equations could be given a precise functional-analytic meaning. The coupled cluster approximations (built by restricting the class of excitation operators that enter the exponential operator) could thus be viewed as classical Galerkin discretisations of an infinite-dimensional non-linear problem. Schneider also showed that under some assumptions, the underlying non-linear coupled cluster function is locally, strongly monotone which could exploited to prove the local well-posedness of both the continuous CC equations and its Galerkin discretisations. Schneider and Rohwedder also derived optimal error estimates for the coupled cluster energies using the dual-weighted residual approach of Rolf Rannacher and co-workers [50]. Since this pioneering work, two further contributions have been published which provide a similar numerical analysis for two other flavours of coupled cluster methods, namely, the extended coupled cluster method [31] and the tailored coupled cluster method [17] (see also [30]). In addition to the aforementioned contributions which tackle the coupled cluster equations from a functional analysis perspective, there has been recent interest in analysing the CC equations using tools from other fields. Thus, the contributions [12, 13] use concepts from graph theory to present a unified framework for constructing different variants of coupled cluster methods and topological index theory to study the solutions of the coupled cluster equations in finite-dimensions. Recently, an additional contribution has appeared which investigates the root structure of the CC equations using tools from algebraic geometry [18].

While the articles [17, 31, 42, 43, 46] listed above lay the groundwork for a rigorous a priori error analysis of the coupled cluster methods, they have one rather unfortunate drawback: in all cases, the well-posedness of the CC equations is established by demonstrating that the underlying CC function is locally strongly monotone, and this demonstration can only be shown to hold if the targeted root \({\varvec{t}}^*\) of the CC function is sufficiently close to zero. In other words, the local well-posedness analysis and the resulting error estimates only hold in a perturbative regime \({\varvec{t}}^* \approx 0\). On the other hand, as we discuss in more detail in Remark 25 in Sect. 4, in many practical situations where the CC method is known numerically to yield accurate approximations, the sought-after root \({\varvec{t}}^*\) is not in the perturbative regime. For such problems, the existing a priori analysis yields estimates with negative constants! The a priori analysis having failed, there is also no hope of developing a posteriori error estimates for practical coupled cluster simulations which, in our opinion, would be the ultimate goal of the numerical analysis.

The aim of the current contribution is to develop a new a priori error analysis for the single reference coupled cluster equations that is valid under more general conditions. The analysis we present here—motivated by the existing literature on non-linear numerical analysis (see, for instance, [5, 53])—is based on the invertibility of the Fréchet derivative of the non-linear coupled cluster function, which is established using a classical inf-sup-type approach. In contrast to the local, strong monotonicity approach pioneered by Schneider, our analysis does not require the sought-after root \({\varvec{t}}^*\) of the coupled cluster function to be close to zero. In this article, we will focus on the continuous (infinite-dimensional) CC equations and a specific version of the discrete CC equations, namely, the Full-CC equations in a finite basis (see Sect. 5). The extension of our analysis to more general discretisations (the so-called truncated CC equations [24, Chapter 13]) will be addressed in a forthcoming contribution.

The remainder of this article is organised as follows. In Sect. 2, we introduce more rigorously the problem formulation, i.e., the electronic Hamiltonian and the Hilbert spaces on which it acts. In Sect. 3, we introduce excitation operators and the coupled cluster ansatz, and we state the continuous and discrete coupled cluster equations. We begin our analysis in Sect. 4 where we prove, under the minimal assumptions that the sought-after eigenfunction is intermediately normalisable and the associated eigenvalue is non-degenerate, that the continuous (infinite-dimensional) CC equations are always locally well-posed. In Sect. 5, we analyse a specific discretisation of the CC equations, namely, the Full-CC equations in a finite basis. We prove under the same minimal assumptions of eigenpair non-degeneracy and CC ansatz validity that these equations are locally well-posed provided that the discretisation is fine enough, and we derive residual-based error estimates with guaranteed positive constants. Preliminary numerical experiments indicate that the constants that appear in our estimates are a significant improvement over those obtained from the local monotonicity approach.

2 Problem formulation and setting

Computational quantum chemistry is the study of the properties of matter through modelling at the molecular scale, i.e., when matter is viewed as a collection of positively charged nuclei and negatively charged electrons. To formalise the problem setting, we assume that we are given a molecule composed of \(M \in {\mathbb {N}}\) nuclei carrying charges \(\{Z_{\alpha }\}_{\alpha =1}^M \subset {\mathbb {R}}_+\) and located at positions \(\{\textbf{x}_{\alpha }\}_{\alpha =1}^{M} \subset {\mathbb {R}}^3\), respectively. We further assume the presence of \(N\in {\mathbb {N}}\) electrons whose spatial coordinates are denoted by \(\{\textbf{x}_i\}_{i=1}^N \subset {\mathbb {R}}^3\). Throughout this article, we will assume that the Born–Oppenheimer approximation holds, i.e., we will treat the nuclei as fixed, classical particles and we will focus purely on the quantum mechanical description of the electrons.

In order to describe the behaviour of this system of nuclei and electrons under the Born–Oppenheimer approximation, we require the notion of several functions spaces. The following construction is partially based on [41].

2.1 Function spaces and norms

To begin with, we denote by \(\textrm{L}^2({\mathbb {R}}^3)\) the space of real-valued square integrable functions of three variables, and we denote by \(\textrm{H}^1({\mathbb {R}}^3)\) the closed subspace of \(\textrm{L}^2({\mathbb {R}}^3)\) consisting of functions that additionally possess square integrable first derivatives. Both spaces are equipped with their usual inner products. Following the convention in the quantum chemical literature, we will frequently refer to \(\textrm{L}^2({\mathbb {R}}^3)\) and \(\textrm{H}^1({\mathbb {R}}^3)\) as infinite-dimensional single particle spaces.

Next, we define the tensor spaceFootnote 2

$$\begin{aligned} {\mathcal {L}}^2:= \bigotimes _{j=1}^N \textrm{L}^2({\mathbb {R}}^3), \end{aligned}$$

which is equipped with an inner product that is constructed by defining first for all elementary tensors \( {\mathcal {f}}, {\mathcal {g}} \in {\mathcal {L}}^2\) with \({\mathcal {f}}= \otimes _{j=1}^N {\mathcal {f}}_j\) and \({\mathcal {g}}= \otimes _{j=1}^N {\mathcal {g}}_j\)

$$\begin{aligned} \begin{aligned} \left( {\mathcal {f}}, {\mathcal {g}}\right) _{{\mathcal {L}}^2}:= \prod _{j=1}^N \left( {\mathcal {f}}_j, {\mathcal {g}}_j\right) _{\textrm{L}^2({\mathbb {R}}^3)}, \end{aligned} \end{aligned}$$
(1)

and then extending bilinearly for general tensorial elements of \({\mathcal {L}}^2\).

It is a consequence of Fubini’s theorem that the tensor space \({\mathcal {L}}^2\) is isometrically isomorphic to the space \(\textrm{L}^2({\mathbb {R}}^{3N})\) of real-valued square integrable functions of 3N variables with the associated \(\textrm{L}^2\)-inner product. Thanks to this result, we can define the tensor space \({\mathcal {H}}^1 \subset {\mathcal {L}}^2 \) as the closure of \({\mathscr {C}}_0^{\infty }({\mathbb {R}}^{3N})\) in \(\textrm{L}^2({\mathbb {R}}^{3N})\) with respect to the usual gradient-gradient inner product on \({\mathbb {R}}^{3N}\).

In quantum mechanics, a fundamental distinction is made between so-called bosonic and fermionic particles, the latter obeying the so-called Pauli-exclusion principle and thus being described in terms of antisymmetric functions. We are therefore obligated to also define tensor spaces of antisymmetric functions. To this end, we first introduce the so-called antisymmetric projection operator \({\mathbb {P}}^{\textrm{as}} :{\mathcal {L}}^2\rightarrow {\mathcal {L}}^2\) that is defined through the action

$$\begin{aligned} \forall {\mathcal {f}} \in {\mathcal {L}}^2:\quad ({\mathbb {P}}^\textrm{as} {\mathcal {f}})(\textbf{x}_1, \ldots , \textbf{x}_N) :=\frac{1}{\sqrt{N!}} \sum _{\pi \in {\textrm{S}}(N)} (-1)^{\mathrm{sgn(\pi )}} {\mathcal {f}}(\textbf{x}_{\pi (1)}, \ldots ,\textbf{x}_{\pi (N)} ), \end{aligned}$$

where \({\textrm{S}}(N)\) denotes the permutation group of order N, and \(\mathrm{sgn(\pi )} \) denotes the signature of \(\pi ~\in ~{\textrm{S}}(N)\).

It is easy to establish that \({\mathbb {P}}^{\textrm{as}}\) is an \({\mathcal {L}}^2\)-orthogonal projection with a closed range. We therefore define the antisymmetric tensor spaces \(\widehat{{\mathcal {L}}}^{\, 2} \subset {\mathcal {L}}^2\) and \( \widehat{{\mathcal {H}}}^1 \subset {\mathcal {H}}^1\) as

$$\begin{aligned} \widehat{{\mathcal {L}}}^{\, 2}:= \bigwedge _{j=1}^N \textrm{L}^2({\mathbb {R}}^3) := \text { ran}\; {\mathbb {P}}^{\textrm{as}} \quad \text {and} \quad \widehat{{\mathcal {H}}}^1 := \widehat{{\mathcal {L}}}^{\, 2} \cap {\mathcal {H}}^1, \end{aligned}$$

equipped with the \((\cdot , \cdot )_{{\mathcal {L}}^2}\) and \((\cdot , \cdot )_{{\mathcal {H}}^1}\) inner products respectively. We remark that normalised elements of \(\widehat{{\mathcal {L}}}^{\, 2}\) are known as wave-functions, and these are antisymmetric in the sense that for any \({\mathcal {f}} \in \widehat{{\mathcal {L}}}^{\, 2} \) we have that

$$\begin{aligned}&{\mathcal {f}}(\textbf{x}_1, \ldots , \textbf{x}_i, \ldots , \textbf{x}_j, \ldots \textbf{x}_N)\\&\quad = - {\mathcal {f}}(\textbf{x}_1, \ldots , \textbf{x}_j, \ldots , \textbf{x}_i, \ldots \textbf{x}_N) \qquad \forall ~ i, j \in \{1, \ldots , N\} \text { with } i\ne j. \end{aligned}$$

In the sequel, we will also occasionally make use of the dual space of \(\widehat{{\mathcal {H}}}^1\). We therefore denote \(\widehat{{\mathcal {H}}}^{-1}:= \big (\widehat{{\mathcal {H}}}^1\big )^*\), we equip \(\widehat{{\mathcal {H}}}^{-1} \) with the canonical dual norm, and we write \(\langle \cdot , \cdot \rangle _{\widehat{{\mathcal {H}}}^1, \widehat{{\mathcal {H}}}^{-1}} \) for the associated duality pairing. Note that higher regularity Sobolev spaces \(\widehat{{\mathcal {H}}}^s, ~s\ge 1\) can be defined similarly to \(\widehat{{\mathcal {H}}}^1\).

Finally, let us comment on the construction of basis sets for the tensor spaces \({\mathcal {H}}^1\) and \(\widehat{{\mathcal {H}}}^1\). Given an \(\textrm{L}^2\)-orthonormal, complete basis \({\mathcal {B}}:=\{\phi _k\}_{k \in {\mathbb {N}}} \subset \textrm{H}^1({\mathbb {R}}^3)\), we can construct a complete basis \({\mathcal {B}}_{\otimes }\) for \({\mathcal {H}}^1\) by setting

$$\begin{aligned} {\mathcal {B}}_{\otimes }= \Big \{\phi _{k_1} \otimes \phi _{k_2} \otimes \cdots \otimes \phi _{k_N} :~k_1, k_2, \ldots , k_N \in {\mathbb {N}}\Big \}, \end{aligned}$$

and it follows immediately that \({\mathcal {B}}_{\otimes }\) is \({\mathcal {L}}^2\)-orthonormal.

In order to construct a basis for the antisymmetric tensor space \(\widehat{{\mathcal {H}}}^1\), we must first define a suitable subset of \({\mathcal {B}}_{\otimes }\). To this end, we introduce an index set \({\mathcal {J}}_{\infty }^N \subset {\mathbb {N}}^N\) given by

$$\begin{aligned} {\mathcal {J}}_{\infty }^N := \Big \{\alpha = (\alpha _1, \alpha _2, \ldots , \alpha _N)\in {\mathbb {N}}^N :\alpha _1< \alpha _2< \cdots < \alpha _N\Big \}. \end{aligned}$$

We can thus define the subset \({\mathcal {B}}_{\otimes }^{\textrm{ord}} \) of the basis \({\mathcal {B}}_{\otimes }\) given by

$$\begin{aligned} {\mathcal {B}}_{\otimes }^{\textrm{ord}}:= \Big \{\Phi _{\textbf{k}}:= \phi _{k_1} \otimes \phi _{k_2} \otimes \cdots \otimes \phi _{k_N} :\textbf{k}= (k_1, \ldots , k_N) \in {\mathcal {J}}_{\infty }^N \Big \}. \end{aligned}$$

A complete basis for the antisymmetric tensor space \(\widehat{{\mathcal {H}}^1}\) is then given by

$$\begin{aligned} {\mathcal {B}}_{\wedge }:=&\{ {\mathbb {P}}^{\textrm{as}} \Phi :\Phi \in {\mathcal {B}}_{\otimes }^{\textrm{ord}}\}\\ =&\left\{ \Phi _{\textbf{k}}(\textbf{x}_1, \textbf{x}_2, \ldots , \textbf{x}_N)=\frac{1}{\sqrt{N!}} \sum _{\pi \in {\textrm{S}}(N)} (-1)^\mathrm{sgn(\pi )} \otimes _{i=1}^N \phi _{k_i}\big (\textbf{x}_{\pi (i)}\big ) :\hspace{1mm} \textbf{k}=(k_1, k_2, \ldots , k_N) \in {\mathcal {J}}_{\infty }^N\right\} . \end{aligned}$$

Elements of the basis set \({\mathcal {B}}_{\wedge }\) are called Slater determinants. For simplicity, given \(\textbf{k} \in {\mathcal {J}}_{\infty }^N\) and \(\Phi _{\textbf{k}} \in {\mathcal {B}}_{\wedge }\) of the form

$$\begin{aligned} \Phi _{\textbf{k}}(\textbf{x}_1, \textbf{x}_2, \ldots , \textbf{x}_N)=\frac{1}{\sqrt{N!}} \sum _{\pi \in {\textrm{S}}(N)} (-1)^\mathrm{sgn(\pi )} \otimes _{i=1}^N \phi _{k_i}\big (\textbf{x}_{\pi (i)}\big ), \end{aligned}$$

we will write \(\Phi _{\textbf{k}}\) in the succinct form

$$\begin{aligned} \Phi _{\textbf{k}}(\textbf{x}_1, \textbf{x}_2, \ldots , \textbf{x}_N)= \frac{1}{\sqrt{N!}}\text { det} \big (\phi _{k_i}(\textbf{x}_j)\big )_{i, j=1}^N. \end{aligned}$$

2.2 Governing operators and problem statement

Throughout this article, we assume that the electronic properties of the molecule that we study can be described by the action of a many-body electronic Hamiltonian given by

$$\begin{aligned} H:= & {} -\frac{1}{2} \sum _{j=1}^N \Delta _{\textbf{x}_j}\nonumber \\{} & {} + \sum _{j =1}^{N} \sum _{\alpha =1}^{M} \frac{-Z_{\alpha }}{\vert \textbf{x}_{\alpha }- \textbf{x}_j\vert } + \sum _{j =1}^{N} \sum _{i =1}^{j-1} \frac{1}{\vert \textbf{x}_i - \textbf{x}_j\vert }\qquad \text {acting on } \widehat{{\mathcal {L}}}^{\, 2} \quad \text {with domain } \widehat{{\mathcal {H}}}^2.\nonumber \\ \end{aligned}$$
(2)

The electronic properties of the molecule that we study are functions of the spectrum of the electronic Hamiltonian H, and we are therefore interested in its analysis and computation. It is a classical result (see, e.g., the review article [26]) that the operator H is self-adjoint on \(\widehat{{\mathcal {L}}}^{\, 2}\) with form domain \(\widehat{{\mathcal {H}}}^1\), and under the additional assumption that \(Z:= \sum _{\alpha =1}^M Z_{\alpha } \ge N\), it holds that

  1. (1)

    The operator H has an essential spectrum \(\sigma _\textrm{ess}\) of the form \(\sigma _{\textrm{ess}}:= [\Sigma , \infty )\) where \(-\infty <\Sigma \le 0\);

  2. (2)

    The operator H has a bounded-below discrete spectrum that consists of a countably infinite number of eigenvalues, each with finite multiplicity, accumulating at \(\Sigma \).

Consequently, under the assumption that \(\sum _{\alpha =1}^M Z_{\alpha } \ge N\), the electronic Hamiltonian \({\mathcal {H}}\) possesses a lowest eigenvalue \({\mathcal {E}}^*_{\textrm{GS}} \in {\mathbb {R}}\), frequently called the ground state energy, such that

$$\begin{aligned} {\mathcal {E}}_{\textrm{GS}}^*= \min _{0\ne \Psi \in \widehat{{\mathcal {H}}}^1} \frac{\big \langle \Psi ,H \Psi \big \rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}}{\Vert \Psi \Vert ^2_{{\mathcal {L}}^2}}. \end{aligned}$$
(3a)

Any function \(\Psi ^*_{\textrm{GS}} \in \widehat{{\mathcal {H}}}^1\) that achieves the minimum in Eq. (3a) is called a ground state of H and obviously satisfies

$$\begin{aligned} H\Psi _{\textrm{GS}}^* = {\mathcal {E}}^*_{\textrm{GS}} \Psi ^*_{\textrm{GS}}. \end{aligned}$$
(3b)

For the purpose of this article, we will assume that indeed \(Z=\sum _{\alpha =1}^M Z_{\alpha } \ge N\). Note that if the ground state eigenvalue \({\mathcal {E}}^*_{\textrm{GS}}\) is simple (which is not always the case), normalised ground states \(\Psi _{\textrm{GS}}^*\) (being elements of a real Hilbert space) are unique up to sign.

From a functional analysis point of view, the electronic Hamiltonian H possesses certain desirable properties, namely continuity and ellipticity on appropriate Sobolev spaces. More precisely (see, for instance, [54, Chapter 4]),

  • The electronic Hamiltonian defined through Eq. (2) is bounded as a mapping from \(\widehat{{\mathcal {H}}}^1\) to \(\widehat{{\mathcal {H}}}^{-1}\):

    $$\begin{aligned} \forall \Phi , \Psi \in \widehat{{\mathcal {H}}}^1 :\qquad \left| \left\langle \Phi , H\Psi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| \le \left( \frac{1}{2}+ 3\sqrt{N}Z\right) \Vert \Phi \Vert _{\widehat{{\mathcal {H}}}^1}\Vert \Psi \Vert _{\widehat{{\mathcal {H}}}^1}; \end{aligned}$$
    (4)
  • The electronic Hamiltonian defined through Eq. (2) satisfies the following ellipticity condition on the Gelfand triple \(\widehat{{\mathcal {H}}}^1 \hookrightarrow \widehat{{\mathcal {L}}}^{\, 2} \hookrightarrow \widehat{{\mathcal {H}}}^{-1}\):

    $$\begin{aligned} \forall \Phi \in \widehat{{\mathcal {H}}}^1 :\qquad \left\langle \Phi , H\Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \ge \frac{1}{4}\Vert \Phi \Vert _{\widehat{{\mathcal {H}}}^1}^2 - \left( 9NZ^2 -\frac{1}{4}\right) \Vert \Phi \Vert ^2_{\widehat{{\mathcal {L}}}^{\, 2}}. \end{aligned}$$
    (5)

An important consequence of the above ellipticity estimate is that the electronic Hamiltonian, modified by any suitable shift, defines an invertible operator on a subspace of \(\widehat{{\mathcal {H}}}^1\). This fact will be of great importance in our analysis and will be the subject of further discussion in Sect. 4 (see Remark 34).

Remark 1

(Restriction to function spaces of real-valued, spin-independent functions) To avoid notational complexity, we have restricted our analysis in this article to real-valued function spaces and we have not taken into account spin variables. The restriction to real-valued functions does not result in any loss in generality since the governing operator that we consider, namely the electronic Hamiltonian H defined through Eq. (2), consists entirely of real terms, and thus the real and imaginary parts of any eigenfunction of H are themselves eigenfunctions of H (see, e.g., [6, Chapter 6.1] for a brief discussion of this point).

Our choice to neglect spin is motivated by the same observation, i.e., that the electronic Hamiltonian does not contain any explicit spin dependencies. Consequently, in order to take spin variables into account we need simply replace the single particle function spaces

$$\begin{aligned}&\textrm{L}^2({\mathbb {R}}^3) \text { and } \textrm{H}^1({\mathbb {R}}^3) \quad \text {with} \quad \textrm{L}^2\left( {\mathbb {R}}^3 \times \Big \{\pm \frac{1}{2}\Big \}\right) \text { and } \textrm{H}^1\left( {\mathbb {R}}^3 \times \Big \{\pm \frac{1}{2}\Big \}\right) \text {respectively}. \end{aligned}$$

Here, \(\textrm{L}^2\left( {\mathbb {R}}^3 \times \left\{ \pm \frac{1}{2}\right\} \right) \) can be seen as the space of (equivalence classes of) square-integrable functions of three spatial variables and an additional spin variable \(s=\pm \frac{1}{2}\), equipped with the inner product

$$\begin{aligned} \forall f,g \in \textrm{L}^2\left( {\mathbb {R}}^3 \times \Big \{\pm \frac{1}{2}\Big \}\right) :\quad \left( f, g \right) _{\textrm{L}^2\left( {\mathbb {R}}^3 \times \left\{ \pm \frac{1}{2}\right\} \right) }= \sum _{s= \pm \frac{1}{2}} \int _{\textbf{R}^3} f(\textbf{x}, s)g(\textbf{x}, s)\; d\textbf{x}, \end{aligned}$$

and an analogous interpretation holds for \(\textrm{H}^1\left( {\mathbb {R}}^3 \times \left\{ \pm \frac{1}{2}\right\} \right) \).

Equipped with the spin-dependent single particle spaces \(\textrm{L}^2\left( {\mathbb {R}}^3 \times \left\{ \pm \frac{1}{2}\right\} \right) \) and \(\textrm{H}^1\left( {\mathbb {R}}^3 \times \left\{ \pm \frac{1}{2}\right\} \right) \), the spin-dependent tensorial N-particle function spaces and basis sets can be constructed following mutatis mutandis, the procedure described in Sect. 2.1 above. Our subsequent analysis can then be readily applied to such spin-dependent function spaces without any significant modifications.

One additional feature of the spin-dependent formalism deserves mention. The analysis that we present in this contribution frequently requires assumptions on the simplicity of certain eigenvalues of the electronic Hamiltonian H. Unfortunately, considering H as an operator on the full spin-dependent tensorial space \(\widehat{{\mathcal {L}}}^{\, 2}_{\textrm{s}}= \wedge _{j=1}^N \textrm{L}^2\left( {\mathbb {R}}^3 \times \left\{ \pm \frac{1}{2}\right\} \right) \) often introduces degeneracies in these eigenvalues, and in order to remove these degeneracies, it is necessary to restrict the funtional setting to a suitable subspace of \(\widehat{{\mathcal {L}}}^{\, 2}_{\textrm{s}}\), typically an eigenspace of the so-called z-spin operator (see [41] for a detailed construction).

2.3 Computing the ground state energy in a finite-dimensional subspace

From a practical point of view, the ground state energy of the electronic Hamiltonian defined through Eq. (2) can only be approximated in a finite-dimensional subspace. The most conceptually simple such approach (albeit tremendously computationally expensive and therefore not widely used) is known in the quantum chemical literature as Full Configuration Interaction. In this subsection, we introduce the terminology and briefly discuss the methodology of the full configuration interaction procedure since the underlying notions will be useful when we discuss the discrete coupled cluster equations in Sect. 5

At its core, the full configuration interaction method (Full-CI) is based on a straightforward Galerkin approximation of the minimisation problem (3a). We will therefore begin by defining an approximation space. To do so, we fix some \(K \in {\mathbb {N}}\) with \(K > N\) and assume that we are given a set \(\{\phi _j\}_{j=1}^K \subset \textrm{H}^1({\mathbb {R}}^3)\) of \(\textrm{L}^2({\mathbb {R}}^3)\)-orthonormal functions. We also introduce an index set \({\mathcal {J}}_{K}^N \subset \{1, \ldots , K\}^N\) given by

$$\begin{aligned} {\mathcal {J}}_{K}^N := \Big \{\alpha = (\alpha _1, \alpha _2, \ldots , \alpha _N)\in \{1, \ldots , K\}^N :\alpha _1< \alpha _2< \cdots < \alpha _N\Big \}. \end{aligned}$$

Definition 2

(Finite Dimensional Single-Particle Basis)

We define the K-dimensional single particle basis \( {\mathcal {B}}_K \subset \textrm{H}^1({\mathbb {R}}^3)\) as \({\mathcal {B}}_K:= \{\phi _j\}_{j=1}^K \). Additionally, we define the subspace spanned by this basis set as \(\textrm{X}_K:= \text { span}\; {\mathcal {B}}_K\) and we refer to \(\textrm{X}_K\) as the single particle approximation space.

Definition 3

(Finite Dimensional N-Particle Basis)

We define the \({\mathcal {L}}^2\)-orthonormal, \({K}\atopwithdelims (){N}\)-dimensional N-particle basis \( {\mathcal {B}}_K^{N} \subset \widehat{{\mathcal {H}}}^1\) as

$$\begin{aligned} {\mathcal {B}}^{N}_K := \left\{ \Phi _{\textbf{k}}(\textbf{x}_1, \textbf{x}_2, \ldots , \textbf{x}_N)=\frac{1}{\sqrt{N!}}\text { det} \big (\phi _{k_i}(\textbf{x}_j)\big )_{i, j=1}^N :\hspace{1mm} \textbf{k}=(k_1, k_2, \ldots , k_N) \in {\mathcal {J}}_{K}^N \right\} . \end{aligned}$$

Additionally, we define the subspace spanned by this basis set as \({\mathcal {V}}_K:= \text { span}\; {\mathcal {B}}^N_K\) and we refer to \({\mathcal {V}}_K\) as the N-particle approximation space.

Full Configuration Interaction Approximation of Minimisation Problem (3a)

Let the N-particle approximation space \({\mathcal {V}}_K\) be defined through Definition 3. We seek the pair(s) \(({\mathcal {E}}^*_{\textrm{FCI}}, \Psi ^*_{\textrm{FCI}}) \in \big ({\mathbb {R}}, {\mathcal {V}}_K\big )\) with \(\Vert \Psi _\textrm{FCI}^*\Vert _{{\mathcal {L}}^2}^2=1\) that satisfies

$$\begin{aligned} {\mathcal {E}}_{\textrm{FCI}}^*:= & {} \min _{0\ne \Psi \in {\mathcal {V}}_K} \frac{\left\langle \Psi , H\Psi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}}{\Vert \Psi \Vert ^2_{{\mathcal {L}}^2}} \quad \text { and} \quad \left\langle \Psi , H\Psi ^*_{\textrm{FCI}}\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\nonumber \\= & {} {\mathcal {E}}_{\textrm{FCI}}^*\left( \Psi ,\Psi ^*_{\textrm{FCI}}\right) _{\widehat{{\mathcal {L}}}^{\, 2}} \hspace{2mm}~\forall \Psi \in {\mathcal {V}}_K. \end{aligned}$$
(6)

Several remarks are now in order.

First, it follows from the variational principle that the minimum in Eq. (6) satisfies \({\mathcal {E}}_{\textrm{FCI}}^* \ge {\mathcal {E}}^*_{\textrm{GS}}\).

Second, in practice the Full-CI minimisation problem (6) is very often solved by writing first the associated Euler–Lagrange equations, i.e., the first order optimality conditions. This yields a linear eigenvalue problem on the finite-dimensional space \({\mathcal {V}}_K\) which can, in principle, be solved through the use of some iterative eigenvalue solver.

Third, despite the fact that the Full-CI methodology (6) seems very amenable to numerical analysis—by virtue of being a Galerkin approximation to the exact minimisation problem (3a)—it has a fundamental computational draw-back: the dimension of the N-particle approximation space \({\mathcal {V}}_K\) grows combinatorially in N which renders this approach computationally intractable for N even moderately large. As a consequence, we are very often forced to introduce further approximations to the Full-CI methodology.

We end this section by defining the Full-CI Hamiltonian which will be referenced in Sect. 5 below.

Definition 4

(Full-CI Hamiltonian)

Let the N-particle approximation space \({\mathcal {V}}_K\) be defined through Definition 3 and let the electronic Hamiltonian \(H:\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) be defined through Eq. (2). We define the Full-CI Hamiltonian \(H_{K} :{\mathcal {V}}_K \rightarrow {\mathcal {V}}_K^*\) as the mapping with the property that for all \(\Psi _K, \Phi _K \in {\mathcal {V}}_K\) it holds that

$$\begin{aligned} \langle \Psi _K, H_K \Phi _K\rangle _{{\mathcal {V}}_K \times {\mathcal {V}}_K^*}:= \langle \Psi _K, H \Phi _K \rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}. \end{aligned}$$
(7)

3 Excitation operators and the coupled cluster ansatz

Throughout this section, we assume the setting of Sect. 2. Our goal now is to introduce the notions of excitation indices, excitation operators and the coupled cluster non-linear parameterisation ansatz.

Let us begin by recalling that we have introduced complete single-particle and N-particle basis sets \({\mathcal {B}}= \{\phi _j\}_{j \in {\mathbb {N}}} \subset \textrm{H}^1({\mathbb {R}}^3)\) and \({\mathcal {B}}_{\wedge } \subset \widehat{{\mathcal {H}}}^1\) respectively in Sect. 2.1. Next, we define a collection of index sets.

Definition 5

(Excitation Index Sets)

For each \(j \in \{1, \ldots , N\}\) we define the index set \({\mathcal {I}}_j\) as

$$\begin{aligned} {\mathcal {I}}_j := \left\{ {{a_1, \ldots , a_j}\atopwithdelims (){\ell _1, \ldots , \ell _j}} :\ell _1< \cdots< \ell _j \in \{1, \ldots , N\} \text { and } a_1< \cdots < a_j \in \{N+1, N+2, \ldots \} \right\} , \end{aligned}$$

and we say that \({\mathcal {I}}_j\) is the excitation index set of order j. Additionally, we define

$$\begin{aligned} {\mathcal {I}}:= \bigcup _{j=1}^N {\mathcal {I}}_j, \end{aligned}$$

and we say that \({\mathcal {I}}\) is the global excitation index set.

The excitation index sets \(\{{\mathcal {I}}_j\}_{j=1}^N\) will be used to construct the so-called excitation and de-excitation operators which play a central role in post-Hartree Fock wave-function methods for further approximating the minimisation problem (6).

Definition 6

(Excitation Operators)

Let \(j \in {\mathbb {N}}\) and let \(\mu \in {\mathcal {I}}_j\) be of the form

$$\begin{aligned} \mu {=} {{a_1, \ldots , a_j}\atopwithdelims (){\ell _1, \ldots , \ell _j}} :\ell _1 {<} \cdots {<} \ell _j \in \{1, \ldots , N\} \text { and } a_1 {<} \cdots {<} a_j \in \{N+1, N+2, \ldots \}. \end{aligned}$$

We define the excitation operator \({\mathcal {X}}_{\mu } :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) through its action on the N-particle basis set \({\mathcal {B}}_{\wedge }\): For \(\Psi _{\nu }(\textbf{x}_1, \ldots , \textbf{x}_N) = \frac{1}{\sqrt{N!}} \text { det}\; \big (\phi _{\nu _j}(\textbf{x}_i)\big )_{ i, j=1}^N\), we set

$$\begin{aligned} {\mathcal {X}}_{\mu } \Psi _{\nu } = {\left\{ \begin{array}{ll} 0 &{}\quad \text { if } \{\ell _1, \ldots , \ell _j\} \not \subset \{\nu _1, \ldots , \nu _N\},\\ 0 &{} \quad \text { if } \exists a_{m} \in \{a_1, \ldots , a_j\} \text { such that } a_m \in \{\nu _1, \ldots , \nu _N\},\\ \Psi _{\nu }^{a} \in {\mathcal {B}}_{\wedge } &{} \quad \text { otherwise}, \end{array}\right. } \end{aligned}$$

where the determinant \(\Psi _{\nu }^{a} \) is constructed from \(\Psi _{\nu }\) by replacing all functions \(\phi _{\ell _1},\ldots \phi _{\ell _j} \) used to construct \(\Psi _{\nu }\) with functions \(\phi _{a_1}, \ldots , \phi _{a_j}\) respectively.

Definition 7

(De-excitation Operators)

Let \(j \in {\mathbb {N}}\) and let \(\mu \in {\mathcal {I}}_j\) be of the form

$$\begin{aligned} \mu = {{a_1, \ldots , a_j}\atopwithdelims (){\ell _1, \ldots , \ell _j}} :\ell _1< \cdots< \ell _j \in \{1, \ldots , N\} \text { and } a_1< \cdots < a_j \in \{N+1, N+2, \ldots \}. \end{aligned}$$

We define the de-excitation operator \({\mathcal {X}}^{\dagger }_{\mu } :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) through its action on the N-particle basis set \({\mathcal {B}}_{\wedge }\): For \(\Psi _{\nu }(\textbf{x}_1, \ldots , \textbf{x}_N) = \frac{1}{\sqrt{N!}} \text { det}\; \big (\phi _{\nu _j}(\textbf{x}_i)\big )_{ i, j=1}^N\), we set

$$\begin{aligned} {\mathcal {X}}_{\mu }^{\dagger } \Psi _{\nu } = {\left\{ \begin{array}{ll} 0 &{}\quad \text { if } \{a_1, \ldots , a_j\} \not \subset \{\nu _1, \ldots , \nu _N\},\\ 0 &{} \quad \text { if } \exists ~\ell _{m} \in \{\ell _1, \ldots , \ell _j\} \text { such that } \ell _m \in \{\nu _1, \ldots , \nu _N\},\\ \Psi _{\nu , \ell } \in {\mathcal {B}}_{\wedge } &{} \quad \text { otherwise}, \end{array}\right. } \end{aligned}$$

where the determinant \(\Psi _{\nu , \ell }\) is constructed from \(\Psi _{\nu }\) by replacing all functions \(\phi _{a_1},\ldots \phi _{a_j} \) used to construct \(\Psi _{\nu }\) with functions \(\phi _{\ell _1}, \ldots , \phi _{\ell _j}\) respectively.

It is natural to ask how de-excitation operators are related to excitation operators. The following remark summarises this relationship.

Remark 8

(Relationship between Excitation and De-excitation Operators) Consider the setting of Definitions 6 and 7. In some sense, each de-excitation operator reverses the action the corresponding excitation operator. More precisely, it can be shown that for any \(\mu \in {\mathcal {I}}\), the de-excitation operator \({\mathcal {X}}_{\mu } :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) is the \(\widehat{{\mathcal {L}}}^{\, 2}\)-adjoint of the excitation operator \({\mathcal {X}}_\mu :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\), i.e.,

$$\begin{aligned} \forall \Phi , {\widetilde{\Phi }} \in \widehat{{\mathcal {H}}}^1, ~\forall \mu \in {\mathcal {I}} :\qquad \left( {\widetilde{\Phi }}, {\mathcal {X}}_{\mu } \Phi \ \right) _{\widehat{{\mathcal {L}}}^{\, 2}} = \left( {\mathcal {X}}_{\mu }^{\dagger }{\widetilde{\Phi }}, \Phi \right) _{\widehat{{\mathcal {L}}}^{\, 2}}. \end{aligned}$$

Several properties of the excitation operators can now be deduced. We begin with a remark.

Remark 9

(Interpretation of N-particle Basis in Terms of Excited Determinants) It is a simple exercise to show that the entire N-particle basis set \({\mathcal {B}}_{\wedge }\) can be generated through the action of the excitation operators on a so-called reference determinant. More precisely, we define \(\Psi _0(\textbf{x}_1, \ldots , \textbf{x}_N):=\frac{1}{\sqrt{N!}} \text { det}\; \big (\phi _{j}(\textbf{x}_i)\big )_{ i, j=1}^N\), and it then follows that

$$\begin{aligned} \begin{aligned} {\mathcal {B}}_{\wedge }&= \{\Psi _0 \} \cup \left\{ {\mathcal {X}}_\mu \Psi _0 :\mu \in {\mathcal {I}}_1\right\} \cup \left\{ {\mathcal {X}}_\mu \Psi _0 :\mu \in {\mathcal {I}}_2\right\} \cup \ldots \cup \left\{ {\mathcal {X}}_\mu \Psi _0 :\mu \in {\mathcal {I}}_N\right\} \\&=\{\Psi _0 \} \cup \left\{ {\mathcal {X}}_\mu \Psi _0 :\mu \in {\mathcal {I}}\right\} . \end{aligned} \end{aligned}$$
(8)

This observation motivates the following convention and definition.

Convention 10

(Reference Determinant)  Consider the setting of Remark 9. In the sequel, we will refer to the function \({\mathcal {B}}_{\wedge } \ni \Psi _0(\textbf{x}_1, \ldots , \textbf{x}_N)=\frac{1}{\sqrt{N!}} \text { det}\; \big (\phi _{j}(\textbf{x}_i)\big )_{ i, j=1}^N\), i.e., the determinant constructed from the first N single particle basis functions \(\{\phi _i\}_{i=1}^N\) as the reference determinant. Moreover, for any \(\mu \in {\mathcal {I}}\), we will frequently denote \(\Psi _{\mu }:= {\mathcal {X}}_{\mu } \Psi _0\). Finally, we will often refer to each set \(\left\{ {\mathcal {X}}_\mu \Psi _0 :\mu \in {\mathcal {I}}_j\right\} \) as the set of j-excited determinants.

Definition 11

(Orthogonal Complement of the Reference Determinant)

Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5, let the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) be defined through Definition 6, and let \(\Psi _0(\textbf{x}_1, \textbf{x}_2, \ldots , \textbf{x}_N):=\frac{1}{\sqrt{N!}} \text { det}\; \big (\phi _{j}(\textbf{x}_i)\big )_{ i, j=1}^N\) denote the reference determinant. Then we define the set \(\widetilde{{\mathcal {B}}_{\wedge }} \subset {\mathcal {B}}_{\wedge }\) and the subspace \(\widetilde{{\mathcal {V}}} \subset \widehat{{\mathcal {H}}}^1\) as

$$\begin{aligned} \widetilde{{\mathcal {B}}_{\wedge }}:=&\left\{ {\mathcal {X}}_\mu \Psi _0 :\mu \in {\mathcal {I}}\right\} , \qquad \text {and}\\ \widetilde{{\mathcal {V}}}:=&\left\{ \Psi _0\right\} ^{\perp }:= \left\{ \Phi \in \widehat{{\mathcal {H}}}^1:\quad (\Phi , \Psi _0)_{\widehat{{\mathcal {L}}}^{\, 2}} =0\right\} , \end{aligned}$$

and we observe that \(\widetilde{{\mathcal {B}}_{\wedge }}\) is a complete, \(\widehat{{\mathcal {L}}}^{\, 2}\)-orthonormal basis for \(\widetilde{{\mathcal {V}}}\).

Definition 12

(Complementary Decomposition of \(\widehat{{\mathcal {H}}}^1\))

Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5, let the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) be defined through Definition 6, and let \(\Psi _0(\textbf{x}_1, \textbf{x}_2, \ldots , \textbf{x}_N):=\frac{1}{\sqrt{N!}} \text { det}\; \big (\phi _{j}(\textbf{x}_i)\big )_{ i, j=1}^N\) denote the reference determinant. We define \({\mathbb {P}}_0 :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) as the \(\widehat{{\mathcal {L}}}^{\, 2}\)-orthogonal projection operator onto \( \text {span} \left\{ \Psi _0\right\} \), and we define \(P_0^{\perp }:={\mathbb {I}}-{\mathbb {P}}_0 :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) as its complement. Additionally, we introduce the complementary decomposition of the N-particle space \(\widehat{{\mathcal {H}}}^1\) given by

$$\begin{aligned} \widehat{{\mathcal {H}}}^1= \text { span} \left\{ \Psi _0\right\} \oplus \widetilde{{\mathcal {V}}}, \quad \text {where we emphasise that }~ \widetilde{{\mathcal {V}}}=\text { Ran}{\mathbb {P}}_0^{\perp }. \end{aligned}$$
(9)

The complementary decomposition introduced through Eq. (9) will be particularly important in our subsequent analysis of the coupled cluster method in Sect. 4. Let us emphasise that the construction of these complementary spaces is based on \(\widehat{{\mathcal {L}}}^{\, 2}\)-orthogonality rather than \(\widehat{{\mathcal {H}}}^1\) orthogonality. This choice is intentional as it simplifies considerably the analysis in Sects. 4 and 5. Let us also remark that the projection operators \({\mathbb {P}}_0 :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1 \) and \({\mathbb {P}}^{\perp }_0 :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1 \) are both, nevertheless, bounded operators with respect to the \(\Vert \cdot \Vert _{\widehat{{\mathcal {H}}}^1}\) norm since they both possess a closed range and a closed kernel.

Returning for the moment to the notion of excitation operators, we see that it is easy to deduce that each excitation operator \({\mathcal {X}}_{\mu }, ~ \mu \in {\mathcal {I}}\) is a bounded linear operator from \(\widehat{{\mathcal {H}}}^1\) to \(\widehat{{\mathcal {H}}}^1\). However, we will frequently be interested in so-called cluster operators which are summations of the excitation operators \({\mathcal {X}}_{\mu }, ~\mu \in {\mathcal {I}}\), and such summations need not be bounded operators from \(\widehat{{\mathcal {H}}}^1\) to \(\widehat{{\mathcal {H}}}^1\) or even from \(\widehat{{\mathcal {L}}}^{\, 2}\) to \(\widehat{{\mathcal {L}}}^{\, 2}\). Fortunately, the following result was proven in [42].

Proposition 13

(Cluster Operators as Bounded Maps on \(\widehat{{\mathcal {L}}}^{\, 2}\)) Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5 and let \({\varvec{t}}= \{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in \ell ^2({\mathcal {I}})\). Then there exists a unique bounded linear operator \({\mathcal {T}} :\widehat{{\mathcal {L}}}^{\, 2} \rightarrow \widehat{{\mathcal {L}}}^{\, 2}\), the so-called cluster operator generated by \({\varvec{t}}\), such that \({\mathcal {T}}=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_{\mu }{\mathcal {X}}_{\mu }\) where the series convergence holds with respect to the operator norm \(\Vert \cdot \Vert _{\widehat{{\mathcal {L}}}^{\, 2} \rightarrow \widehat{{\mathcal {L}}}^{\, 2}} \).

Next, we introduce a coefficient subspace of \(\ell ^2({\mathcal {I}})\), i.e, the space of square summable sequences of real numbers indexed by \({\mathcal {I}}\), which will limit the class of cluster operators that we consider in the sequel.

Definition 14

(Coefficient Space For Cluster Operators)

Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5 and let \(\ell ^2({\mathcal {I}})\) denote the space of square summable sequences of real numbers indexed by \({\mathcal {I}}\). We define the Hilbert space of sequences \({\mathbb {V}} \subset \ell ^2({\mathcal {I}})\) as the set

$$\begin{aligned} {\mathbb {V}}:= \left\{ \textbf{t}:= \{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in \ell ^2({\mathcal {I}}) :\quad \sum _{\mu \in {\mathcal {I}}}{\varvec{t}}_{\mu } \Psi _\mu ~\in \widehat{{\mathcal {H}}}^1 \right\} , \end{aligned}$$
(10)

equipped with the inner product

$$\begin{aligned} \forall \textbf{t}, {\varvec{s}}\in {\mathbb {V}} ~\text { with }~ {\varvec{t}}:= ({\varvec{t}}_{\mu })_{\mu \in {\mathcal {I}}}, ~{\varvec{s}}:= ({\varvec{s}}_{\mu })_{\mu \in {\mathcal {I}}} :\quad \left( {\varvec{s}}, {\varvec{t}}\right) _{{\mathbb {V}}}:= \left( \sum _{\mu \in {\mathcal {I}}}{\varvec{s}}_{\mu } \Psi _\mu , \sum _{\mu \in {\mathcal {I}}}{\varvec{t}}_{\mu } \Psi _\mu \right) _{\widehat{{\mathcal {H}}}^1}.\nonumber \\ \end{aligned}$$
(11)

Additionally, we define \({\mathbb {V}}^*\) as the topological dual space of \({\mathbb {V}}\), equipped with the canonical dual norm

$$\begin{aligned} \forall {\varvec{w}}\in {\mathbb {V}}^* :\qquad \Vert {\varvec{w}}\Vert _{{\mathbb {V}}^*}:= \sup _{0\ne {\varvec{t}}\in {\mathbb {V}}}\frac{\big \vert \left\langle {\varvec{w}}, {\varvec{t}}\right\rangle _{{\mathbb {V}}^* \times {\mathbb {V}}} \big \vert }{\Vert {\varvec{t}}\Vert _{{\mathbb {V}}}}, \end{aligned}$$

Some remarks are now in order.

Remark 15

(Clarification of the Definition of the Coefficient Space) Consider Definition 14 of the coefficient space \({\mathbb {V}} \subset \ell ^2({\mathcal {I}})\) and let \({\varvec{t}}= \{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\). We emphasise here that the assertion \(\sum _{\mu \in {\mathcal {I}}}{\varvec{t}}_{\mu } \Psi _\mu \in \widehat{{\mathcal {H}}}^1\) should be understood in the following sense: there exists \(\Psi _{\textbf{t}} \in \widehat{{\mathcal {H}}}^1 \subset \widehat{{\mathcal {L}}}^{\, 2}\) such that \(\Psi _{\textbf{t}} =\sum _{\mu \in {\mathcal {I}}}{\varvec{t}}_{\mu } \Psi _\mu \) where the series convergence holds with respect to the \(\widehat{{\mathcal {L}}}^{\, 2}\) norm. In particular, this series convergence does not a priori hold with respect to the \(\widehat{{\mathcal {H}}}^1\)-norm, and it is only the limit function \(\Psi _{{\varvec{t}}}\) that is an element of \(\widehat{{\mathcal {H}}}^1\).

Remark 16

(Dual Coefficient Space) Consider the setting of Definition 14. Throughout, this article, we will denote by \({\mathbb {V}}^*\) the topological dual space of \({\mathbb {V}}\) equipped with the canonical dual norm. Note that since \(\widehat{{\mathcal {H}}}^1\) is dense and continuously embedded in \(\widehat{{\mathcal {L}}}^{\, 2}\), we can deduce that the coefficient space \({\mathbb {V}}\) is dense and continuously embedded in \(\ell ^2({\mathcal {I}})\). As a consequence, the inner product \((\cdot , \cdot )_{\ell ^2}\) can be continuously extended to the duality pairing \(\langle \cdot , \cdot \rangle _{{\mathbb {V}} \times {\mathbb {V}}^*}\) on \({\mathbb {V}} \times {\mathbb {V}}^*\). This fact will be of occasional use in the sequel.

Notation 17

(Coefficient Sequences and Cluster Operators) Let \({\varvec{t}}=\{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\). As mentioned previously, the operator \({\mathcal {T}}:=\sum _{\mu \in {\mathcal {I}}}{\varvec{t}}_{\mu } {\mathcal {X}}_{\mu }\) is known as the cluster operator generated by \({\varvec{t}}\), and it plays a key role in the coupled cluster formalism.

For clarity of exposition, we will adopt the convention of denoting by small bold letters such as \({\varvec{r}}, {\varvec{s}}, {\varvec{t}}\), and \({\varvec{w}}\), etc., elements of the coefficient space \({\mathbb {V}}\) and denoting by capital curly letters such as \({\mathcal {R}}, {\mathcal {S}}, {\mathcal {T}}\), and \({\mathcal {W}}\), etc., the corresponding cluster operators with the understanding that \({\mathcal {R}}:= \sum _{\mu \in {\mathcal {I}}}{\varvec{r}}_{\mu } {\mathcal {X}}_{\mu }\), \({\mathcal {S}}:= \sum _{\mu \in {\mathcal {I}}}{\varvec{s}}_{\mu } {\mathcal {X}}_{\mu }\), and so on.

Remark 18

(Representation of Elements of the Complementary Subspace \(\widetilde{{\mathcal {V}}}\)) Consider the setting of Definition 14 and recall Definition 11 of the space \(\widetilde{{\mathcal {V}}} \subset \widehat{{\mathcal {H}}}^1\). It is not difficult to see that every element \({\Phi }_{{\varvec{s}}}:= \sum _{\mu \in {\mathcal {I}}} \textbf{s}_{\mu } {\mathcal {X}}_{\mu } \Psi _0 \in \widetilde{{\mathcal {V}}}\) generates a sequence \(\textbf{s}:= \{\textbf{s}_\mu \}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) such that

$$\begin{aligned} {\Phi }_{{\varvec{s}}}= \sum _{\mu \in {\mathcal {I}}}\textbf{s}_{\mu } {\mathcal {X}}_{\mu } \Psi _0 = {\mathcal {S}}\Psi _0. \end{aligned}$$

Conversely, given any sequence \(\textbf{w}:= \{\textbf{w}_\mu \}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\), we can define the function \({\Phi }_{{\varvec{w}}} \in \widetilde{{\mathcal {V}}}\) as

$$\begin{aligned} {\Phi }_{{\varvec{w}}}=\sum _{\mu \in {\mathcal {I}}}\textbf{w}_{\mu } {\mathcal {X}}_{\mu } \Psi _0= {\mathcal {W}}\Psi _0. \end{aligned}$$

Therefore, in the sequel (in particular in Sect. 4), we will occasionally write elements of the space \(\widetilde{{\mathcal {V}}}\) as, for instance, \({\mathcal {S}}\Psi _0\) or \({\mathcal {W}}\Psi _0\) where \({\mathcal {S}}:= \sum _{\mu \in {\mathcal {I}}}\textbf{s}_{\mu } {\mathcal {X}}_{\mu }\) and \({\mathcal {W}}:= \sum _{\mu \in {\mathcal {I}}}\textbf{w}_{\mu } {\mathcal {X}}_{\mu }\) for some sequences \(\textbf{s}:= \{\textbf{s}_\mu \}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) and \(\textbf{w}:= \{\textbf{w}_\mu \}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\).

The following theorem now summarises the main properties of the excitation operators \({\mathcal {X}}_\mu , ~\mu \in {\mathcal {I}}\) and cluster operators constructed from these excitation operators. The establishment of these properties in infinite dimensions was the main achievement of the article [42]. In finite-dimensions, where the situation is considerably simpler from a topological point of view, these results were first proven in the mathematical literature in [46].

Theorem 19

(Properties of Excitation and Cluster Operators) Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5, let the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) and de-excitation operators \(\{{\mathcal {X}}^{\dagger }_{\mu }\}_{\mu \in {\mathcal {I}}}\) be defined through Definitions 6 and 7 respectively, and let the Hilbert space \({\mathbb {V}}\) of sequences be defined through Definition 14. Then

  1. (1)

    For all \(\mu , \nu \in {\mathcal {I}}\), it holds that \({\mathcal {X}}_{\mu } {\mathcal {X}}_{\nu }={\mathcal {X}}_{\nu } {\mathcal {X}}_{\mu }\) and \({\mathcal {X}}^{\dagger }_{\mu } {\mathcal {X}}^{\dagger }_{\nu }={\mathcal {X}}^{\dagger }_{\nu } {\mathcal {X}}^{\dagger }_{\mu }\).

  2. (2)

    For every \(\Phi \in \widehat{{\mathcal {H}}}^1\) that satisfies the so-called intermediate normalisation condition \(\left( \Phi , \Psi _0\right) _{{\mathcal {L}}^2}=1\), there exists a unique sequence \(\textbf{r}=\left\{ \textbf{r}_\mu \right\} _{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) with corresponding cluster operator \({\mathcal {R}}=\sum _{\mu \in {\mathcal {I}}}{\varvec{r}}_{\mu }{\mathcal {X}}_{\mu }\) such that

    $$\begin{aligned} \Phi = \Psi _0 + {\mathcal {R}} \Psi _0. \end{aligned}$$
  3. (3)

    Let \({\varvec{t}}\in {\mathbb {V}}\). Then

    • The cluster operator \({\mathcal {T}}= \sum _{\mu \in {\mathcal {I}}}{\varvec{t}}_{\mu }{\mathcal {X}}_{\mu }\) is a bounded linear map from \(\widehat{{\mathcal {H}}}^1\) to \(\widehat{{\mathcal {H}}}^1\) and there exists a constant \(\beta >0\) depending only on N such that

      $$\begin{aligned} \Vert {\varvec{t}}\Vert _{{\mathbb {V}}} \le \Vert {\mathcal {T}}\Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1} \le \beta \Vert {\varvec{t}}\Vert _{{\mathbb {V}}}. \end{aligned}$$
    • The de-excitation cluster \({\mathcal {T}}^{\dagger }= \sum _{\mu \in {\mathcal {I}}}{\varvec{t}}_{\mu }{\mathcal {X}}^{\dagger }_{\mu }\) is also bounded linear map from \(\widehat{{\mathcal {H}}}^1\) to \(\widehat{{\mathcal {H}}}^1\) and there exists a constant \(\beta ^{\dagger }>0\) depending only on N such that

      $$\begin{aligned} \Vert {\mathcal {T}}^{\dagger }\Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1} \le \beta ^{\dagger } \Vert {\varvec{t}}\Vert _{{\mathbb {V}}}. \end{aligned}$$
    • The cluster operator \({\mathcal {T}}= \sum _{\mu \in {\mathcal {I}}}{\varvec{t}}_{\mu }{\mathcal {X}}_{\mu }\) has an extension to a bounded linear operator from \(\widehat{{\mathcal {H}}}^{-1}\) to \(\widehat{{\mathcal {H}}}^{-1}\).

  4. (4)

    Define the set of operators

    $$\begin{aligned} {\mathfrak {L}}:= \left\{ t_0 \textrm{I} + {\mathcal {T}} :\quad t_0 \in {\mathbb {R}}, ~{\mathcal {T}}=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_{\mu } {\mathcal {X}}_{\mu } \quad \text {such that }~ {\varvec{t}}=\{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}} \right\} \end{aligned}$$

    The following hold:

    • The set \({\mathfrak {L}}\) forms a closed commutative subalgebra in the algebra of bounded linear operators acting from \(\widehat{{\mathcal {H}}}^{1}\) to \(\widehat{{\mathcal {H}}}^{1}\) (and also from \(\widehat{{\mathcal {H}}}^{-1}\) to \(\widehat{{\mathcal {H}}}^{-1}\)).

    • The subalgebra \({\mathfrak {L}}\) is closed under inversion and the spectrum of any \({\mathfrak {L}}\ni {\mathcal {L}}= t_0\textrm{I} + {\mathcal {T}}\) is exactly \(\sigma ({\mathcal {L}})=\{t_0\}\).

    • Any element in \({\mathfrak {L}}\) of the form \({\mathcal {T}} =\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_{\mu } {\mathcal {X}}_{\mu }\) with \({\varvec{t}}\in {\mathbb {V}}\) is nilpotent: it holds that \({\mathcal {T}}^{N+1}\equiv 0\).

    • The exponential function is a locally \({\mathscr {C}}^{\infty }\) map on \({\mathfrak {L}}\), and is a bijection from the sub-algebra

      $$\begin{aligned} \left\{ {\mathcal {T}} :\quad {\mathcal {T}}=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_{\mu } {\mathcal {X}}_{\mu } \quad \text {such that }~ {\varvec{t}}=\{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}} \right\} . \end{aligned}$$

      to the sub-algebra

      $$\begin{aligned} \left\{ \textrm{I} + {\mathcal {T}} :\quad {\mathcal {T}}=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_{\mu } {\mathcal {X}}_{\mu } \quad \text {such that }~ {\varvec{t}}=\{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}} \right\} . \end{aligned}$$

As a consequence of Theorem 19 (c.f., Properties \((\textit{2})\) and \((\textit{4})\)), one can prove that any intermediately normalised element of the N-particle space can be parameterised through an exponential cluster operator. More precisely, given the excitation index set \({\mathcal {I}}\) defined through Definition 5 and the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) defined through Definition 6, for any \(\Phi \in \widehat{{\mathcal {H}}}^1\) such that \(\left( \Phi , \Psi _0\right) _{{\mathcal {L}}^2}=1\), there exists a unique sequence \({\varvec{t}}=\{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) and a unique cluster operator \({\mathcal {T}}=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_{\mu } {\mathcal {X}}_{\mu }\) such that

$$\begin{aligned} \Phi = e^{{\mathcal {T}}} \Psi _0. \end{aligned}$$
(12)

A proof of this statement in the infinite-dimensional setting can be found in [42] while the corresponding proof for the finite-dimensional case is given in [46].

Equation (12) implies in particular that if the sought-after ground state wave-function \(\Psi ^*_{\textrm{GS}} \in \widehat{{\mathcal {H}}}^1\) that solves the minimisation problem (3a) is intermediately normalised, then it can also be written in the form

$$\begin{aligned} \Psi ^*_{\textrm{GS}} = e^{{\mathcal {T}}^*}\Psi _0, \end{aligned}$$

for some sequence \({\varvec{t}}^* = \{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}}\) and corresponding cluster operator \({\mathcal {T}}^*=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_{\mu }^* {\mathcal {X}}_{\mu }\). In other words, the minimisation problem (3a) can be replaced by an equivalent problem which consists of finding the sequence \({\varvec{t}}^* =\{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) used to construct the appropriate cluster operator \({\mathcal {T}}^*\) that appears in the exponential parametrisation of \(\Psi ^*\). Indeed, it follows from the definition of \(\Psi ^*_{\textrm{GS}} \) and such an exponential cluster operator \(e^{{\mathcal {T}}^*}\) that

$$\begin{aligned} {\mathcal {E}}^{*}_{\textrm{GS}}e^{{\mathcal {T}}^*}\Psi _0 ={\mathcal {E}}^{*}_{\textrm{GS}}\Psi ^*_{\textrm{GS}} =H\Psi ^*_{\textrm{GS}} = H e^{{\mathcal {T}}^*}\Psi _0, \quad \text {and therefore}\quad {\mathcal {E}}^{*}_{\textrm{GS}}\Psi _0 =e^{-{\mathcal {T}}^*}H e^{{\mathcal {T}}^*}\Psi _0. \end{aligned}$$

Recalling now that for any excitation index \(\mu \in {\mathcal {I}}\), the excited determinant \(\Psi _{\mu }= {\mathcal {X}}_{\mu }\Psi _0\) is \(\widehat{{\mathcal {L}}}^{\, 2}\)-orthogonal to the reference determinant \(\Psi _0\), we are led to the continuous coupled cluster equations.

Continuous Coupled Cluster Equations:

Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5 and let the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) be defined through Definition 6. We seek a sequence \({\varvec{t}}^* =\{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) such that for all \(\mu \in {\mathcal {I}}\) we have

$$\begin{aligned} \left\langle {\mathcal {X}}_\mu \Psi _0, e^{-{\mathcal {T}}^*}H e^{{\mathcal {T}}^*} \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}} =0, \quad \text { where } ~{\mathcal {T}}^*= \sum _{\mu \in {\mathcal {I}}} {\varvec{t}}^*_{\mu } {\mathcal {X}}_{\mu }. \end{aligned}$$
(13)

Once Eq. (13) has been solved, the associated coupled cluster energy \({\mathcal {E}}_{\textrm{CC}}^*\) is given by

$$\begin{aligned} {\mathcal {E}}_{\textrm{CC}}^*:= \left\langle \Psi _0, e^{-{\mathcal {T}}^*}H e^{{\mathcal {T}}^*} \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}, \quad \text { where }~ {\mathcal {T}}^*= \sum _{\mu \in {\mathcal {I}}} {\varvec{t}}^*_{\mu } {\mathcal {X}}_{\mu }. \end{aligned}$$
(14)

Remark 20

(Solutions to the Continuous Coupled Cluster Equations) Consider the continuous coupled cluster equations (13). Under the assumption that the ground state wave-function \(\Psi _\textrm{GS}\) of the electronic Hamiltonian \(H :\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{-1}\) is intermediately normalisable with respect to the chosen reference determinant \(\Psi _0\), i.e., it is not orthogonal to \(\Psi _0\), it is obvious that there exists a corresponding solution to this non-linear system of equations. Indeed, by Eq. (12), there exists a sequence \({\varvec{t}}_{\textrm{GS}}^* =\{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) such that \(\Psi ^*_{\textrm{GS}}= e^{{\mathcal {T}}_{\textrm{GS}}^*} \Psi _0\) with \({\mathcal {T}}_{\textrm{GS}}^*= \sum _{\mu \in {\mathcal {I}}} {\varvec{t}}^*_{\mu } {\mathcal {X}}_{\mu }\), and it can readily be verified that this sequence \({\varvec{t}}_{\textrm{GS}}^*\) solves exactly Eq. (13), and consequently \({\mathcal {E}}_\textrm{CC}^* = {\mathcal {E}}_{\textrm{GS}}^*\).

Of course, \({\varvec{t}}_{\textrm{GS}}^*\) defined as above need not be the unique solution to the coupled cluster equations (13). In fact, as we discuss in the next Sect. 4, every intermediately normalisable eigenfunction of the electronic Hamiltonian will generate a solution to Eq. (13). From a theoretical point of view, this means that only local well-posedness results can be expected to hold for the continuous CC equations.

The continuous coupled cluster equations are an infinite system of non-linear equations and thus cannot be solved exactly. Instead, one introduces an approximation of the continuous coupled cluster equations by considering, instead of the global excitation index \({\mathcal {I}}\), some finite subset \({{\mathcal {I}}_h} \subset {\mathcal {I}}\) and solving only the equations associated with this subset of excitation indices. This procedure results in the so-called discrete coupled cluster equations.

Discrete Coupled Cluster Equations:

Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5, let \({\mathcal {I}}_h \subset {\mathcal {I}}\) denote any finite subset of excitation indices, and let the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) be defined through Definition 6. We seek a coefficient vector \({\varvec{t}}^{*}_h=\{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}_h} \in \ell ^2({\mathcal {I}}_h)\) such that for all \(\mu \in {{\mathcal {I}}}_h\) we have

$$\begin{aligned} \left\langle {\mathcal {X}}_\mu \Psi _0, e^{-{\mathcal {T}}^*_h}H e^{{\mathcal {T}}^*_h} \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^{-1} \times \widehat{{\mathcal {H}}}^{1}} =0, \quad \text { where } {\mathcal {T}}^*_h= \sum _{\mu \in {{\mathcal {I}}_h}} {\varvec{t}}^*_{\mu } {\mathcal {X}}_{\mu }. \end{aligned}$$
(15)

The associated discrete ground state energy \({\mathcal {E}}^*_{h, \mathrm CC}\) is given by

$$\begin{aligned} {\mathcal {E}}^*_{h, \mathrm CC}:= \left\langle \Psi _0, e^{-{\mathcal {T}}_h^*}He^{{\mathcal {T}}^*_h} \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^{-1} \times \widehat{{\mathcal {H}}}^{1}}, \quad \text { where } {\mathcal {T}}^*_h= \sum _{\mu \in {\mathcal {I}}_h}{\varvec{t}}^*_{\mu } {\mathcal {X}}_{\mu }. \end{aligned}$$
(16)

The discrete CC equations will be the subject of further discussion in Sect. 5 where we will analyse their well-posedness for some specific choices of the excitation index subsets and N-particle basis sets. For the moment, we conclude this section with a remark on the nature of the solutions to these discrete equations.

Remark 21

(Solutions to the Discrete Coupled Cluster Equations) Consider the discrete coupled cluster equations (15). As in the continuous case, there is a priori no reason for solutions of Eq. (15) to be globally unique. Indeed, numerical experience confirms that solutions to Eq. (15) are very often not unique (see, e.g., [29, 37,38,39, 57]). Nevertheless, in practice the discrete CC equations are solved very frequently by the quantum chemical community when performing electronic structure calculations, usually using some type of iterative Newton method, and it is hoped that if one starts from a sufficiently accurate initial point, then the resulting solution \({\varvec{t}}_h^* \in \ell ^2({\mathcal {I}}_h) \) of Eq. (15) approximates, in some sense, an exact solution \({\varvec{t}}^*\) of the continuous CC equations (13) that generates the intermediately normalised ground state wave-function. Of course there are no mathematical guarantees that this procedure works, and the current reputation of coupled cluster methods as a ‘gold-standard’ in computational quantum chemistry seems to be based mostly on successful empirical experience.

Having introduced the continuous and discrete coupled cluster equations, the remainder of this article will be concerned with their (local) well-posedness analysis. We will first analyse the continuous coupled cluster equations (13) in Sect. 4, following which we will study a particular class of the discrete coupled cluster equations (15) in Sect. 5.

4 Well-posedness of the continuous coupled cluster equations

Throughout this section, we assume the setting of Sects. 2 and 3, and we recall in particular the notion of excitation operators and the continuous coupled cluster equations. We begin by defining the so-called coupled cluster function, which will be the main object of study in this section.

Definition 22

(Coupled Cluster function)

Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5 and let the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) be defined through Definition 6. We define the coupled cluster function \({\mathcal {f}} :{\mathbb {V}}\rightarrow {\mathbb {V}}^*\) as the mapping with the property that for all \({\varvec{t}}=\{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}}, {\varvec{s}}=\{{\varvec{s}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) it holds that

$$\begin{aligned} \big \langle {\varvec{s}}, {\mathcal {f}}({\varvec{t}}) \big \rangle _{{\mathbb {V}} \times {\mathbb {V}}^*}:=\left\langle \sum _{\mu \in {\mathcal {I}}}{\varvec{s}}_{\mu }{\mathcal {X}}_{\mu }\Psi _0, e^{-{\mathcal {T}}}He^{{\mathcal {T}}} \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}, \qquad \text { where }~ {\mathcal {T}}=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_\mu {\mathcal {X}}_\mu . \end{aligned}$$

Remark 23

(Justification of the Domain and Range of Coupled Cluster Function) Consider Definition 22 of the coupled cluster function. The fact that \({\mathcal {f}}\) is indeed a mapping from \({\mathbb {V}}\) to \({\mathbb {V}}^*\) is a direct consequence of the boundedness of the electronic Hamiltonian \(H:\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) and the exponential cluster operators \(e^{{\mathcal {T}}} :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) and \(e^{-{\mathcal {T}}} :\widehat{{\mathcal {H}}}^{-1}\rightarrow \widehat{{\mathcal {H}}}^{-1}\). Indeed, for all \({\varvec{s}}= \{{\varvec{s}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) and all \({\varvec{t}}= \{{\varvec{t}}_\mu \}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) it holds that

$$\begin{aligned} \big \vert \big \langle {\varvec{s}}, {\mathcal {f}}({\varvec{t}})\big \rangle _{{\mathbb {V}} \times {\mathbb {V}}^*} \big \vert&= \left| \left\langle \sum _{\mu \in {\mathcal {I}}}{\varvec{s}}_{\mu }{\mathcal {X}}_{\mu }\Psi _0, e^{-{\mathcal {T}}}He^{{\mathcal {T}}} \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| \\&\le \Big \Vert \sum _{\mu \in {\mathcal {I}}}{\varvec{s}}_{\mu } {\mathcal {X}}_{\mu } \Psi _0\Big \Vert _{\widehat{{\mathcal {H}}}^1} \Big \Vert e^{-{\mathcal {T}}}He^{{\mathcal {T}}} \Psi _0\Big \Vert _{\widehat{{\mathcal {H}}}^{-1}}\\&\le \Vert {\varvec{s}}\Vert _{{\mathbb {V}}} \Vert e^{-{\mathcal {T}}}\Vert _{\widehat{{\mathcal {H}}}^{-1} \rightarrow \widehat{{\mathcal {H}}}^{-1}} \Vert H\Vert _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{-1}} \Vert e^{{\mathcal {T}}} \Psi _0\Vert _{\widehat{{\mathcal {H}}}^{1}}. \end{aligned}$$

Equipped with Definition 22 of the coupled cluster function, let us point out that the continuous coupled cluster equations (13) can be re-written in the following weak form.

Weak Form of the Continuous Coupled Cluster equations:

Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5, let the Hilbert space of sequences \({\mathbb {V}} \subset \ell ^2({\mathcal {I}})\) be defined through Definition 14, and let the coupled cluster function \({\mathcal {f}}:{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) be defined through Definition 22. We seek a sequence \({\varvec{t}}=\{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) such that for all sequences \({\varvec{s}}=\{{\varvec{s}}_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) it holds that

$$\begin{aligned} \langle {\varvec{s}}, {\mathcal {f}}({\varvec{t}})\rangle _{{\mathbb {V}} \times {\mathbb {V}}^*}=0. \end{aligned}$$
(17)

As we shall see in Sect. 5, this point of view will allow us to interpret the truncated coupled cluster equations (15) as Galerkin discretisations of Equation (17), which will be useful for the purpose of the numerical analysis.

The following extremely significant result, proven in [42], establishes a precise relationship between zeros of the coupled cluster function defined through Definition 22 [i.e., solutions of the continuous CC equations (17)] and intermediately normalised eigenfunctions of the electronic Hamiltonian defined through Eq. (2).

Theorem 24

(Relation between Coupled Cluster Zeros and Eigenfunctions of Electronic Hamiltonian) Let the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) be defined through Definition 22 and let the electronic Hamiltonian be given by Eq. (2). Then

  1. (1)

    For any zero \({\varvec{t}}^* = \{{\varvec{t}}_{\mu }^*\}_{\mu \in {\mathcal {I}}}\in {\mathbb {V}}\) of the CC function, the function \(\Psi ^*=e^{{\mathcal {T}}^*}\Psi _0 \in \widehat{{\mathcal {H}}}^1\) with \({\mathcal {T}}^*=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}^*_\mu {\mathcal {X}}_\mu \) is an intermediately normalised eigenfunction of the electronic Hamiltonian. Moreover, the eigenvalue corresponding to the eigenfunction \(\Psi ^*\) coincides with the CC energy \({\mathcal {E}}_{\textrm{CC}}^*\) generated by \({\varvec{t}}^*\) as defined through Eq. (14).

  2. (2)

    Conversely, for any intermediately normalised eigenfunction \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian, there exists \({\varvec{t}}^* = \{{\varvec{t}}_{\mu }^*\}_{\mu \in {\mathcal {I}}}\in {\mathbb {V}}\) such that \({\varvec{t}}^*\) is a zero of the CC function and \(\Psi ^*=e^{{\mathcal {T}}^*}\Psi _0 \in \widehat{{\mathcal {H}}}^1\) with \({\mathcal {T}}^*=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}^*_\mu {\mathcal {X}}_\mu \). Moreover, the CC energy \({\mathcal {E}}_{\textrm{CC}}^*\) generated by \({\varvec{t}}^*\) as defined through Eq. (14) coincides with the eigenvalue corresponding to the eigenfunction \(\Psi ^*\).

In other words every intermediately normalisable eigenfunction of the electronic Hamiltonian (2) corresponds to a zero of the coupled cluster function defined through Definition 22 and vice-versa. The goal of our analysis in this section is to study the nature of these zeros of the coupled cluster function and, in particular, to derive sufficient conditions that guarantee the simplicity of the zeros. Indeed, if we know that some \({\varvec{t}}^* \in {\mathbb {V}}\) is a simple zero of the coupled cluster function, then this will allow us to deduce local invertibility of the coupled cluster function at \({\varvec{t}}^* \in {\mathbb {V}}\) and thereby derive both local uniqueness and local residual-based error estimates for the CC equations (17). Arguments of this nature are standard in the literature on non-linear numerical analysis (see, e.g., [53, Proposition 2.1] or [5, Theorem 2.1]) and are usually based on the invertibility of the Fréchet derivative of the non-linear function being studied. The next step in our analysis therefore will be to study carefully the Fréchet derivative of the coupled cluster function. Before proceeding with this analysis however, let us comment on the existing numerical analysis of the coupled cluster equation (17).

Remark 25

(Existing Approaches in the Numerical Analysis of the CC equations (17)) The existing literature on the numerical analysis of coupled cluster methods is rather sparse. The first numerical analysis of the single reference coupled cluster—in the finite-dimensional setting—is due to Schneider [46]. The analysis carried out in [46] was then extended to the infinite-dimensional setting (as considered here) in the subsequent articles [42, 43]. The former article showed that the mathematical objects used to formulate the coupled cluster method (such as excitation operators) are bounded operators on appropriate infinite-dimensional Hilbert spaces so that the coupled cluster equations can be stated in infinite-dimensions (prior to this article, the CC equations were always written in a finite-dimensional setting). The article [43] used these tools and the ideas developed in [46] to perform a numerical analysis of the infinite-dimensional coupled cluster equations. Additional articles on the mathematical analysis of CC methods have since appeared, including [31] which studies the so-called extended coupled cluster method, [17] which studies the so-called tailored coupled cluster method, [13] which studies the finite-dimensional CC equations using topological degree theory, and [18] which analyses the root structure of the CC equations using tools from algebraic geometry.

The aforementioned articles have two important features in common: First they are concerned with the (local) analysis of the ‘ground-state’ zero of the coupled cluster function, i.e., with the zero \({\varvec{t}}^*_{\textrm{GS}}\) such that \(e^{{\mathcal {T}}^*_{\textrm{GS}}}\Psi _0= \Psi ^*_{\textrm{GS}}\). This of course makes sense since the vast majority of coupled cluster calculations are targeted at approximating the ground state energy of the electronic Hamiltonian.

Second and more importantly, the well-posedness analysis in all of the above articles is based on proving a local, strong monotonicity property of the coupled cluster function at \({\varvec{t}}^*_{\textrm{GS}}\). Taking the example of the article [43] whose notation closely aligns with ours, let the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) be defined through Definition 22. Then it is shown in [43] that for \(\delta >0\) sufficiently small, there exists a constant \(\Gamma \) such that for all \({\varvec{w}}, {\varvec{s}}\in {\mathbb {B}}_{\delta }({\varvec{t}}^*_{\textrm{GS}}) \subset {\mathbb {V}}\) it holds that

$$\begin{aligned} \left\langle {\varvec{w}}- {\varvec{s}}, {\mathcal {f}}({\varvec{w}}) -{\mathcal {f}}({\varvec{s}})\right\rangle _{{\mathbb {V}} \times {\mathbb {V}}^*} \ge \Gamma \Vert {\varvec{w}}- {\varvec{s}}\Vert _{{\mathbb {V}}}. \end{aligned}$$
(18)

If the constant \(\Gamma \) can be shown to be strictly positive, then the local monotonicity property (18) immediately yields local well-posedness of both the continuous coupled cluster equations as well as sufficiently rich Galerkin discretisations thereof .Footnote 3 Quasi-optimal error estimates for the CC energy can then also be derived using the dual weighted residual approach developed by Rannacher et al. [2, Chapter 6].

The main drawback of the above approach is that the actual local monotonicity constant \(\Gamma \) derived from this analysis (see [43, Theorem 3.4]) is of the form:

$$\begin{aligned} \Gamma= & {} \omega \gamma - \left\| {\mathcal {T}}^*_{\textrm{GS}} - ({\mathcal {T}}^*_\textrm{GS})^{\dagger }\right\| _{\widehat{{\mathcal {H}}}^1\rightarrow \widehat{{\mathcal {H}}}^{1}}\; \left\| H- {\mathcal {E}}_\textrm{GS}^*\right\| _{\widehat{{\mathcal {H}}}^1\rightarrow \widehat{{\mathcal {H}}}^{-1}} - {\mathcal {O}}\left( \Vert {\varvec{t}}^*_\textrm{GS}\Vert ^2_{{\mathbb {V}}}\right) \end{aligned}$$
(19a)
$$\begin{aligned}\ge & {} \omega \gamma - \left( \beta +\beta ^{\dagger }\right) \left\| {\varvec{t}}^{*}_{\textrm{GS}}\right\| _{{\mathbb {V}}}\left\| H- {\mathcal {E}}_\textrm{GS}^*\right\| _{\widehat{{\mathcal {H}}}^1\rightarrow \widehat{{\mathcal {H}}}^{-1}} - {\mathcal {O}}\left( \Vert {\varvec{t}}^*_\textrm{GS}\Vert ^2_{{\mathbb {V}}}\right) \end{aligned}$$
(19b)

where \(\gamma >0\) denotes the coercivity constant of the shifted electronic Hamiltonian \(H- {\mathcal {E}}^*_{\textrm{GS}}\) on \(\{\Psi _\textrm{GS}^*\}^{\perp }\), the constant \(\omega \in (0, 1)\) is a prefactor depending on \(\Vert \Psi _0 - \Psi _{\textrm{GS}}^* \Vert _{\widehat{{\mathcal {L}}}^{\, 2}}\), and \(\beta , \beta ^{\dagger }\) are the continuity constants of the mappings \({\mathbb {V}}\ni {\varvec{t}}\mapsto {\mathcal {T}}:\widehat{{\mathcal {H}}}^1\rightarrow \widehat{{\mathcal {H}}}^{1}\) and and \({\mathbb {V}}\ni {\varvec{t}}\mapsto {\mathcal {T}}^{\dagger }:\widehat{{\mathcal {H}}}^1\rightarrow \widehat{{\mathcal {H}}}^{1}\) respectively as given in Theorem 19.

Consequently, the constant \(\Gamma \) is positive provided that \(\left\| {\varvec{t}}^{*}_{\textrm{GS}}\right\| _{{\mathbb {V}}}\) is small enough. However, according to the theoretical analysis in [42], the constants \(\beta , \beta ^{\dagger }\) grow combinatorially in the number of electrons N in the system, and thus as soon as \(N\approx 10\) or larger, the lower bound (19b) for the constant \(\Gamma \) is no longer positive. Similar issues arise in the local monotonicity constants derived in the other articles [17] and [31].

To make matters worse, even if we rely on the sharper Inequality (19a), numerical experiments involving small, relatively well-behaved molecules for which it is well-known (from numerical experience) that the coupled cluster method works well, reveal that (see Table 1 below)

$$\begin{aligned} \left\| {\mathcal {T}}^*_{\textrm{GS}} - ({\mathcal {T}}^*_\textrm{GS})^{\dagger }\right\| _{\widehat{{\mathcal {H}}}^1\rightarrow \widehat{{\mathcal {H}}}^{1}}\; \left\| H- {\mathcal {E}}_\textrm{GS}^*\right\| _{\widehat{{\mathcal {H}}}^1\rightarrow \widehat{{\mathcal {H}}}^{-1}} > \gamma . \end{aligned}$$

In other words, the assumptions required to establish local strong monotonicity of the coupled cluster function, namely, smallness of the amplitude vector norm \(\left\| {\varvec{t}}^{*}_\textrm{GS}\right\| _{{\mathbb {V}}}\) seem restrictive and not satisfied in many practical examples. As a consequence, the hope of obtaining quantitative a posteriori error estimates for the coupled cluster equations appears difficult to achieve.

Table 1 Examples of numerically computed local monotonicity constants for a collection of small molecules at equilibrium geometries

We begin our analysis with the following proposition whose essence seems known (c.f., [46, Theorem 4.16], [43, Lemma 3.1] and [12, Lemma 4.6]) but that has not been expressed in the current form in the existing literature.

Proposition 26

(Coupled Cluster Fréchet Derivative) Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5, let the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) be defined through Definition 6, and let the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}}^* \) be defined through Definition 22. Then,

  • For any \({\varvec{t}}=\{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}}\in {\mathbb {V}}\), the Fréchet derivative \(\textrm{D}{\mathcal {f}}({\varvec{t}}) :{\mathbb {V}}\rightarrow {\mathbb {V}}^*\) of the coupled cluster function \({\mathcal {f}} \) at \({\varvec{t}}\) is the mapping with the property that for all \({\varvec{s}}, {\varvec{w}}\in {\mathbb {V}}\) with \({\varvec{s}}=\{{\varvec{s}}_{\nu }\}_{\nu \in {\mathcal {I}}}\) and \({\varvec{w}}=\{{\varvec{w}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) it holds that

    $$\begin{aligned} \left\langle {\varvec{w}}, \textrm{D}{\mathcal {f}}({\varvec{t}}) {\varvec{s}}\right\rangle _{{\mathbb {V}} \times {\mathbb {V}}^*} = \left\langle \sum _{\mu \in {\mathcal {I}}} {\varvec{w}}_{\mu } {\mathcal {X}}_\mu \Psi _0, e^{-{\mathcal {T}}}\left[ {H}{\sum _{\nu \in {\mathcal {I}}} {\varvec{s}}_{\nu } {\mathcal {X}}_\nu }\right] e^{{\mathcal {T}}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$
    (20)

    where \([\cdot , \cdot ]\) denotes the commutator and \({\mathcal {T}}:=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_{\mu } {\mathcal {X}}_\mu \).

  • \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) is a \({\mathscr {C}}^{\infty }\) mapping.

Proof

We start with the proof of the first assertion. This portion of the proof will proceed in two steps:

  • We will obtain an expression for the Gateaux derivative \(\textrm{D}{\mathcal {f}}({\varvec{t}}), ~t\in {\mathbb {V}}\) of the coupled cluster function \({\mathcal {f}}\), and we will show that this agrees with the expression offered by Eq. (20).

  • We will show that the Gateaux derivative is continuous as a function of \({\varvec{t}}\), i.e., the mapping \({\varvec{t}}\mapsto \textrm{D}{\mathcal {f}}(t) :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) is continuous.

Let \({\varvec{t}}, {\varvec{s}}\in {\mathbb {V}}\) be arbitrary. Thanks to Remark 23, we observe that for any \(h \ge 0\) it holds that \({\mathcal {f}}({\varvec{t}}+ h {\varvec{s}}) \in {\mathbb {V}}^*\). It follows that for any \(h > 0\) and any \({\varvec{w}}\in {\mathbb {V}}\) we have that

$$\begin{aligned} \langle {\varvec{w}}, {\mathcal {f}}({\varvec{t}}+ h{\varvec{s}})- {\mathcal {f}}({\varvec{t}})\rangle _{{\mathbb {V}}\times {\mathbb {V}}^*}&= \left\langle \sum _{\mu \in {\mathcal {I}}}{\varvec{w}}_{\mu }{\mathcal {X}}_{\mu }\Psi _0, \left( e^{-{\mathcal {T}} -h {\mathcal {S}}}He^{{\mathcal {T}}+ h{\mathcal {S}}} -e^{-{\mathcal {T}}}He^{{\mathcal {T}}}\right) \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\\&= \left\langle \sum _{\mu \in {\mathcal {I}}}{\varvec{w}}_{\mu }{\mathcal {X}}_{\mu }\Psi _0, e^{-{\mathcal {T}}}\left( e^{-h{\mathcal {S}}}He^{h{\mathcal {S}}} -H\right) e^{{\mathcal {T}}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$

where we have denoted \({\mathcal {S}}:= \sum _{\mu \in {\mathcal {I}}} {\varvec{s}}_{\mu } {\mathcal {X}}_\mu \) and we have used the fact that \({\mathcal {T}}\) and \({\mathcal {S}}\) commute (see the first assertion of Theorem 19).

As a consequence, using once again the commutativity of \({\mathcal {T}}\) and \({\mathcal {S}}\) together with the power series expansion of the exponential cluster operator, we deduce that

$$\begin{aligned} \lim _{h \rightarrow 0} \frac{\langle {\varvec{w}}, {\mathcal {f}}({\varvec{t}}+ h{\varvec{s}})- {\mathcal {f}}({\varvec{t}})\rangle _{{\mathbb {V}}\times {\mathbb {V}}^*}}{h}&= \left\langle \sum _{\mu \in {\mathcal {I}}}{\varvec{w}}_{\mu }{\mathcal {X}}_{\mu }\Psi _0, e^{-{\mathcal {T}}}\left( -{\mathcal {S}}H + H{\mathcal {S}}\right) e^{{\mathcal {T}}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\nonumber \\&=\left\langle \sum _{\mu \in {\mathcal {I}}}{\varvec{w}}_{\mu }{\mathcal {X}}_{\mu }\Psi _0, e^{-{\mathcal {T}}}\left[ {H}{\sum _{\nu \in {\mathcal {I}}} {\varvec{s}}_{\nu } {\mathcal {X}}_\nu }\right] e^{{\mathcal {T}}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}. \end{aligned}$$
(21)

In order to show that the expression offered by Eq. (21) defines the Gateaux derivative \(\textrm{D}{\mathcal {f}}(t) :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\), we must show that this operator is bounded. Recalling from Definition 11, the subspace \(\widetilde{{\mathcal {V}}} \subset \widehat{{\mathcal {H}}}^1\) as the orthogonal complement of \(\{\Psi _0\}\), let us therefore define \({\mathcal {A}}({\varvec{t}}) :\widetilde{{\mathcal {V}}} \rightarrow \widehat{{\mathcal {H}}}^{-1}\) as

$$\begin{aligned} \forall {\Phi } \in \widehat{{\mathcal {H}}}^{1}, ~~\forall {\mathcal {S}}\Psi _0 \in \widetilde{{\mathcal {V}}} ~ \text { with } {\mathcal {S}}&= \sum _{\mu }{{\varvec{s}}}_{\mu }{\mathcal {X}}_{\mu }:\quad \left\langle \Phi , {\mathcal {A}}({\varvec{t}}) {\mathcal {S}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\\&:=\left\langle \Phi , e^{-{\mathcal {T}}}\left[ {H}{{\mathcal {S}}}\right] e^{{\mathcal {T}}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}. \end{aligned}$$

We claim that \({\mathcal {A}}({\varvec{t}})\) defines a bounded linear operator. Indeed, a direct calculation shows that \(\forall {\Phi } \in \widehat{{\mathcal {H}}}^{1}\) and \(\forall {\mathcal {S}}\Psi _0 \in \widetilde{{\mathcal {V}}} ~ \text { with } {\mathcal {S}}= \sum _{\mu }{{\varvec{s}}}_{\mu }{\mathcal {X}}_{\mu }\) we have

$$\begin{aligned} \big \vert \left\langle \Phi , {\mathcal {A}}({\varvec{t}}) {\mathcal {S}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \big \vert&= \left| \left\langle \Phi , e^{-{\mathcal {T}}}H{\mathcal {S}}e^{{\mathcal {T}}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} - \left\langle \Phi , e^{-{\mathcal {T}}}{\mathcal {S}}He^{{\mathcal {T}}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\right| \\&\le \left\| \Phi \right\| _{\widehat{{\mathcal {H}}}^{1}} \left\| e^{-{\mathcal {T}}}\right\| _{\widehat{{\mathcal {H}}}^{-1}\rightarrow \widehat{{\mathcal {H}}}^{-1}} \left\| H\right\| _{\widehat{{\mathcal {H}}}^{1}\rightarrow \widehat{{\mathcal {H}}}^{-1}}\left\| e^{{\mathcal {T}}}\right\| _{\widehat{{\mathcal {H}}}^{1}\rightarrow \widehat{{\mathcal {H}}}^{1}}\left\| {\mathcal {S}}\Psi _0\right\| _{\widehat{{\mathcal {H}}}^{1}} \\&\quad +\left\| {\mathcal {S}}^{\dagger }\Phi \right\| _{\widehat{{\mathcal {H}}}^{1}} \left\| e^{-{\mathcal {T}}}\right\| _{\widehat{{\mathcal {H}}}^{-1}\rightarrow \widehat{{\mathcal {H}}}^{-1}} \left\| H\right\| _{\widehat{{\mathcal {H}}}^{1}\rightarrow \widehat{{\mathcal {H}}}^{-1}}\left\| e^{{\mathcal {T}}}\Psi _0\right\| _{\widehat{{\mathcal {H}}}^{1}}, \end{aligned}$$

where we have used the fact that the cluster operators \(e^{{\mathcal {T}}}\) and \({\mathcal {S}}\) commute. Next, let us observe that by definition of the cluster operator \({\mathcal {S}}\) and the norm \(\Vert \cdot \Vert _{{\mathbb {V}}}\), it holds that \(\Vert {\mathcal {S}} \Psi _0 \Vert _{\widehat{{\mathcal {H}}}^{1}} = \Vert {\varvec{s}}\Vert _{{\mathbb {V}}}\). Consequently, recalling the continuity properties of cluster operators from Theorem 19 we deduce that

$$\begin{aligned} \left\| {\mathcal {S}}^{\dagger }\Phi \right\| _{\widehat{{\mathcal {H}}}^{1}} \le \beta ^{\dagger } \Vert {\varvec{s}}\Vert _{{\mathbb {V}}} \Vert \Phi \Vert _{\widehat{{\mathcal {H}}}^{1}}= \beta ^{\dagger } \Vert {\mathcal {S}}\Psi _0\Vert _{\widehat{{\mathcal {H}}}^{1}} \Vert {\mathcal {W}}\Psi _0\Vert _{\widehat{{\mathcal {H}}}^{1}}, \end{aligned}$$

where the constant \(\beta ^{\dagger } >0\) is independent of \({\mathcal {S}}\).

Collecting terms now shows that \({\mathcal {A}}({\varvec{t}}) :\widetilde{{\mathcal {V}}}\rightarrow \widehat{{\mathcal {H}}}^{-1}\) is indeed bounded, and therefore the Gateaux derivative \(\textrm{D}{\mathcal {f}}({\varvec{t}}) :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) is well-defined according to the expression offered by Eq. (21). Since \({\varvec{t}}\in {\mathbb {V}}\) was arbitrary, the coupled cluster function \({\mathcal {f}}\) is everywhere Gateaux differentiable.

It remains to prove that the Gateaux derivative \(\textrm{D}{\mathcal {f}}({\varvec{t}})\) is in fact a Fréchet derivative. To this end, it suffices to show that the mapping \({\mathbb {V}}\ni {\varvec{t}}\mapsto \textrm{D}{\mathcal {f}}({\varvec{t}}) :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) is continuous. To do so, let \(\{{\varvec{t}}_n\}_{n \in {\mathbb {N}}} \subset {\mathbb {V}}\) be a sequence that converges to \({\varvec{t}}\). It follows that

$$\begin{aligned}&\lim _{n \rightarrow \infty }\Vert \textrm{D}\mathcal {f}({\varvec{t}}) - \textrm{D}\mathcal {f}({\varvec{t}}_n)\Vert _{\mathbb {V} \rightarrow \mathbb {V}^*} = \lim _{n \rightarrow \infty } \sup _{\begin{array}{c} {\varvec{s}}\in \mathbb {V}\\ \Vert {\varvec{s}}\Vert _{\mathbb {V}} =1 \end{array}} \sup _{\begin{array}{c} {\varvec{w}}\in \mathbb {V}\\ \Vert {\varvec{w}}\Vert _{\mathbb {V}} =1 \end{array}} \vert \left\langle {\varvec{w}}, \textrm{D}\mathcal {f}({\varvec{t}}){\varvec{s}}- \textrm{D}\mathcal {f}({\varvec{t}}_n){\varvec{s}}\right\rangle _{\mathbb {V}\times \mathbb {V}^*} \vert \\&\quad = \lim _{n \rightarrow \infty } \sup _{\begin{array}{c} {\varvec{s}}\in \mathbb {V}\\ \Vert {\varvec{s}}\Vert _{\mathbb {V}} =1 \end{array}} \sup _{\begin{array}{c} {\varvec{w}}\in \mathbb {V}\\ \Vert {\varvec{w}}\Vert _{\mathbb {V}} =1 \end{array}}\left| \left\langle \sum _{\mu \in \mathcal {I}} {\varvec{w}}_{\mu } \mathcal {X}_\mu \Psi _0, e^{-{\mathcal {T}}}\left[ {H}{\sum _{\nu \in \mathcal {I}} {\varvec{s}}_{\nu } \mathcal {X}_\nu }\right] e^{\mathcal {T}}\Psi _0\right. \right. \\&\qquad \left. \left. -e^{-\mathcal {T}_n}\left[ {H}{\sum _{\nu \in \mathcal {I}} {\varvec{s}}_{\nu } \mathcal {X}_\nu }\right] e^{\mathcal {T}_n}\Psi _0\right\rangle _{\widehat{\mathcal {H}}^1 \times \widehat{\mathcal {H}}^{-1}}\right| , \end{aligned}$$

where for all \(n \in {\mathbb {N}}\) we denote \({\mathcal {T}}_n:= \sum _{\mu \in {\mathcal {I}}} ({\varvec{t}}_n)_{\mu } {\mathcal {X}}_\mu \). Adding and subtracting suitable terms yields the inequality

$$\begin{aligned}&\lim _{n \rightarrow \infty }\Vert \textrm{D}{\mathcal {f}}(t) - \textrm{D}{\mathcal {f}}(t_n)\Vert _{{\mathbb {V}} \rightarrow {\mathbb {V}}^*} \\&\quad \le \lim _{n \rightarrow \infty } \sup _{\begin{array}{c} {\varvec{s}}\in {\mathbb {V}}\\ \Vert {\varvec{s}}\Vert _{{\mathbb {V}}} =1 \end{array}} \sup _{\begin{array}{c} {\varvec{w}}\in {\mathbb {V}}\\ \Vert {\varvec{w}}\Vert _{{\mathbb {V}}} =1 \end{array}}\left| \left\langle \sum _{\mu \in {\mathcal {I}}} {\varvec{w}}_{\mu } {\mathcal {X}}_\mu \Psi _0, \left( e^{-{\mathcal {T}}} - e^{-{\mathcal {T}}_n}\right) \left[ {H}{\sum _{\nu \in {\mathcal {I}}} {\varvec{s}}_{\nu } {\mathcal {X}}_\nu }\right] e^{{\mathcal {T}}}\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\right| \\&\qquad +\lim _{n \rightarrow \infty } \sup _{\begin{array}{c} {\varvec{s}}\in {\mathbb {V}}\\ \Vert {\varvec{s}}\Vert _{{\mathbb {V}}} =1 \end{array}} \sup _{\begin{array}{c} {\varvec{w}}\in {\mathbb {V}}\\ \Vert {\varvec{w}}\Vert _{{\mathbb {V}}} =1 \end{array}}\left| \left\langle \sum _{\mu \in {\mathcal {I}}} {\varvec{w}}_{\mu } {\mathcal {X}}_\mu \Psi _0, e^{-{\mathcal {T}}_n}\left[ {H}{\sum _{\nu \in {\mathcal {I}}} {\varvec{s}}_{\nu } {\mathcal {X}}_\nu }\right] \left( e^{{\mathcal {T}}}-e^{{\mathcal {T}}_n}\right) \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\right| \\&\quad \le \lim _{n \rightarrow \infty } \sup _{\begin{array}{c} {\varvec{s}}\in {\mathbb {V}}\\ \Vert {\varvec{s}}\Vert _{{\mathbb {V}}} =1 \end{array}} \left\| \left( e^{-{\mathcal {T}}} - e^{-{\mathcal {T}}_n}\right) \left[ {H}{\sum _{\nu \in {\mathcal {I}}} {\varvec{s}}_{\nu } {\mathcal {X}}_\nu }\right] e^{{\mathcal {T}}}\Psi _0\right\| _{\widehat{{\mathcal {H}}}^{-1} }\\&\qquad +\lim _{n \rightarrow \infty } \sup _{\begin{array}{c} {\varvec{s}}\in {\mathbb {V}}\\ \Vert {\varvec{s}}\Vert _{{\mathbb {V}}} =1 \end{array}} \left\| e^{-{\mathcal {T}}_n}\left[ {H}{\sum _{\nu \in {\mathcal {I}}} {\varvec{s}}_{\nu } {\mathcal {X}}_\nu }\right] \left( e^{{\mathcal {T}}}-e^{{\mathcal {T}}_n}\right) \Psi _0\right\| _{\widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$

where we have used the fact that \(\Vert {\varvec{w}}\Vert _{{\mathbb {V}}}= \left\| \sum _{\mu \in {\mathcal {I}}} {\varvec{w}}_{\mu } {\mathcal {X}}_{\mu } \Psi _0\right\| _{\widehat{{\mathcal {H}}}^1}\) by definition.

We can now use the fact that the exponential cluster operator is a locally \({\mathscr {C}}^{\infty }\) mapping on the algebra of cluster operators (see Theorem 19) together with the boundedness properties of the Hamiltonian H and excitation operators to deduce that both of the above limits are zero. Thus, \(\lim _{n \rightarrow \infty }\Vert \textrm{D}{\mathcal {f}}({\varvec{t}}) - \textrm{D}{\mathcal {f}}({\varvec{t}}_n)\Vert _{{\mathbb {V}} \rightarrow {\mathbb {V}}^*}= 0,\) which shows that \(\textrm{D}{\mathcal {f}}(t) :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) as defined through Eq. (20) is indeed the Fréchet derivative of the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) at \({\varvec{t}}\in {\mathbb {V}}\).

In order to complete the proof of this proposition, we must demonstrate that the second assertion also holds, namely, that \({\mathcal {f}}\) is a \({\mathscr {C}}^{\infty }\) mapping from \({\mathbb {V}}\) to \({\mathbb {V}}^*\). To this end, it is sufficient to observe that higher order Gateaux derivatives of the coupled cluster function can be computed exactly as the first order Gateaux derivative given by Eq. (21) with the single commutator being replaced by nested commutators. The fact that these Gateaux derivatives are also Fréchet derivatives is deduced in an identical fashion by making use of the fact that exponential cluster operator is a locally \({\mathscr {C}}^{\infty }\) map. This completes the proof. \(\square \)

Proposition 26 has a number of important consequences that we now state. For the first result, let us recall from Theorem (24) that every zero \({\varvec{t}}^* \in {\mathbb {V}}\) of the coupled cluster function is associated with an intermediately normalised eigenfunction \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian \(H :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) defined through Eq. (2).

Corollary 27

(Coupled Cluster Fréchet Derivative at Zeros of the Coupled Cluster Function)

Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5, let the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) be defined through Definition 6, let the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}} \) be defined through Definition 22, for any \({\varvec{t}}\in {\mathbb {V}}\) let \(\textrm{D}{\mathcal {f}}({\varvec{t}})\) denote the Fréchet derivative of the coupled cluster function as defined through Eq. (20), let \({\varvec{t}}^*= \{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) denote a zero of the CC function that generates the intermediately normalised eigenfunction \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian with corresponding eigenvalue \({\mathcal {E}}^*\). Then for all \({\varvec{s}}, {\varvec{w}}\in {\mathbb {V}} \) with \({\varvec{s}}=\{{\varvec{s}}_{\nu }\}_{\nu \in {\mathcal {I}}}\) and \( {\varvec{w}}=\{{\varvec{w}}_{\mu }\}_{\mu \in {\mathcal {I}}}\), it holds that

$$\begin{aligned} \left\langle {\varvec{w}}, \textrm{D}{\mathcal {f}}({\varvec{t}}^*) {\varvec{s}}\right\rangle _{{\mathbb {V}} \times {\mathbb {V}}^*}= & {} \left\langle \sum _{\mu \in {\mathcal {I}}} {\varvec{w}}_{\mu } {\mathcal {X}}_\mu \Psi _0, e^{-{\mathcal {T}}^*} \left( H - {\mathcal {E}}^*\right) e^{{\mathcal {T}}^*} \sum _{\nu \in {\mathcal {I}}} {\varvec{s}}_{\nu } {\mathcal {X}}_\nu \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\nonumber \\{} & {} \quad ~ \text { where } ~ {\mathcal {T}}^*:=\sum _{\mu \in {\mathcal {I}}} {\varvec{t}}^*_{\mu } {\mathcal {X}}_\mu . \end{aligned}$$
(22)

Proof

The proof follows by a direct calculation from Eq. (20) by expanding the commutator, making use of the fact that \(H\Psi ^* = {\mathcal {E}}^* \Psi ^*= {\mathcal {E}}^* e^{{\mathcal {T}}^*}\Psi _0\) by definition together with the commutativity of the cluster operators \({\mathcal {T}}^*\) and \({\mathcal {S}}= \sum _{\mu } {\varvec{s}}_{\mu }{\mathcal {X}}_{\mu }\). \(\square \)

Consider the setting of Corollary 27. Let us remark here that, thanks to Theorem 24, the eigenvalue \({\mathcal {E}}^*\) which appears in (22) coincides with the CC energy \({\mathcal {E}}^*_{\textrm{CC}}\) generated by \({\varvec{t}}^*\) through Eq. (14). Therefore, when considering expressions of the form (22) involving the CC Fréchet derivative, we may refer to \({\mathcal {E}}^*\) as simply the coupled cluster energy associated with \({\varvec{t}}^*\) without reference to the underlying eigenpair of the electronic Hamiltonian.

Corollary 28

(Local Lipschitz Continuity of Coupled Cluster Fréchet Derivative) Let the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}} \) be defined through Definition 22, and for any \({\varvec{t}}\in {\mathbb {V}}\) let \(\textrm{D}{\mathcal {f}}({\varvec{t}})\) denote the Fréchet derivative of the coupled cluster function as defined through Eq. (20). Then the mapping \({\mathbb {V}} \ni {\varvec{t}}\mapsto \textrm{D}{\mathcal {f}}({\varvec{t}}) :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) is Lipschitz continuous for bounded arguments, i.e., for any \({\varvec{t}}\in {\mathbb {V}}\) and any any \(\delta > 0\), there exists a constant \(\textrm{L}_{{\varvec{t}}}(\delta ) >0\) such that

$$\begin{aligned} \sup _{{\varvec{t}}\ne {\varvec{s}}\in {\textrm{B}_{\delta }({\varvec{t}})}}\frac{\Vert \textrm{D}{\mathcal {f}}({\varvec{t}}) - \textrm{D}{\mathcal {f}}({\varvec{s}})\Vert _{{\mathbb {V}} \rightarrow {\mathbb {V}}^*}}{\Vert {\varvec{t}}- {\varvec{s}}\Vert _{{\mathbb {V}}}} := \textrm{L}_{{\varvec{t}}}(\delta ) < \infty . \end{aligned}$$

Corollary 28 follows immediately from the regularity of the coupled cluster function.

Having obtained an expression for the first Fréchet derivative \(\textrm{D}{\mathcal {f}}({\varvec{t}}), ~{\varvec{t}}\in {\mathbb {V}}\) of the coupled cluster function and studied some regularity properties of the mapping \({\mathbb {V}} \ni {\varvec{t}}\mapsto \textrm{D}{\mathcal {f}}({\varvec{t}})\), the next step in our analysis will be to study the invertibility of the Fréchet derivative \(\textrm{D}{\mathcal {f}}\) at any zero \({\varvec{t}}^* \in {\mathbb {V}}\) of the coupled cluster function. In order to proceed with this analysis, let us first notice that thanks to the expression offered by Eq. (22) in Corollary 27, the coupled cluster Fréchet derivative at any zero \({\varvec{t}}^*\in {\mathbb {V}}\) can be described in terms of an operator acting on a subspace of the infinite-dimensional N-particle space \(\widehat{{\mathcal {H}}}^1\). This observation motivates us to introduce the following operator acting on the space \(\widetilde{{\mathcal {V}}}=\{\Psi _0\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\) (recall Definition 11).

Definition 29

(Operator Induced by Coupled Cluster Fréchet Derivative at \({\varvec{t}}^*\))

Let the excitation index set \({\mathcal {I}}\) be defined through Definition 5, let the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) be defined through Definition 6, let \({\varvec{t}}^*=\{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}\) be any zero of the coupled cluster function defined through Definition 22, let \({\mathcal {E}}^*\) be the associated coupled cluster energy calculated through (14), and let the space \(\widetilde{{\mathcal {V}}} \subset \widehat{{\mathcal {H}}}^1\) be defined as in Definition 11. We define the operator \({\mathcal {A}}(t^*) :\widetilde{{\mathcal {V}}}\rightarrow \widehat{{\mathcal {H}}}^{-1}\) as the mapping with the property that

$$\begin{aligned} \forall {\widetilde{\Psi }}\in \widetilde{{\mathcal {V}}}:\qquad {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }}:= e^{-{\mathcal {T}}^*}\left( H- {\mathcal {E}}^*\right) e^{{\mathcal {T}}^*}{\widetilde{\Psi }}\qquad \text {where} \quad {\mathcal {T}}^*= \sum _{\mu \in {\mathcal {I}}} {\varvec{t}}_{\mu }^* {\mathcal {X}}_{\mu }.\nonumber \\ \end{aligned}$$
(23)

Notation 30

Let \({\varvec{t}}^* \in {\mathbb {V}}\) be any zero of the coupled cluster function defined through Definition 22 and let \({\mathcal {E}}^*\) be the associated coupled cluster energy calculated through Eq. (14).

  • We denote by \(\alpha _{{\varvec{t}}^*} > 0\) the constant defined as

    $$\begin{aligned} \alpha _{{\varvec{t}}^*}:= \Vert {\mathcal {A}}({\varvec{t}}^*) \Vert _{\widetilde{{\mathcal {V}}}\rightarrow \widetilde{{\mathcal {V}}}^*}:= \sup _{0\ne {\widetilde{\Phi }} \in \widetilde{{\mathcal {V}}}}\; \sup _{0\ne {\widetilde{\Psi }} \in \widetilde{{\mathcal {V}}}} \frac{\langle {\widetilde{\Phi }}, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \rangle _{\widehat{{\mathcal {H}}}^{-1} \times \widehat{{\mathcal {H}}}^{1}} }{\Vert {\widetilde{\Phi }}\Vert _{\widehat{{\mathcal {H}}}^{1}} \Vert {\widetilde{\Psi }}\Vert _{\widehat{{\mathcal {H}}}^{1}}}, \end{aligned}$$

    with the existence of \(\alpha _{{\varvec{t}}^*}\) being guaranteed by Proposition 26.

  • For any \({\varvec{t}}\in {\mathbb {V}}\) we denote by \(\textrm{L}_{{\varvec{t}}} :{\mathbb {R}}_+ \rightarrow {\mathbb {R}}_+\) the so-called ‘Lipschitz continuity function’ as the mapping with the property that for all \(\delta >0\) it holds that

    $$\begin{aligned} \forall \delta > 0 :\quad \textrm{L}_{{\varvec{t}}}(\delta ):= \sup _{{\varvec{t}}\ne {\varvec{s}}\in {\textrm{B}_{\delta }({\varvec{t}})}}\frac{\Vert \textrm{D}{\mathcal {f}}({\varvec{t}}) - \textrm{D}{\mathcal {f}}({\varvec{s}})\Vert _{{\mathbb {V}} \rightarrow {\mathbb {V}}^*}}{\Vert {\varvec{t}}- {\varvec{s}}\Vert _{{\mathbb {V}}}}, \end{aligned}$$

    with the existence of the function \(\textrm{L}_{{\varvec{t}}}\) being guaranteed by Corollary 28.

  • We denote by \({\mathcal {A}}({\varvec{t}}^*)^{\dagger } :\widetilde{{\mathcal {V}}} \rightarrow \widehat{{\mathcal {H}}}^{-1}\) the mapping with the property that for all \({\widetilde{\Psi }} \in \widetilde{{\mathcal {V}}}\) it holds that

    $$\begin{aligned} {\mathcal {A}}({\varvec{t}}^*)^\dagger {\widetilde{\Psi }}:= e^{({\mathcal {T}}^*)^{\dagger }}\left( H- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }}{\widetilde{\Psi }}, \end{aligned}$$
    (24)

    and we emphasise that for all \({\widetilde{\Psi }}, {\widetilde{\Phi }} \in \widetilde{{\mathcal {V}}}\) it holds that

    $$\begin{aligned} \langle {\widetilde{\Phi }}, {\mathcal {A}}({\varvec{t}}^*)^{\dagger } {\widetilde{\Psi }} \rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}= \langle {\mathcal {A}}({\varvec{t}}^*){\widetilde{\Phi }}, {\widetilde{\Psi }} \rangle _{\widehat{{\mathcal {H}}}^{-1} \times \widehat{{\mathcal {H}}}^{1}}, \end{aligned}$$

    so that in particular

    $$\begin{aligned} \Vert {\mathcal {A}}({\varvec{t}}^*)^{\dagger } \Vert _{\widetilde{{\mathcal {V}}}\rightarrow \widetilde{{\mathcal {V}}}^*} = \Vert {\mathcal {A}}({\varvec{t}}^*) \Vert _{\widetilde{{\mathcal {V}}}\rightarrow \widetilde{{\mathcal {V}}}^*}= \alpha _{{\varvec{t}}^*}. \end{aligned}$$

Consider now Definition 29 of the bounded linear operator \({\mathcal {A}}({\varvec{t}}^*) :\widetilde{{\mathcal {V}}} \rightarrow \widehat{{\mathcal {H}}}^{-1}\) for an arbitrary zero \({\varvec{t}}^*\in {\mathbb {V}}\) of the coupled cluster function. Since the coefficient space \({\mathbb {V}}\) inherits its inner product from the inner product on \(\widehat{{\mathcal {H}}}^1\), it immediately follows that

$$\begin{aligned} \textrm{D}{\mathcal {f}}({\varvec{t}}^*):{\mathbb {V}} \rightarrow {\mathbb {V}}^* \quad \text { is an isomorphism } \quad \iff \quad {\mathcal {A}}({\varvec{t}}^*) :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^* \quad \text {is an isomorphism}. \end{aligned}$$

We claim that the mapping \({\mathcal {A}}({\varvec{t}}^*) :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\) can indeed be shown to be an isomorphism provided that the zero \({\varvec{t}}^*\) is generated by an intermediately normalisable eigenfunction \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian that corresponds to a simple and isolated eigenvalue. The proof of this claim, which is the subject of the next theorem, is based on classical functional analysis arguments, and will proceed in the following steps: Assuming that the zero \({\varvec{t}}^*\) is generated by a non-degenerate, intermediately normalisable eigenfunction of the electronic Hamiltonian:

  1. (1)

    We will first show that \({\mathcal {A}}({\varvec{t}}^*) :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\) is injective. As a consequence of the Hahn-Banach theorem, we will deduce that the adjoint operator \({\mathcal {A}}({\varvec{t}}^*)^{\dagger } :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\) has dense range.

  2. (2)

    Next, we will show that the operator \({\mathcal {A}}({\varvec{t}}^*)^{\dagger } :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\) is bounded below. This will imply that \({\mathcal {A}}({\varvec{t}}^*)^{\dagger } :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\) is injective, and has closed range.

Combining the above two steps, will allow us to deduce that the adjoint operator \({\mathcal {A}}({\varvec{t}}^*)^{\dagger } :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\) is an isomorphism, and therefore so too is the operator \({\mathcal {A}}({\varvec{t}}^*):\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\). Let us emphasise here that rather than attacking directly the operator \({\mathcal {A}}({\varvec{t}}^*)\) induced by the Fréchet derivative of the coupled cluster function at \({\varvec{t}}^* \in {\mathbb {V}}\), we are choosing to analyse its adjoint. This choice is motivated by practical reasons: there is a technical difficulty in proving directly the invertibility of \({\mathcal {A}}({\varvec{t}}^*)\) which is avoided if we study instead \({\mathcal {A}}({\varvec{t}}^*)^{\dagger }\).

Theorem 31

(Invertibility of Operator Induced by Coupled Cluster Fréchet Derivative at \({\varvec{t}}^*\)) Let \({\varvec{t}}^* =\{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}}\in {\mathbb {V}}\) be associated with a non-degenerate, intermediately normalisable eigenpair \(({\mathcal {E}}^*, \Psi ^*) \in {\mathbb {R}} \times \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian \(H :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) defined through Eq. (2), i.e.,

$$\begin{aligned} H\Psi ^*&= {\mathcal {E}}^* \Psi ^*, \quad \text {with }~ {\mathcal {E}}^* ~\text { simple, isolated} \quad \text {and} \quad \Psi ^* = e^{{\mathcal {T}}^*}\Psi _0 \quad \text {where} \quad {\mathcal {T}}^* = \sum _{\mu \in {\mathcal {I}}}{\varvec{t}}_{\mu }^*{\mathcal {X}}_{\mu }. \end{aligned}$$

Then the operator \({\mathcal {A}}({\varvec{t}}^*) :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\) defined through Definition 29 is an isomorphism.

Proof

The proof follows the aforementioned two steps. We begin with the injectivity of \({\mathcal {A}}({\varvec{t}}^*)\). \(\square \)

figure a

Suppose there exists \(0 \ne {\widetilde{\Psi }} \in \widetilde{{\mathcal {V}}}\) such that \({\mathcal {A}}({\varvec{t}}^*){\widetilde{\Psi }}\equiv 0\) in \(\widetilde{{\mathcal {V}}}^*\), i.e., for all \({\widetilde{\Phi }} \in \widetilde{{\mathcal {V}}}\) it holds that

$$\begin{aligned} \left\langle {\widetilde{\Phi }}, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}=0. \end{aligned}$$
(25)

As a first step, we claim that from Eq. (25) it must follow that \({\mathcal {A}}({\varvec{t}}^*){\widetilde{\Psi }}\equiv 0\) in \(\widehat{{\mathcal {H}}}^{-1}\). Recalling the complementary decomposition of \(\widehat{{\mathcal {H}}}^1\) given by Definition 12, we see that it suffices to prove that

$$\begin{aligned} \left\langle \Psi _0, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}=0. \end{aligned}$$
(26)

Consider now the element \({\widehat{\Phi }} = e^{({\mathcal {T}}^*)^{\dagger }}e^{{\mathcal {T}}^*} \Psi _0 \in \widehat{{\mathcal {H}}}^1\) and recall that we denote by \({\mathbb {P}}_0 :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) the \(\widehat{{\mathcal {L}}}^{\, 2}\)-orthogonal projection operator onto \(\text {span} \{\Psi _0\}\) defined through Definition 12 and we have defined \({\mathbb {P}}^{\perp }_0:= {\mathbb {I}}-{\mathbb {P}}_0\). Clearly, we have that \({\mathbb {P}}_0 {\widehat{\Phi }} \ne 0\) since

$$\begin{aligned} ({\widehat{\Phi }}, \Psi _0)_{\widehat{{\mathcal {L}}}^{\, 2}}= \left( e^{({\mathcal {T}}^*)^{\dagger }}e^{{\mathcal {T}}^*} \Psi _0, \Psi _0\right) _{\widehat{{\mathcal {L}}}^{\, 2}}= \left( e^{{\mathcal {T}}^*} \Psi _0, e^{{\mathcal {T}}^*} \Psi _0\right) _{\widehat{{\mathcal {L}}}^{\, 2}}=\Vert \Psi ^*\Vert _{\widehat{{\mathcal {L}}}^{\, 2}}^2=:{\widehat{c}}_0 \ne 0. \end{aligned}$$
(27)

Since \(\Psi ^*= e^{{\mathcal {T}}^*} \Psi _0\in \widehat{{\mathcal {H}}}^1\) is by definition an eigenfunction of the electronic Hamiltonian with associated eigenvalue \({\mathcal {E}}^*\), a direct calculation also reveals that

$$\begin{aligned} \left\langle {\widehat{\Phi }}, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}&= \left\langle e^{({\mathcal {T}}^*)^{\dagger }}e^{{\mathcal {T}}^*} \Psi _0, e^{-{\mathcal {T}}^*}\left( H- {\mathcal {E}}^*\right) e^{{\mathcal {T}}^*}{\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\nonumber \\&= \left\langle e^{{\mathcal {T}}^*} \Psi _0, \left( H- {\mathcal {E}}^*\right) e^{{\mathcal {T}}^*}{\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\nonumber \\&=0. \end{aligned}$$
(28)

On the other hand, we also have

$$\begin{aligned} \left\langle {\widehat{\Phi }}, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}&= \left\langle {\mathbb {P}}_0{\widehat{\Phi }}, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}+\left\langle {\mathbb {P}}_0^{\perp }{\widehat{\Phi }}, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\nonumber \\&=\left\langle {\mathbb {P}}_0{\widehat{\Phi }}, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$
(29)

where the second equality is due to the fact that \({\mathcal {A}}({\varvec{t}}^*){\widetilde{\Psi }}\equiv 0\) in \(\widetilde{{\mathcal {V}}}^*\) by assumption.

Combining therefore Eqs. (27)–(29), we deduce that

$$\begin{aligned} 0=\left\langle {\widehat{\Phi }}, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}=\left\langle {\mathbb {P}}_0{\widehat{\Phi }}, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}= {\widehat{c}}_0\left\langle \Psi _0, {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}. \end{aligned}$$

Since \({\widehat{c}}_0\ne 0\), we immediately deduce that Eq. (26) holds and therefore \({\mathcal {A}}({\varvec{t}}^*){\widetilde{\Psi }}\equiv 0\) in \(\widehat{{\mathcal {H}}}^{-1}\) as claimed.

Since \(e^{-({\mathcal {T}}^*)^{\dagger }} :\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1} \) is a bijection, we next deduce that for all \(\Phi \in \widehat{{\mathcal {H}}}^1\) it holds that

$$\begin{aligned} 0= \left\langle e^{({\mathcal {T}}^*)^{\dagger }}\Phi , {\mathcal {A}}({\varvec{t}}^*) {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}=\left\langle {\Phi }, \left( H- {\mathcal {E}}^*\right) e^{{\mathcal {T}}^*}{\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}. \end{aligned}$$

The simplicity of the eigenvalue \({\mathcal {E}}^*\) now implies that we must have

$$\begin{aligned} e^{{\mathcal {T}}^*}{\widetilde{\Psi }} \in ~\text { span}\{\Psi ^{*}\}. \end{aligned}$$

Using again the fact that \(\Psi ^* = e^{{\mathcal {T}}^*} \Psi _0\) and that \(e^{{\mathcal {T}}^*} :\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1} \) is a bijection, we obtain the existence of some constant \({\widetilde{c}}_{0} \in {\mathbb {R}}\) such that

$$\begin{aligned} {\widetilde{\Psi }}= {\widetilde{c}}_{0}\Psi _0. \end{aligned}$$

Recall however that \({\widetilde{\Psi }} \in \widetilde{{\mathcal {V}}}= \{\Psi _0\}^{\perp }\) by assumption, and therefore we must have \({\widetilde{c}}_{0}=0\) and thus \({\widetilde{\Psi }}=0\). This completes the proof of the first step.

figure b

Let \({\widetilde{\Psi }}\in \widetilde{{\mathcal {V}}}\) be arbitrary. For any \(\Psi ^*_{\perp } \in \{\Psi ^{*}\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\), i.e., any wave-function \(\Psi ^*_{\perp }\) that is \(\widehat{{\mathcal {L}}}^{\, 2}\)-orthogonal to the eigenfunction \(\Psi ^*\) with associated eigenvalue \({\mathcal {E}}^*\), we define the function

$$\begin{aligned} {\widetilde{\Phi }}_{\Psi _{\perp }}:= {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*} \Psi ^*_{\perp }\in \widetilde{{\mathcal {V}}}, \end{aligned}$$

It is straightforward to observe that for all such \(\Phi _{\Psi ^{\perp }}\), it holds that

$$\begin{aligned} \begin{aligned} \left| \left\langle \Phi _{\Psi _{\perp }}, {\mathcal {A}}({\varvec{t}}^*)^{\dagger } {\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}\right| = \Big \vert&\underbrace{\left\langle e^{-{\mathcal {T}}^*} \Psi ^*_{\perp }, e^{({\mathcal {T}}^*)^{\dagger }} \left( H- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}}_{:= \mathrm (I)}\\&\quad -\underbrace{\left\langle {\mathbb {P}}_0 e^{-{\mathcal {T}}^*} \Psi ^*_{\perp }, e^{({\mathcal {T}}^*)^{\dagger }} \left( H- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}}_{:= (\mathrm II)} \Big \vert . \end{aligned} \end{aligned}$$
(30)

We claim that the term (II) is identically zero for any choice of \(\Psi ^*_{\perp }\). To this end, observe that

$$\begin{aligned} {\mathbb {P}}_0 e^{-{\mathcal {T}}^*} \Psi ^*_{\perp }= \left( e^{-{\mathcal {T}}^*} \Psi ^*_{\perp }, \Psi _0\right) _{\widehat{{\mathcal {L}}}^{\, 2}}\Psi _0= \left( \Psi ^*_{\perp }, \Psi _0\right) _{\widehat{{\mathcal {L}}}^{\, 2}}\Psi _0= {\mathbb {P}}_0\Psi ^*_{\perp }. \end{aligned}$$

We therefore deduce that

$$\begin{aligned} (\mathrm II)&=-\left\langle {\mathbb {P}}_0 \Psi ^*_{\perp }, e^{({\mathcal {T}}^*)^{\dagger }} \left( H- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}\\&= -\left( \Psi _0, \Psi ^*_{\perp } \right) _{\widehat{{\mathcal {L}}}^{2} }\; \left\langle \Psi _0, e^{({\mathcal {T}}^*)^{\dagger }} \left( H- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$

where we have used the fact that \({\mathbb {P}}_0 \Psi ^*_{\perp } =\left( \Psi _0, \Psi ^*_{\perp } \right) _{\widehat{{\mathcal {L}}}^{2} } \Psi _0\).

Notice however that the second term in the product above satisfies

$$\begin{aligned} \left\langle \Psi _0, e^{({\mathcal {T}}^*)^{\dagger }} \left( H- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}} = \left\langle \left( H- {\mathcal {E}}^*\right) e^{{\mathcal {T}}^*} \Psi _0, e^{-({\mathcal {T}}^*)^{\dagger }}{\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^{-1} \times \widehat{{\mathcal {H}}}^{1}} =0, \end{aligned}$$
(31)

where the last step follows from the fact that \(He^{{\mathcal {T}}^*} \Psi _0 = H\Psi ^*= {\mathcal {E}}^*\Psi ^*\) by assumption. Thus, the term (II) is identically zero for any choice of \(\Psi ^*_{\perp } \in \{\Psi ^{*}\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\) as claimed, and we need only estimate the term (I).

An easy simplification reveals that

$$\begin{aligned} (\mathrm I)=\left\langle \Psi ^*_{\perp }, \left( H- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}. \end{aligned}$$
(32)

Thanks to the ellipticity of the electronic Hamiltonian \(H :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) and the simplicity of the eigenvalue \({\mathcal {E}}^*\), it is easy to deduce that the shifted Hamiltonian \(H - {\mathcal {E}}^* :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) satisfies an inf-sup condition on \(\{\Psi ^{*}\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\) (see also Remark 34 for a detailed argument). In order to make use of this result and bound the term (I), we need only show that \(\Psi ^*_{\perp }\) and \( e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\) are both elements of \(\{\Psi ^*\}^{\perp }\). The former inclusion is true by definition of \(\Psi ^*_{\perp }\) and as for latter, we see that

$$\begin{aligned} \left( \Psi ^*, e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right) _{\widehat{{\mathcal {L}}}^{\, 2}} = \left( e^{{\mathcal {T}}^*}\Psi _0, e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right) _{\widehat{{\mathcal {L}}}^{\, 2}}= \left( \Psi _0,{\widetilde{\Psi }}\right) _{\widehat{{\mathcal {L}}}^{\, 2}}= 0, \end{aligned}$$

where we have used the fact that \({\widetilde{\Psi }} \in \widetilde{{\mathcal {V}}}= \{\Psi _0\}^{\perp }\) by definition.

We can therefore deduce from Eq. (32) that

$$\begin{aligned} \sup _{\Psi ^*_{\perp } \in \{\Psi ^{*}\}^{\perp }} \frac{\left| \left\langle \Psi ^*_{\perp }, \left( H-{\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{\Vert \Psi ^*_{\perp }\Vert _{\widehat{{\mathcal {H}}}^1 } }\ge \gamma \left\| e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right\| _{\widehat{{\mathcal {H}}}^{1}}, \end{aligned}$$
(33)

where \(\gamma > 0\) is the inf-sup constant of the shifted Hamiltonian \(H-{\mathcal {E}}^*\) on \(\{\Psi ^{*}\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\).

Recalling now that \({\widetilde{\Psi }}\in \widetilde{{\mathcal {V}}}\) was arbitrary and combining the estimates (31)–(33) with Eq. (30) we obtain that for all \({\widetilde{\Psi }} \in \widetilde{{\mathcal {V}}}\) it holds that

$$\begin{aligned} \Vert {\mathcal {A}}({\varvec{t}}^*)^{\dagger } {\widetilde{\Psi }}\Vert _{\widetilde{{\mathcal {V}}}^*}&= \sup _{0\ne {\widetilde{\Phi }} \in \widetilde{{\mathcal {V}}}} \frac{\big \vert \left\langle {\widetilde{\Phi }}, {\mathcal {A}}({\varvec{t}}^*)^{\dagger }{\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}} \big \vert }{\Vert {\widetilde{\Phi }} \Vert _{\widehat{{\mathcal {H}}}^{1}} }\\&\ge \sup _{0\ne {\Psi }_{\perp }^* \in \{\Psi ^{*}\}^{\perp }}\frac{\big \vert \left\langle {\widetilde{\Phi }}_{\Psi _{\perp }}, {\mathcal {A}}({\varvec{t}}^*)^{\dagger }{\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}\big \vert }{\Vert {\widetilde{\Phi }}_{\Psi _{\perp }} \Vert _{\widehat{{\mathcal {H}}}^{1}} }\\&= \sup _{0\ne {\Psi }^*_{\perp } \in \{\Psi ^{*}\}^{\perp }} \frac{\left| \left\langle \Psi ^*_{\perp }, \left( H-{\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*){\dagger }} {\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{\Vert {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}({\varvec{t}}^*)} \Psi ^*_{\perp }\Vert _{\widehat{{\mathcal {H}}}^1 } }\\&\ge \frac{1}{\Vert {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*}\Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1}}\; \sup _{0\ne \Psi ^*_{\perp } \in \{\Psi ^{*}\}^{\perp }} \frac{\left| \left\langle \Psi ^*_{\perp }, \left( H-{\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{\Vert \Psi ^*_{\perp }\Vert _{\widehat{{\mathcal {H}}}^1 } }\\&\ge \frac{\gamma }{\Vert {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*}\Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1}}\; \left\| e^{-({\mathcal {T}}^*)^{\dagger }} {\widetilde{\Psi }}\right\| _{\widehat{{\mathcal {H}}}^{1}}\\&\ge \frac{\gamma }{\Vert {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*}\Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1} \Vert e^{({\mathcal {T}}^*)^{\dagger }} \Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1} }\; \big \Vert {\widetilde{\Psi }}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}, \end{aligned}$$

where the final step follows from the fact that \(e^{-({\mathcal {T}}^*)^{\dagger }} :\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}\) is a bijection. Defining now the constant \(\Theta \in (0, \infty )\) as

$$\begin{aligned} \Theta := \Vert e^{({\mathcal {T}}^*)^{\dagger }}\Vert _{\widehat{{\mathcal {H}}}^{1}\rightarrow \widehat{{\mathcal {H}}}^{1}} \Vert {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*}\Vert _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}}, \end{aligned}$$
(34)

and recalling that \({\widetilde{\Psi }} \in \widetilde{{\mathcal {V}}}\) was arbitrary, we deduce that

$$\begin{aligned} \forall {\widetilde{\Psi }} \in \widetilde{{\mathcal {V}}} :\quad \Vert {\mathcal {A}}({\varvec{t}}^*)^{\dagger } {\widetilde{\Psi }}\Vert _{\widetilde{{\mathcal {V}}}^*} \ge \frac{\gamma }{\Theta } \Vert \Psi \Vert _{\widehat{{\mathcal {H}}}^{1}}, \end{aligned}$$

which completes the proof of the second step.

Combining the conclusions of Step 1 and Step 2 we deduce that the adjoint operator \({\mathcal {A}}({\varvec{t}}^*)^{\dagger } :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\) is an isomorphism, and from this it follows that the operator \({\mathcal {A}}({\varvec{t}}^*) :\widetilde{{\mathcal {V}}} \rightarrow \widetilde{{\mathcal {V}}}^*\) is also an isomorphism. \(\square \)

Equipped with Theorem 31 and recalling the discussion following Notation 30, we immediately obtain the desired invertibility result for the coupled cluster Fréchet derivative at any zero \({\varvec{t}}^* \in {\mathbb {V}}\) of the coupled cluster function that is associated with a non-degenerate, intermediately normalised eigenfunction \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian.

Corollary 32

(Invertibility of the Coupled Cluster Fréchet Derivative at \({\varvec{t}}^*\)) Let the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}}^* \) be defined through Definition 22, for any \({\varvec{t}}\in {\mathbb {V}}\) let \(\textrm{D}{\mathcal {f}}({\varvec{t}})\) denote the Fréchet derivative of the coupled cluster function as defined through Eq. (20), let \({\varvec{t}}^* \in {\mathbb {V}}\) denote a zero of the coupled cluster function corresponding to an intermediately normalised eigenfunction \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian \(H:\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) with simple, isolated eigenvalue \({\mathcal {E}}^*\), let \(\gamma >0\) denote the inf-sup constant of the shifted Hamiltonian \(H-{\mathcal {E}}^*\) on \( \{\Psi ^*\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\), and let \(\Theta > 0\) denote the constant defined through Eq. (34). Then \(\textrm{D}{\mathcal {f}}({\varvec{t}}^*) :{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) is an isomorphism and it holds that

$$\begin{aligned} \Vert \textrm{D}{\mathcal {f}}({\varvec{t}}^*)^{-1}\Vert _{{\mathbb {V}}^* \rightarrow {\mathbb {V}}} \le \frac{\Theta }{\gamma }. \end{aligned}$$

Having completed our study of the coupled cluster Fréchet derivative, we are now finally ready to state the main result of this section, namely the local well-posedness of the single reference coupled cluster equations. As mentioned at the beginning of this section, we will do so by appealing to a classical result from non-linear numerical analysis.

Theorem 33

(Local Uniqueness of the Coupled Cluster Solution \({\varvec{t}}^*\)) Let the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}}^* \) be defined through Definition 22, let \({\varvec{t}}^* \in {\mathbb {V}}\) denote a zero of the coupled cluster function corresponding to an intermediately normalised eigenfunction \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian \(H :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) with simple, isolated eigenvalue \({\mathcal {E}}^*\), let \(\gamma >0\) denote the inf-sup constant of the shifted Hamiltonian \(H-{\mathcal {E}}^*\) on \(\{\Psi ^{*}\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\), let \(\Theta > 0\) denote the constant defined through Eq. (34), let the continuity constant \(\alpha _{{\varvec{t}}^*} >0\) and the Lipschitz continuity function \(\textrm{L}_{{\varvec{t}}^*} :{\mathbb {R}}_+ \rightarrow {\mathbb {R}}_+\) be defined according to Notation 30, and define the constant

$$\begin{aligned} \textrm{R}:= \min _{\delta >0} \left\{ \delta , ~\frac{\gamma }{\textrm{L}_{{\varvec{t}}^*}(\delta ) \Theta },~ 2\frac{\alpha _{{\varvec{t}}^*}}{\textrm{L}_{{\varvec{t}}^*}(\delta )} \right\} . \end{aligned}$$

Then \({\mathcal {f}}\big (\textrm{B}_{\textrm{R}}({\varvec{t}}^*)\big )\) is an open subset \({\mathbb {V}}^*\), the restriction of \({\mathcal {f}}\) to \(\textrm{B}_{\textrm{R}}({\varvec{t}}^*)\) is a diffeomorphism and for all \( {\varvec{s}}\in \textrm{B}_{\textrm{R}}({\varvec{t}}^*)\) we have the error estimate

$$\begin{aligned} \frac{1}{2} \frac{1}{\alpha _{{\varvec{t}}^*} } \Vert {\mathcal {f}}({\varvec{s}}) \Vert _{{\mathbb {V}}^*} \le \Vert {\varvec{t}}^* - {\varvec{s}}\Vert _{{\mathbb {V}}} \le 2 \frac{\Theta }{\gamma } \Vert {\mathcal {f}}({\varvec{s}}) \Vert _{{\mathbb {V}}^*}. \end{aligned}$$
(35)

In particular, \({\varvec{t}}^*\) is the unique solution of the continuous coupled cluster equations (13) in the open ball \(\textrm{B}_{\textrm{R}}({\varvec{t}}^*)\).

Proof

The fact that the image under \({\mathcal {f}}\) of the open ball \(\textrm{B}_{\textrm{R}}({\varvec{t}}^*)\) is itself open and that \({\mathcal {f}}\) is a local diffeomorphism is a direct consequence of the inverse function theorem for Banach spaces (see, e.g., [34, Chapter 9]) while the error estimate is a direct application of [53, Proposition 2.1]. The fact that the assumptions of both results are indeed fulfilled by the coupled cluster function \({\mathcal {f}}\) is a consequence of Proposition 26 and Corollaries 28 and 32. \(\square \)

Next, let us comment on the constants that appear in the error estimate offered by Theorem 33.

Remark 34

(Interpretation of the Constants Appearing in Error Estimate (35)) Consider the setting of Theorem 33. From the point of view of a posteriori error quantification, it is important to gain a better understanding of the constants \(\gamma >0\) and \(\Theta >0\).

Let us recall that the \(\gamma >0\) is the inf-sup constant of the shifted Hamiltonian \(H-{\mathcal {E}}^*\) on \(\{\Psi ^{*}\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\). A crude lower bound for this constant can be obtained through the following procedure.

We begin by noting that the shifted Hamiltonian \(H- {\mathcal {E}}^*_{\textrm{GS}}+1\) defines a coercive operator on \(\widehat{{\mathcal {H}}}^1\). Since the electronic Hamiltonian is additionally self-adjoint, we can introduce a new norm on \(\widehat{{\mathcal {H}}}^1\) by setting

$$\begin{aligned} \forall \Phi \in \widehat{{\mathcal {H}}}^1 :\qquad ||| \Phi |||^2_{\widehat{{\mathcal {H}}}^1} := \left\langle \Phi , \left( H- {\mathcal {E}}^*_{\textrm{GS}}+1\right) \Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$

and it is clear that this new norm is equivalent to the canonical \(\Vert \cdot \Vert _{\widehat{{\mathcal {H}}}^1}\) norm, i.e.,

$$\begin{aligned} \exists c_{\textrm{equiv}}>1 ~ \text { such that }~\forall \Phi \in \widehat{{\mathcal {H}}}^1 :\quad \frac{1}{c_{\textrm{equiv}}} ||| \Phi |||_{\widehat{{\mathcal {H}}}^1} \le \Vert \Phi \Vert _{\widehat{{\mathcal {H}}}^1}\le c_{\textrm{equiv}} ||| \Phi |||_{\widehat{{\mathcal {H}}}^1}. \end{aligned}$$

In particular, the ellipticity of the electronic Hamiltonian given by Inequality (5) also holds with respect to the new \(|||~\cdot ~|||_{\widehat{{\mathcal {H}}}^1}\) norm and we have

$$\begin{aligned}&\forall \Phi \in \widehat{{\mathcal {H}}}^1 :\quad \left\langle \Phi , \left( H-{\mathcal {E}}^*\right) \Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\nonumber \\&\quad \ge \frac{1}{4 c_{\textrm{equiv}}} ||| \Phi |||^2_{\widehat{{\mathcal {H}}}^1} -\left( 9NZ^2 -{\mathcal {E}}^*-\frac{1}{4}\right) \Vert \Phi \Vert ^2_{\widehat{{\mathcal {L}}}^{\, 2}}. \end{aligned}$$
(36)

Moreover, the norm \(|||~\cdot ~|||_{\widehat{{\mathcal {H}}}^1}\) also induces a new dual norm \(|||~\cdot ~|||_{\widehat{{\mathcal {H}}}^{-1}}\) on the space \(\widehat{{\mathcal {H}}}^{-1}\), and this new norm is also equivalent to the canonical dual norm \(\Vert \cdot \Vert _{\widehat{{\mathcal {H}}}^{-1}}\).

Next, we claim that for any \(\Phi \in \{\Psi ^*\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\) there exists \({\Phi }_{\textrm{flip}} \in \{\Psi ^*\}^{\perp }\) such that

$$\begin{aligned}&\left\langle {\Phi }_{\textrm{flip}}, \left( H-{\mathcal {E}}^*\right) \Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \ge \Lambda ^* \Vert \Phi \Vert ^2_{\widehat{{\mathcal {L}}}^{\, 2}}, ~ \text { where } ~ \Lambda ^*\nonumber \\&\quad := \underset{\begin{array}{c} \lambda \in \sigma (H)\\ \lambda \ne {\mathcal {E}}^* \end{array} }{\inf }\vert \lambda - {\mathcal {E}}^*\vert >0 ~ \text { is the spectral gap at } {\mathcal {E}}^*. \end{aligned}$$
(37)

To see this, assume that \(({\mathcal {E}}^*, \Psi ^*)\) is the \(J^\textrm{th}\) eigenpair of the electronic Hamiltonian, ordered non-decreasingly and counting multiplicity. Then we can write any \(\Phi \in \{\Psi ^*\}^{\perp }\) in the form

$$\begin{aligned} \Phi = \sum _{\ell =1}^{J+1} {\mathbb {P}}_{\ell }\Phi + \Phi ^{\perp }, \end{aligned}$$

where each \({\mathbb {P}}_{\ell }:\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) denotes the \(\widehat{{\mathcal {L}}}^{\, 2}\)-orthogonal projector onto the span of the \(\ell ^{\textrm{th}}\) eigenfunction, \(\Phi ^{\perp }:= \Phi - \sum _{\ell =1}^{J+1} {\mathbb {P}}_{j}\Phi \), and we emphasise that \({\mathbb {P}}_{J}\Phi =0\) since \(\Phi \in \{\Psi ^*\}^{\perp }\). Consequently, for any \(\Phi \in \{\Psi ^*\}^{\perp }\), we may define \(\Phi _{\textrm{flip}} \in \{\Psi ^*\}^{\perp }\) as

$$\begin{aligned} \Phi _{\textrm{flip}}&:= -\sum _{\ell =1}^{J-1} {\mathbb {P}}_{\ell }\Phi + {\mathbb {P}}_{J+1}\Phi +\Phi ^{\perp },\\ \end{aligned}$$

and a direct calculation shows that

$$\begin{aligned} \left\langle \Phi _{\textrm{flip}}, \left( H-{\mathcal {E}}^*\right) \Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}&\ge \min \left\{ {\mathcal {E}}^*-{\mathcal {E}}_{J-1}, {\mathcal {E}}_{J+1}-{\mathcal {E}}^* \right\} \Vert \Phi \Vert ^2_{\widehat{{\mathcal {L}}}^{\, 2}}=:\Lambda ^* \Vert \Phi \Vert ^2_{\widehat{{\mathcal {L}}}^{\, 2}}, \end{aligned}$$

where we have used \({\mathcal {E}}_{J-1}, {\mathcal {E}}_{J+1}\) to denote the \(J-1\) and \(J+1\) eigenvalues of the electronic Hamiltonian. The claim now readily follows. Additionally, it is readily verified that for any \(\Phi \in \{\Psi ^*\}^{\perp }\) with \(\Phi _{\textrm{flip}}\) constructed according to the above procedure, it holds that \(||| \Phi _{\textrm{flip}} |||_{\widehat{{\mathcal {H}}}^1} = ||| \Phi |||_{\widehat{{\mathcal {H}}}^1} \).

Defining now the constant \(q:= \dfrac{\Lambda ^*}{\Lambda ^*+ \left( 9NZ^2 -{\mathcal {E}}^*-\frac{1}{4}\right) } \in (0, 1)\) and combining the Estimates (36) and (37), we deduce that for all \(\Phi \in \{\Psi ^*\}^{\perp }\) it holds that

$$\begin{aligned}&\sup _{0\ne \Psi \in \{\Psi ^*\}^{\perp }} \frac{\left| \left\langle \Psi , (H-{\mathcal {E}}^*)\Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{||| \Psi |||_{\widehat{{\mathcal {H}}}^1}}= q\sup _{0\ne \Psi \in \{\Psi ^*\}^{\perp }} \frac{\left| \left\langle \Psi , (H-{\mathcal {E}}^*)\Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{||| \Psi |||_{\widehat{{\mathcal {H}}}^1}}\\&\qquad + (1-q) \sup _{0\ne \Psi \in \{\Psi ^*\}^{\perp }} \frac{\left| \left\langle \Psi , (H-{\mathcal {E}}^*)\Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{||| \Psi |||_{\widehat{{\mathcal {H}}}^1}}\\&\quad \ge q\frac{\left| \left\langle \Phi , (H-{\mathcal {E}}^*)\Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{||| \Phi |||_{\widehat{{\mathcal {H}}}^1}}+ (1-q) \frac{\left| \left\langle \Phi _{\textrm{flip}}, (H-{\mathcal {E}}^*)\Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{||| \Phi _{\textrm{flip}} |||_{\widehat{{\mathcal {H}}}^1}}\\&\quad \ge q\frac{1}{||| \Phi |||_{\widehat{{\mathcal {H}}}^1}}\left( \frac{1}{4 \;c_{\textrm{equiv}}} ||| \Phi |||^2_{\widehat{{\mathcal {H}}}^1} - \left( 9NZ^2 -{\mathcal {E}}^*-\frac{1}{4}\right) \Vert \Phi \Vert ^2_{\widehat{{\mathcal {L}}}^{\, 2}}\right) \\&\qquad +(1-q) \frac{1}{||| \Phi |||_{\widehat{{\mathcal {H}}}^1}}\Lambda ^* \Vert \Phi \Vert ^2_{\widehat{{\mathcal {L}}}^{\, 2}}\\&\quad = q\frac{1}{4\; c_{\textrm{equiv}}} ||| \Phi |||_{\widehat{{\mathcal {H}}}^1}, \end{aligned}$$

where the cancellations in the last step occurs due to the definition of \(q \in (0,1)\).

Recalling now the definition of the constant q, we see that the inf-sup constant \(\gamma \) is lower bounded by

$$\begin{aligned} \gamma \ge \dfrac{\Lambda ^*}{4\; c_{\textrm{equiv}}\left( \Lambda ^*+ 9NZ^2 -{\mathcal {E}}^*-\frac{1}{4} \right) }. \end{aligned}$$
(38)

Two important comments are now in order. First, we expect the lower bound (38) to be rather coarse because of the appearence of the norm equivalence constant \(c_{\textrm{equiv}}\). Note also that the \(\widehat{{\mathcal {H}}}^1\)-norm associated with this equivalance constant is given by

$$\begin{aligned} \forall \Phi \in \widehat{{\mathcal {H}}}^1 :\qquad ||| \Phi |||^2_{\widehat{{\mathcal {H}}}^1} = \left\langle \Phi , \left( H- {\mathcal {E}}^*_{\textrm{GS}}+1\right) \Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$

i.e., as the norm induced by the shifted Hamiltonian \(H- {\mathcal {E}}^*_{\textrm{GS}}+1\), and there is a priori no reason that a better equivalence constant cannot be obtained for a differently shifted Hamiltonian, i.e., for the operator \(H- {\mathcal {E}}^*_\textrm{GS}+\alpha \) with \(\alpha >0\) arbitrary.

Second, we observe that if \(\Lambda ^*\), i.e., the spectral gap at \({\mathcal {E}}^*\), approaches zero, then the lower bound (38) that we have derived also approches zero. In fact, the same is true for the inf-sup constant \(\gamma \), i.e., \(\Lambda ^* \rightarrow 0\) implies that \(\gamma \rightarrow 0\). To see this, assume for simplicity that \(\Lambda ^* ={\mathcal {E}}^*-\widetilde{{\mathcal {E}}}\) with \(\widetilde{{\mathcal {E}}}\) denoting the eigenvalue associated with some eigenfunction \({\widetilde{\Psi }}\ne \Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian. It then follows that

$$\begin{aligned} \gamma&= \inf _{0\ne \Phi \in \{\Psi ^*\}^{\perp }} \sup _{0\ne \Psi \in \{\Psi ^*\}^{\perp }} \frac{\left| \left\langle \Psi , (H-{\mathcal {E}}^*)\Phi \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{||| \Psi |||_{\widehat{{\mathcal {H}}}^1} ||| \Phi |||_{\widehat{{\mathcal {H}}}^1}}\\&\le \sup _{0\ne \Psi \in \{\Psi ^*\}^{\perp }} \frac{\left| \left\langle \Psi , (H-{\mathcal {E}}^*){\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{||| \Psi |||_{\widehat{{\mathcal {H}}}^1} ||| {\widetilde{\Psi }} |||_{\widehat{{\mathcal {H}}}^1 }}\\&= \Lambda ^* \sup _{0\ne \Psi \in \{\Psi ^*\}^{\perp }} \frac{\left| \left\langle \Psi ,{\widetilde{\Psi }} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} \right| }{||| \Psi |||_{\widehat{{\mathcal {H}}}^1} ||| {\widetilde{\Psi }} |||_{\widehat{{\mathcal {H}}}^1 }}, \end{aligned}$$

from which we deduce that \(\Lambda ^* \rightarrow 0\) indeed implies \(\gamma \rightarrow 0\).

An important consequence of this observation is that the residual-based CC error estimate (35) that we have derived in this article will degrade as the spectral gap degrades. In particular, it will not hold for gapless systems for which a more elaborate theory must be developed. A possible starting point for such a theory could be the Lyapunov-Schmidt construction (see, e.g., [5, Chapter V]) which is used in non-linear numerical analysis to study problems that cannot be analysed using the inverse function theorem. Of course, the applicability of this approach to the coupled cluster equations is an open question.

Coming now to the constant \(\Theta \), we see that it is simply the product of two operator norms involving the exponential cluster operator and its adjoint. Thanks to the continuity of the mapping \({\mathbb {V}} \ni {\varvec{t}}\mapsto e^{-{\mathcal {T}}({\varvec{t}})} :\widehat{{\mathcal {H}}}^1\rightarrow \widehat{{\mathcal {H}}}^1\), (and its adjoint) we deduce that these operator norms will be large when \(\Vert {\varvec{t}}\Vert _{{\mathbb {V}}}\) is large, and therefore the residual-based CC error estimate (35) is expected to degrade if \(\Vert {\varvec{t}}\Vert _{{\mathbb {V}}}\) is large.

We conclude this section by emphasising, in particular, that if the ground state energy of the electronic Hamiltonian \(H :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1} \) is simple, and the chosen reference determinant \(\Psi _0\) is not orthogonal to the corresponding ground state wave-function, then the continuous coupled cluster equations (17) are locally well-posed, and we have access to the residual-based error estimates given by Theorem 33.

5 Well-posedness of the Full Coupled Cluster equations in a finite basis

Having understood the local well-posedness of the continuous coupled cluster function, the next step in our analysis is to study the discrete coupled cluster equations (15). Unfortunately, obtaining a local well-posedness result for an arbitrary choice of excitation subset or N-particle basis set is a highly non-trivial exercise. Indeed, as the subsequent exposition will show (see Lemma 45 and Theorem 46 below), our discrete local well-posedness analysis depends on being able to demonstrate that certain discrete inf-sup conditions hold and the establishment of these conditions for arbitrary discretisations is not obvious. For the purpose of this article therefore, we will limit ourselves to an analysis of the so-called Full-Coupled Cluster equations in a finite basis. The extension of our analysis to more general discretisations (the so-called truncated CC equations [24, Chapter 13]) will be addressed in a forthcoming contribution.

Throughout this section, we assume the settings of Sects. 24. In particular, we will frequently refer to the notions of Sect. 2.3. Let \(\{\psi _{j}\}_{j \in {\mathbb {N}}}\) denote an \(\textrm{L}^2({\mathbb {R}}^3; {\mathbb {C}})\)-orthonormal basis for \(\textrm{H}^1({\mathbb {R}}^3;{\mathbb {C}})\). For any \(K \in {\mathbb {N}}\), we define \({\mathcal {B}}_K:= \left\{ \psi _j \right\} _{j=1}^K\) and \(\textrm{X}_K:= \text {span } {\mathcal {B}}_K\).

Recall that we denote by \(N\in {\mathbb {N}}\) the number of electrons in the system under study. Our goal now is to use the sets \(\{{\mathcal {B}}_K\}_{K\in {\mathbb {N}}}\) to construct a sequence of finite-dimensional, nested subspaces of the antisymmetric tensor product space \(\widehat{{\mathcal {H}}}^1\) whose union is dense in \(\widehat{{\mathcal {H}}}^1\). To avoid tedious notation in this construction, we will always assume that K is a natural number such that \(K \ge N\). Proceeding now, exactly as in Sect. 2.3, we first introduce for each such K the index set \({\mathcal {J}}_{K}^N \subset \{1, \ldots , K\}^N\) given by

$$\begin{aligned} {\mathcal {J}}_{K}^N := \Big \{\varvec{\ell } = (\ell _1, \ell _2, \ldots , \ell _N)\in \{1, \ldots , K\}^N :\ell _1< \ell _2< \cdots < \ell _N\Big \}. \end{aligned}$$

Next, we define for each K the set of \(\widehat{{\mathcal {L}}}^{\, 2}\)-orthonormal, N-particle determinants \({\mathcal {B}}_K^{N} \subset \widehat{{\mathcal {H}}}^1\) as

$$\begin{aligned} {\mathcal {B}}^{N}_K := \left\{ \Psi _{\textbf{k}}(\textbf{x}_1, \textbf{x}_2, \ldots , \textbf{x}_N)=\frac{1}{\sqrt{N!}}\text { det} \big (\psi _{k_i}(\textbf{x}_j)\big )_{i, j=1}^N :\hspace{1mm} \textbf{k}=(k_1, k_2, \ldots , k_N) \in {\mathcal {J}}_{K}^N \right\} , \end{aligned}$$

and we denote, as usual, \(\Psi _0(\textbf{x}_1, \ldots , \textbf{x}_N):= \text {det}\big (\psi _{i}(\textbf{x}_j)\big )_{i, j=1}^N\).

It now follows that we can define the sequence \(\{{\mathcal {V}}_K\}_{K \ge N}\) of subspaces of \(\widehat{{\mathcal {H}}}^1\) as \({\mathcal {V}}_K:= \text { span } {\mathcal {B}}^N_K\), and it holds that

$$\begin{aligned}&\forall ~K\ge N:\quad \text { dim }{\mathcal {V}}_K ={{K}\atopwithdelims (){K-N}}, \qquad \forall ~ K_2 > K_1 \ge N :\quad {\mathcal {V}}_{K_1} \subset {\mathcal {V}}_{K_2} \quad \text {and} \\&\qquad \overline{\bigcup _{\begin{array}{c} K\ge N \end{array}} {\mathcal {V}}_K}^{\Vert \cdot \Vert _{\widehat{{\mathcal {H}}}^1}}= \widehat{{\mathcal {H}}}^1. \end{aligned}$$

Equipped with the sequence of finite-dimensional subspaces \(\{{\mathcal {V}}_K\}_{K \ge N}\) whose union is dense in \(\widehat{{\mathcal {H}}}^1\), our next task is to introduce a corresponding sequence of finite-dimensional coefficient spaces \(\{{\mathbb {V}}_K\}_{K \ge N}\) whose union is dense in the Hilbert space of sequences \({\mathbb {V}}\) that was introduced through Definition 14. To this end, we require some definitions.

Definition 35

(Excitation Index Sets For Finite Bases)

For each K and each \(j \in \{1, \ldots , N\}\) we define the index set \({\mathcal {I}}^K\) as

$$\begin{aligned} {\mathcal {I}}_j^K := \left\{ {{i_1, \ldots , i_j}\atopwithdelims (){\ell _1, \ldots , \ell _j}} :i_1< \cdots< i_j \in \{1, \ldots , N\} \text { and } \ell _1< \cdots < \ell _j \in \{N+1, \ldots , K\} \right\} , \end{aligned}$$

we set

$$\begin{aligned} {\mathcal {I}}^K:= \bigcup _{j=1}^N {\mathcal {I}}^K_j, \end{aligned}$$

and we emphasise that \(\underset{K \ge N}{\bigcup }\ {\mathcal {I}}^K ={\mathcal {I}}\), i.e., the global excitation index defined through Definition 5.

Consider Definition 35 of the excitation index sets \({\mathcal {I}}_j^K, ~j \in \{1, \ldots , N\}\). Since each \({\mathcal {I}}_j^K\) is a subset of the global excitation index set \({\mathcal {I}}\) defined through Definition 5, it follows that we can define for any \(\mu \in {\mathcal {I}}_j^K\), excitation and de-excitation operators \({\mathcal {X}}_{\mu } :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) and \({\mathcal {X}}^{\dagger }_{\mu }:\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\) through Definitions 6 and 7 respectively. Moreover, the results of Theorem 19 can be applied to these excitation and de-excitation operators, and the following remark summarises some additional properties of these elementary excitation and de-excitation operators.

Remark 36

(Properties of Excitation and De-excitation Operators Related to the Index Set \({\mathcal {I}}^K\)) Let the excitation index set \({\mathcal {I}}^K\) be defined according to Definition 35. Then the finite-dimensional N-particle basis \({\mathcal {B}}_K^N\) and the finite-dimensional N-particle approximation space \({\mathcal {V}}_K\) have the decomposition

$$\begin{aligned} {\mathcal {B}}_K^N:=&\left\{ \Psi _0\right\} \cup \{{\mathcal {X}}_{\mu } \Psi _0 :~ \mu \in {\mathcal {I}}^K\},\\ {\mathcal {V}}_K:=&\text { span}\left\{ \Psi _0\right\} \oplus \underbrace{\text { span}\{{\mathcal {X}}_{\mu } \Psi _0 :~ \mu \in {\mathcal {I}}^K\}}_{:= \widetilde{{\mathcal {V}}}_K}. \end{aligned}$$

Additionally, for any \(\mu , \nu \in {\mathcal {I}}^K\) and \(\sigma \in {\mathcal {I}} {\setminus } {\mathcal {I}}^K\) it holds that

$$\begin{aligned} {\mathcal {X}}_{\mu } {\mathcal {X}}_{\nu }\Psi _0 \in \widetilde{{\mathcal {V}}}_K \qquad&\text {and} \qquad {\mathcal {X}}_{\mu }^{\dagger } {\mathcal {X}}_{\nu }\Psi _0 \in \widetilde{{\mathcal {V}}}_K\\ {\mathcal {X}}_{\mu } {\mathcal {X}}_{\sigma }\Psi _0 \notin {\mathcal {B}}_K^N \qquad&\text {and} \qquad {\mathcal {X}}_{\mu }^{\dagger } {\mathcal {X}}_{\sigma }\Psi _0 \notin {\mathcal {B}}_K^N,\\ {\mathcal {X}}_{\sigma } {\mathcal {X}}_{\nu }\Psi _0 \notin {\mathcal {B}}_K^N \qquad&\text {and} \qquad {\mathcal {X}}_{\sigma }^{\dagger } {\mathcal {X}}_{\nu }\Psi _0 =0. \end{aligned}$$

Finally, as in Sect. 4 we will denote \(\Psi _{\mu }:= {\mathcal {X}}_{\mu } \Psi _0\) for any \(\mu \in {\mathcal {I}}^K\).

Next we will introduce subspaces of coefficient vectors corresponding to the excitation index sets \(\left\{ {\mathcal {I}}^K\right\} _{K\ge N}\). The following construction is essentially an adaptation of Definition 14 of the sequence space \({\mathbb {V}}\) to finite dimensions.

Definition 37

(Finite-Dimensional Coefficient Spaces)

Let the excitation index set \({\mathcal {I}}_K\) be defined through Definition 35 for \(K\ge N\), and let the Hilbert space of sequences \({\mathbb {V}}\) be defined according to Definition 14. We define the Hilbert subspace of coefficients \({\mathbb {V}}_{K} \subset {\mathbb {V}}\) as the set

$$\begin{aligned} {\mathbb {V}}_K:= \left\{ \textbf{t}:= ({\varvec{t}}_{\mu })_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}:\quad {\varvec{t}}_{\mu }=0 ~\forall \mu \notin {\mathcal {I}}^K \right\} , \end{aligned}$$
(39)

equipped with the \((\cdot , \cdot )_{{\mathbb {V}}}\) inner product.

Notation 38

Consider Definition 37 of the Hilbert subspace of coefficients \({\mathbb {V}}_K\), and let \({\varvec{t}}\in {\mathbb {V}}_K\) denote an arbitrary element. In the sequel, for clarity of exposition we will frequently denote \({\varvec{t}}:= {\varvec{t}}_K:= \{{\varvec{t}}_{\mu }\}_{\mu \in {\mathcal {I}}^K}\). In other words, by an abuse of notation, we will identify \({\mathbb {V}}_K\) with the set \(\ell ^2\left( {\mathcal {I}}^K\right) \) but equipped with the \((\cdot , \cdot )_{{\mathbb {V}}}\) inner product.

As can be expected, the coefficient subspaces \(\{{\mathbb {V}}_K\}_{K\ge N}\) introduced through Definition 37 inherit many properties from the N-particle approximation spaces \(\{{\mathcal {B}}_K^N\}_{K\ge N}\). Indeed, we have the following lemma.

Lemma 39

(Density of Finite-Dimensional Coefficient Spaces) Let the infinite-dimensional Hilbert space of sequences \({\mathbb {V}} \subset \ell ^2({\mathcal {I}})\) be defined through Definition 14 and let the Hilbert subspace of coefficients \({\mathbb {V}}_{{K}} \subset {\mathbb {V}}\) be defined through Definition 37 for \(K\ge N\). Then it holds that

$$\begin{aligned} \forall ~ K_2 > K_1 \ge N :\quad {\mathbb {V}}_{K_1} \subset {\mathbb {V}}_{K_2} \qquad \text {and} \qquad \overline{\bigcup _{\begin{array}{c} K\ge N \end{array}} {\mathbb {V}}_K}^{\Vert \cdot \Vert _{{\mathbb {V}}}}= {\mathbb {V}}. \end{aligned}$$

Proof

The set inclusion is obvious so we focus on proving the density. Note that the density result would also be obvious had the sequence space \({\mathbb {V}}\) been equipped with the \(\Vert \cdot \Vert _{\ell ^2}\) norm. The density is slightly subtle precisely because we have equipped \({\mathbb {V}}\) with the non-standard \(\Vert \cdot \Vert _{{\mathbb {V}}}\) norm.

Recall that the union of the N-particle approximation spaces \(\left\{ {\mathcal {V}}_K\right\} _{K\ge N}\) is dense in \(\widehat{{\mathcal {H}}}^1\). From this we deduce that the union of the N-particle approximation subspaces \(\{\widetilde{{\mathcal {V}}}_K\}_{K\ge N}\) is dense in \(\text {span}\{\Psi _0\}^{\perp }\), where we remind the reader that \(\widetilde{{\mathcal {V}}}_K= \text { span} \{\Psi _{\mu }:~ \mu \in {\mathcal {I}}^K\} = \{\Psi \in {\mathcal {V}}_K :(\Psi , \Psi _0)_{\widehat{{\mathcal {L}}}^{\, 2}}=0\}\).

Consequently, there exists a sequence of functions \(\{\Psi _K\}_{K\ge N}\) with each \(\Psi _K:= \sum _{\mu \in {\mathcal {I}}^K} {\varvec{t}}^K_\mu \Psi _{\mu } \in \widetilde{{\mathcal {V}}}_K\) such that \(\lim _{K \rightarrow \infty }\Vert \Psi _K -\Psi _{{\varvec{s}}}^* \Vert _{\widehat{{\mathcal {H}}}^1}=0\). Defining for each \(K \ge N\), the sequence \({\varvec{t}}_K \in {\mathbb {V}}_K \) as \({\varvec{t}}_K = \{{\varvec{t}}^K_{\mu }\}_{\mu \in {\mathcal {I}}^K}\), and using the definition of the \(\Vert \cdot \Vert _{{\mathbb {V}}}\) norm now yields the required density. \(\square \)

We are now ready to state the discrete coupled cluster equations corresponding to the approximation spaces we have introduced above. As mentioned at the beginning of this section, these equations are known in the quantum chemical literature as the Full-Coupled Cluster equations in a finite basis.

Full-Coupled Cluster Equations in a Finite Basis:

Let the excitation index set \({\mathcal {I}}^K\) be defined through Definition 35 for \(K\ge N\), let the Hilbert subspace of coefficients \({\mathbb {V}}_K \subset {\mathbb {V}}\) be defined through Definition 37, and let the coupled cluster function \({\mathcal {f}}:{\mathbb {V}} \rightarrow {\mathbb {V}}^*\) be defined through Definition 22. We seek a coefficient vector \({\varvec{t}}_K \in {\mathbb {V}}_K\) such that for all coefficient vectors \({\varvec{s}}_K \in {\mathbb {V}}_K\) it holds that

$$\begin{aligned} \left\langle {\varvec{s}}_K, {\mathcal {f}}({\varvec{t}}_K)\right\rangle _{{\mathbb {V}} \times {\mathbb {V}}^*}=0. \end{aligned}$$
(40)

The remainder of this section will be concerned with the (local) well-posedness analysis of Eq. (40). We begin with a definition.

Definition 40

(Restricted Coupled Cluster function on Full-CI spaces)

Let the excitation index set \({\mathcal {I}}^K\) be defined through Definition 35 for \(K\ge N\) and let the Hilbert subspace of coefficients \({\mathbb {V}}_K \subset {\mathbb {V}}\) be defined through Definition 37. We define the restricted coupled cluster function \({\mathcal {f}}_K :{\mathbb {V}}_K\rightarrow {\mathbb {V}}^*_K\) as the mapping with the property that for all \({\varvec{t}}_K, {\varvec{s}}_K \in {\mathbb {V}}_K\) it holds that

$$\begin{aligned} \langle {\varvec{s}}_K, {\mathcal {f}}_{K}({\varvec{t}}_K)\rangle _{{\mathbb {V}}_K \times {\mathbb {V}}_K^*} := \left\langle {\varvec{s}}_K, {\mathcal {f}}({\varvec{t}}_K)\right\rangle _{{\mathbb {V}}\times {\mathbb {V}}^*}. \end{aligned}$$

It is readily seen that solutions \({\varvec{t}}_K^* \in {\mathbb {V}}_K\) to the Full-CC equations in a finite basis (40) are nothing else than zeros of the restricted coupled cluster function \({\mathcal {f}}_K :{\mathbb {V}}_K \rightarrow {\mathbb {V}}_K^*\) defined through Definition 40. The following result, whose proof can, for instance, be found in [46], is essentially a finite-dimensional analogue of Theorem 24 and establishes a relationship between these zeros of the restricted coupled cluster function and intermediately normalised eigenfunctions of the Full-CI Hamiltonian \(H_{K} :{\mathcal {V}}_K \rightarrow {\mathcal {V}}_K^*\) defined through Eq.  (7).

Theorem 41

(Relation between Restricted Coupled Cluster Zeros and Full-CI Eigenfunctions) Let the restricted coupled cluster function \({\mathcal {f}}_K :{\mathbb {V}}_K \rightarrow {\mathbb {V}}_K^*\) be defined through Definition 40, and let the Full-CI Hamiltonian \(H_{K} :{\mathcal {V}}_K \rightarrow {\mathcal {V}}_K^*\) be defined through Eq. (7). Then

  1. (1)

    For any zero \({\varvec{t}}_K^* = \{{\varvec{t}}_{\mu }^*\}_{\mu \in {\mathcal {I}}^K}\in {\mathbb {V}}_K\) of the restricted CC function, the function \(\Psi _K^*=e^{{\mathcal {T}}_K^*}\Psi _0 \in {\mathcal {V}}_K\) with \({\mathcal {T}}_K^*=\sum _{\mu \in {\mathcal {I}}^K} {\varvec{t}}^*_\mu {\mathcal {X}}_\mu \) is an intermediately normalised eigenfunction of the Full-CI Hamiltonian. Moreover, the eigenvalue corresponding to the eigenfunction \(\Psi _K^*\) coincides with the discrete CC energy \({\mathcal {E}}_{K, \mathrm CC}^*\) generated by \({\varvec{t}}_K^*\) as defined through Eq. (16).

  2. (2)

    Conversely, for any intermediately normalised eigenfunction \(\Psi _K^* \in {\mathcal {V}}_K\) of the Full-CI Hamiltonian, there exists \({\varvec{t}}_K^* = \{{\varvec{t}}_{\mu }^*\}_{\mu \in {\mathcal {I}}^K}\in {\mathbb {V}}_K\) such that \({\varvec{t}}_K^*\) is a zero of the restricted CC function and \(\Psi _K^*=e^{{\mathcal {T}}_K^*}\Psi _0 \in {\mathcal {V}}_K\) with \({\mathcal {T}}_K^*=\sum _{\mu \in {\mathcal {I}}^K} {\varvec{t}}^*_\mu {\mathcal {X}}_\mu \). Moreover, the discrete CC energy \({\mathcal {E}}_{K, \mathrm CC}^*\) generated by \({\varvec{t}}_K^*\) through through Eq. (16) coincides with the eigenvalue corresponding to the eigenfunction \(\Psi _K^*\).

In view of Theorem 41, the goal of our analysis in this section will be twofold: first, we would like to demonstrate, exactly as in the infinite-dimensional case, that solutions \({\varvec{t}}_K^* \in {\mathbb {V}}_K\) of the Full-CC equations (40) that correspond to non-degenerate eigenpairs of the Full-CI Hamiltonian are locally unique. Second, we wish to obtain a characterisation of the error between solutions \({\varvec{t}}_K^* \in {\mathbb {V}}_K\) of the Full-CC equations (40) and solutions \({\varvec{t}}^* \in {\mathbb {V}}\) of the continuous coupled cluster equations (17). For the latter analysis we will appeal to classical results from the numerical analysis of Galerkin discretisations of non-linear equations but the former task is essentially trivial since the Full-CC equations have the same structure as the continuous CC equations and hence our proofs from Sect. 4 can be copied with minor amendments. For the sake of brevity therefore, we simply state the final result on local uniqueness of solutions to the Full-CC equations in a finite basis (40).

Theorem 42

(Local Well-Posedness of the Full-Coupled Cluster Equations in a Finite Basis) Let \({\mathbb {V}}_K \subset {\mathbb {V}}\) denote the Hilbert subspace of coefficients as defined through Definition 37 for \(K \ge N\), let the restricted coupled cluster function \({\mathcal {f}}_K :{\mathbb {V}}_K \rightarrow {\mathbb {V}}_K^*\) be defined through Definition 40, let \({\varvec{t}}_K^*:= \{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}^K} \in {\mathbb {V}}_K\) denote a zero of the restricted coupled cluster function corresponding to any intermediately normalised eigenfunction \(\Psi _K^* \in {\mathcal {V}}_K\) of the Full-CI Hamiltonian \(H_{K} :{\mathcal {V}}_K \rightarrow {\mathcal {V}}_K^*\) with non-degenerate eigenvalue \({\mathcal {E}}_K^*\), let \(\gamma _K >0\) denote the inf-sup constant of the shifted Full-CI Hamiltonian \(H_K-{\mathcal {E}}_K^*\) on \(\{\Psi _K^*\}^{\perp } \subset {\mathcal {V}}_K\), let \(\Theta _K > 0\) be defined as \(\Theta _K:= \Vert e^{({\mathcal {T}}^*_K)^{\dagger }}\Vert _{{\mathcal {V}}_K\rightarrow {\mathcal {V}}_K} \Vert {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}_K^*}\Vert _{{\mathcal {V}}_K \rightarrow {\mathcal {V}}_K}\) with \({\mathcal {T}}_K^*:= \sum _{\mu \in {\mathcal {I}}^K}{\varvec{t}}^*_{\mu } {\mathcal {X}}_{\mu } \), let the continuity constant \(\alpha _{{\varvec{t}}_K^*} >0\) and the Lipschitz continuity function \(\textrm{L}_{{\varvec{t}}_K^*} :{\mathbb {R}}_+ \rightarrow {\mathbb {R}}_+\) be defined according to Notation 30, and define the constant

$$\begin{aligned} \textrm{R}:= \textrm{R}(K):= \min _{\delta >0} \left\{ \delta , ~\frac{\gamma _K}{\textrm{L}_{{\varvec{t}}_K^*}(\delta ) \Theta _K},~ 2\frac{\alpha _{{\varvec{t}}_K^*}}{\textrm{L}_{{\varvec{t}}_K^*}(\delta )} \right\} . \end{aligned}$$

Then \({\mathcal {f}}_K\big (\textrm{B}_{\textrm{R}}({\varvec{t}}^*_K)\big )\) is an open subset of \({\mathbb {V}}_K^*\), the restriction of \({\mathcal {f}}_K\) to \(\textrm{B}_{\textrm{R}}({\varvec{t}}_K^*)\) is a diffeomorphism, and for all \( {\varvec{s}}_K \in \textrm{B}_{\textrm{R}}({\varvec{t}}_K^*)\) we have the error estimate

$$\begin{aligned} \frac{1}{2} \frac{1}{\alpha _{{\varvec{t}}_K^*} } \Vert {\mathcal {f}}_K({\varvec{s}}_K) \Vert _{{\mathbb {V}}_K^*} \le \Vert {\varvec{t}}_K^* - {\varvec{s}}_K \Vert _{{\mathbb {V}}_K} \le 2 \frac{\Theta _K}{\gamma _K} \Vert {\mathcal {f}}_K({\varvec{s}}_K) \Vert _{{\mathbb {V}}_K^*}. \end{aligned}$$
(41)

In particular, \({\varvec{t}}_K^*\) is the unique solution of the Full-Coupled Cluster equations in a finite basis (13) in the open ball \(\textrm{B}_{\textrm{R}_K}({\varvec{t}}_K^*)\).

Proof

The proof is essentially identical to the proof of Theorem 33 with some obvious modifications. We first obtain an expression for the Full-CC Jacobian \(\textrm{D}{\mathcal {f}}_K({\varvec{t}}_K) :{\mathbb {V}}_K \rightarrow {\mathbb {V}}_K^*\) at any \({\varvec{t}}_K \in {\mathbb {V}}_K\) exactly as in the infinite-dimensional case. Thanks to Theorem 41, we can deduce from this expression that the Jacobian \(\textrm{D}{\mathcal {f}}_K({\varvec{t}}_K^*)\) at any zero \({\varvec{t}}_K^* \in {\mathbb {V}}_K\) of the restricted CC function has the form

$$\begin{aligned}&\left\langle {\varvec{w}}_K, \textrm{D}{\mathcal {f}}_K({\varvec{t}}_K^*) {\varvec{s}}_K \right\rangle _{{\mathbb {V}}_K \times {\mathbb {V}}_K^*}\\&\quad = \left\langle \sum _{\mu \in {\mathcal {I}}^K} {\varvec{w}}_{\mu } {\mathcal {X}}_\mu \Psi _0, e^{-{\mathcal {T}}_K^*} \left( H - {\mathcal {E}}_K^*\right) e^{{\mathcal {T}}_K^*} \sum _{\nu \in {\mathcal {I}}^K} {\varvec{s}}_{\nu } {\mathcal {X}}_\nu \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$

for all \({\varvec{s}}_K, {\varvec{w}}_K \in {\mathbb {V}}_K \) with \({\varvec{s}}_K =\{{\varvec{s}}_{\nu }\}_{\nu \in {\mathcal {I}}^K}\) and \( {\varvec{w}}_K=\{{\varvec{w}}_{\mu }\}_{\mu \in {\mathcal {I}}^K}\) where \({\mathcal {T}}_K^*:= \sum _{\mu \in {\mathcal {I}}^K}{\varvec{t}}^*_{\mu } {\mathcal {X}}_{\mu } \).

In analogy with Definition 29, we can then introduce an operator \({\mathcal {A}}_K({\varvec{t}}_K^*) :\widetilde{{\mathcal {V}}}_K \rightarrow \widehat{{\mathcal {H}}}^{-1}\) that characterises the action of the Full-CC derivative \(\textrm{D}{\mathcal {f}}_K({\varvec{t}}_K^*)\) and show that this operator is an isomorphism from \(\widetilde{{\mathcal {V}}}_K\) to \(\widetilde{{\mathcal {V}}}_K^*\) exactly as in Theorem 31. The local-uniqueness result then readily follows. \(\square \)

Consider the setting of Theorem 42. For very small molecules discretised in minimal basis sets, it is possible to perform Full-CI calculations and thereby gain access to the derivative \(\textrm{D}{\mathcal {f}}_K({\varvec{t}})\) of the restricted coupled cluster function at \({\varvec{t}}= {\varvec{t}}^*_{\textrm{FCI}} \in {\mathbb {V}}_K\), i.e., at the coefficient vector \({\varvec{t}}^*_{\textrm{FCI}}\) which generates the Full-CI ground state wave-function. It is natural to ask how the bounds that we have derived compare to the exact norm of the inverse \(\textrm{D}{\mathcal {f}}^{-1}_K(t^*_{\textrm{FCI}})\). While a comprehensive numerical study involving state-of-the-art quantum chemistry basis sets for moderately large molecules, is computationally unfeasible, there is some hope that numerical experiments can be performed on certain very small molecules using so-called minimal basis sets. A numerical study of this nature is left to a future contribution but some preliminary numerical results are given in Table 2 and Figs. 1 and 2. Based on these results, the lower bounds that we have derived for the operator norm \(\Vert \textrm{D}{\mathcal {f}}^{-1}_K({\varvec{t}}^*_{\textrm{FCI}})\Vert ^{-1}_{{\mathbb {V}}_K^* \rightarrow {\mathbb {V}}_K}\) seem reasonable at equilibrium but tend to degrade in the bond dissociation regime.

Table 2 Examples of numerically computed constants for a collection of small molecules at equilibrium geometries

We now turn to the second goal of this section, namely to a study of the error between solutions \({\varvec{t}}_K^* \in {\mathbb {V}}_K\) of the Full-CC equations (40) and solutions \({\varvec{t}}^* \in {\mathbb {V}}\) of the continuous coupled cluster equations (17). Since the Full-CC equations are simply Galerkin discretisations of the continuous CC equations, their local well-posedness can be deduced from classical results in non-linear numerical analysis. Indeed, we merely have to obtain an appropriate invertibility result for the coupled cluster Fréchet derivative restricted to the coefficient subspaces \(\{{\mathbb {V}}_K\}_{K \ge N}\) and we must establish that the subspaces \(\{{\mathbb {V}}_K\}_{K \ge N}\) have the approximation property with respect to \({\mathbb {V}}\). The latter demonstration is a simple consequence of the density of \(\underset{K\ge N}{\cup }{\mathbb {V}}_K\) in \({\mathbb {V}}\) which has already been proven in Lemma 39. We therefore focus on obtaining the required invertibility result.

We begin by defining projection operators corresponding to the various finite-dimensional approximation spaces we have introduced.

Definition 43

(Projection Operators)

Let \({\mathcal {V}}_K= \text { span}\{\Psi _0\} \cup \widetilde{{\mathcal {V}}}_K \subset \widehat{{\mathcal {H}}}^1\) denote the finite-dimensional N-particle approximation space for \(K \ge N\) and let \({\mathbb {V}}_K \subset {\mathbb {V}}\) denote the Hilbert subspace of coefficients as defined through Definition 37. Then

  • We denote by \({\mathbb {P}}_{K} :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1\), the \(\widehat{{\mathcal {H}}}^1\)-orthogonal projection operator onto \({\mathcal {V}}_K\) and by \({\mathbb {P}}_K^{\perp }\) its complement, i.e., \({\mathbb {P}}_K^{\perp }= {\mathbb {I}}-{\mathbb {P}}_K\).

  • We denote by \(\Pi _{K} :{\mathbb {V}} \rightarrow {\mathbb {V}}\), the \(( \cdot , \cdot )_{{\mathbb {V}}}\)-orthogonal projection operator onto \({\mathbb {V}}_K\) and by \(\Pi _K^{\perp }\) its complement, i.e., \(\Pi _K^{\perp }= {\mathbb {I}}-\Pi _K\).

Fig. 1
figure 1

Numerically computed constants for the HF molecule at different bond lengths. The equilibrium bond length is 0.9168 Angstrom. The figure on the right uses a log scale on the y-axis

Fig. 2
figure 2

Numerically computed constants for the LiH molecule at different bond lengths. The equilibrium bond length is 1.5949 Angstrom. The figure on the right uses a log scale on the y-axis

Notation 44

(Cluster Operators Involving Projections) Consider the setting of Definition 43 and let \({\varvec{t}}\in {\mathbb {V}}\). In the sequel, we will frequently consider cluster operators generated by \(\Pi _K {\varvec{t}}\) or \(\Pi _K^{\perp }{\varvec{t}}\). We will therefore use the notation \({\mathcal {T}}(\Pi _K)\) and \( {\mathcal {T}}(\Pi _K^{\perp })\) respectively to denote these cluster operators, i.e., we denote

$$\begin{aligned} {\mathcal {T}}(\Pi _K):=&\sum _{\mu \in {\mathcal {I}}^{K}} {\varvec{s}}_{\mu } {\mathcal {X}}_{\mu } \quad \text {where }~ \{{\varvec{s}}_{\mu }\}_{\mu \in {\mathcal {I}}^K} = \Pi _K {\varvec{t}}\in {\mathbb {V}}_K, \quad \text {and}\\ {\mathcal {T}}(\Pi _K^{\perp }):=&\sum _{\mu \in {\mathcal {I}}} {\varvec{r}}_{\mu } {\mathcal {X}}_{\mu } \quad \hspace{2mm}\text {where }~ \{{\varvec{r}}_{\mu }\}_{\mu \in {\mathcal {I}}} = \Pi _K^{\perp } {\varvec{t}}\in {\mathbb {V}}. \end{aligned}$$

We are now ready to state the main technical lemma of this section. We emphasise that the proof of this lemma assumes that any isolated, simple eigenpair of the electronic Hamiltonian can be approximated by a sequence of simple eigenpairs of the Full-CI Hamiltonian.

Lemma 45

(Invertibility of the coupled cluster Fréchet derivative on \({\mathbb {V}}_K\)) Let the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}}^* \) be defined through Definition 22, for any \({\varvec{t}}\in {\mathbb {V}}\) let \(\textrm{D}{\mathcal {f}}({\varvec{t}})\) denote the Fréchet derivative of the coupled cluster function as defined through Eq. (20), let \({\varvec{t}}^* \in {\mathbb {V}}\) denote a zero of the coupled cluster function corresponding to an intermediately normalised eigenfunction \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian \(H :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) with isolated, non-degenerate eigenvalue \({\mathcal {E}}^*\), let \({\mathbb {V}}_K \subset {\mathbb {V}}\) denote the Hilbert subspace of coefficients as defined through Definition 37 for \(K \ge N\), let the Full-CI Hamiltonian \({H}_{K} :{\mathcal {V}}_K \rightarrow {\mathcal {V}}_K^*\) be defined according to Equation 7 and assume that there exists a sequence of simple eigenpairs \((\Psi _K^*, {\mathcal {E}}_K^*) \in {\mathcal {V}}_K \times {\mathbb {R}}\) of the Full-CI Hamiltonians \(\{{H}_K\}_{K \ge N}\), i.e.,

$$\begin{aligned} \forall \Phi _K \in {\mathcal {V}}_K :\quad \langle \Phi _K, {H}_K \Psi _K^*\rangle _{{\mathcal {V}}_K \times {\mathcal {V}}_K^*}&= {\mathcal {E}}_K^* \langle \Phi _K, \Psi _K^*\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} ~\text { with }~ {\mathcal {E}}_K^* \text { simple} \quad \text {and such that} \nonumber \\ \lim _{K \rightarrow \infty } \Vert \Psi ^* - \Psi _K^*\Vert _{\widehat{{\mathcal {H}}}^1} =0, \quad&\lim _{K \rightarrow \infty }\vert {\mathcal {E}}^* - {\mathcal {E}}_K^* \vert =0. \end{aligned}$$
(42)

Then for all K sufficiently large, there exist a constant \(\gamma _K >0\) uniformly bounded below in K, a constant \(\Theta _K >0\) uniformly bounded above in K, a constant \(\varepsilon _K> 0\) such that \(\underset{K\rightarrow \infty }{\lim }\ \varepsilon _K =0\), a constant \(\omega _K> 0\) such that \(\underset{K\rightarrow \infty }{\lim } \omega _K =1\), and we have the estimate

$$\begin{aligned} \inf _{0\ne {\varvec{w}}_K \in {\mathbb {V}}_K} \sup _{0 \ne {\varvec{s}}_K \in {\mathbb {V}}_K}\frac{ \left\langle {\varvec{w}}_K, \textrm{D}{\mathcal {f}}({\varvec{t}}^*){\varvec{s}}_K\right\rangle _{{\mathbb {V}} \times {\mathbb {V}}^*}}{\Vert {\varvec{w}}_K\Vert _{{\mathbb {V}}} \Vert {\varvec{s}}_K\Vert _{{\mathbb {V}}}} \ge \frac{{\gamma _{K}}/{\omega _K} - \varepsilon _K}{\Theta _K}. \end{aligned}$$

Proof

Let \({\varvec{w}}_K= \{{\varvec{w}}_{\mu }\}_{\mu \in {\mathcal {I}}^K}, {\varvec{s}}_K=\{{\varvec{s}}_{\mu }\}_{\mu \in {\mathcal {I}}^K} \in {\mathbb {V}}_K\) be arbitrary and let the bounded linear operator \({\mathcal {A}}({\varvec{t}}^*) :\widetilde{{\mathcal {V}}}\rightarrow \widetilde{{\mathcal {V}}}^*\) be defined according to Equation 29. It follows from Corollary 27 that

$$\begin{aligned} \left\langle {\varvec{w}}_K, \textrm{D}{\mathcal {f}}({\varvec{t}}^*){\varvec{s}}_K \right\rangle _{{\mathbb {V}} \times {\mathbb {V}}^*}&= \left\langle {\mathcal {W}}_K \Psi _0, {\mathcal {A}}({\varvec{t}}^*) {\mathcal {S}}_K\Psi _0 \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}\\&= \left\langle {\mathcal {W}}_K \Psi _0, e^{-{\mathcal {T}}^*} \left( {H}- {\mathcal {E}}^*\right) e^{{\mathcal {T}}^*} {\mathcal {S}}_K \Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$

where \({\mathcal {W}}_K:= \sum _{\mu \in {\mathcal {I}}^K} {\varvec{w}}_{\mu } {\mathcal {X}}_{\mu }\) and \({\mathcal {S}}_K:= \sum _{\mu \in {\mathcal {I}}^K} {\varvec{s}}_{\mu } {\mathcal {X}}_{\mu }\). To avoid tedious notation, let us define \(\Phi _{{\mathcal {W}}}:= {\mathcal {W}}_K \Psi _0 \in \widetilde{{\mathcal {V}}}_K\) and \(\Phi _{{\mathcal {S}}}:= {\mathcal {S}}_K \Psi _0\in \widetilde{{\mathcal {V}}}_K\). Obviously, we now have

$$\begin{aligned} \left\langle \Phi _{{\mathcal {W}}}, e^{-{\mathcal {T}}^*} \left( {H}- {\mathcal {E}}^*\right) e^{{\mathcal {T}}^*} \Phi _{{\mathcal {S}}} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}= \left\langle e^{{\mathcal {T}}^*}\Phi _{{\mathcal {S}}}, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }} \Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}. \end{aligned}$$
(43)

Since \(\Psi ^*\) is intermediately normalisable by assumption, there exists \(\widetilde{K_0} \in {\mathbb {N}}\) such that for all \(K \ge \widetilde{K_0}\), the eigenfunction \(\Psi _K^* \in {\mathcal {V}}_K\) is intermediately normalisable. In the remainder of this proof, we assume that indeed \(K \ge \widetilde{K_0}\) and we denote by \({\varvec{t}}_K^*:= \{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}} \in {\mathbb {V}}_K\) the coefficient vector with the property that \(\Psi _K^*= e^{{\mathcal {T}}_K^*}\Psi _0\) where \({\mathcal {T}}_K^*:= \sum _{\mu \in {\mathcal {I}}^{K}} {\varvec{t}}^*_{\mu }{\mathcal {X}}_{\mu }\). Let us emphasise here that since \(\Psi _K^*\) is an eigenfunction of the Full-CI Hamiltonian, it follows from Theorem 41 that \({\varvec{t}}_K^*\) is a zero of the restricted coupled cluster function \({\mathcal {f}}_K :{\mathbb {V}}_K \rightarrow {\mathbb {V}}_K^*\) defined through Definition 40.

Recalling now that \(\Phi _{{\mathcal {S}}}\) is arbitrary (due to the fact that the sequence \({\varvec{s}}_K=\{{\varvec{s}}_{\mu }\}_{\mu \in {\mathcal {I}}^K} \in {\mathbb {V}}_K\) was chosen arbitrarily), we may in particular set for any \(\Phi ^*_{K, \perp }\in \{ \Psi _K^*\}^{\perp } \subset {\mathcal {V}}_K\):

$$\begin{aligned} \Phi _{{\mathcal {S}}}:= {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*({\Pi _K})} \Phi ^*_{K, \perp }\in \widetilde{{\mathcal {V}}}_K, \end{aligned}$$

where \({\mathcal {T}}^*({\Pi _K})\) denotes the cluster operator generated by \(\Pi _K {\varvec{t}}^*\in {\mathbb {V}}_K\) (recall Notation 44) and we have used the fact that, thanks to the properties of the excitation operators \(\{{\mathcal {X}}_{\mu }\}_{\mu \in {\mathcal {I}}}\) given by Remark 36, it holds that \(e^{-{\mathcal {T}}^*(\Pi _K)} \Psi _K \in {\mathcal {V}}_K\) for any \(\Psi _K \in {\mathcal {V}}_K\).

Plugging in this choice of \(\Phi _{{\mathcal {S}}}\) in Eq. (43) now yields

$$\begin{aligned} \left\langle {\mathcal {W}}_K \Psi _0, {\mathcal {A}}({\varvec{t}}^*) {\mathcal {S}}_K\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}&= \underbrace{\left\langle e^{{\mathcal {T}}^*}e^{-{\mathcal {T}}^*(\Pi _K)} \Phi ^*_{K, \perp }, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}}_{:=\mathrm (I)}\nonumber \\&\quad - \underbrace{\left\langle e^{{\mathcal {T}}^*} {\mathbb {P}}_0 e^{-{\mathcal {T}}^*(\Pi _K)} \Phi ^*_{K, \perp }, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}}_{:= \mathrm (II)}. \end{aligned}$$
(44)

We claim that the term (II) is identically zero. Indeed, using the fact that \(e^{{\mathcal {T}}^*} \Psi _0=\Psi ^*\) by assumption, a straightforward calculation shows that

$$\begin{aligned} \mathrm{(II)}&=\left( \Psi _0, e^{-{\mathcal {T}}^*(\Pi _K)} \Phi ^*_{K, \perp }\right) _{\widehat{{\mathcal {L}}}^{\, 2}} \left\langle e^{{\mathcal {T}}^*} \Psi _0, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}, \end{aligned}$$

and the second term in the product above is zero as \({H}e^{{\mathcal {T}}^*} \Psi _0= {\mathcal {H}}\Psi ^*= {\mathcal {E}}^*\Psi ^*\).

It therefore remains to simplify the term (I). To this end, we observe that we can write

$$\begin{aligned} \mathrm{(I)}&=\left\langle e^{{\mathcal {T}}^*(\Pi _K^{\perp })}\Phi ^*_{K, \perp }, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}\\&=\underbrace{\left\langle \Phi ^*_{K, \perp }, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}}_{:= (\mathrm IA)}\\&\quad +\underbrace{\left\langle \left( e^{{\mathcal {T}}^*(\Pi _K^{\perp })}-{\mathbb {I}}\right) \Phi ^*_{K, \perp }, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}}_{:= (\mathrm IB)}. \end{aligned}$$

Focusing first on the term (IB) and using the Cauchy-Schwarz inequality, we may write

$$\begin{aligned} \mathrm{(IB)}&\ge -\left\| e^{{\mathcal {T}}^*(\Pi _K^{\perp })}-{\mathbb {I}}\right\| _{\widehat{{\mathcal {H}}}^{1} \rightarrow {\widehat{{\mathcal {H}}}^{1}}} \left\| {H}-{\mathcal {E}}^*\right\| _{\widehat{{\mathcal {H}}}^{1} \rightarrow {\widehat{{\mathcal {H}}}^{-1}}} \big \Vert \Phi ^*_{K, \perp }\big \Vert _{\widehat{{\mathcal {H}}}^{1}}\big \Vert e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}. \end{aligned}$$
(45)

We now claim that in fact

$$\begin{aligned} \lim _{K \rightarrow \infty } \left\| e^{{\mathcal {T}}^*(\Pi ^{\perp }_K)}-{\mathbb {I}} \right\| _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}}=0. \end{aligned}$$

Indeed, thanks to the boundedness properties of the excitation operators given in Theorem 19, it holds that

$$\begin{aligned} \Vert e^{-{\mathcal {T}}^*(\Pi _K)}\Vert _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}} \le e^{\Vert {\mathcal {T}}^*(\Pi _K)\Vert _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}}} \le e^{\beta \Vert \Pi _K{\varvec{t}}^*\Vert _{{\mathbb {V}}}} \le e^{\beta \Vert {\varvec{t}}^*\Vert _{{\mathbb {V}}}}, \end{aligned}$$

where the constant \(\beta >0\) depends only on N.

Therefore, we need only show that \(\lim _{K \rightarrow \infty } \Vert e^{{\mathcal {T}}^*} - e^{{\mathcal {T}}^*(\Pi _K)} \Vert _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}}=0\). Recall however, that the exponential function is of class \({\mathscr {C}}^{\infty }\) on the algebra of bounded operators on \(\widehat{{\mathcal {H}}}^1\), and thus it suffices to show that

$$\begin{aligned} \lim _{K \rightarrow \infty } \Vert {\mathcal {T}}^*- {\mathcal {T}}^*(\Pi _K)\Vert _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}}=0. \end{aligned}$$

But this is an obvious consequence of the density of the coefficient spaces \(\{{\mathbb {V}}_K\}_{K\ge N}\) in \({\mathbb {V}}\). Indeed,

$$\begin{aligned} \lim _{K \rightarrow \infty } \Vert {\mathcal {T}}^*- {\mathcal {T}}^*(\Pi _K)\Vert _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}} \le \beta \lim _{K \rightarrow \infty } \Vert {\varvec{t}}^* - \Pi _K {\varvec{t}}^*\Vert _{{\mathbb {V}}}=0. \end{aligned}$$
(46)

Consequently, combining Eqs. (45) and (46), we obtain the existence of a constant \(\varepsilon _{1, K} >0\) with the property that \(\lim _{K \rightarrow \infty }\varepsilon _{1, K} =0\) and such that

$$\begin{aligned} \mathrm{(IB)}\ge -\varepsilon _{1, K} \big \Vert \Phi ^*_{K, \perp }\big \Vert _{\widehat{{\mathcal {H}}}^{1}}\big \Vert e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}. \end{aligned}$$
(47)

Let us now return to the term (IA). Notice that we may write

$$\begin{aligned} \mathrm{(IA)}&= \underbrace{\left\langle \Phi ^*_{K, \perp }, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*_K)^{\dagger }}\Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} }_{:= (\mathrm IAA)}\\&\quad + \underbrace{\left\langle \Phi ^*_{K, \perp }, \left( {H}- {\mathcal {E}}^*\right) \left( e^{-({\mathcal {T}}^*)^{\dagger }}-e^{-({\mathcal {T}}^*_K)^{\dagger }}\right) \Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}}_{:= (\mathrm IAB)}, \end{aligned}$$

where we recall from Eq. (42) that \({\varvec{t}}^*_K = \{{\varvec{t}}^*_{\mu }\}_{\mu \in {\mathcal {I}}^K} \in {\mathbb {V}}_K\) is the coefficient vector such that \(e^{{\mathcal {T}}_K^*}\Psi _0=\Psi _K^* \in {\mathcal {V}}_K\).

We first simplify the term (IAB). Thanks to the Cauchy-Schwarz inequality we may write

$$\begin{aligned} \mathrm{(IAB)}&\ge - \left\| e^{-({\mathcal {T}}^*)^{\dagger }}-e^{-({\mathcal {T}}^*_K)^{\dagger }}\right\| _{\widehat{{\mathcal {H}}}^{1} \rightarrow {\widehat{{\mathcal {H}}}^{1}}} \left\| e^{({\mathcal {T}}^*)^{\dagger }}\right\| _{\widehat{{\mathcal {H}}}^{1} \rightarrow {\widehat{{\mathcal {H}}}^{1}}}\nonumber \\&\quad \left\| {H}-{\mathcal {E}}^*\right\| _{\widehat{{\mathcal {H}}}^{1} \rightarrow {\widehat{{\mathcal {H}}}^{-1}}} \big \Vert \Phi ^*_{K, \perp }\big \Vert _{\widehat{{\mathcal {H}}}^{1}}\big \Vert e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}. \end{aligned}$$
(48)

We now claim that in fact \(\lim _{K \rightarrow \infty } \left\| e^{-({\mathcal {T}}^*)^{\dagger }}-e^{-({\mathcal {T}}_K^*)^{\dagger }}\right\| _{\widehat{{\mathcal {H}}}^{1} \rightarrow {\widehat{{\mathcal {H}}}^{1}}} =0\). Indeed, an easy calculation using the continuity properties of cluster operators given by Theorem 19, shows the existence of a constant \(\widetilde{\beta ^{\dagger }}>0\), depending only on N, such that for any \(K \ge N\) it holds that

$$\begin{aligned} \left\| e^{-({\mathcal {T}}^*)^{\dagger }}-e^{-({\mathcal {T}}_K^*)^{\dagger }}\right\| _{\widehat{{\mathcal {H}}}^{1} \rightarrow {\widehat{{\mathcal {H}}}^{1}}} \le \widetilde{\beta ^{\dagger }} \left\| e^{{\mathcal {T}}^*}\Psi _0 -e^{{\mathcal {T}}_K^*}\Psi _0 \right\| _{\widehat{{\mathcal {H}}}^{1}}= \left\| \Psi ^*-\Psi _K^* \right\| _{\widehat{{\mathcal {H}}}^{1}}. \end{aligned}$$
(49)

The claim now follows by using the convergence of the approximate eigenvector \(\Psi _K^* \in {\mathcal {V}}_K\) to \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) from Eq. (42). Consequently, we obtain the existence of a constant \(\varepsilon _{2, K} >0\) with the property that \(\lim _{K \rightarrow \infty }\varepsilon _{2, K} =0\) and such that

$$\begin{aligned} \mathrm{(IB)}\ge -\varepsilon _{2, K} \big \Vert \Phi ^*_{K, \perp }\big \Vert _{\widehat{{\mathcal {H}}}^{1}}\big \Vert e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}. \end{aligned}$$
(50)

Focusing finally on the term (IAA), a simple calculation shows that for any \(\Phi _{{\mathcal {W}}} \in \widetilde{{\mathcal {V}}}_K\), we have that \(e^{-({\mathcal {T}}^*_K)^{\dagger }}\Phi _{{\mathcal {W}}} \in \{\Psi _K^*\}^{\perp } \subset {\mathcal {V}}_K\). Furthermore, \({\mathcal {E}}^*\) is a simple, isolated eigenvalue by assumption and \(\lim _{K \rightarrow \infty } {\mathcal {E}}_K^* = {\mathcal {E}}^*\). Since \( \Phi ^*_{K, \perp } \in \{\Psi _K^*\}^{\perp } \subset {\mathcal {V}}_K\) is arbitrary, we therefore deduce the existence of \(\widehat{K_0} \in {\mathbb {N}}\) sufficiently large such that for all \(K \ge \widehat{K_0}\) the shifted Full-CI Hamiltonian \({H}_K- {\mathcal {E}}^*\) satisfies an inf-sup condition on \(\{\Psi _K^*\}^{\perp } \subset {\mathcal {V}}_K\), and as a consequence,

$$\begin{aligned}{} & {} \sup _{0 \ne \Phi ^*_{K, \perp } \in \{\Psi _K^*\}^{\perp }} \; \frac{{\mathrm{(IAA)}}}{\Vert \Phi ^*_{K, \perp }\Vert _{\widehat{{\mathcal {H}}}^1}}\nonumber \\{} & {} \quad = \sup _{0 \ne \Phi ^*_{K, \perp } \in \{\Psi _K^*\}^{\perp }} \frac{\big \langle \Phi ^*_{K, \perp }, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*_K)^{\dagger }}\Phi _{{\mathcal {W}}} \big \rangle _{\widehat{{\mathcal {H}}}^{1} \times {\widehat{{\mathcal {H}}}}^{-1}}}{\Vert \Phi ^*_{K, \perp }\Vert _{\widehat{{\mathcal {H}}}^1}} \ge \gamma _K \big \Vert e^{-({\mathcal {T}}^*_K)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}},\nonumber \\ \end{aligned}$$
(51)

where \(\gamma _K\) denotes the inf-sup constant of the shifted Full-CI Hamiltonian \({H}_K- {\mathcal {E}}^*\) on \(\{\Psi _K^*\}^{\perp } \subset {\mathcal {V}}_K\) for \(K \ge \widehat{K_0}\). For the remainder of this proof, we assume that indeed \(K \ge \widehat{K_0}\).

Notice that this last bound can be written as

$$\begin{aligned} \gamma _K \big \Vert e^{-({\mathcal {T}}^*_K)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}&= \gamma _K \big \Vert e^{-({\mathcal {T}}_K^*-{\mathcal {T}}^* )^{\dagger }} e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}\\&\ge \frac{\gamma _K}{\Vert e^{({\mathcal {T}}_K^*-{\mathcal {T}}^* )^{\dagger }}\Vert _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}}}\big \Vert e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}, \end{aligned}$$

where the inequality follows from the invertibility of the exponential map. Using now a similar calculation to the one used to obtain Inequality (49), it can easily be shown that \(\lim _{K \rightarrow \infty } \Vert e^{({\mathcal {T}}_K^*-{\mathcal {T}}^* )^{\dagger }}\Vert _{\widehat{{\mathcal {H}}}^{1} \rightarrow \widehat{{\mathcal {H}}}^{1}}=1\). Consequently, we obtain the existence of constant \(\omega _{K}> 0\) with the property that \(\lim _{K \rightarrow \infty } \omega _K =1\) and such that

$$\begin{aligned} \sup _{0 \ne \Phi ^*_{K, \perp } \in \{\Psi _K^*\}^{\perp }} \; \frac{{\mathrm{(IAA)}}}{\Vert \Phi ^*_{K, \perp }\Vert _{\widehat{{\mathcal {H}}}^1}}= & {} \sup _{0 \ne \Phi ^*_{K, \perp } \in \{\Psi _K^*\}^{\perp }} \frac{\left\langle \Phi ^*_{K, \perp }, \left( {H}- {\mathcal {E}}^*\right) e^{-({\mathcal {T}}^*_K)^{\dagger }}\Phi _{{\mathcal {W}}} \right\rangle _{\widehat{{\mathcal {H}}}^{1} \times \widehat{{\mathcal {H}}}^{-1}}}{\Vert \Phi ^*_{K, \perp }\Vert _{\widehat{{\mathcal {H}}}^1}}\nonumber \\\ge & {} \frac{\gamma _k}{\omega _k} \big \Vert e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}. \end{aligned}$$
(52)

Combining now the estimates (43)–(52) allows us to conclude that

$$\begin{aligned} \sup _{0 \ne {\varvec{s}}_K \in {\mathbb {V}}_K}\frac{ \left\langle {\varvec{w}}_K, \textrm{D}{\mathcal {f}}({\varvec{t}}^*){\varvec{s}}_K\right\rangle _{{\mathbb {V}} \times {\mathbb {V}}^*}}{ \Vert {\varvec{s}}_K\Vert _{{\mathbb {V}}}} =&\sup _{0 \ne \Phi _{{\mathcal {S}}} \in \widetilde{{\mathcal {V}}}_K} \frac{\left\langle {\mathcal {W}}_K \Psi _0, {\mathcal {A}}({\varvec{t}}^*) {\mathcal {S}}_K\Psi _0\right\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}}}{ \Vert \Phi _{{\mathcal {S}}}\Vert _{\widehat{{\mathcal {H}}}^1}}\\ \ge&\sup _{0 \ne \Phi ^*_{K, \perp } \in \{\Psi _K^*\}^{\perp } }\frac{(\mathrm IAA) + (\mathrm IAB) + (\mathrm IB)}{ \Vert {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*({\Pi _K})} \Phi ^*_{K, \perp }\Vert _{\widehat{{\mathcal {H}}}^1}}\\ \ge&\frac{{\gamma _K}/{\omega _K}-\varepsilon _{1, K} -\varepsilon _{2, K}}{\left\| {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*({\Pi _K})}\right\| _{\widehat{{\mathcal {H}}}^1} }\Vert e^{-({\mathcal {T}}^*)^{\dagger }}\Phi _{{\mathcal {W}}}\big \Vert _{\widehat{{\mathcal {H}}}^{1}}\\ \ge&\frac{{\gamma _K}/{\omega _K}-\varepsilon _{1, K} -\varepsilon _{2, K} }{ \Vert e^{({\mathcal {T}}^*)^{\dagger }}\Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1} \Vert {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*({\Pi _K})}\Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1}} \big \Vert \Phi _{{\mathcal {W}}} \big \Vert _{\widehat{{\mathcal {H}}}^{1}}. \end{aligned}$$

Defining the constants \(\Theta _K:= \Vert e^{({\mathcal {T}}^*)^{\dagger }}\Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1} \Vert {\mathbb {P}}_0^{\perp } e^{-{\mathcal {T}}^*({\Pi _K})}\Vert _{\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^1}\), and \(\varepsilon _K:= \varepsilon _{1, K} +\varepsilon _{2, K}\) and taking the infimum over all coefficient vectors \({\varvec{w}}_K\in {\mathbb {V}}_K\) now yields the required estimate. The fact that the constant \(\Theta _K\) is uniformly bounded above in K is a consequence of the continuity properties of exponential cluster operators together with the density of the union of subspaces \(\underset{K\ge N}{\cup }\ {\mathbb {V}}_K\) in \({\mathbb {V}}\). The fact that the inf-sup constant \(\gamma _K\) is uniformly bounded below in K is a consequence of the eigenvalue convergence \({\mathcal {E}}_K^* \rightarrow {\mathcal {E}}^*\) (see also the arguments in Remark 34). \(\square \)

Equipped with Lemma 45, we are now ready to state the final result of this section, which concerns the error between the ground state solution of the Full-CC equations in a finite basis (40) and the exact solutions of the continuous CC equations (17).

Theorem 46

(Error Estimates for Full-CC in a Finite Basis) Let the coupled cluster function \({\mathcal {f}} :{\mathbb {V}} \rightarrow {\mathbb {V}}^* \) be defined through Definition 22, for any \({\varvec{t}}\in {\mathbb {V}}\) let \(\textrm{D}{\mathcal {f}}({\varvec{t}})\) denote the Fréchet derivative of the coupled cluster function as defined through Eq. (20), let \({\varvec{t}}^* \in {\mathbb {V}}\) denote a zero of the coupled cluster function corresponding to an intermediately normalised eigenfunction \(\Psi ^* \in \widehat{{\mathcal {H}}}^1\) of the electronic Hamiltonian \({H} :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) with isolated, non-degenerate ground state eigenvalue \({\mathcal {E}}^*\), let \({\mathbb {V}}_K \subset {\mathbb {V}}\) denote the Hilbert subspace of coefficients as defined through Definition 37 for \(K\ge N\), let the Full-CI Hamiltonian \({H}_{K} :{\mathcal {V}}_K \rightarrow {\mathcal {V}}_K^*\) be defined according to Equation 7, assume that there exists a sequence of simple eigenpairs \((\Psi _K^*, {\mathcal {E}}_K^*) \in {\mathcal {V}}_K \times {\mathbb {R}}\) of the Full-CI Hamiltonians \(\{{H}_K\}_{K \ge N}\), i.e.,

$$\begin{aligned} \forall \Phi _K \in {\mathcal {V}}_K :\qquad \langle \Phi _K, {H}_K \Psi _K^*\rangle _{{\mathcal {V}}_K \times {\mathcal {V}}_K^*}&= {\mathcal {E}}_K^* \langle \Phi _K, \Psi _K^*\rangle _{\widehat{{\mathcal {H}}}^1 \times \widehat{{\mathcal {H}}}^{-1}} ~\text { with }~ {\mathcal {E}}_K^* \text { simple} \quad \text {and such that}\nonumber \\ \lim _{K \rightarrow \infty } \Vert \Psi ^* - \Psi _K^*\Vert _{\widehat{{\mathcal {H}}}^1} =0, \quad&\lim _{K \rightarrow \infty }\vert {\mathcal {E}}^* - {\mathcal {E}}_K^* \vert =0, \end{aligned}$$

and let the constants \(\gamma _K, \Theta _K, \varepsilon _K, \omega _K >0\) be defined as in the proof of Lemma 45.

Then there exists \(K_0 \in {\mathbb {N}}\) and a constant \(\delta _0>0\) such that for all \(K \ge K_0\) there exists a unique solution \({\varvec{t}}_{K}^*\in {\mathbb {V}}_K\) to the Full-CC equations in a finite basis (40) in the closed ball \(\overline{{\mathbb {B}}_{\delta _K}({\varvec{t}}^*) }\) where \(\delta _K= \delta _0 \dfrac{{\gamma _K}/{\omega _K}-\varepsilon _K}{\Theta _K}\).

Moreover, there exists a constant \(\textrm{C}>0\) such that \(\forall K\ge K_0\) we have the quasi-optimality result

$$\begin{aligned} \Vert {\varvec{t}}^*_{K} -{\varvec{t}}^*\Vert _{{\mathbb {V}}} \le \textrm{C}\frac{\Theta _K}{{\gamma _K}/{\omega _K}-\varepsilon _K}\; \inf _{{\varvec{s}}_K \in {\mathbb {V}}_K}\Vert {\varvec{s}}_K - {\varvec{t}}^*\Vert _{{\mathbb {V}}}, \end{aligned}$$
(53)

and we have the residual-based error estimate

$$\begin{aligned} \Vert {\varvec{t}}^*_{K} -{\varvec{t}}^*\Vert _{{\mathbb {V}}} \le 2\left\| \textrm{D}{\mathcal {f}}\left( {\varvec{t}}^*_{K}\right) ^{-1}\right\| _{{\mathbb {V}}^* \rightarrow {\mathbb {V}}} \;\left\| {\mathcal {f}}\left( {\varvec{t}}^*_{K}\right) \right\| _{{\mathbb {V}}^*}. \end{aligned}$$
(54)

Proof

As mentioned at the beginning of this section, the Full-Coupled Cluster equations in a finite basis (40) are simply a Galerkin discretisation of the continuous coupled cluster equations (17). Galerkin discretisations of non-linear equations have been widely studied in the literature on non-linear numerical analysis. In particular, the proof of Theorem 46 is a direct application of [5, Theorem 7.1]. We merely have to confirm that the assumptions of [5, Theorem 7.1] hold, and this amounts to

  1. (1)

    Establishing that the coupled cluster Fréchet derivative at \({\varvec{t}}^* \in {\mathbb {V}}\), which we denote \(\textrm{D}{\mathcal {f}}({\varvec{t}}^*)\), satisfies the discrete inf-sup condition

    $$\begin{aligned} \exists \Upsilon _K >0:\qquad \inf _{0\ne {\varvec{w}}_K \in {\mathbb {V}}_K} \sup _{0 \ne {\varvec{s}}_K \in {\mathbb {V}}_K}\frac{ \left\langle {\varvec{w}}_K, \textrm{D}{\mathcal {f}}({\varvec{t}}^*){\varvec{s}}_K\right\rangle _{{\mathbb {V}} \times {\mathbb {V}}^*}}{\Vert {\varvec{w}}_K\Vert _{{\mathbb {V}}} \Vert {\varvec{w}}_S\Vert _{{\mathbb {V}}}} \ge \Upsilon _K; \end{aligned}$$
    (55)
  2. (2)

    Establishing that the coefficient subspaces \(\{{\mathbb {V}}_K\}_{K\ge N}\) satisfy the following approximability condition:

    $$\begin{aligned} \lim _{K \rightarrow \infty } \;\inf _{0\ne {\varvec{s}}_K \in {\mathbb {V}}_K}\; \frac{1}{\Upsilon _K^2}\;\Vert {\varvec{t}}^* - {\varvec{s}}_K\Vert _{{\mathbb {V}}} =0. \end{aligned}$$
    (56)

The discrete inf-sup condition (55) has been established in Lemma 45 with constant \(\Upsilon _K = \dfrac{{\gamma _K}/{\omega _K} - \varepsilon _K}{\Theta _K}\) which will obviously be positive for all K sufficiently large since \(\varepsilon _K \rightarrow 0\). It therefore remains to establish the approximability result (56) but this is a simple consequence of the previously exploited fact that the union of subspaces \(\underset{K\ge N}{\cup }\ {\mathbb {V}}_K\) is dense in \({\mathbb {V}}\) together with the fact that, as shown in the proof of Lemma 45, the constant \(\gamma _K\) is uniformly bounded below in K and the constant \(\Theta _K\) is uniformly bounded above in K. \(\square \)

We conclude this section with several remarks.

Remark 47

(Necessity of Assumptions of Lemma 45 in Theorem 46)  Consider the setting of Lemma 45 and Theorem 46 and recall in particular the assumption that any isolated, simple eigenpair of the electronic Hamiltonian can be approximated by a sequence of simple eigenpairs of the Full-CI Hamiltonian as expressed through Eq. (42). It is readily seen from the proof of Lemma 45 that this assumption is not required if one considers invertibility of the CC Fréchet derivative \(\textrm{D}{\mathcal {f}}({\varvec{t}}^*)\) on \({\mathbb {V}}_K\) at \({\varvec{t}}^* = {\varvec{t}}^*_{\textrm{GS}}\). Indeed, in this special case, the discrete inf-sup condition for the shifted Full-CI Hamiltonian \({H}_K- {\mathcal {E}}^*\) on \(\{\Psi ^*_K\}^{\perp } \subset {\mathcal {V}}_K\) can be replaced with the coercivity of the shifted electronic Hamiltonian \({H}- {\mathcal {E}}^*_{\textrm{GS}}\) on \(\{\Psi ^*_{\textrm{GS}}\}^{\perp } \subset \widehat{{\mathcal {H}}}^1\). In this special case therefore, the proof of Lemma 45 holds without any assumption beyond the simplicity of the ground state energy and the intermediate normalisability of the ground state wave-function. Thus we can deduce the asymptotic local well-posedness of the Full-CC equations in a finite-basis (17) in a neighbourhood of \({\varvec{t}}_{\textrm{GS}}^* \in {\mathbb {V}}\) according to Theorem 46 without any additional assumptions.

Remark 48

(Comparing the Conclusions of Theorems 42 and 46) Consider the settings of Theorems 42 and 46. Let us emphasise here that, in contrast to Theorem 42, Theorem 46 does not explicitly require that the ground state wave-function in \({\mathcal {V}}_K\) of the Full-CI Hamiltonian be intermediately normalisable or that the associated ground state eigenvalue be simple. Instead these properties are inherited (for K large enough) from the properties of the exact electronic Hamiltonian \({H} :\widehat{{\mathcal {H}}}^1 \rightarrow \widehat{{\mathcal {H}}}^{-1}\) thanks to the density of the N-particle approximation spaces \(\{{\mathcal {V}}_K\}_{K\ge N}\) in \(\widehat{{\mathcal {H}}}^1\). More significantly, Theorem 46 provides error estimates for the Full-CC equations in a finite basis with respect to the zeros of the exact (infinite-dimensional) coupled cluster function.

Remark 49

(Error Estimates for the Discrete Coupled Cluster Energies) It is natural, at this point, to ask whether a priori and residual-based error estimates of the form (53) and (54) can be obtained for the Full-CC discrete energies. Quasi-optimal a priori error estimates for the discrete CC energies have been obtained by Schneider and Rohwedder [43, Theorem 4.5] using the dual weighted residual-based approach developed by Rannacher and coworkers [50]. The arguments of Schneider and Rohwedder can readily be seen to apply in our framework, and the proof and statement of [43, Theorem 4.5] can be thus be copied nearly word-for-word, the only difference being that the local monotonicity constant that appears in the a priori error estimate in [43, Theorem 4.5] is replaced with the discrete inf-sup constant that we have derived in Lemma 45. The establishment of residual-based error estimates for the discrete CC energies, which requires considerably more work but can be achieved using the tools developed in this article, will be addressed in a forthcoming contribution.