1 Introduction

While Hartree–Fock theory describes some aspects of interacting fermionic systems very well, it utterly fails at others. The best known example is that Hartree–Fock theory predicts a vanishing density of states at the Fermi momentum, which is incompatible with measurements of the conductivity and specific heat in metals [30]. It is therefore important to develop a rigorous understanding of many-body corrections to Hartree–Fock theory. The simplest theory of many-body correlations is the random-phase approximation (RPA).

In this paper we show that the RPA is mathematically rigorous, insofar as the RPA correlation energy provides an upper bound on the ground state energy of interacting fermions in the mean-field scaling regime. Our approach also sheds some light on the emergence of bosonic collective modes in the Fermi gas, described by an effective quadratic Hamiltonian.

We consider a system of \(N\gg 1\) fermionic particles with mass \(m >0\) in the torus \(\mathbb {T}^3 = \mathbb {R}^3/(2\pi \mathbb {Z}^3)\), interacting via a two-body potential V, in the mean-field scaling regime. Setting

$$\begin{aligned} \hbar = N^{-1/3}, \end{aligned}$$

the Hamiltonian is defined as

$$\begin{aligned} H_N := -\frac{\hbar ^2}{2m} \sum _{i=1}^N \Delta _{x_i} + \frac{1}{N}\sum _{1\le i<j \le N} V(x_i - x_j)\;, \end{aligned}$$

and acts on the Hilbert space \(L^2_a\left( (\mathbb {T}^3)^N\right) \) consisting of square-integrable functions that are anti-symmetric under permutations of the N arguments. For simplicity we consider only the spinless case.Footnote 1 The choice of \(\hbar = N^{-1/3}\) and coupling constant 1 / N defines the fermionic mean-field regime: it guarantees that both kinetic and potential energies are of order N, as \(N \rightarrow \infty \) (see [9] for a detailed introduction).

The ground state energy of the system is defined as

$$\begin{aligned} E_N := \inf _{\begin{array}{c} \psi \in L^2_a\left( (\mathbb {T}^3)^N\right) \\ ||\psi || = 1 \end{array}} \langle \psi , H_N \psi \rangle \;. \end{aligned}$$
(1.1)

In Hartree–Fock theory, one restricts the attention to Slater determinants

$$\begin{aligned} \psi _\text {Slater} (x_1, \dots , x_N) = \frac{1}{\sqrt{N!}} \sum _{\sigma \in S_N} {\text {sgn}}(\sigma ) f_1 (x_{\sigma (1)}) f_2 (x_{\sigma (2)}) \dots f_N (x_{\sigma (N)}) \end{aligned}$$

with \(\{f_j\}_{j=1}^N\) an orthonormal set in \(L^2 (\mathbb {R}^3)\). Slater determinants are an example of quasi-free states: all reduced density matrices can be expressed in terms of the one-particle reduced density matrix \(\omega := N{\text {tr}}_{2,\ldots , N} |\psi \rangle \langle \psi |\). For a Slater determinant, one has \(\omega = \sum _{j=1}^N |f_j \rangle \langle f_j|\). In particular, the energy of a Slater determinant is given by the Hartree–Fock energy functional, depending only on \(\omega \):

$$\begin{aligned} \begin{aligned} \mathcal {E}_\text {HF}(\omega )&:= \langle \psi _\text {Slater}, H_N \psi _\text {Slater} \rangle \\&= {\text {tr}}\, \left( \frac{- \hbar ^2}{2m} \Delta \omega \right) + \frac{1}{2N} \int {{\mathrm{d}}}x {{\mathrm{d}}}y V(x-y) \omega (x,x) \omega (y,y) \\&\quad - \frac{1}{2N} \int {{\mathrm{d}}}x {{\mathrm{d}}}y V(x-y) |\omega (x,y) |^2. \end{aligned} \end{aligned}$$

(The first two summands are typically of order N and called the kinetic and direct term, respectively; the third summand is typically of order 1 and called the exchange term.) Thus, minimizing \(\mathcal {E}_\text {HF}(\omega )\) over all orthogonal projections \(\omega \) with \({\text {tr}}\, \omega = N\) gives an upper bound to the ground state energy \(E_N\). Actually, it turns out that Hartree–Fock theory provides more than an upper bound for the ground state energy: the method developed in [2, 3, 33] for the jellium model can also be applied to show that in the present mean-field scaling the Hartree–Fock minimum agrees with the many-body ground state energy up to an error of size o(1) for \(N \rightarrow \infty \). Moreover, by projection of the time-dependent Schrödinger equation onto the manifold of quasi-free states one obtains the time-dependent Hartree–Fock equation [10], which was proven to effectively approximate the many-body evolution of mean-field fermionic systems [5,6,7,8, 58, 59].

For Nnon-interacting particles on the torus, the ground state is given by the Slater determinant constructed from plane waves

$$\begin{aligned} f_k (x) = (2\pi )^{-3/2} e^{i k \cdot x}, \quad k \in \mathbb {Z}^3, \end{aligned}$$
(1.2)

where the momenta \(k_{1}, \ldots , k_{N}\in \mathbb {Z}^{3}\) are chosen to minimize the kinetic energy in a way compatible with the Pauli principle; i. e., by filling the Fermi ball, up to the Fermi momentum \(k_{\mathrm{F}}\). The energy \(E_{\mathrm{F}} := k_{\mathrm{F}}^2/(2m)\) is called the Fermi energy, and the sphere \(k_F \mathbb {S}^2\) of radius \(k_F\) is called the Fermi surface. (We assume that N is chosen so that this state is unique, no modes in the Fermi ball being left empty.) We shall denote by \(\omega _{{\mathrm{pw}}}\) the reduced one-particle density matrix of this state,

$$\begin{aligned} \omega _{{\mathrm{pw}}} = \sum _{i=1}^{N} |f_{k_{i}} \rangle \langle f_{k_{i}}|\;. \end{aligned}$$

It turns out that this simple state is a stationary state of the Hartree–Fock energy functional even with interactions, and in our setting provides a good approximation to the minimum of the Hartree–Fock functional. The focus of the present paper is to quantify the effect of correlations in the true many-body ground state: in particular, we shall be interested in the correlation energy, defined as the difference of the ground state energy and the Hartree–Fock energy of the plane wave state,Footnote 2\(E_{N} - \mathcal {E}_{\mathrm{HF}}(\omega _{\mathrm{pw}})\).

The quest of calculating the correlation energy has been a driving force in the early development of theoretical condensed matter physics. Let us discuss the case of the jellium model: that is, fermions interacting via Coulomb repulsion, exposed to a neutralizing background charge on the torus, in the large volume limit. Let us consider the ground state energy per volume of the system, in the high density regime. As noticed already by Wigner [66] and Heisenberg [40], the computation of the correlation energy is an intricate matter because perturbation theory with respect to the Coulomb potential becomes more and more infrared divergent at higher orders. It was however quickly understood that these divergences are an artefact of perturbation theory [53]; a partial resummation of the perturbative expansion allows to capture the effect of screening, that ultimately trades the infrared divergence for a \(\rho \log \rho \) contribution (\(\rho \) being the density) to the ground state energy.

In their seminal work [13, 55] Bohm and Pines related the screening of the Coulomb potential to an auxiliary bosonic mode called the plasmon, and coined the name “random-phase approximation”; see also [27] for a reformulation of their result using Jastrow–type states. Gell-Mann and Brueckner showed that the RPA can be seen as a systematic resummation of the most divergent diagrams of perturbation theory [28], which has become the most popular point of view for physicists. Another interpretation of the RPA was given by Sawada et al. in [60, 61] as an effective theory of approximately bosonic particle–hole pairs. A systematic mapping of particle–hole pairs to bosonic operators was introduced by Usui in [64] but does not lead to a quadratic Hamiltonian. (In Usui’s approach there are parallels to bosonization in the Heisenberg model [18, 20, 21, 41], which also gives rise to interesting problems in the calculation of higher order corrections to the free energy [4].) Sawada’s approach has been systematically related to perturbation theory in [1]. Sawada’s effective Hamiltonian has proved useful for further investigations into diamagnetism and the Meissner effect [65]. While Sawada’s concept of bosonic pairs is very elegant, it remained unclear which parameter makes the error of the bosonic approximation small. This was clarified many years later, highlighting the role of collective excitations delocalized over many particle–hole pairs [15, 16, 24,25,26, 39, 42, 43, 45,46,47, 52, 54]; the main idea being that collective excitations of pairs of fermions do not experience the Pauli exclusion principle if they involve many fermionic modes of which only few are occupied.

Concerning rigorous works for the jellium model, the only available result for the correlation energy is the work of Graf and Solovej [33], which provided an upper and lower bound proportional to \(\rho ^{4/3 - \delta }\) for some \(\delta > 0\). This bound has been obtained using correlation inequalities for the many-body interaction together with semiclassical methods. Unfortunately, this is still far from the expected \(\rho \log \rho \) behavior: to improve on [33], new ideas are needed.

In the context of interacting fermions in the mean-field regime, the first rigorous result on the correlation energy has been recently obtained in [37], for small interaction potentials, via upper and lower bounds matching at leading order. One has:

$$\begin{aligned} \lim _{N \rightarrow \infty } \frac{E_N-\mathcal {E}_{\mathrm{HF}}(\omega _{\mathrm{pw}})}{\hbar } = - m \pi (1-\log (2)) \sum _{k \in \mathbb {Z}^3} |k|\hat{V}(k)^2 \big (1 + \mathcal {O}({\hat{V}}(k)) \big )\;. \end{aligned}$$
(1.3)

The strategy of [37] is based on a rigorous formulation of second order perturbation theory following [17, 35, 36, 38], combined with methods developed in the context of many-body quantum dynamics [5, 7, 8, 58]. For larger interaction potentials however, this method is limited to a lower bound of the right order in \(\hbar \) and N but not capturing the precise value.

Here we shall provide a rigorous upper bound on the correlation energy, without any smallness assumption on the size of the potential. It improves on the upper bound of [37], to which it reduces in the limit of small interactions. The method of the proof is inspired by a mapping of the particle–hole excitations around the Fermi surface to emergent bosonic degrees of freedom: this allows to estimate the correlation energy in terms of the ground state energy of a quadratic, bosonic Hamiltonian. The expression we obtain, if formally extrapolated to the infinite volume limit, agrees with the Gell-Mann–Brueckner formula for the jellium model.

Our method can be seen as a rigorous version of the Haldane–Luther bosonization for interacting Fermi gases, a nonperturbative technique widely used in condensed matter physics; see [44] for a review. To our knowledge, this is the first time that this method is formulated in a mathematically rigorous setting. We believe that this method, possibly combined with [37], will be crucial to rigorously understand the correlation energy for a large class of high density Fermi gases, including the jellium model.

Correlation corrections to the ground state energy of interacting Bose gases have been studied to a much larger extent. Upper and lower bounds have been proven for the mean-field scaling regime in [19, 34, 49, 56, 57, 62], for the jellium model in [50, 51, 63], for the Gross–Pitaevskii scaling regime in [12], and in an intermediate scaling regime in [11, 14, 29]. The Lee–Huang–Yang formula for the low-density limit has been proven as an upper bound in [22] for small potential and in [67] for general potential, and only very recently as a lower bound [23].

2 Main Result

In this section we present our main result, Theorem 2.1. Our theorem provides an upper bound for the ground state energy, which is consistent with the Gell-Mann–Brueckner formula for the correlation energy.

Notice that for the interaction potential we normalize the Fourier transform such that \(\hat{V}(k) = (2\pi )^{-3} \int {{\mathrm{d}}}x\, e^{-ik\cdot x} V(x)\), whereas for wave functions we choose it unitary in \(L^2\).

Theorem 2.1

(Upper Bound for the Ground State Energy). Let \(\hat{V}: \mathbb {Z}^3 \rightarrow \mathbb {R}\) be non-negative and compactly supported. Let \(k_F > 0\) be the Fermi momentum and \(N := |\{ k \in \mathbb {Z}^3 : |k|\le k_F\}|\) the number of particles; recall that \(\hbar = N^{-1/3}\). Let \(\omega _{\mathrm{pw}} := \sum _{k \in \mathbb {Z}^3 : |k|\le k_F} |f_k \rangle \langle f_k |\) be the projection on the filled Fermi ball. Then, asymptotically for \(k_F \rightarrow \infty \), the ground state energy (1.1) satisfies the upper bound

$$\begin{aligned} E_N&\le \mathcal {E}_{\mathrm{HF}}(\omega _{\mathrm{pw}}) \nonumber \\&\quad + \frac{\hbar \kappa _0}{2m} \sum _{k \in \mathbb {Z}^3} |k|\left[ \frac{1}{\pi } \int _0^\infty \log \left( 1+4\pi \hat{V}(k) m\kappa _0 \left( 1 - \lambda \arctan \frac{1}{\lambda } \right) \right) {{\mathrm{d}}}\lambda - \hat{V}(k)m\kappa _0\pi \right] \nonumber \\&\quad + \hbar \, \mathcal {O}(N^{-1/27})\;, \end{aligned}$$
(2.1)

where \(\kappa _0 = (\frac{3}{4\pi })^{1/3}\).

Remarks

  1. (i)

    We conjecture that there is actually equality in (2.1); i. e., a corresponding lower bound, possibly with different error exponent, should hold.

  2. (ii)

    Recall that the Hartree–Fock energy \(\mathcal {E}_{\mathrm{HF}}(\omega _{\mathrm{pw}})\) consists of kinetic energy (order N), direct interaction energy (order N), and exchange interaction energy (order 1). Our many-body correction is of order \(\hbar = N^{-1/3}\). As expected, it is negative, so that it improves over \(\mathcal {E}_{\mathrm{HF}}(\omega _{\mathrm{pw}})\).

  3. (iii)

    Notice that already with regular interaction potential the correlation correction at order \(\hbar \) involves arbitrarily high powers of the interaction potential.

  4. (iv)

    If we formally extrapolate our formula to the jellium model it agrees with the correlation energy first obtained by Gell-Mann and Brueckner [28, Equation (19)] as a power series; see also [61, Equation (37)] for the first appearance of the explicit expression. Gell-Mann and Brueckner also obtain a contribution from a second order exchange-type term denoted \({\epsilon _b}^{(2)}\); for us, in mean-field scaling and with compactly supported \(\hat{V}\), this term is only of order \(\hbar ^2\). However, since our trial state captures the second order direct-type term correctly and can be expanded in powers of \(\hat{V}\), we expect that it would also capture the second order exchange-type term in models where it has a bigger contribution.

  5. (v)

    For small interaction potentials \({\hat{V}}\), we can expand

    $$\begin{aligned} \begin{aligned}&\frac{1}{\pi } \int _0^\infty \log \left( 1+4\pi \hat{V}(k) m\kappa _0 \left( 1 - \lambda \arctan \frac{1}{\lambda } \right) \right) {{\mathrm{d}}}\lambda - \hat{V}(k)m\kappa _0\pi \\&\quad = - \frac{8 \pi ^2}{3} \hat{V}(k)^2 m^2 \kappa ^2_0 \left( 1-\log (2)\right) + \mathcal {O}\left( \hat{V}(k)^3\right) . \end{aligned} \end{aligned}$$

    Therefore

    $$\begin{aligned} \frac{E_N - \mathcal {E}_{\mathrm{HF}}(\omega _{\mathrm{pw}})}{\hbar } \le - m \pi (1-\log (2)) \sum _{k \in \mathbb {Z}^3} |k|\hat{V}(k)^2 (1 + \mathcal {O}({\hat{V}}(k))) + \mathcal {O}(N^{-1/27}). \end{aligned}$$

    This is consistent with [37], see (1.3) (notice that [37] considered the Fermi gas in \([0,1]^3\) instead of \([0,2\pi ]^3\)). Whereas [37] uses rigorous second-order perturbation theory, here we use a non-perturbative bosonization method which directly yields a resummation of the dominant contributions of the perturbation series both of the ground state and the ground state energy to all orders in the potential.

  6. (vi)

    The assumption of \(\hat{V}\) being compactly supported is mainly used to control the number of particle–hole pairs that may be lost near the boundaries of patches (see Sect. 6) and to avoid interaction between different patches across the separating corridors (see Fig. 2). A sufficiently fast power law decay of \(\hat{V}(k)\) for large k should allow to control such error terms, but to keep the article readable we do not follow up on this question here.

In the remaining part of the paper we prove Theorem 2.1. Our proof is based on a reorganization of the particle–hole excitations around the Fermi surface in terms of approximately bosonic collective degrees of freedom, which we will introduce in the next section. Notice that 1 / m can be factored out from the Hamiltonian, replacing the potential V by mV, so we consider only \(m=1\) and the dependence on m is easily restored at the end.

3 Collective Particle–Hole Pairs

In this section we represent the correlation energy in terms of particle–hole excitations around the Fermi surface. These excitations will be described by quadratic fermionic operators on the Fock space, that behave as almost bosonic operators. The advantage of this rewriting is that the correlation energy can thus be related to the ground state energy of a quadratic almost-bosonic Hamiltonian.

3.1 The correlation Hamiltonian

Here we shall introduce a Fock space representation of the model. We shall follow the notations of [9, Chapter 6], to which we refer for more details. Let \(\mathcal {F}:= \mathcal {F}(L^{2}(\mathbb {T}^{3}))\) be the fermionic Fock space built on the single-particle space \(L^{2}(\mathbb {T}^{3})\). Let us denote by \(\mathcal {H}_{N}\) the second quantization of \(H_{N}\). We have

$$\begin{aligned} \mathcal {H}_N = \frac{\hbar ^2}{2} \int {{\mathrm{d}}}x \nabla _x a^*_x \nabla _x a_x + \frac{1}{2N} \int {{\mathrm{d}}}x {{\mathrm{d}}}y\, V(x-y) a^*_x a^*_y a_y a_x, \end{aligned}$$

where \(a^{*}_{x}\), \(a_{x}\) are the creation and annihilation operators (more precisely, operator-valued distributions), creating or annihilating a fermionic particle at \(x \in \mathbb {T}^3\). They satisfy the usual canonical anticommutation relations (CAR)

$$\begin{aligned} \{a_x,a_y\} = 0 = \{a^*_x,a^*_y\}, \quad \{a_x,a^*_y\} = \delta (x-y). \end{aligned}$$
(3.1)

Given a function \(f \in L^{2}(\mathbb {T}^{3})\) we also define \(a(f) := \int {{\mathrm{d}}}x\, a_{x} \overline{f(x)}\) and \(a^{*}(f) = \left( a(f)\right) ^{*}\).

Let us define the Fermi ball

$$\begin{aligned} B_{{\mathrm{F}}} := \{ k\in \mathbb {Z}^{3} : |k|\le k_{{\mathrm{F}}} \}\;, \end{aligned}$$

where \(k_{{\mathrm{F}}}\) is the Fermi momentum. Let N be the number of points in the Fermi ball, \(N := |B_{{\mathrm{F}}} |\). Then, by Gauss’ classical counting argument,

$$\begin{aligned} \begin{aligned} k_{{\mathrm{F}}} = \kappa N^{1/3}\;,\qquad \kappa = \kappa (N)&= (3/4\pi )^{1/3} + \mathcal {O}(N^{-1/3})\\&=: \kappa _0 + \mathcal {O}(N^{-1/3}) \;. \end{aligned} \end{aligned}$$
(3.2)

We also introduce the complement of the Fermi ball,

$$\begin{aligned} B_{{\mathrm{F}}}^{c} = \mathbb {Z}^{3} \setminus B_{{\mathrm{F}}}. \end{aligned}$$

The filled Fermi ball is obtained by considering the Slater determinant \(\psi _{\mathrm{pw}}\) built from the plane waves \(f_{k_{i}}(x) = (2\pi )^{-3/2} e^{ik_{i}\cdot x}\), associated to the points \(k_{i} \in B_{{\mathrm{F}}}\), \(i = 1,\ldots , N\). Let \(\omega _{{\mathrm{pw}}}\) be the reduced one-particle density matrix associated to such states, \(\omega _{{\mathrm{pw}}} = \sum _{i=1}^{N} |f_{k_{i}} \rangle \langle f_{k_{i}} |\). With the plane waves \(f_k\) defined in (1.2), we define the unitaryFootnote 3 particle–hole transformation \(R_{\omega _{\mathrm{pw}}}: \mathcal {F}\rightarrow \mathcal {F}\) by setting

$$\begin{aligned} R_{\omega _{\mathrm{pw}}} a (f_k) R^*_{{\omega _{\mathrm{pw}}}} := \left\{ \begin{array}{cc} a(f_k) &{} \text {for }k \in B_{\mathrm{F}}^c\\ a^*(\overline{f_k}) &{} \text {for }k \in B_{\mathrm{F}}\end{array} \right. \qquad \text {and} \qquad R_{\omega _{\mathrm{pw}}} \Omega := \psi _{\mathrm{pw}}. \end{aligned}$$

Here we introduced the vacuum vector \(\Omega = (1,0,0,\ldots ) \in \mathcal {F}\). Particle–hole transformations are a particular kind of fermionic Bogoliubov transformation. In fact, formally writing \(a_x = a(\delta (\cdot -x))\) and \(\delta (y-x) = \sum _{k \in \mathbb {Z}^3} f_k(y) \overline{f_k(x)}\) one can rewrite the previous relation in position space,

$$\begin{aligned} R_{\omega _{\mathrm{pw}}} a_x R^*_{\omega _{\mathrm{pw}}} = a(u_x) + a^*(\overline{v}_x), \qquad R_{\omega _{\mathrm{pw}}} a^*_x R^*_{\omega _{\mathrm{pw}}} = a^*(u_x) + a(\overline{v}_x), \end{aligned}$$
(3.3)

where \(u= \mathbb {I}- \omega _{\mathrm{pw}}\), \(v = \sum _{k \in B_{\mathrm{F}}} |\overline{f_k}\rangle \langle f_k |\) and where we also introduced the short-hand notation \(v_x(\cdot ) = v(\cdot ,x) = \sum _{k \in B_{\mathrm{F}}} \overline{f_k}(\cdot ) \overline{f_k}(x)\) and \(u_x(\cdot ) = u(\cdot ,x) = \delta (\cdot -x)-\sum _{k \in B_{\mathrm{F}}} {f_k}(\cdot ) \overline{f_k}(x)\).

The state \(R_{\omega _{\mathrm{pw}}} \Omega \) plays the role of the new vacuum for the model, on which the new fermionic operators \(R_{\omega _{\mathrm{pw}}} a (f_k) R^*_{{\omega _{\mathrm{pw}}}}\) act. We call momenta in \(B_{\mathrm{F}}\)hole modes, and momenta in \(B^{c}_{\mathrm{F}}\)particle modes. We will use the notation \(a^*_k := a^*(f_k)\). If we want to emphasize that the index is outside the Fermi ball we write \(a^*_p\), \(p \in B_{\mathrm{F}}^c\) (“p” like “particle”) and say that \(a^*_p\) creates a particle. Similarly we use \(a^*_h\), \(h \in B_{\mathrm{F}}\) (“h” like “hole”) and say that \(a^*_h\) creates a hole in the Fermi ball. We call \(\mathcal {N}_{\mathrm{p}} := \sum _{p \in B_{\mathrm{F}}^c} a^*_p a_p\) the number-of-particles operator and \(\mathcal {N}_{\mathrm{h}} := \sum _{h \in B_{\mathrm{F}}} a^*_h a_h\) the number-of-holes operator. If we do not want to distinguish between particles and holes we use the word “fermion”, for example calling \(\mathcal {N}= \mathcal {N}_{\mathrm{p}} + \mathcal {N}_{\mathrm{h}}\) the number-of-fermions operator.

Let us consider the conjugated Hamiltonian \(R^*_{{\omega _{\mathrm{pw}}}} \mathcal {H}_N R_{\omega _{\mathrm{pw}}}\). Using (3.3), and rewriting the result into a sum of normal-ordered contributions one gets (see [9, Chapter 6] for a similar computation in the context of many-body quantum dynamics):

$$\begin{aligned} R^*_{{\omega _{\mathrm{pw}}}} \mathcal {H}_N R_{\omega _{\mathrm{pw}}}&= \mathcal {E}_{\mathrm{HF}}(\omega _{\mathrm{pw}}) + {{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v) + Q_N \end{aligned}$$
(3.4)

with \(d\Gamma (A)\) the second quantizationFootnote 4 of a one-particle operator A. The operator h is the one-particle Hartree–Fock Hamiltonian, given by

$$\begin{aligned} h = -\frac{\hbar ^2 \Delta }{2} + (2\pi )^3 \hat{V} (0) + X \, \end{aligned}$$
(3.5)

where X is the exchange operator, defined by its integral kernel \(X(x,y) = -N^{-1} V(x-y) \omega _{\mathrm{pw}}(x,y)\). As for the operator \(Q_N\) on the r. h. s. of (3.4), it contains all contributions that are quartic in creation and annihilation operators. It is given by

$$\begin{aligned} \begin{aligned} Q_N&= \frac{1}{2N} \int _{\mathbb {T}^3 \times \mathbb {T}^3} {{\mathrm{d}}}x{{\mathrm{d}}}y\, V(x-y) \bigg (\mathcal {E}_1(x,y) + 2 a^*(u_x) a^*(\overline{v}_x) a(\overline{v}_y) a(u_y)\\& + \Big [ a^*(u_x) a^*(\overline{v}_x) a^*(u_y) a^*(\overline{v}_y) + \mathcal {E}_2(x,y) + {\mathrm{h.c.}}\Big ] \bigg ) \end{aligned} \end{aligned}$$

where

$$\begin{aligned} \mathcal {E}_1(x,y)= & {} a^*(u_x)a^*(u_y) a(u_y)a(u_x)- 2 a^*(u_x) a^*(\overline{v}_y) a(\overline{v}_y) a(u_x) \nonumber \\&+\, a^*(\overline{v}_y)a^*(\overline{v}_x) a(\overline{v}_x) a(\overline{v}_y) \end{aligned}$$
(3.6)

and

$$\begin{aligned} \mathcal {E}_2(x,y) = - 2a^*(u_x) a^*(u_y)a^*(\overline{v}_x) a(u_y) + 2 a^*(u_x) a^*(\overline{v}_y) a^*(\overline{v}_x) a(\overline{v}_y). \end{aligned}$$
(3.7)

As we shall see, both \(\mathcal {E}_{1}\) and \(\mathcal {E}_{2}\) will provide subleading corrections to the correlation energy, as \(N\rightarrow \infty \). The operator \(R^*_{{\omega _{\mathrm{pw}}}} \mathcal {H}_N R_{\omega _{\mathrm{pw}}} - \mathcal {E}_{\mathrm{HF}}(\omega _{\mathrm{pw}})\) is called the correlation Hamiltonian,

$$\begin{aligned} \mathcal {H}_{{\mathrm{corr}}} := {{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v) + Q_N\;. \end{aligned}$$
(3.8)

Let \(\psi \in \mathcal {F}\) be a normalized N-particle state in the fermionic Fock space, that is \(\psi = (0, 0, \ldots , 0, \psi ^{(N)}, 0, \ldots )\). By the variational principle, we have

$$\begin{aligned} E_{N} \le \langle \psi , \mathcal {H}_{N} \psi \rangle = \mathcal {E}_{\mathrm{HF}}(\omega _{\mathrm{pw}}) + \langle \xi , \mathcal {H}_{{\mathrm{corr}}} \xi \rangle \;, \end{aligned}$$

where \(\xi = R_{\omega _{\mathrm{pw}}}^* \psi \). The last step follows from the identity (3.4).

We are going to construct an N-particle state \(\psi _{\mathrm{trial}} = R_{\omega _{\mathrm{pw}}} \xi \) such that \(\langle \xi , \mathcal {H}_{\mathrm{corr}} \xi \rangle \) is given by the Gell-Mann–Brueckner formula

$$\begin{aligned} \frac{\hbar \kappa _0}{2} \sum _{k \in \mathbb {Z}^3} |k|\left[ \frac{1}{\pi } \int _0^\infty \log \left( 1+4\pi \hat{V}(k) \kappa _0 \left( 1 - \lambda \arctan \frac{1}{\lambda } \right) \right) {{\mathrm{d}}}\lambda - \hat{V}(k)\kappa _0\pi \right] , \end{aligned}$$

up to errors that are of smaller order as \(N \rightarrow \infty \). To construct this state, we shall represent \(\mathcal {H}_{{\mathrm{corr}}}\) in terms of suitable almost-bosonic operators, obtained by combining fermionic particle–hole excitations. As we shall see, the resulting expression will be quadratic in terms of these new operators; the state \(\xi \) will be chosen to minimize the bosonic energy.

3.2 Particle–hole excitations

We start by rewriting the quartic contribution to the correlation Hamiltonian as

(3.9)

The main contribution to \(Q_N\) is \(Q_{N}^{{\mathrm{B}}}\), which, as we shall see, can be represented as a quadratic operator in terms of collective particle–hole pair operators. These operators behave approximately like bosonic creation and annihilation operators.

Let us define the (unnormalized) particle–hole operator as

$$\begin{aligned} {\tilde{b}}^*_k := \int _{\mathbb {T}^3} {{\mathrm{d}}}x\, a^*(u_x) e^{ikx} a^*(\overline{v}_x). \end{aligned}$$

Notice that \(\tilde{b}^*_0 = 0\) since \(u\overline{v} =0\). Writing this operator in momentum representation,

$$\begin{aligned} {\tilde{b}}^*_k = \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c\\ h \in B_{\mathrm{F}} \end{array}} a^*_p a^*_h \delta _{p-h,k}, \end{aligned}$$
(3.10)

we can think of it as creating a particle–hole pair of momentum k, delocalized over all the Fermi surface. In terms of these operators

$$\begin{aligned} Q_N^{{\mathrm{B}}} = \frac{1}{2N} \sum _{k\in \mathbb {Z}^3 \setminus \{0\}} \hat{V}(k) \left( 2 {\tilde{b}}^*_k {\tilde{b}}_k + {\tilde{b}}^*_k {\tilde{b}}^*_{-k} + {\tilde{b}}_{-k} {\tilde{b}}_k \right) . \end{aligned}$$

Recall that \(\hat{V}\) has compact support by assumption, so there exists

$$\begin{aligned} R> 0 \text { such that } \hat{V} (k) = 0 \text { for all } |k|> R\;. \end{aligned}$$

It is convenient to group together k and \(-k\) modes, as follows. Define

$$\begin{aligned} \Gamma ^{{\mathrm{nor}}}\subset \mathbb {Z}^3 \end{aligned}$$
(3.11)

as the set of all \(k \in \mathbb {Z}^3 \cap B_R(0)\) with \(k_3 >0\) and additionally half of the k-vectors with \(k_3 =0\), such that for every \(k \in \Gamma ^{{\mathrm{nor}}}\) we have \(-k \not \in \Gamma ^{{\mathrm{nor}}}\). We then rewrite \(Q_N^{{\mathrm{B}}}\) as

$$\begin{aligned} Q_N^{{\mathrm{B}}} = \frac{1}{2N} \sum _{k \in \Gamma ^{{\mathrm{nor}}}} \hat{V}(k) \left( 2 {\tilde{b}}^*_k {\tilde{b}}_k + {\tilde{b}}^*_k {\tilde{b}}^*_{-k} + {\tilde{b}}_{-k} {\tilde{b}}_k + 2 {\tilde{b}}^*_{-k} {\tilde{b}}_{-k} + {\tilde{b}}^*_{-k} {\tilde{b}}^*_{k} + {\tilde{b}}_{k} {\tilde{b}}_{-k}\right) . \end{aligned}$$
(3.12)

It turns out that the operators \({\tilde{b}}_k\) behave as approximate bosonic operators, whenever acting on vectors of \(\mathcal {F}\) with only a few particles; the Pauli principle is relaxed by summing over a large number of momenta of which typically only few are occupied.

The main problem, however, is that the term \({{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v)\) in (3.4) cannot be represented as a quadratic operator in terms of \({\tilde{b}}_k\) and \({\tilde{b}}_k^{*}\). To circumvent this issue we shall split the operators \(\tilde{b}_{k}\), \(\tilde{b}^{*}_{k}\) into partially localized particle–hole operators \(\tilde{b}_{\alpha ,k}\), \(\tilde{b}^*_{\alpha ,k}\) involving only modes of one patch of a decomposition (indexed by \(\alpha \)) of the Fermi surface. This allows us to linearize the kinetic energy around the centers of patches, so that states of the form

$$\begin{aligned} \tilde{b}^*_{\alpha _1,k_1} \tilde{b}^*_{\alpha _2,k_2} \cdots \tilde{b}^*_{\alpha _m,k_m} \Omega \end{aligned}$$

become approximate eigenvectors of \({{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v)\).

The non-trivial question is whether we can localize (3.10) sufficiently to control the linearization of the kinetic energy, while at the same time keeping it sufficiently delocalized so that \(\tilde{b}^*_{\alpha ,k}\) involves many fermionic modes, thus relaxing the Pauli principle—complete localization would of course destroy the bosonic behavior since \((a^*_p a^*_h)^2 =0\). We are going to find that this can be achieved by decomposing the Fermi sphere into \(M = M(N)\) diameter-bounded equal-area patches if \(N^{1/3} \ll M \ll N^{2/3}\).

Patch decomposition of the Fermi sphere. We construct a partition of the Fermi sphere \(k_F \mathbb {S}^2\) into M diameter-bounded equal-area patches following [48], see Fig. 1. Let

$$\begin{aligned} M = M(N) := N^{1/3 + \epsilon } \quad \text {for an } 0< \epsilon < 1/3, \end{aligned}$$

or more precisely, this number rounded to the nearest even integer. Our goal is to first decompose the unit sphere \(\mathbb {S}_{2}\) as

$$\begin{aligned} \mathbb {S}_2 = \left( \bigcup _{\alpha =1}^{M} p_\alpha \right) \cup p_{\mathrm{corri}}, \end{aligned}$$

where \(p_{\alpha }\) are suitable pairwise disjoint sets, to be defined below, and \(p_{\mathrm{corri}}\) has small surface measure, \(\sigma (p_{\mathrm{corri}}) = \mathcal {O}(M^{1/2} N^{-1/3}) \rightarrow 0\) as \(N \rightarrow \infty \). The error \(p_{\mathrm{corri}}\) is due to the introduction of a positive distance (“corridors”) separating neighboring patches. The important properties to be ensured in the construction are that all patches \(p_\alpha \) have area of order 1 / M and that they do not degenerate into very long, thin shapes as M becomes large.

Fig. 1
figure 1

Diameter-bounded partition of the northern half sphere following [48]: a spherical cap is placed at the pole; then collars along the latitudes are introduced and split into patches, separated by corridors. The vectors \(\hat{\omega }_\alpha \) are picked as centers of the patches, marked in black. The patches will be reflected by the origin to cover also the southern half sphere

We use standard spherical coordinates: for \({\hat{\omega }} \in \mathbb {S}^2\), denote by \(\theta \) the inclination angle (measured between \({\hat{\omega }}\) and \(e_3 = (0,0,1)\)) and by \(\varphi \) the azimuth angle (measured between \(e_1 = (1 , 0 , 0)\) and the projection of \(\hat{\omega }\) onto the plane orthogonal to \(e_3\)). We write \(\hat{\omega }(\theta ,\varphi )\) to specify a vector on the unit sphere in terms of its inclination and azimuth angles.

The construction starts by placing a spherical cap centered at \(e_3\), with opening angle \(\Delta \theta _0 := D /\sqrt{M}\), with \(D \in \mathbb {R}\) chosen so that the area of the spherical cap equals \(4\pi / M\). Next, we decompose the remaining part of the half sphere, i. e., the set of all \(\hat{\omega } (\theta , \varphi )\) with \(D/{\sqrt{M}} \le \theta \le \pi /2\), into \(\sqrt{M}/2\) (rounded to the next integer) collars; the i-th collar consists of all \(\hat{\omega }(\theta ,\varphi )\) with \(\theta \in [\theta _i - \Delta \theta _i,\theta _i+\Delta \theta _i)\) and arbitrary azimuth \(\varphi \). The inclination of every collar will extend over a range \(\Delta \theta _i \sim 1/{\sqrt{M}}\); the proportionality constant is adjusted so that the number of collars on the half sphere is an integer.

Observe that the circle \(\left\{ {\hat{\omega }}(\theta _i,\varphi ): \varphi \in [0,2\pi ) \right\} \) has circumference proportional to \(\sin (\theta _i)\); therefore we split the i-th collar into \(\sqrt{M}\sin (\theta _i)\) (rounded to the next integer) patches. This implies that the j-th patch in the i-th collar covers an azimuth angles \(\varphi \in [\varphi _{i,j}-\Delta \varphi _{i,j},\varphi _{i,j} + \Delta \varphi _{i,j})\), where

$$\begin{aligned}\Delta \varphi _{i,j} \sim \frac{1}{{\sin (\theta _i)\sqrt{M}}}.\end{aligned}$$

We fix the proportionality constants by demanding that all patches have area \(4\pi /M\) (this is not necessary though, it would be sufficient that all patches have area of order 1 / M).

The last step is to define \(\Delta \widetilde{\theta }_i := \Delta \theta _i - \tilde{D} R N^{-1/3}\) and \(\Delta \widetilde{\varphi }_{i,j} := \Delta \varphi _{i,j} - \tilde{D} R N^{-1/3}/\sin (\theta _i)\), with \(\tilde{D} >0\) to be fixed below. We then define \(p_1\) as the spherical cap centered at \(e_3\) with opening angle \(\Delta \widetilde{\theta }_0\) and the other \(M/2 - 1\) patches as

$$\begin{aligned} p_{i,j} := \big \{ \hat{\omega }(\theta ,\varphi ): \theta \in [\theta _i-\Delta \widetilde{\theta }_i,\theta _i+\Delta \widetilde{\theta }_i) \text { and } \varphi \in [\varphi _{i,j}-\Delta \widetilde{\varphi }_{i,j},\varphi _{i,j} + \Delta \widetilde{\varphi }_{i,j})\big \}. \end{aligned}$$

The constant \(\tilde{D}\) is chosen such that, when patches are scaled up to the Fermi sphere there are corridors of width at least 2R between adjacent patches (i. e., \(\tilde{D}\) has to be slightly larger than \(\kappa _0^{-1}\)). Having concluded the construction on the northern half sphere, we define the patches on the southern half sphere through reflection by the origin, \(k \mapsto -k\). Finally we switch from enumeration by i and j to enumeration with a single index \(\alpha \in \{1, \dots , M\}\). From the construction it is clear that the patches \(p_\alpha \) have the following three properties:

  1. (i)

    The area of every patch is

    $$\begin{aligned} \sigma (p_\alpha ) = \frac{4\pi }{M} + \mathcal {O}\big (N^{-1/3} M^{-1/2}\big )\;. \end{aligned}$$
  2. (ii)

    The family of decompositions is diameter bounded, i. e., there exists a constant \(C_0\) independent of N and M such that, for the decomposition into M patches, the diameterFootnote 5 of every patch is bounded by \(C_0/\sqrt{M}\).

  3. (iii)

    Point reflection at the origin maps \(p_\alpha \) to \(-p_\alpha = p_{\alpha +\frac{M}{2}}\) for all \(\alpha = 1,\ldots ,\frac{M}{2}\).

Next, we scale the patches from the unit sphere up to the Fermi surface \(k_{\mathrm{F}} \mathbb {S}^2\) by setting

$$\begin{aligned} P_\alpha := k_{\mathrm{F}} p_\alpha \end{aligned}$$

for all \(\alpha = 1, \dots , M\). The patches \(P_\alpha \) then have the following properties.

  1. (i)

    The area of every patch is \(\sigma (P_{\alpha }) = \frac{4\pi }{M} k_{\mathrm{F}}^2+ \mathcal {O}\left( N^{1/3} M^{-1/2}\right) \).

  2. (ii)

    There exists a constant \(C_1\) independent of N and M such that, for the decomposition in M patches, we have \({\text {diam}}(P_\alpha ) \le C_1 N^{1/3}/\sqrt{M}\).

Finally, we shall introduce a “fattening” of the patch decomposition, which will be used to decompose the operators \(b_{k}\) as sums of operators corresponding to particle–hole excitations around the patches. This is motivated by the fact that the only modes affected by the interaction are those in a shell around the Fermi sphere, where the thickness of the shell is given by the radius of the support of \(\hat{V}\). Recalling again that \(R > 0\) is chosen such that \(\hat{V} (k) = 0\) for \(|k|> R\), we define the fattened Fermi surface as

$$\begin{aligned} \partial B_{\mathrm{F}}^R := \left\{ q \in \mathbb {Z}^3 : k_{\mathrm{F}} - R \le |q |\le k_{\mathrm{F}} +R \right\} . \end{aligned}$$

We lift the partition of the unit sphere to a partition of \(\partial B_{\mathrm{F}}^R\),

$$\begin{aligned} \partial B_{\mathrm{F}}^R = \left( \bigcup _{\alpha =1}^M B_\alpha \right) \cup B_{\mathrm{corri}}, \end{aligned}$$

by introducing the cones \(\mathcal {C}_\alpha := \bigcup _{r \in (0,\infty )} r p_\alpha \) and defining

$$\begin{aligned} B_\alpha := \partial B_{\mathrm{F}}^R \cap \mathcal {C}_\alpha . \end{aligned}$$

(The set \(B_{\mathrm{corri}}\) consist of all the remaining modes in the similarly fattened corridors.) To every patch \(B_\alpha \) we assign a vector \(\omega _\alpha \in B_\alpha \) as the center of \(P_\alpha \) on the Fermi surface; in particular \(|\omega _\alpha |=k_{\mathrm{F}}\). The vectors \(\omega _{\alpha }\) inherit the reflection symmetry of the patches, \(\omega _{\alpha + M/2} = -\omega _\alpha \) for all \(\alpha = 1, \dots , M/2\).

Localization on the Fermi surface. We recall (3.10) in momentum representation,

$$\begin{aligned} {\tilde{b}}^*_k = \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c\\ h \in B_{\mathrm{F}} \end{array}} a^*_p a^*_h \delta _{p-h,k}. \end{aligned}$$
(3.13)

Since \(\hat{V} (k) = 0\) if \(|k|> R\), we are only interested in the case \(|k|\le R\); hence, the sum in (3.13) effectively runs only over p and h at most at distance R from the Fermi sphere \(k_{\mathrm{F}} \mathbb {S}^2\). In other words,

$$\begin{aligned} {\tilde{b}}^*_k = \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c \cap \partial B_{\mathrm{F}}^R\\ h \in B_{\mathrm{F}} \cap \partial B_{\mathrm{F}}^R \end{array}} a^*_p a^*_h \delta _{p-h,k}. \end{aligned}$$
(3.14)

Next, we decompose the sum on the r. h. s. of (3.14) into contributions associated with different patches. If \(k \cdot \omega _\alpha < 0\), there will be few or no particle–hole pairs (ph) in the patch \(B_\alpha \) satisfying \(p-h = k\); geometrically, k is approximately pointing from outside to inside of the Fermi ball, which is incompatible with the requirements \(p \in B_{\mathrm{F}}^c\) and \(h \in B_{\mathrm{F}}\). Also if \(k \cdot \omega _\alpha \) is positive but small, there are only few particle–hole pairs (ph) with \(p-h = k\). For this reason, for any \(k \in \mathbb {Z}^3\), we define the index setFootnote 6

$$\begin{aligned} \mathcal {I}_{k}^{+}:= \big \{\alpha =1,\ldots , M: {\hat{\omega }}_\alpha \cdot \hat{k} \ge N^{-\delta }\big \} \end{aligned}$$

for a parameter \(\delta > 0\) to be chosen later. We then write

$$\begin{aligned} {\tilde{b}}^*_k = \sum _{\alpha \in \mathcal {I}_{k}^{+}} {\tilde{b}}^*_{\alpha ,k} + \mathfrak {r}^*_k\;, \end{aligned}$$
(3.15)

where

$$\begin{aligned} \tilde{b}^*_{\alpha ,k} := \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c\cap B_\alpha \\ h \in B_{\mathrm{F}}\cap B_\alpha \end{array}} a^*_p a^*_h \delta _{p-h,k}\;. \end{aligned}$$

The operator \(\mathfrak {r}^*_k\) contains all particle–hole pairs that are not included in \(\sum _{\alpha \in \mathcal {I}_{k}^{+}} \tilde{b}^*_{\alpha ,k}\). This can happen for two reasons: because an index \(\alpha \) is not included in \(\mathcal {I}_{k}^{+}\), or because one or both momenta of a pair (ph) belong to a corridor between patches. As we shall see, this operator can be understood as a small error, due to the fact that the number of pairs (ph) not included in the first sum is small.

Normalization of particle–hole pair operators. We still have to normalize the pair operators so that they can be seen as an approximation of bosonic operators. The normalized operators are defined by

$$\begin{aligned} b^*_{\alpha ,k} := \frac{1}{n_{\alpha ,k}} {\tilde{b}}^*_{\alpha ,k}, \qquad n_{\alpha ,k} := ||{\tilde{b}}^*_{\alpha ,k} \Omega ||. \end{aligned}$$
(3.16)

We call these operators the pair creation operators; their adjoints are called pair annihilation operators. The normalization constant can be calculated as follows:

$$\begin{aligned} \begin{aligned} ||{\tilde{b}}^*_{\alpha ,k}\Omega ||^2&= \langle \Omega , \Bigg [\sum _{\begin{array}{c} p_1 \in B_{\mathrm{F}}^c \cap B_\alpha \\ h_1 \in B_{\mathrm{F}} \cap B_\alpha \end{array}} a^*_{p_1} a^*_{h_1} \delta _{p_1 - h_1,k}\Bigg ]^* \Bigg [\sum _{\begin{array}{c} p_2 \in B_{\mathrm{F}}^c \cap B_\alpha \\ h_2 \in B_{\mathrm{F}} \cap B_\alpha \end{array}} a^*_{p_2} a^*_{h_2} \delta _{p_2 - h_2,k}\Bigg ] \Omega \rangle \\&= \sum _{\begin{array}{c} p \in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}}\! \delta _{p - h,k}\;. \end{aligned} \end{aligned}$$

This shows that \(n_{\alpha ,k}^2\) is the number of particle–hole pairs with momentum \(k = p-h\) that lie in the patch \(B_\alpha \). Due to the symmetry of the partition under point reflection at the origin we have \(n_{\alpha ,k} = n_{\alpha +M/2,-k}\). We define \(v_\alpha (k) \ge 0\) by setting

$$\begin{aligned} n_{\alpha ,k}^2 =: k_{\mathrm{F}}^2 |k|v_\alpha (k)^2\;. \end{aligned}$$
(3.17)

In the next proposition, whose proof is deferred to Sect. 6, we estimate the normalization constants.

Proposition 3.1

Let \(k \in \mathbb {Z}^3 \backslash \{0 \}\), \(M = N^{1/3 + \epsilon }\) for an \(0< \epsilon < 1/3\). Then, for \(0<\delta < 1/6- \epsilon /2\) and for all \(\alpha \in \mathcal {I}_{k}^{+}\), we have

$$\begin{aligned} v_{\alpha } (k)^2 = \sigma (p_\alpha ) \, | \hat{k} \cdot {\hat{\omega }}_\alpha | \, \left( 1 + \mathcal {O}\left( \sqrt{M}N^{-\frac{1}{3}+\delta }\right) \right) , \end{aligned}$$

where \(\sigma (p_\alpha ) = \frac{4\pi }{M} + \mathcal {O}(N^{-1/3}M^{-1/2})\) is the surface area of the patch \(p_\alpha \) on the unit sphere.

Due to the cutoff \(\hat{\omega }_\alpha \cdot \hat{k} \ge N^{-\delta }\) imposed through the index set \(\mathcal {I}_{k}^{+}\), it immediately follows that there exists a constantFootnote 7C such that

$$\begin{aligned} n_{\alpha , k} \ge C \mathfrak {n}, \quad \text {where } \mathfrak {n}(N,M) := \frac{N^{1/3-\delta /2}}{\sqrt{M}} \;. \end{aligned}$$
(3.18)

4 Construction of the Trial State

In this section we shall introduce the trial state that will produce the upper bound in our main result, Theorem 2.1. To begin, let us show that the particle–hole operators \(b_{\alpha , k}\) defined in (3.16) behave as almost-bosonic operators when acting on Fock space vectors containing only few fermions.

4.1 Particle–hole creation via almost-bosonic operators

Recall the definition of \(\Gamma ^{{\mathrm{nor}}}\) given after (3.11). For \(k \in \Gamma ^{{\mathrm{nor}}}\), let

$$\begin{aligned} \mathcal {I}_{k}^{-}:= \mathcal {I}_{-k}^+ = \big \{ \alpha = 1, \ldots , M : \hat{\omega }_\alpha \cdot \hat{k} \le - N^{-\delta } \big \}\;. \end{aligned}$$

We shall also set \(\mathcal {I}_{k} = \mathcal {I}_{k}^{+} \cup \mathcal {I}_{k}^{-}\). To unify notation, we define

$$\begin{aligned} c^*_\alpha (k) := \left\{ \begin{array}{lr} b^*_{\alpha ,k} &{} \text { for } \alpha \in \mathcal {I}_{k}^{+}\\ b^*_{\alpha ,-k} &{} \text { for } \alpha \in \mathcal {I}_{k}^{-}\end{array}\right. \;. \end{aligned}$$
(4.1)

Lemma 4.1

(Approximate CCR). Let \(k,l \in \Gamma ^{{\mathrm{nor}}}\). Let \(\alpha \in \mathcal {I}_{k}\) and \(\beta \in \mathcal {I}_{l}\). Then

$$\begin{aligned} \begin{aligned} {[}c_\alpha (k),c_\beta (l)]&= 0 = [c^*_\alpha (k),c^*_\beta (l)],\\ {[}c_\alpha (k),c^*_\beta (l)]&= \delta _{\alpha ,\beta }\left( \delta _{k,l} + \mathcal {E}_\alpha (k,l) \right) . \end{aligned} \end{aligned}$$
(4.2)

The operator \(\mathcal {E}_\alpha (k,l)\) commutes with \(\mathcal {N}\), and satisfies the bound

$$\begin{aligned} || \mathcal {E}_\alpha (k,l) \psi || \le \frac{2}{n_{\alpha ,k} n_{\alpha ,l}} ||\mathcal {N}\psi ||\;, \qquad \forall \psi \in \mathcal {F}. \end{aligned}$$
(4.3)

The same estimate holds for \(\mathcal {E}_\alpha ^*(k,l) = \mathcal {E}_\alpha (l,k)\).

Proof

The two identities on the first line of (4.2) are obvious. We prove the second line.

First case:\(\alpha \in \mathcal {I}_{k}^{+}\)and\(\beta \in \mathcal {I}_l^{+}\). We have

$$\begin{aligned}{}[c_\alpha (k),c^*_\beta (l)] = [b_{\alpha ,k},b^*_{\beta ,l}]. \end{aligned}$$
(4.4)

From the definition it is clear that b and \(b^*\) operators belonging to different patches commute, explaining the \(\delta _{\alpha ,\beta }\)-factor. Thus, from now on \(\alpha =\beta \). By the CAR,

$$\begin{aligned}{}[a_{h_1} a_{p_1},a^*_{p_2} a^*_{h_2}] = \delta _{h_1,h_2}\delta _{p_1,p_2} - a^*_{p_2} a_{p_1} \delta _{h_1,h_2} - a^*_{h_2} a_{h_1} \delta _{p_1,p_2}. \end{aligned}$$
(4.5)

The first term in (4.5) gives the following contribution to the commutator (4.4):

$$\begin{aligned} \begin{aligned}&n_{\alpha ,k}^{-1} n_{\alpha ,l}^{-1} \sum _{\begin{array}{c} p_1 \in B_{\mathrm{F}}^c \cap B_\alpha \\ h_1 \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \sum _{\begin{array}{c} p_2 \in B_{\mathrm{F}}^c \cap B_\alpha \\ h_2 \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{h_1,h_2}\delta _{p_1,p_2} \delta _{p_1-h_1,k}\delta _{p_2-h_2,l}\\&\quad = n_{\alpha ,k}^{-1} n_{\alpha ,l}^{-1} \sum _{\begin{array}{c} p_1 \in B_{\mathrm{F}}^c \cap B_\alpha \\ h_1 \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{p_1-h_1,k} \delta _{p_1-h_1,l} = n_{\alpha ,k}^{-2} \delta _{k,l} \sum _{\begin{array}{c} p \in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{p-h,k} = \delta _{k,l}. \end{aligned} \end{aligned}$$

The two remaining terms in (4.5) produce the error term

$$\begin{aligned} \begin{aligned}&- \sum _{\begin{array}{c} h_1, h_2 \in B_{\mathrm{F}} \cap B_\alpha \\ p \in B_{\mathrm{F}}^c \cap B_\alpha \end{array}} \frac{\delta _{p - h_1,k} \delta _{p-h_2,l}}{n_{\alpha ,k} n_{\alpha ,l}} a^*_{h_2} a_{h_1} - \sum _{\begin{array}{c} p_1, p_2 \in B_{\mathrm{F}}^c \cap B_\alpha \\ h\in B_{\mathrm{F}}\cap B_\alpha \end{array}} \frac{\delta _{p_1-h,k}\delta _{p_2-h,l}}{n_{\alpha ,k} n_{\alpha ,l}}a^*_{p_2} a_{p_1}\\&\quad =: \mathcal {E}_{1}(\alpha ,k,l) + \mathcal {E}_{2}(\alpha ,k,l) =: \mathcal {E}(\alpha ,k,l). \end{aligned} \end{aligned}$$
(4.6)

In the present case, the error term in the lemma is \(\mathcal {E}_\alpha (k,l) := \mathcal {E}(\alpha ,k,l)\). Let us only consider the second term in the left-hand side; the first can be controlled in the same way. Setting \(\omega ^{(\alpha )} := \sum _{h \in B_{\mathrm{F}}\cap B_\alpha } |f_h\rangle \langle f_h|\) and \(u^{(\alpha )} := \sum _{p \in B_{\mathrm{F}}^c \cap B_\alpha } |f_p\rangle \langle f_p |\), we have

$$\begin{aligned}&{{\mathrm{d}}}\Gamma \Big ( u^{(\alpha )} e^{ilx} \omega ^{(\alpha )} e^{-ikx} u^{(\alpha )} \Big ) \\&\quad = \sum _{p_1,p_2 \in B_{\mathrm{F}}^c \cap B_\alpha } a^*_{p_1} a_{p_2} \langle f_{p_2}, e^{ilx} \Big [\sum _{h\in B_{\mathrm{F}}\cap B_\alpha } |f_{h}\rangle \langle f_{h}|\Big ] e^{-ikx} |f_{p_1}\rangle \\&\quad = \sum _{\begin{array}{c} p_1,p_2\in B_{\mathrm{F}}^c\cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} a^*_{p_1} a_{p_2} \delta _{p_2-h,l} \delta _{p_1-h,k}. \end{aligned}$$

Recall also, for the second quantization of any bounded one-particle operator, the standard bound \(||{{\mathrm{d}}}\Gamma (A)\psi || \le ||A||_{{\mathrm{op}}}||\mathcal {N}\psi ||\) for all \(\psi \in \mathcal {F}\), with \(||A||_{{\mathrm{op}}}\) the operator norm. Consequently

$$\begin{aligned} || \mathcal {E}_2(\alpha ,k,l) \psi || = \Big \Vert \frac{1}{n_{\alpha ,k} n_{\alpha ,l}}{{\mathrm{d}}}\Gamma \Big ( u^{(\alpha )} e^{ilx} \omega ^{(\alpha )} e^{-ikx} u^{(\alpha )} \Big ) \Big \Vert \le \frac{1}{n_{\alpha ,k} n_{\alpha ,l}} || \mathcal {N}\psi || \end{aligned}$$

since \(||u^{(\alpha )} e^{ilx} \omega ^{(\alpha )} e^{-ikx} u^{(\alpha )}||_{{\mathrm{op}}}\le 1\).

Second case:\(\alpha \in \mathcal {I}_{k}^{-}\), \(\beta \in \mathcal {I}_l^{-}\). This case is treated like the first case, recalling that

$$\begin{aligned}{}[c_\alpha (k),c^*_\beta (l)] = [b_{\alpha ,-k},b^*_{\beta ,-l}]. \end{aligned}$$

In this case \(\mathcal {E}_\alpha (k,l) := \mathcal {E}(\alpha ,-k,-l)\), with the same bound as before.

Third case:\(\alpha \in \mathcal {I}_{k}^{+}\)and\(\beta \in \mathcal {I}_l^{-}\), and vice versa. For \(\alpha \ne \beta \) the commutator vanishes, just like in the previous cases. So consider \(\alpha \in \mathcal {I}_{k}^{+}\) and \(\beta = \alpha \in \mathcal {I}^{-}_l = \mathcal {I}_{-l}^+\). We find

$$\begin{aligned}{}[c_\alpha (k),c^*_\alpha (l)] = [b_{\alpha ,k},b^*_{\alpha ,-l}] = \delta _{k,-l} + \mathcal {E}(\alpha ,k,-l). \end{aligned}$$
(4.7)

Since \(\mathcal {I}_{k}^{+}\cap \mathcal {I}_k^{-} = \emptyset \), \(\alpha = \beta \) is possible only for \(k \ne l\). Also \(k = -l\) is excluded since \(k,l \in \Gamma ^{{\mathrm{nor}}}\). Consequently \(\delta _{k,l} = 0 = \delta _{k,-l}\), so (4.7) agrees with the statement of the Lemma (if we set \(\mathcal {E}_\alpha (k,l) := \mathcal {E}(\alpha ,k,-l)\)). The estimate of the error term remains the same.

It is obvious that \(\mathcal {E}(\alpha ,k,l)\) commutes with \(\mathcal {N}\). This completes the proof of the lemma.

\(\square \)

The next lemma provides bounds for the \(c_{\alpha }(k)\), \(c_{\alpha }^{*}(k)\) operators that are similar to the usual bounds valid for bosonic creation and annihilation operators.

Lemma 4.2

(Bounds for Pair Operators). Let \(k \in \Gamma ^{{\mathrm{nor}}}\) and \(\alpha \in \mathcal {I}_{k}\). Then,

$$\begin{aligned} ||c_{\alpha }(k)\psi || \le ||\mathcal {N}(B_{\mathrm{F}}\cap B_\alpha )^{1/2} \psi ||\qquad \forall \psi \in \mathcal {F}\;, \end{aligned}$$
(4.8)

where \(\mathcal {N}(B) := \sum _{i \in B} a^*_i a_i\) for any set of momenta \(B\subset \mathbb {Z}^{3}\). Furthermore, for \(f \in \ell ^2(\mathcal {I}_{k})\) and \(\psi \in \mathcal {F}\), we have

$$\begin{aligned} \begin{aligned} ||\sum _{\alpha \in \mathcal {I}_{k}} f(\alpha ) c_\alpha (k) \psi ||&\le \Big ( \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha )|^2 \Big )^{1/2} ||\mathcal {N}^{1/2} \psi ||\\ ||\sum _{\alpha \in \mathcal {I}_{k}} f(\alpha ) c^*_\alpha (k) \psi ||&\le \Big ( \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha )|^2 \Big )^{1/2} ||(\mathcal {N}+1)^{1/2} \psi ||. \end{aligned} \end{aligned}$$
(4.9)

Proof

Using \(|| a_{q} ||_{{\mathrm{op}}}= 1\) we have

$$\begin{aligned} ||b_{\alpha ,k} \psi ||&\le \frac{1}{n_{\alpha ,k}} \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c\cap B_\alpha \\ h \in B_{\mathrm{F}}\cap B_\alpha \end{array}}\delta _{p-h,k} ||a_p a_h \psi || \le \frac{1}{n_{\alpha ,k}} \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c\cap B_\alpha \\ h \in B_{\mathrm{F}}\cap B_\alpha \end{array}}\delta _{p-h,k} ||a_h \psi || \\&\le \frac{1}{n_{\alpha ,k}} \Big [ \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c\cap B_\alpha \\ h \in B_{\mathrm{F}}\cap B_\alpha \end{array}} \delta _{p-h,k} \Big ]^{1/2} \Big [ \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c\cap B_\alpha \\ h \in B_{\mathrm{F}}\cap B_\alpha \end{array}} \delta _{p-h,k} ||a_h \psi ||^2 \Big ]^{1/2} \\&= \langle \psi , \mathcal {N}(B_{\mathrm{F}}\cap B_\alpha ) \psi \rangle ^{1/2}, \end{aligned}$$

recalling that by definition \(\sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c\cap B_\alpha \\ h \in B_{\mathrm{F}}\cap B_\alpha \end{array}} \delta _{p-h,k} = n_{\alpha ,k}^2\). This proves (4.8). To prove the first inequality from (4.9), we use (4.8) together with Cauchy–Schwarz,

$$\begin{aligned} \Big \Vert \sum _{\alpha \in \mathcal {I}_{k}} \overline{f(\alpha )} c_\alpha (k) \psi \Big \Vert ^2&\le \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha ) |^2 \sum _{\alpha '\in \mathcal {I}_{k}} ||c_{\alpha '}(k) \psi ||^2\\&\le \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha ) |^2 \sum _{\alpha '\in \mathcal {I}_{k}} ||\mathcal {N}(B_{\mathrm{F}} \cap B_{\alpha '})^{1/2} \psi ||^2 \\&\le \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha ) |^2 \langle \psi , \mathcal {N}\psi \rangle . \end{aligned}$$

We now prove the second inequality from (4.9). By Lemma 4.1, we have

$$\begin{aligned}&\Big \Vert \sum _{\alpha \in \mathcal {I}_{k}} f(\alpha ) c^*_\alpha (k) \psi \Big \Vert ^2 \nonumber \\&\quad = \sum _{\alpha ,\beta \in \mathcal {I}_{k}} \overline{f(\alpha )}f(\beta ) \langle \psi , c^*_\beta (k) c_\alpha (k) \psi \rangle + \sum _{\alpha ,\beta \in \mathcal {I}_{k}} \overline{f(\alpha )}f(\beta ) \langle \psi , [ c_\alpha (k), c^*_\beta (k)]\psi \rangle \nonumber \\&\quad = \Big \Vert \sum _{\alpha \in \mathcal {I}_{k}} \overline{f(\alpha )} c_\alpha (k) \psi \Big \Vert ^2 + \sum _{\alpha ,\beta \in \mathcal {I}_{k}} \overline{f(\alpha )}f(\beta ) \langle \psi , \delta _{\alpha ,\beta }\left( 1 +\mathcal {E}_\alpha (k,k) \right) \psi \rangle \nonumber \\&\quad \le \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha ) |^2 \sum _{\alpha '\in \mathcal {I}_{k}} ||c_{\alpha '}(k) \psi ||^2 + \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha )|^2 ||\psi ||^2 + \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha )|^2 \langle \psi , \mathcal {E}_\alpha (k,k) \psi \rangle . \end{aligned}$$
(4.10)

Consider the last term on the r. h. s. Recall from (4.6) that for \(\alpha \in \mathcal {I}_{k}^{+}\) we have

$$\begin{aligned} \mathcal {E}_\alpha (k,k) = - \frac{1}{n_{\alpha ,k}^2} \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{p-h,k} a^*_h a_h - \frac{1}{n_{\alpha ,k}^2} \sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{p-h,k} a^*_p a_p. \end{aligned}$$

Obviously \(\langle \psi , \mathcal {E}_\alpha (k,k) \psi \rangle \le 0\). For \(\alpha \in \mathcal {I}_{k}^{-}\) we have \(-k\) replacing k on the r. h. s., again producing a negative semidefinite operator. Hence in (4.10) we can drop the last summand for the purpose of an upper bound. Together with the first bound from (4.9) this implies

$$\begin{aligned} \Big \Vert \sum _{\alpha \in \mathcal {I}_{k}} f(\alpha ) c^*_\alpha (k) \psi \Big \Vert ^2&\le \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha ) |^2 \langle \psi , \mathcal {N}\psi \rangle + \sum _{\alpha \in \mathcal {I}_{k}} |f(\alpha ) |^2 \langle \psi ,\psi \rangle . \end{aligned}$$

\(\square \)

4.2 The trial state

In order to motivate the definition of the trial state, let us formally rewrite the correlation Hamiltonian \(\mathcal {H}_{{\mathrm{corr}}}\) in terms of the almost-bosonic pair operators \(c^{*}_{\alpha }(k)\) and \(c_{\alpha }(k)\).

Bosonization of the correlation Hamiltonian. Inserting the decomposition (3.15) into (3.12) we find

$$\begin{aligned} \begin{aligned} Q_N^{{\mathrm{B}}} = \frac{1}{2N} \sum _{k\in \mathbb {Z}^3 \setminus \{0\}} \hat{V}(k)&\Big [ 2 \sum _{\alpha \in \mathcal {I}_{k}^{+}} \sum _{\beta \in \mathcal {I}_{k}^{+}} n_{\alpha ,k} n_{\beta ,k} b_{\alpha ,k}^* b_{\beta ,k} + \sum _{\alpha \in \mathcal {I}_{k}^{+}} \sum _{\beta \in \mathcal {I}^{+}_{-k}} n_{\alpha ,k} n_{\beta , -k} b_{\alpha , k}^* b_{\beta , -k}^* \\&\, + \sum _{\alpha \in \mathcal {I}^{+}_{-k}} \sum _{\beta \in \mathcal {I}_{k}^{+}} n_{\alpha ,-k} n_{\beta , k} b_{\alpha , -k} b_{\beta , k} \Big ] + \text {error terms,} \end{aligned} \end{aligned}$$

where the error terms contain at least one \(\mathfrak {r}_k\)–operator (see the discussion following (5.7) for the rigorous proof of their smallness). Recalling the definition (3.17) of \(v_{\alpha }(k)\) and the definition of the c and \(c^*\) operators (4.1), we get

$$\begin{aligned} \begin{aligned} Q_N^{{\mathrm{B}}} = \hbar \kappa ^2 \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|\, \hat{V}(k)&\Big [ \sum _{\alpha \in \mathcal {I}_{k}^{+}} \sum _{\beta \in \mathcal {I}_{k}^{+}} v_{\alpha } (k) v_\beta (k) c_\alpha ^* (k) c_\beta (k) \\&+ \sum _{\alpha \in \mathcal {I}_{k}^{-}} \sum _{\beta \in \mathcal {I}_{k}^{-}} v_\alpha (-k) v_\beta (-k) c_\alpha ^* (k) c_\beta (k) \\&+ \sum _{\alpha \in \mathcal {I}_{k}^{+}} \sum _{\beta \in \mathcal {I}_{k}^{-}} \left( v_\alpha (k) v_\beta (-k) c_\alpha ^*(k) c_\beta ^*(k) + \text {h.c.} \right) \Big ] + \text {error terms} \end{aligned} \end{aligned}$$
(4.11)

where \(\kappa = (3/4\pi )^{1/3} + \mathcal {O}(N^{-1/3})\) is defined as in (3.2).

Let us now consider the operator \({{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v)\) appearing in the definition of the correlation Hamiltonian (3.8). To express \({{\mathrm{d}}}\Gamma (uhu - \bar{v} \bar{h} v)\) in terms of \(c_\alpha (k), c_\alpha ^* (k)\), we observe that, for \(\alpha \in \mathcal {I}_{k}^{+}\) and neglecting the contribution of the constant direct term and of the exchange operator X on the r. h. s. of (3.5) (they will be proven to be small)

$$\begin{aligned} {{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v) c^*_{\alpha } (k) \Omega&\simeq \frac{1}{n_{\alpha ,k}}\sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \frac{\hbar ^2 (|p|^2 - |h|^2)}{2} a^*_p a^*_h \delta _{p-h,k} \Omega \\&= \frac{1}{n_{\alpha ,k}}\sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \frac{\hbar ^2 (p-h)\cdot (p+h)}{2} a^*_p a^*_h \delta _{p-h,k} \Omega \\&\simeq \frac{1}{n_{\alpha ,k}}\sum _{\begin{array}{c} p\in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \frac{\hbar ^2 k\cdot 2\omega _\alpha }{2} a^*_p a^*_h \delta _{p-h,k} \Omega = \hbar ^2 k\cdot \omega _\alpha c^*_{\alpha } (k)\Omega , \end{aligned}$$

where we used the fact that, for \(p,h \in B_\alpha \), \(p \simeq \omega _\alpha \simeq h\). A similar computation for \(\alpha \in \mathcal {I}_{k}^{-}\) shows that

$$\begin{aligned} {{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v) c^*_{\alpha } (k) \Omega \simeq {\hbar ^2} |k \cdot \omega _\alpha | c^*_\alpha (k) \Omega \end{aligned}$$
(4.12)

for all \(\alpha \in \mathcal {I}_k = \mathcal {I}_{k}^{+}\cup \mathcal {I}_{k}^{-}\).

If the operators \(c^*_\alpha (k), c_\alpha (k)\) were bosonic creation and annihilation operators, satisfying canonical commutation relations, and if \({{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v)\) were quadratic in these operators, (4.12) would lead us to

$$\begin{aligned} {{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v) \simeq \hbar ^2 \sum _{k \in \Gamma ^{{\mathrm{nor}}}} \sum _{\alpha \in \mathcal {I}_k} |k \cdot \omega _\alpha | c_\alpha ^* (k) c_\alpha (k). \end{aligned}$$

Thus, Equations (4.11) and (4.12) suggest that, if restricted to states with few particles, the correlation Hamiltonian should be approximated by the Sawada-type effective Hamiltonian

$$\begin{aligned} \mathcal {H}_{\mathrm{eff}} = \hbar \kappa \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|h_{\mathrm{eff}} (k) \end{aligned}$$
(4.13)

with

$$\begin{aligned} \begin{aligned} h_{\mathrm{eff}}(k) = \;&\sum _{\alpha \in \mathcal {I}_k} u_\alpha ^2 (k) \, c^*_{\alpha } (k) c_{\alpha } (k) + g (k) \bigg [ \Big ( \sum _{\alpha \in \mathcal {I}_{k}^{+}}\sum _{\beta \in \mathcal {I}_{k}^{-}} v_\alpha (k) v_\beta (-k) c^*_{\alpha } (k) c^*_{\beta } (k) + {\mathrm{h.c.}}\Big ) \\&+\sum _{\alpha \in \mathcal {I}_{k}^{+}}\sum _{\beta \in \mathcal {I}_{k}^{+}} v_\alpha (k) v_\beta (k) c^*_{\alpha } (k) c_{\beta } (k) + \sum _{\alpha \in \mathcal {I}_{k}^{-}}\sum _{\beta \in \mathcal {I}_{k}^{-}} v_\alpha (-k) v_\beta (-k) c^*_{\alpha } (k) c_{\beta } (k) \bigg ]. \end{aligned} \end{aligned}$$
(4.14)

We defined

$$\begin{aligned} u_\alpha (k) := |\hat{k} \cdot \hat{\omega }_\alpha |^{1/2}, \qquad g(k) := \kappa \hat{V} (k). \end{aligned}$$
(4.15)

(The main difference to Sawada’s original Hamiltonian is that he treated pairs \(a^*_p a^*_h\) as bosonic; our pair operators instead are delocalized over large patches, thus relaxing the Pauli principle and allowing a controlled bosonic approximation.) If the operators \(c_{\alpha }(k)\), \(c_{\alpha }^{*}(k)\) were exactly bosonic, the effective Hamiltonian \(\mathcal {H}_{{\mathrm{eff}}}\) could be diagonalized via a bosonic Bogoliubov transformation. We provide the details of this computation in Appendix A. The ground state of (4.13) would be given by

$$\begin{aligned} \xi = \exp \Big [ \frac{1}{2} \sum _{k \in \Gamma ^{{\mathrm{nor}}}} \sum _{\alpha , \beta \in \mathcal {I}_k} K_{\alpha , \beta } (k) c_\alpha ^* (k) c_\beta ^* (k) - \text {h.c.} \Big ] \Omega \;, \end{aligned}$$
(4.16)

where, for every \(k \in \Gamma ^{{\mathrm{nor}}}\), K(k) is the \(2I_k \times 2I_k\) matrix (with \(I_k := |\mathcal {I}_{k}^{+}|= |\mathcal {I}_{k}^{-}|\)) defined by

$$\begin{aligned} K(k) := \log |{S_1 (k)}^T|, \end{aligned}$$
(4.17)

(the superscript T denoting the transpose of the matrix) with

$$\begin{aligned} S_1 (k) := \left( D(k) +W(k) -\tilde{W}(k) \right) ^{1/2} E(k)^{-1/2}, \end{aligned}$$
(4.18)

and

$$\begin{aligned} E (k) := \left( (D(k) +W(k) -\tilde{W}(k) )^{1/2} (D(k) +W(k) +\tilde{W}(k)) (D(k)+W(k)- \tilde{W}(k) )^{1/2} \right) ^{1/2} \end{aligned}$$

and, recalling the definition (3.17) of \(v_\alpha (k)\),

$$\begin{aligned} \begin{aligned} D (k)&:= {\text {diag}}(u_\alpha ^2 (k) : \alpha \in \mathcal {I}_{k}), \\ W(k)_{\alpha ,\beta }&:= \left\{ \begin{array}{cl} g (k) v_\alpha (k) v_\beta (k) &{}\quad \text {for } \alpha ,\beta \in \mathcal {I}_{k}^{+}\\ g (k) v_\alpha (-k) v_\beta (-k) &{}\quad \text {for } \alpha ,\beta \in \mathcal {I}_{k}^{-}\\ 0 &{}\quad \text {for } \alpha \in \mathcal {I}_{k}^{+}, \beta \in \mathcal {I}_{k}^{-}\text { or } \alpha \in \mathcal {I}_{k}^{-}, \beta \in \mathcal {I}_{k}^{+},\end{array} \right. \\ {\tilde{W}}(k)_{\alpha ,\beta }&:= \left\{ \begin{array}{cl} g (k) v_\alpha (k) v_\beta (-k) &{}\quad \text {for } \alpha \in \mathcal {I}_{k}^{+}, \beta \in \mathcal {I}_{k}^{-}\\ g (k) v_\alpha (-k) v_\beta (k) &{}\quad \text {for } \alpha \in \mathcal {I}_{k}^{-},\beta \in \mathcal {I}_{k}^{+}\\ 0 &{}\quad \text {for } \alpha ,\beta \in \mathcal {I}_{k}^{+}\text { or } \alpha ,\beta \in \mathcal {I}_{k}^{-}.\end{array} \right. \end{aligned} \end{aligned}$$
(4.19)

However, the particle–hole pair operators \(c_\alpha ^* (k), c_\alpha (k)\) are not exactly bosonic, and thus the ground state vector of (4.13) is not given by (4.16). Nevertheless, by Lemma 4.1, it is reasonable to expect that the true ground state of \(\mathcal {H}_{{\mathrm{corr}}}\) will be energetically close to \(\xi \), provided that the number of fermions in \(\xi \) is small. This last fact is proven in Sect. 4.3.

Motivated by the above heuristic discussion, we define as trial state for the full many-body problem the fermionic Fock space vector

$$\begin{aligned} \psi _{\mathrm{trial}} := R_{\omega _{{\mathrm{pw}}}} T \Omega \;, \qquad T := e^{B}\;,\qquad B := \frac{1}{2}\sum _{k\in \Gamma ^{{\mathrm{nor}}}}\sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } c^*_\alpha (k) c^*_\beta (k) - {\mathrm{h.c.}}\end{aligned}$$
(4.20)

Notice that \(B^* = -B\), so T is unitary and hence \(||\psi _{\mathrm{trial}}|| = 1\). We have to check that \(\psi _{\mathrm{trial}}\) is an N-particle state. In fact, writing \(\xi := R^*_{\omega _{\mathrm{pw}}} \psi _{\mathrm{trial}}\), we have

$$\begin{aligned} \mathcal {N}\psi _{\mathrm{trial}} = R_{\omega _{\mathrm{pw}}} \Big [\sum _{p\in B_{\mathrm{F}}^c} a^*_p a_p + \sum _{h \in B_{\mathrm{F}}} a_h a^*_h \Big ] \xi = R_{\omega _{\mathrm{pw}}}(\mathcal {N}_{\mathrm{p}}-\mathcal {N}_{\mathrm{h}}) \xi + N\psi _{\mathrm{trial}}, \end{aligned}$$

which shows that \(\psi _{\mathrm{trial}}\) is an eigenvector of \(\mathcal {N}\) with eigenvalue N if and only if \(\xi \) is an eigenvector of \(\mathcal {N}_{\mathrm{p}}-\mathcal {N}_{\mathrm{h}}\) with eigenvalue 0. This is the content of the next lemma.

Lemma 4.3

(Particle–Hole Symmetry). For \(\xi \) as in (4.16) we have \((\mathcal {N}_{\mathrm{p}} - \mathcal {N}_{\mathrm{h}}) \xi = 0\).

Proof

Let \(\xi _\lambda = T_\lambda \Omega \), with \( T_{\lambda } = e^{\lambda B}\) for \(\lambda \in [0,1]\). Then \(\xi _1 = \xi \), \(\xi _0 = \Omega \), and thus

$$\begin{aligned} || (\mathcal {N}_{\mathrm{p}} - \mathcal {N}_{\mathrm{h}}) \xi ||^2= & {} \int _0^1 {{\mathrm{d}}}\lambda \, \frac{{{\mathrm{d}}}}{{{\mathrm{d}}}\lambda } \langle \xi _\lambda , ( \mathcal {N}_{\mathrm{p}} - \mathcal {N}_{\mathrm{h}})^2 \xi _\lambda \rangle \\= & {} \int _0^1 {{\mathrm{d}}}\lambda \, \langle \xi _\lambda , \left[ ( \mathcal {N}_{\mathrm{p}} - \mathcal {N}_{\mathrm{h}})^2 , B \right] \xi _\lambda \rangle = 0 \end{aligned}$$

because \([ \mathcal {N}_{\mathrm{p}} - \mathcal {N}_{\mathrm{h}} , c_\alpha ^* (k)] = 0 = [ \mathcal {N}_{\mathrm{p}} - \mathcal {N}_{\mathrm{h}} , c_\alpha (k) ]\) implies \([\mathcal {N}_{\mathrm{p}} - \mathcal {N}_{\mathrm{h}} , B] = 0\). \(\quad \square \)

4.3 Approximate bosonic Bogoliubov transformations

Our next task is to evaluate the energy of the fermionic many-body trial state \(\psi _{\mathrm{trial}} = R_{\omega _{{\mathrm{pw}}}} \xi = R_{\omega _{{\mathrm{pw}}}} T \Omega \), which by (3.4) and (3.8) reduces to calculating \(\langle \xi , \mathcal {H}_{{\mathrm{corr}}} \xi \rangle \). To do so, we will need some properties of the operator T, which are going to be proven in this section. More generally, we shall consider the one-parameter family of unitaries \(T_{\lambda } = e^{\lambda B}\), with B defined in (4.20).

The next proposition establishes that the action of \(T_\lambda \) approximates a bosonic Bogoliubov transformation.

Proposition 4.4

(Approximate Bogoliubov Transformation). Let \(\lambda \in [0,1]\). Let \(l \in \Gamma ^{{\mathrm{nor}}}\) and \(\gamma \in \mathcal {I}_{l}= \mathcal {I}_l^+ \cup \mathcal {I}_l^-\). Then

$$\begin{aligned} T^*_\lambda c_\gamma (l) T_\lambda&= \sum _{\alpha \in \mathcal {I}_{l}}\cosh (\lambda K(l))_{\alpha ,\gamma } c_\alpha (l) + \sum _{\alpha \in \mathcal {I}_{l}} \sinh (\lambda K(l))_{\alpha ,\gamma } c^*_\alpha (l) + \mathfrak {E}_\gamma (\lambda ,l), \end{aligned}$$

where the error operator \(\mathfrak {E}_\gamma (\lambda ,l)\) satisfies, for all \(\psi \in \mathcal {F}\), the bound

$$\begin{aligned} \Big [ \sum _{\gamma \in \mathcal {I}_{l}} ||\mathfrak {E}_\gamma (\lambda ,l)\psi ||^2 \Big ]^{1/2} \le \frac{C}{\mathfrak {n}^2} \sup _{\tau \in [0,\lambda ]}|| (\mathcal {N}+2)^{3/2} T_\tau \psi || \, e^{\lambda ||K(l)||_{{\mathrm{HS}}}} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} ||K(k)||_{{\mathrm{HS}}}. \end{aligned}$$
(4.21)

Here \(\mathfrak {n}= N^{1/3-\delta /2} M^{-1/2}\) as defined in (3.18), and \(||K(k)||_{{\mathrm{HS}}}\) denotes the Hilbert-Schmidt norm of the matrix K(k). The same estimate holds for \(\mathfrak {E}^*_\gamma (\lambda ,l)\).

Proof

We start from the Duhamel formula

$$\begin{aligned} T^*_\lambda c_\gamma (l) T_\lambda = c_\gamma (l) + \int _0^\lambda {{\mathrm{d}}}\tau \, T^*_\tau [c_\gamma (l),B]T_\tau . \end{aligned}$$

From Lemma 4.1, the commutator is given by

$$\begin{aligned}{}[c_\gamma (l),B] = \sum _{k\in \Gamma ^{{\mathrm{nor}}}}\frac{1}{2} \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } [c_\gamma (l),c^*_\alpha (k) c^*_\beta (k)] = \sum _{\alpha \in \mathcal {I}_{l}} K(l)_{\gamma ,\alpha } c^*_\alpha (l) + \mathfrak {e}_\gamma (l) \end{aligned}$$

where the error term is

$$\begin{aligned} \mathfrak {e}_\gamma (l) := \sum _{k \in \Gamma ^{{\mathrm{nor}}}}\frac{\chi _{\mathcal {I}_{k}}(\gamma )}{2} \sum _{\alpha \in \mathcal {I}_{k}} K(k)_{\gamma ,\alpha } \left( \mathcal {E}_\gamma (k,l) c^*_\alpha (k) + c^*_\alpha (k) \mathcal {E}_\gamma (k,l)\right) , \end{aligned}$$
(4.22)

with \(\chi _{\mathcal {I}_{k}}\) the indicator function of the set \(\mathcal {I}_{k}= \mathcal {I}_{k}^{+}\cup \mathcal {I}_{k}^{-}\) and \(\mathcal {E}_\gamma (k,l)\) bounded as in (4.3). Thus

$$\begin{aligned} \begin{aligned} T^*_\lambda c_\gamma (l) T_\lambda&= c_\gamma (l) + \sum _{\alpha \in \mathcal {I}_{l}} K(l)_{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau T^*_\tau c^*_\alpha (l) T_\tau + \int _0^\lambda {{\mathrm{d}}}\tau T^*_\tau \mathfrak {e}_\gamma (l) T_\tau , \\ T^*_\lambda c^*_\gamma (l) T_\lambda&= c^*_\gamma (l) + \sum _{\alpha \in \mathcal {I}_{l}} K(l)_{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau T^*_\tau c_\alpha (l) T_\tau + \int _0^\lambda {{\mathrm{d}}}\tau T^*_\tau \mathfrak {e}^*_\gamma (l) T_\tau . \end{aligned} \end{aligned}$$

We iterate \(n_0\) times by plugging the second equation into the second summand on the r. h. s. of the first equation and so forth. The simplex integrals produce factors 1 / n!, so we obtainFootnote 8

$$\begin{aligned}&T^*_\lambda c_\gamma (l) T_\lambda \\&\quad = c_\gamma (l) \\&\qquad + \sum _{\alpha \in \mathcal {I}_{l}} \lambda K(l)_{\gamma ,\alpha } c^*_\alpha (l) + \int _0^\lambda {{\mathrm{d}}}\tau _1 T^*_{\tau _1} \mathfrak {e}_\gamma (l) T_{\tau _1} \\&\qquad + \frac{1}{2!}\sum _{\alpha \in \mathcal {I}_{l}} \left( \lambda ^2 K(l)^2 \right) _{\gamma ,\alpha } c_\alpha (l) + \sum _{\alpha \in \mathcal {I}_{l}} K(l)_{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau _1\int _0^{\tau _1} {{\mathrm{d}}}\tau _2 T^*_{\tau _2} \mathfrak {e}^*_\alpha (l) T_{\tau _2} \\&\qquad + \frac{1}{3!}\!\sum _{\alpha \in \mathcal {I}_{l}} \left( \lambda ^3 K(l)^3\right) _{\gamma ,\alpha } c^*_\alpha (l) + \sum _{\alpha \in \mathcal {I}_{l}} \left( K(l)^2\right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau _1 \int _0^{\tau _1} {{\mathrm{d}}}\tau _2 \int _0^{\tau _2} {{\mathrm{d}}}\tau _3 T^*_{\tau _3} \mathfrak {e}_{\alpha }(l) T_{\tau _3}\\&\qquad + \ldots \\&\qquad + \sum _{\alpha \in \mathcal {I}_{l}} \left( K(l)^{n_0} \right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau _1 \int _0^{\tau _1} {{\mathrm{d}}}\tau _2 \ldots \int _0^{\tau _{n_0-1}} {{\mathrm{d}}}\tau _{n_0} T^*_{\tau _{n_0}} c^\natural _\alpha (l) T_{\tau _{n_0}}. \end{aligned}$$

Here we introduced the notation \(c^\natural _\alpha (l)\), which in this formula means \(c^*_\alpha (l)\) for \(n_0\) odd, and \(c_\alpha (l)\) for \(n_0\) even. The left term on every line is the leading term, the right term on every line is an error term which will be controlled later. The very last line is the ‘head’ of the iteration after \(n_0\) steps; we are going to control the expansion as \(n_0 \rightarrow \infty \), showing that the head vanishes.

Notice that leading terms are of the form of an exponential series \(\lambda ^n K(l)^n/n!\) but intermittently with c and \(c^*\). Separating creation and annihilation operators, we reconstruct \(\cosh (\lambda K(l))\) and \(\sinh (\lambda K(l))\). We find

$$\begin{aligned} T^*_\lambda c_\gamma (l) T_\lambda&= \sum _{\alpha \in \mathcal {I}_{l}}\cosh (\lambda K(l))_{\gamma ,\alpha } c_\alpha (l) + \sum _{\alpha \in \mathcal {I}_{l}} \sinh (\lambda K(l))_{\gamma ,\alpha } c^*_\alpha (l) + \mathfrak {E}_\gamma (\lambda ,l) \end{aligned}$$

where, for an arbitrary \(n_0 \in \mathbb {N}\),

$$\begin{aligned} \mathfrak {E}_\gamma (\lambda ,l)&:= \sum _{\alpha \in \mathcal {I}_{l}} \sum _{n=0}^{n_0-1} \left( K(l)^n\right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau _1\cdots \int _0^{\tau _{n}} {{\mathrm{d}}}\tau _{n+1} T^*_{\tau _{n+1}} \mathfrak {e}^{\natural }_{\alpha }(l) T_{\tau _{n+1}}\\&\qquad + \sum _{\alpha \in \mathcal {I}_{l}} \left( K(l)^{n_0} \right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau _1 \int _0^{\tau _1} {{\mathrm{d}}}\tau _2 \ldots \int _0^{\tau _{n_0-1}} {{\mathrm{d}}}\tau _{n_0} T^*_{\tau _{n_0}} c^\natural _\alpha (l) T_{\tau _{n_0}} \\&\qquad - \sum _{\alpha \in \mathcal {I}_{l}} \sum _{n = n_0}^\infty \frac{\lambda ^n (K(l)^n)_{\gamma ,\alpha }}{n!} c^\natural _{\alpha } (l). \end{aligned}$$

(In every summand, \(\mathfrak {e}_\alpha (l)\) and \(c_\alpha (l)\) appear for even n or \(n_0\), \(\mathfrak {e}_\alpha ^*(l)\) and \(c_\alpha ^* (l)\) for odd n or \(n_0\).) Notice that for any function \(f: \mathbb {R}\rightarrow \mathbb {R}\) the simplex integration simplifies to

$$\begin{aligned} \int _0^\lambda {{\mathrm{d}}}\tau _1 \int _0^{\tau _1} {{\mathrm{d}}}\tau _2 \cdots \int _0^{\tau _{n}} {{\mathrm{d}}}\tau _{n+1} f(\tau _{n+1}) = \int _0^\lambda \frac{(\lambda -\tau )^n}{n!} f(\tau ) {{\mathrm{d}}}\tau . \end{aligned}$$

Therefore, for all \(\psi \in \mathcal {F}\) we have

$$\begin{aligned} ||\mathfrak {E}_\gamma (\lambda ,l)\psi ||&\le \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} \sum _{n=0}^{n_0-1} \left( K(l)^n \right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^n}{n!} T^*_\tau \mathfrak {e}_\alpha ^\natural (l) T_\tau \psi \Big \Vert \\&\qquad + \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} \left( K(l)^{n_0} \right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^{n_0-1}}{(n_0-1)!} T^*_{\tau } c^\natural _\alpha (l) T_{\tau }\Big \Vert \\&\qquad + \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} \sum _{n=n_0}^{\infty } \frac{\lambda ^n (K(k)^n)_{\gamma ,\alpha }}{n!} c^\natural _\alpha (l) \psi \Big \Vert \;; \end{aligned}$$

using the explicit expression (4.22) for \(\mathfrak {e}_\alpha ^\natural (l)\) we have

$$\begin{aligned}&\le \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} \sum _{n=0}^{n_0-1} \left( K(l)^n \right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^n}{n!} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \frac{\chi _{\mathcal {I}_{k}}(\alpha )}{2} \\&\qquad \times \sum _{\delta \in \mathcal {I}_{k}} K(k)_{\alpha ,\delta } T^*_\tau (\mathcal {E}_\alpha (k,l) c^*_\delta (k) )^{\natural } T_\tau \psi \Big \Vert \\&\qquad + \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} \sum _{n=0}^{n_0-1} \left( K(l)^n \right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^n}{n!} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \frac{\chi _{\mathcal {I}_{k}}(\alpha )}{2} \\&\qquad \quad \quad \quad \quad \times \sum _{\delta \in \mathcal {I}_{k}} K(k)_{\alpha ,\delta } T^*_\tau c^*_\delta (k) \mathcal {E}_\alpha (k,l)^{\natural } T_\tau \psi \Big \Vert \\&\qquad + \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} \left( K(l)^{n_0} \right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^{n_0-1}}{(n_0-1)!} T^*_{\tau } c^\natural _\alpha (l) T_{\tau }\Big \Vert \\&\qquad + \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} \sum _{n=n_0}^{\infty } \frac{\lambda ^n (K(k)^n)_{\gamma ,\alpha }}{n!} c^\natural _\alpha (l) \psi \Big \Vert \\&=: \text {A}_\gamma + \text {B}_\gamma + \text {C}_\gamma + \text {D}_\gamma . \end{aligned}$$

Let us start by estimating \(\text {B}_\gamma \). We shall neglect the symbol \(\natural \); the bounds are the same whether for the operator or its adjoint. Using also (4.9), we get

$$\begin{aligned} \text {B}_\gamma&\le \sum _{\alpha \in \mathcal {I}_{l}}\sum _{n=0}^\infty |\left( K(l)^n\right) _{\gamma ,\alpha } |\int _0^\lambda {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^n}{n!} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \frac{\chi _{\mathcal {I}_{k}}(\alpha )}{2} \\&\quad \times \Big \Vert \sum _{\delta \in \mathcal {I}_{k}} K(k)_{\alpha ,\delta } c^*_\delta (k) \mathcal {E}_\alpha (k,l)T_\tau \psi \Big \Vert \\&\le \sum _{\alpha \in \mathcal {I}_{l}}\sum _{n=0}^\infty |\left( K(l)^n\right) _{\gamma ,\alpha } |\int _0^\lambda {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^n}{n!} \\&\quad \times \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \frac{\chi _{\mathcal {I}_{k}}(\alpha )}{2} \Big [ \sum _{\delta \in \mathcal {I}_{k}} |K(k)_{\alpha ,\delta } |^2 \Big ]^{1/2} || (\mathcal {N}+1)^{1/2} \mathcal {E}_\alpha (k,l)T_\tau \psi ||; \end{aligned}$$

pulling \(\mathcal {E}_\alpha (k,l)\) through \((\mathcal {N}+1)^{1/2}\) to the front and then using (4.3), we get

$$\begin{aligned} \text {B}_\gamma&\le \sum _{\alpha \in \mathcal {I}_{l}}\sum _{n=0}^\infty |\left( K(l)^n\right) _{\gamma ,\alpha } |\\&\quad \times \int _0^\lambda \! {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^n}{n!}\!\! \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \chi _{\mathcal {I}_{k}}(\alpha ) \Big [ \sum _{\delta \in \mathcal {I}_{k}} |K(k)_{\alpha ,\delta } |^2 \Big ]^{1/2} \frac{|| \mathcal {N}(\mathcal {N}+1)^{1/2} T_\tau \psi ||}{n_{\alpha ,k} n_{\alpha ,l}}\\&\le \frac{\sup _{\tau \in [0,\lambda ]} || (\mathcal {N}+1)^{3/2} T_\tau \psi ||}{\mathfrak {n}^2} \sum _{n=0}^\infty \frac{\lambda ^{n+1}}{(n+1)!} \\&\quad \times \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \sum _{\alpha \in \mathcal {I}_{l}\cap \mathcal {I}_{k}}|\left( K(l)^n\right) _{\gamma ,\alpha } |\Big [ \sum _{\delta \in \mathcal {I}_{k}} |K(k)_{\alpha ,\delta } |^2 \Big ]^{1/2} \\&\le \frac{\sup _{\tau \in [0,\lambda ]} || (\mathcal {N}+1)^{3/2} T_\tau \psi ||}{\mathfrak {n}^2}\\&\quad \times \Bigg [ \lambda \sum _{k \in \Gamma ^{{\mathrm{nor}}}} \Big [ \sum _{\delta \in \mathcal {I}_{k}} |K(k)_{\gamma ,\delta } |^2 \Big ]^{1/2} \\&\qquad \quad + \sum _{n= 1}^\infty \frac{\lambda ^{n+1}}{(n+1)!} \Big [\! \sum _{\alpha \in \mathcal {I}_{l}\cap \mathcal {I}_{k}}\!\! |\left( K(l)^n\right) _{\gamma ,\alpha } |^2 \Big ]^{1/2} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \Vert K (k) \Vert _{{\mathrm{HS}}}\Bigg ] \end{aligned}$$

where we used \(n_{\alpha ,k} \ge \mathfrak {n}\) as established in (3.18), we separated the term with \(n=0\) and, for \(n \ge 1\), we applied Cauchy–Schwarz to the sum over \(\alpha \). Again by Cauchy–Schwarz, we obtain

$$\begin{aligned} \Big [ \sum _{\gamma \in \mathcal {I}_{l}} \text {B}_\gamma ^2 \Big ]^{1/2} \le C \, \frac{\sup _{\tau \in [0,\lambda ]} || (\mathcal {N}+1)^{3/2} T_\tau \psi ||}{\mathfrak {n}^2} \, e^{\lambda \Vert K(k) \Vert _{{\mathrm{HS}}}} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \Vert K (k) \Vert _{{\mathrm{HS}}}. \end{aligned}$$

The error term \(\text {A}_\gamma \) can be treated similarly (applying first (4.3) and then (4.9)). We find

$$\begin{aligned} \Big [ \sum _{\gamma \in \mathcal {I}_{l}} \text {A}_\gamma ^2 \Big ]^{1/2} \le C \, \frac{\sup _{\tau \in [0,\lambda ]} || (\mathcal {N}+1)^{3/2} T_\tau \psi ||}{\mathfrak {n}^2} \, e^{\lambda \Vert K(k) \Vert _{{\mathrm{HS}}}} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \Vert K (k) \Vert _{{\mathrm{HS}}}. \end{aligned}$$

As for the term \(\text {C}_\gamma \), it is controlled with (4.9) by

$$\begin{aligned} \text {C}_\gamma&\le \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} \left( K(l)^{n_0} \right) _{\gamma ,\alpha } \int _0^\lambda {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^{n_0-1}}{(n_0-1)!} T^*_{\tau } c^\natural _\alpha (l) T_{\tau }\Big \Vert \\&\le \int _0^\lambda {{\mathrm{d}}}\tau \frac{(\lambda -\tau )^{n_0-1}}{(n_0-1)!} \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} \left( K(l)^{n_0} \right) _{\gamma ,\alpha } c^\natural _\alpha (l) T_\tau \psi \Big \Vert \\&\le \frac{\lambda ^{n_0}}{n_0!} \Big ( \sum _{\alpha \in \mathcal {I}_{l}} \Big | \Big ( K(l)^{n_0} \Big )_{\gamma ,\alpha } \Big |^2 \Big )^{1/2} \sup _{\tau \in [0,1]} || (\mathcal {N}+1)^{1/2} T_\tau \psi ||. \end{aligned}$$

This implies that

$$\begin{aligned} \Big [ \sum _{\gamma \in \mathcal {I}_{l}} \text {C}_\gamma ^2 \Big ]^{1/2} \le C \, \frac{\lambda ^{n_0} \Vert K(l) \Vert _{{\mathrm{HS}}}^{n_0}}{n_0!} \, \sup _{\tau \in [0,\lambda ]} || (\mathcal {N}+1)^{1/2} T_\tau \psi ||. \end{aligned}$$

Finally, the term \(\text {D}_\gamma \) can be bounded by

$$\begin{aligned} \begin{aligned} \text {D}_\gamma&\le \sum _{n \ge n_0} \frac{\lambda ^n}{n!} \Big \Vert \sum _{\alpha \in \mathcal {I}_{l}} (K(k)^n)_{\gamma ,\alpha } c_\alpha ^\natural (l) \psi \Big \Vert \\&\le \sum _{n \ge n_0} \frac{\lambda ^n}{n!} \Big [ \sum _{\alpha \in \mathcal {I}_{l}} | (K(k)^n)_{\gamma ,\alpha }|^2 \Big ]^{1/2} \Vert (\mathcal {N}+1)^{1/2} \psi \Vert \end{aligned} \end{aligned}$$

which leads us to

$$\begin{aligned} \Big [ \sum _{\gamma \in \mathcal {I}_{l}} \text {D}_\gamma ^2 \Big ]^{1/2} \le C \sum _{n \ge n_0} \frac{\lambda ^n \Vert K(k) \Vert _{{\mathrm{HS}}}^n}{n!} \, \Vert (\mathcal {N}+ 1)^{1/2} \psi \Vert . \end{aligned}$$

Since all these bounds hold for any \(n_0 \in \mathbb {N}\), we obtain

$$\begin{aligned} \left[ \sum _{\gamma \in \mathcal {I}_{l}} ||\mathfrak {E}_\gamma (\lambda ,l)\psi ||^2 \right] ^{1/2} \le C \, \frac{\sup _{\tau \in [0,\lambda ]} || (\mathcal {N}+1)^{3/2} T_\tau \psi ||}{\mathfrak {n}^2} \, e^{\lambda \Vert K(k) \Vert _{{\mathrm{HS}}}} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \Vert K (k) \Vert _{{\mathrm{HS}}}. \end{aligned}$$

\(\square \)

The next lemma provides the required bounds for the matrix K(k), defined in (4.17). For later estimates it is important that this bound implies \(K(k) = 0\) outside the support of \(\hat{V}\). (Actually the constant C here may be chosen independent of V.)

Lemma 4.5

(Bound on the Bogoliubov Kernel). Let \(k \in \Gamma ^{{\mathrm{nor}}}\). Then the matrices E(k), \(D(k)+W(k)-\tilde{W}(k)\), and \(D(k)+W(k)+\tilde{W}(k)\), all defined in (4.19), are strictly positive. Let K(k) be defined by (4.17). Then we have

$$\begin{aligned} || K (k)||_{{\mathrm{HS}}}\le || K (k)||_{{\mathrm{tr}}}\le C \hat{V} (k), \end{aligned}$$
(4.23)

where \(|| K (k)||_{{\mathrm{tr}}}\) denotes the trace norm of the matrix K(k).

Proof

Recall that \(I_k = |\mathcal {I}_{k}^{+}| = |\mathcal {I}_{k}^{-}|\) and that the matrix K(k) is symmetric and has size \(2I_k \times 2I_k\). All quantities in this proof depend on the same k, so we simplify notation by dropping this dependence where there is no risk of confusion, writing e. g., \(I\) for \(I_k\).

To exhibit the block structure of the matrices, we map the indices \(\mathcal {I}_{k}^{+}\) to \(\{1,\ldots , I\}\), and the indices \(\mathcal {I}_{k}^{-}\) to \(\{I+1,\ldots 2I\}\). There are many such mappings, but due to the reflection symmetry of the patches (\(B_{\alpha +M/2} = - B_\alpha \) and \(\omega _{\alpha +M/2} = - \omega _\alpha \) in the original numbering), we can choose one such that \(v_\alpha (k) = v_{\alpha +I}(-k)\). This implies that

$$\begin{aligned} W= \begin{pmatrix} b &{} 0 \\ 0 &{} b \end{pmatrix}, \quad \tilde{W}= \begin{pmatrix} 0&{} b\\ b &{} 0\end{pmatrix}, \end{aligned}$$
(4.24)

where \(b_{\alpha ,\beta } = g v_\alpha (k) v_\beta (k)\) defines an \(I\times I\)-matrix. We drop the k-dependence from the notation and just write \(v_\alpha = v_\alpha (k)\). In Dirac notation, where \(|v\rangle \langle v|\) is the orthogonal projection onto \(v = (v_1, \cdots , v_{I})\), we have \(b= g |v\rangle \langle v|\in \mathbb {C}^{I\times I}\).

Also \(D_{\alpha ,\alpha } = |\hat{k}\cdot \hat{\omega }_\alpha |\) is invariant under reflection at the origin and so \(D\) simplifies to

$$\begin{aligned} D= \begin{pmatrix} d &{} 0 \\ 0 &{} d \end{pmatrix}, \quad d = {\text {diag}}(u_\alpha ^2, \alpha = 1,\ldots ,I). \end{aligned}$$

Recalling the definition of the index set \(\mathcal {I}_{k}^{+}\) we notice that \(u_{\alpha }^2 \ge N^{-\delta }\) for all \(\alpha \in \{1,\ldots ,I\}\), and thus d is invertible. Since \(b \ge 0\) (because \(g \ge 0\)), we find \(d + 2b \ge d > 0\); hence also \(d+2b\) is invertible.

To simplify the computation further, let

$$\begin{aligned} U = \frac{1}{\sqrt{2}}\begin{pmatrix} \mathbb {I}&{} \mathbb {I}\\ \mathbb {I}&{} -\mathbb {I}\end{pmatrix}, \end{aligned}$$

where \(\mathbb {I}\) is the \(I\times I\)-identity matrix. Obviously \(U^T = U = U^{-1}\), and it simultaneously blockdiagonalizes

$$\begin{aligned} U^T (D+W+\tilde{W}) U = \begin{pmatrix} d+2b &{} 0 \\ 0 &{} d \end{pmatrix}, \quad U^T (D+W-\tilde{W}) U = \begin{pmatrix} d &{} 0\\ 0 &{} d+2b \end{pmatrix}. \end{aligned}$$
(4.25)

This shows that \(D+W+\tilde{W}\) and \(D+W-\tilde{W}\) are strictly positive, thus invertible, and have a positive square root. We also find

$$\begin{aligned} U^T E U = \begin{pmatrix} \left[ d^{1/2}(d+2b) d^{1/2} \right] ^{1/2} &{} 0 \\ 0 &{} \left[ (d+2b)^{1/2} d (d+2b)^{1/2}\right] ^{1/2} \end{pmatrix}. \end{aligned}$$

Both blocks are strictly positive; E is therefore invertible and has a strictly positive operator square root.

Now consider

$$\begin{aligned} |S_1|^2 = S_1^T S_1 = E^{-1/2} (D+W-\tilde{W}) E^{-1/2}. \end{aligned}$$

We find

$$\begin{aligned} U^T |S_1 |^2 U = \begin{pmatrix} A_1 &{} 0 \\ 0 &{} A_2 \end{pmatrix} \end{aligned}$$
(4.26)

with

$$\begin{aligned} \begin{aligned} A_1&:= \left[ d^{1/2} (d+2b) d^{1/2} \right] ^{-1/4} d \left[ d^{1/2} (d+2b) d^{1/2} \right] ^{-1/4}, \\ A_2&:= \left[ (d+2b)^{1/2} d (d+2b)^{1/2} \right] ^{-1/4} (d+2b) \left[ (d+2b)^{1/2} d (d+2b)^{1/2} \right] ^{-1/4}. \end{aligned} \end{aligned}$$
(4.27)

Since b is a positive operator, using operator monotonicity of the inverse and the square root, we find

$$\begin{aligned} d^{1/2} \left[ d^{1/2}(d+2b)d^{1/2}\right] ^{-1/2} d^{1/2} \le \mathbb {I}. \end{aligned}$$

Using the equality of the spectra \(\sigma (AB) = \sigma (BA)\) for positive operators A and B, we conclude that

$$\begin{aligned} \sigma (A_1)&=\sigma \left( d \left[ d^{1/2}(d+2b)d^{1/2}\right] ^{-1/2}\right) = \sigma \left( d^{1/2} \left[ d^{1/2}(d+2b)d^{1/2}\right] ^{-1/2} d^{1/2} \right) \end{aligned}$$

and therefore that

$$\begin{aligned} A_1 \le \mathbb {I}. \end{aligned}$$
(4.28)

Arguing similarly, we find that

$$\begin{aligned} \mathbb {I}\le A_2. \end{aligned}$$
(4.29)

We introduce the polar decomposition \(S_1 = O |S_1 |\); a priori O is a partial isometry, but since \(S_1\) is invertible, O is actually an orthogonal matrix. Then \(|S_1^T|^2 = S_1 S_1^T = O |S_1||S_1|^T O^T = O|S_1|^2 O^T\) because \(|S_1|^T = |S_1|\). This implies

$$\begin{aligned} || K ||_{{\mathrm{tr}}}= || \log |S_1^T|||_{{\mathrm{tr}}}= \frac{1}{2} || \log |S_1^T|^2 ||_{{\mathrm{tr}}}= \frac{1}{2} || \log O |S_1|^2 O^T||_{{\mathrm{tr}}}= ||\log |S_1|^2||_{{\mathrm{tr}}}. \end{aligned}$$

Using furthermore the blockdiagonalization (4.26), we find

$$\begin{aligned} || K ||_{{\mathrm{tr}}}= ||U^T\log |S_1|^2 U||_{{\mathrm{tr}}}= \frac{1}{2} || \log \begin{pmatrix} A_1 &{} 0 \\ 0 &{} A_2 \end{pmatrix} ||_{{\mathrm{tr}}}= \frac{1}{2} || \log A_1 ||_{{\mathrm{tr}}}+ \frac{1}{2} || \log A_2 ||_{{\mathrm{tr}}}. \end{aligned}$$

Equations (4.28) and (4.29) imply that \(\log A_1 \le 0\) and \(\log A_2 \ge 0\). Hence

$$\begin{aligned} || K ||_{{\mathrm{tr}}}= \frac{1}{2} \left( - {\text {tr}}\log A_1 + {\text {tr}}\log A_2 \right) = \frac{1}{2} \left( - \log \det A_1 + \log \det A_2 \right) . \end{aligned}$$

From the definition (4.27), we arrive at

$$\begin{aligned} \begin{aligned} || K ||_{{\mathrm{tr}}}&= \log \det (d+2b) - \log \det d = \log \det \left( \mathbb {I}+ 2 d^{-1/2} b d^{-1/2}\right) \\&\le 2 {\text {tr}}d^{-1/2} b d^{-1/2} = 2 g \langle v , d^{-1} v \rangle = 2 g \sum _{\alpha = 1}^{I} \frac{v_\alpha ^2}{u_\alpha ^2} \le C g = C\kappa \hat{V}(k) \end{aligned} \end{aligned}$$

where we used Proposition 3.1, which implies \(v_\alpha ^2 \le C M^{-1} u_\alpha ^2\). (Recall also \(I\le M/2\).)

\(\square \)

We are now ready to estimate the expectation of \(\mathcal {N}^{n}\) in the state \(\xi \), defined in (4.16). We follow a strategy similar to the one developed in the dynamical setting in [8] for the control of the growth of many-body fluctuations around Hartree–Fock dynamics.

Proposition 4.6

(Bound on the Number of Fermions). For all \(n \in \mathbb {N}\) and for all \(\psi \in \mathcal {F}\) we have (for a constant C that does not depend on n)

$$\begin{aligned} \sup _{\lambda \in [0,1]} \langle T_\lambda \psi , (\mathcal {N}+1)^n T_\lambda \psi \rangle \le e^{C n} \langle \psi , (\mathcal {N}+5)^n \psi \rangle . \end{aligned}$$

Proof

From the CAR (3.1) we get

$$\begin{aligned}{}[\mathcal {N},c^*_{\alpha }(k)] = 2c^*_{\alpha }(k) \quad \text {and} \quad c^*_{\alpha }(k) c^*_{\beta }(l) (\mathcal {N}+4) = \mathcal {N}c^*_{\alpha }(k) c^*_{\beta }(l). \end{aligned}$$

We calculate the derivative w.r.t. \(\lambda \) of the expectation value of \((\mathcal {N}+5)^n\):

$$\begin{aligned}&\left| \frac{{{\mathrm{d}}}}{{{\mathrm{d}}}\lambda } \langle T_\lambda \psi , (\mathcal {N}+5)^n T_\lambda \psi \rangle \right| \\&\quad = \Big | \langle T_\lambda \psi , \sum _{j=0}^{n-1} (\mathcal {N}+5)^j [\mathcal {N},B](\mathcal {N}+5)^{n-j-1} T_\lambda \psi \rangle \Big | \\&\quad = \Big | 4{\text {Re}}\sum _{k\in \Gamma ^{{\mathrm{nor}}}} \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } \sum _{j=0}^{n-1} \langle T_\lambda \psi , (\mathcal {N}+5)^j c^*_\alpha (k) c^*_\beta (k) (\mathcal {N}+5)^{n-j-1} T_\lambda \psi \rangle \Big |\;. \end{aligned}$$

To distribute the powers of the number operator equally to both arguments of the inner product, we insert \(\mathbb {I}= (\mathcal {N}+1)^{\frac{n}{2}-1-j} (\mathcal {N}+1)^{j+1-\frac{n}{2}}\) between \((\mathcal {N}+5)^j\) and \(c^*_\alpha (k)\) and then pull \((\mathcal {N}+1)^{j+1-\frac{n}{2}}\) through \(c^*_\alpha (k)c^*_\beta (k)\) to the right. Thus

$$\begin{aligned} \left| \frac{{{\mathrm{d}}}}{{{\mathrm{d}}}\lambda } \langle T_\lambda \psi , (\mathcal {N}+5)^n T_\lambda \psi \rangle \right| = \Big | 4{\text {Re}}\sum _{k \in \Gamma ^{{\mathrm{nor}}}} \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } \sum _{j=0}^{n-1} \langle \xi _j, c^*_\alpha (k) c^*_\beta (k) {\tilde{\xi }} \rangle \Big | \end{aligned}$$

where we have introduced \(\xi _j := (\mathcal {N}+1)^{\frac{n}{2}-1-j}(\mathcal {N}+5)^j T_\lambda \psi \) and \(\tilde{\xi } := (\mathcal {N}+5)^{\frac{n}{2}} T_\lambda \psi \). By Cauchy–Schwarz

$$\begin{aligned}&\left| \frac{{{\mathrm{d}}}}{{{\mathrm{d}}}\lambda } \langle T_\lambda \psi , (\mathcal {N}+5)^n T_\lambda \psi \rangle \right| \\&\le 4\sum _{k \in \Gamma ^{{\mathrm{nor}}}} \sum _{j=0}^{n-1} \sum _{\alpha ,\beta \in \mathcal {I}_{k}} \left|K(k)_{\alpha ,\beta }\right|||c_\beta (k) c_\alpha (k) \xi _j|| ||{\tilde{\xi }}|| \\&\le 4\sum _{k \in \Gamma ^{{\mathrm{nor}}}} \sum _{j=0}^{n-1} \Big ( \sum _{\alpha ,\beta \in \mathcal {I}_{k}} \Big | K(k)_{\alpha ,\beta } \Big |^2 \Big )^{1/2} \Big ( \sum _{\alpha ,\beta \in \mathcal {I}_{k}} ||c_\beta (k) c_\alpha (k) \xi _j||^2 \Big )^{1/2} ||{\tilde{\xi }}|| \end{aligned}$$

using the first bound from Lemma 4.2

$$\begin{aligned}&\le 4 \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \sum _{j=0}^{n-1} ||K(k)||_{{\mathrm{HS}}}\Big ( \sum _{\alpha ,\beta \in \mathcal {I}_{k}} ||\mathcal {N}(B_{\mathrm{F}} \cap B_\beta )^{1/2} c_\alpha (k) \xi _j||^2 \Big )^{1/2} ||{\tilde{\xi }}||\\&= 4 \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \sum _{j=0}^{n-1} ||K(k)||_{{\mathrm{HS}}}\Big ( \sum _{\alpha \in \mathcal {I}_{k}} \langle c_\alpha (k) \xi _j, \sum _{\beta \in \mathcal {I}_{k}} \mathcal {N}(B_{\mathrm{F}}\cap B_\beta ) c_\alpha (k) \xi _j\rangle \Big )^{1/2} ||{\tilde{\xi }}|| \end{aligned}$$

with the trivial estimate \(\sum _{\beta \in \mathcal {I}_{k}} \mathcal {N}(B_{\mathrm{F}}\cap B_\beta ) \le \mathcal {N}\) then

$$\begin{aligned}&\le 4 \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \sum _{j=0}^{n-1} ||K(k)||_{{\mathrm{HS}}}\Big ( \sum _{\alpha \in \mathcal {I}_{k}} \langle c_\alpha (k) \xi _j, \mathcal {N}c_\alpha (k) \xi _j\rangle \Big )^{1/2} ||{\tilde{\xi }}||\\&\le 4 \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \sum _{j=0}^{n-1} ||K(k)||_{{\mathrm{HS}}}\Big ( \sum _{\alpha \in \mathcal {I}_{k}} || (\mathcal {N}+2)^{1/2} c_\alpha (k) \xi _j||^2 \Big )^{1/2} ||{\tilde{\xi }}|| \\&= 4 \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \sum _{j=0}^{n-1} ||K(k)||_{{\mathrm{HS}}}\Big ( \sum _{\alpha \in \mathcal {I}_{k}} || c_\alpha (k) \mathcal {N}^{1/2} \xi _j||^2 \Big )^{1/2} ||{\tilde{\xi }}|| \end{aligned}$$

and, estimating \(c_\alpha (k)\) by the first bound from Lemma 4.2,

$$\begin{aligned}&\le 4 \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \sum _{j=0}^{n-1} ||K(k)||_{{\mathrm{HS}}}|| \mathcal {N}\xi _j|| ||{\tilde{\xi }}|| \le 4\sum _{k \in \Gamma ^{{\mathrm{nor}}}} n ||K(k)||_{{\mathrm{HS}}}\langle T_\lambda \psi , (\mathcal {N}+5)^n T_\lambda \psi \rangle . \end{aligned}$$
(4.30)

From the differential inequality (4.30), using Grönwall’s Lemma, we conclude that

$$\begin{aligned} \langle T_\lambda \psi , (\mathcal {N}+5)^n T_\lambda \psi \rangle\le & {} \exp \Big ( 4 n \lambda \sum _{k \in \Gamma ^{{\mathrm{nor}}}} || K (k) ||_{{\mathrm{HS}}}\Big ) \langle \psi , (\mathcal {N}+5)^n \psi \rangle \\\le & {} e^{C n \lambda } \langle \psi , (\mathcal {N}+5)^n \psi \rangle \end{aligned}$$

where in the last inequality we used (4.23) and the assumptions on V. \(\quad \square \)

5 Evaluating the Energy of the Trial State

In this section we calculate the expectation value \(\langle \xi , \mathcal {H}_{{\mathrm{corr}}} \xi \rangle \), for the trial state \(\xi \) defined in (4.16) and \(\mathcal {H}_{\mathrm{corr}}\) defined in (3.8). We start with some simple estimates for the non-bosonizable terms. Afterwards we linearize the kinetic energy and calculate its contribution to the expectation value, before we eventually turn to the main part of the interaction.

5.1 Getting rid of non-bosonizable terms

In the next lemma, we show that the contribution of the terms in (3.6) to the expectation \(\langle \xi , \mathcal {H}_{{\mathrm{corr}}} \xi \rangle \) is negligible for \(N \rightarrow \infty \).

Lemma 5.1

(Non-Bosonizable Interaction Terms). Let \(\mathcal {E}_1 (x,y)\) be defined as in (3.6). Let \(\xi \) be the trial state defined as in (4.16). Then we have

$$\begin{aligned} \Big | \Big \langle \xi , \frac{1}{2N}\int _{\mathbb {T}^3\times \mathbb {T}^3} {{\mathrm{d}}}x{{\mathrm{d}}}y\, V(x-y) \,\mathcal {E}_1(x,y)\, \xi \Big \rangle \Big | \le C N^{-1}. \end{aligned}$$

Proof

We are going to show that for all \(\psi \in \mathcal {F}\) we have

$$\begin{aligned} \Big | \Big \langle \psi , \frac{1}{2N}\int _{\mathbb {T}^3\times \mathbb {T}^3} {{\mathrm{d}}}x{{\mathrm{d}}}y\, V(x-y) \,\mathcal {E}_1(x,y)\, \psi \Big \rangle \Big | \le \frac{2}{N} \sum _{k \in \mathbb {Z}^3} |\hat{V}(k)|\, \langle \psi , (\mathcal {N}+1)^2\psi \rangle . \end{aligned}$$
(5.1)

The final claim then follows using Proposition 4.6. To prove (5.1), let us rewrite the first term on the r. h. s. of (3.6) by using the CAR and \(\langle u_x,u_y\rangle = u(x,y)\), yielding

$$\begin{aligned}&\frac{1}{2N} \int _{\mathbb {T}^3\times \mathbb {T}^3} {{\mathrm{d}}}x{{\mathrm{d}}}y\, V(x-y) a^*(u_x) a^*(u_y) a(u_y) a(u_x)\nonumber \\&\quad = \frac{1}{2N} \int _{\mathbb {T}^3\times \mathbb {T}^3} {{\mathrm{d}}}x{{\mathrm{d}}}y\, V(x-y) \Big ( a^*(u_x) a(u_x) a^*(u_y) a(u_y) - a^*(u_x) \langle u_x,u_y\rangle a(u_y) \Big )\nonumber \\&\quad = \frac{1}{2N} \sum _{k\in \mathbb {Z}^3} \hat{V}(k) \Big ( {{\mathrm{d}}}\Gamma (u e^{ikx}u){{\mathrm{d}}}\Gamma (u e^{-ikx} u) - {{\mathrm{d}}}\Gamma (u e^{ikx} u e^{-ikx} u) \Big )\;. \end{aligned}$$
(5.2)

Recall the two bounds \(||{{\mathrm{d}}}\Gamma (A) \psi || \le ||A||_{{\mathrm{op}}}||\mathcal {N}\psi ||\) and \(|\langle \psi , {{\mathrm{d}}}\Gamma (A) \psi \rangle |\le ||A||_{{\mathrm{op}}}\langle \psi , \mathcal {N}\psi \rangle \) for any bounded one-particle operator A and any \(\psi \in \mathcal {F}\). Thus, using that \(\Vert u \Vert _{\text {op}} \le 1\),

$$\begin{aligned} \Big | \Big \langle \xi , \frac{1}{2N} \sum _{k\in \mathbb {Z}^3} \hat{V}(k) {{\mathrm{d}}}\Gamma (u e^{ikx}u){{\mathrm{d}}}\Gamma (u e^{-ikx} u) \xi \Big \rangle \Big | \le \frac{1}{2N} \sum _{k\in \mathbb {Z}^3} |\hat{V}(k)|||\mathcal {N}\xi ||^2. \end{aligned}$$

The second summand in (5.2) can be estimated in the same way. The same holds true for the other two terms in (3.6). \(\quad \square \)

Let us now consider the error term \(\mathcal {E}_{2}\), defined in (3.7). We prove that this term vanishes in our trial state \(\xi \).

Lemma 5.2

(Interaction Terms of Wrong Parity). Let \(\mathcal {E}_{2}(x,y)\) be defined as in (3.7). Let \(\xi \) be the trial state defined in (4.16). Then we have

$$\begin{aligned} \Big \langle \xi , \frac{1}{2N}\int _{\mathbb {T}^3\times \mathbb {T}^3} {{\mathrm{d}}}x{{\mathrm{d}}}y\, V(x-y) \big ( \mathcal {E}_2(x,y) + {\mathrm{h.c.}}\big ) \xi \Big \rangle =0. \end{aligned}$$

Proof

Since terms in \(\mathcal {E}_2(x,y)\) create exactly two fermions, we have

$$\begin{aligned} i^\mathcal {N}\mathcal {E}_2(x,y) = \mathcal {E}_2(x,y) i^{\mathcal {N}+2} = -\mathcal {E}_2(x,y) i^\mathcal {N}. \end{aligned}$$

Recall that \(\xi = T \Omega \), with \(T= \exp (B)\) and B as in (4.20). We have \([i^\mathcal {N},B] =0\), since B creates or annihilates particles four at a time. This implies \(T i^\mathcal {N}= i^\mathcal {N}T\). Using \(i^\mathcal {N}\Omega = \Omega \), we get

$$\begin{aligned} \langle T\Omega , \mathcal {E}_2 T \Omega \rangle&= \langle T \Omega , \mathcal {E}_2 T i^{\mathcal {N}} \Omega \rangle = -\langle T \Omega , i^\mathcal {N}\mathcal {E}_2 T\Omega \rangle = - \langle (-i)^\mathcal {N}T\Omega , \mathcal {E}_2 T\Omega \rangle \\&= - \langle T (-i)^\mathcal {N}\Omega , \mathcal {E}_2 T\Omega \rangle = - \langle T\Omega , \mathcal {E}_2 T\Omega \rangle \;, \end{aligned}$$

which thus vanishes. \(\quad \square \)

5.2 Estimating direct and exchange operators

In this section we estimate the contribution of the direct and exchange terms to \({{\mathrm{d}}}\Gamma (uhu-\overline{v}\overline{h}v)\). Recall that

$$\begin{aligned} h = -\frac{\hbar ^2 \Delta }{2} + (2\pi )^3 \hat{V} (0) + X \end{aligned}$$

where X has the integral kernel \(X(x,y) = -N^{-1} V(x-y) \omega _{{\mathrm{pw}}}(x,y)\). The contribution of the constant direct term \((2\pi )^3 \hat{V} (0)\) is

$$\begin{aligned} (2\pi )^3 \hat{V} (0) \, {{\mathrm{d}}}\Gamma (u^2 - \overline{v} v) = (2\pi )^3 \hat{V} (0) {{\mathrm{d}}}\Gamma (\mathbb {I}-2 \omega _{\mathrm{pw}}) = (2\pi )^3 \hat{V} (0) (\mathcal {N}_{\mathrm{p}} - \mathcal {N}_{\mathrm{h}}) \end{aligned}$$

and therefore it vanishes on \(\xi \) by Lemma 4.3. The next lemma allows us to control the contribution of the exchange term X.

Lemma 5.3

(Bound for the Exchange Term). Let \(\xi \) be the trial state defined as in (4.16). Then we have

$$\begin{aligned} |\langle \xi , {{\mathrm{d}}}\Gamma (uXu-\overline{v}\overline{X}v) \xi \rangle |\le C N^{-1}. \end{aligned}$$

Proof

Notice that

$$\begin{aligned} \omega _{\mathrm{pw}}(x,y) = \frac{1}{(2\pi )^3}\sum _{h \in B_{\mathrm{F}}} e^{ih\cdot (x-y)} =: f (x-y). \end{aligned}$$

Thus X is translation invariant, and hence

$$\begin{aligned} ||X||_\text {op} = N^{-1} || \widehat{Vf}||_{L^\infty } \le N^{-1} || \hat{f} ||_{L^\infty } \sum _{k \in \mathbb {Z}^3} |\hat{V}(k)|\, \le C N^{-1}. \end{aligned}$$

Using that \(||u||_{{\mathrm{op}}}= 1= ||v||_{{\mathrm{op}}}\), we get, by Proposition 4.6:

$$\begin{aligned} |\langle \xi , {{\mathrm{d}}}\Gamma (uXu-\overline{v}\overline{X}v) \xi \rangle |\le ||uXu-\overline{v}\overline{X}v||_{{\mathrm{op}}}\langle \xi ,\mathcal {N}\xi \rangle \le C N^{-1}. \end{aligned}$$

\(\square \)

5.3 Expectation value of the kinetic energy

In this section we evaluate the contribution of the Laplacian to the expectation value of the correlation Hamiltonian in the trial state \(\xi \) defined as in (4.16). We start by linearizing in Fourier space,

$$\begin{aligned} -\frac{\hbar ^2}{2} \langle \xi , {{\mathrm{d}}}\Gamma \left( u \Delta u - \overline{v} \Delta v\right) \xi \rangle&= \frac{\hbar ^2}{2}\Bigg \langle \xi , \Big [ \sum _{p \in B_{\mathrm{F}}^c} p^2 a^*_p a_p - \sum _{h \in B_{\mathrm{F}}} h^2 a^*_h a_h \Big ] \xi \Bigg \rangle \\&= \frac{\hbar ^2}{2}\Bigg \langle \xi , \sum _{\alpha =1}^M \Big [ \sum _{p \in B_{\mathrm{F}}^c \cap B_\alpha } \left( (p-\omega _\alpha )^2 + 2p\cdot \omega _\alpha - \omega _\alpha ^2\right) a^*_p a_p\\&\qquad \quad \qquad - \sum _{h \in B_{\mathrm{F}} \cap B_\alpha } \left( (h-\omega _\alpha )^2+ 2h \cdot \omega _\alpha -\omega _\alpha ^2\right) a^*_h a_h \Big ] \xi \Bigg \rangle . \end{aligned}$$

Notice that from the first to the second line, momenta p and h that lie in the corridors or are more than a distance R away from the Fermi surface have disappeared from the sums; this is justified since such modes are never occupied in the trial state, i. e., \(a_p \xi = 0\) and \(a_h\xi =0\). Furthermore, thanks to Lemma 4.3, we have

$$\begin{aligned} \left\langle \xi , \sum _{\alpha =1}^M \left[ \sum _{p \in B_{\mathrm{F}}^c \cap B_\alpha } \omega _\alpha ^2 a^*_p a_p- \sum _{h \in B_{\mathrm{F}} \cap B_\alpha } \omega _\alpha ^2 a^*_h a_h\right] \xi \right\rangle = k_{\mathrm{F}}^2 \, \left\langle \xi , \left[ \mathcal {N}_{\mathrm{p}}- \mathcal {N}_{\mathrm{h}}\right] \xi \right\rangle = 0 \end{aligned}$$

where we used that \(|\omega _\alpha |= k_{\mathrm{F}}\) for all \(\alpha \). To estimate \((p-\omega _\alpha )^2\) and \((h-\omega _\alpha )^2\), we recall that the diameter of the patches is bounded by \(C \sqrt{N^{2/3}/M}\) (since the diameter of the patch on the Fermi surface is bounded by \(\sqrt{N^{2/3}/M}\) which is large compared to its thickness of order R). Therefore

$$\begin{aligned} \begin{aligned}&-\frac{\hbar ^2}{2} \langle \xi , {{\mathrm{d}}}\Gamma \left( u \Delta u - \overline{v} \Delta v\right) \xi \rangle = \langle \xi , \mathbb {H}_{\mathrm{kin}} \xi \rangle +\mathfrak {E}_{\mathrm{lin}} \end{aligned} \end{aligned}$$

where we introduced

$$\begin{aligned} \mathbb {H}_{\mathrm{kin}} := {\hbar ^2} \sum _{\alpha =1}^M \Big [ \sum _{p \in B_{\mathrm{F}}^c \cap B_\alpha }p\cdot \omega _\alpha \, a^*_p a_p - \sum _{h \in B_{\mathrm{F}} \cap B_\alpha } h\cdot \omega _\alpha \, a^*_h a_h \Big ] \end{aligned}$$

and where the error \(\mathfrak {E}_{\mathrm{lin}}\) is bounded by

$$\begin{aligned} \begin{aligned} \left|\mathfrak {E}_{\mathrm{lin}} \right|&= \Big | \frac{\hbar ^2}{2} \Bigg \langle \xi , \sum _{\alpha =1}^M \Big [ \sum _{p \in B_{\mathrm{F}}^c \cap B_\alpha } (p - \omega _\alpha )^2 a^*_p a_p- \sum _{h \in B_{\mathrm{F}} \cap B_\alpha } (h - \omega _\alpha )^2 a^*_h a_h \Big ] \xi \Bigg \rangle \Big |\\&\le C \frac{\hbar ^2}{2} \frac{N^{2/3}}{M} \langle \xi , \mathcal {N}\xi \rangle \le \frac{C}{M} \end{aligned} \end{aligned}$$

where in the last step we used Proposition 4.6 to bound the expectation value of the number operator and \(\hbar = N^{-1/3}\).

To compute the expectation of the linearized kinetic energy operator \(\mathbb {H}_{\mathrm{kin}}\), we will make use of the following lemma.

Lemma 5.4

(Kinetic Energy of Particle–Hole Pairs). For all \(k\in \Gamma ^{{\mathrm{nor}}}\) and \(\alpha \in \mathcal {I}_{k}\) we have

$$\begin{aligned}{}[\mathbb {H}_{\mathrm{kin}}, c^*_\alpha (k)] = {\hbar ^2} |k\cdot \omega _\alpha |c^*_\alpha (k). \end{aligned}$$

Proof

We first treat the case \(\alpha \in \mathcal {I}_{k}^{+}\), for which \(k\cdot \omega _\alpha >0\). Using the CAR we calculate

$$\begin{aligned}&[ \mathbb {H}_{\mathrm{kin}}, c^*_{\alpha }(k)]\\&\quad = [ \mathbb {H}_{\mathrm{kin}}, b^*_{\alpha ,k} ] = [\mathbb {H}_{\mathrm{kin}}, \frac{1}{n_{\alpha ,k}} \sum _{\begin{array}{c} p \in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{p-h,k} a^*_p a^*_h]\\&\quad = {\hbar ^2} \sum _{\beta =1}^M \frac{1}{n_{\alpha ,k}} \sum _{\begin{array}{c} p \in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{p-h,k} \sum _{{\tilde{p}} \in B_{\mathrm{F}}^c \cap B_\beta } {\tilde{p}} \cdot \omega _\beta [a^*_{{\tilde{p}}} a_{{\tilde{p}}}, a^*_p a^*_h] \\&\qquad - {\hbar ^2} \sum _{\beta =1}^M \frac{1}{n_{\alpha ,k}} \sum _{\begin{array}{c} p \in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{p-h,k} \sum _{{\tilde{h}} \in B_{\mathrm{F}} \cap B_\beta } {\tilde{h}} \cdot \omega _\beta [a^*_{{\tilde{h}}} a_{{\tilde{h}}}, a^*_p a^*_h] \\&\quad = {\hbar ^2} \sum _{\beta =1}^M \frac{1}{n_{\alpha ,k}} \sum _{\begin{array}{c} p \in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}}\cap B_\alpha \end{array}} \delta _{p-h,k} \Big ( \sum _{{\tilde{p}} \in B_{\mathrm{F}}^c \cap B_\beta } {\tilde{p}} \cdot \omega _\beta \delta _{p, {\tilde{p}}} - \sum _{{\tilde{h}} \in B_{\mathrm{F}} \cap B_\beta } {\tilde{h}}\cdot \omega _\beta \delta _{h,{\tilde{h}}}\Big ) a^*_p a^*_h; \end{aligned}$$

notice that the Kronecker deltas \(\delta _{p,{\tilde{p}}}\) and \(\delta _{h,{\tilde{h}}}\) imply \(\beta =\alpha \), so we find

$$\begin{aligned}&= {\hbar ^2} \frac{1}{n_{\alpha ,k}} \sum _{\begin{array}{c} p \in B_{\mathrm{F}}^c \cap B\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{p-h,k} (p-h)\cdot \omega _\alpha a^*_p a^*_h = {\hbar ^2} |k \cdot \omega _\alpha |c^*_\alpha (k). \end{aligned}$$

The absolute value was trivially introduced since the scalar product is anyway non-negative. For \(k\cdot \omega _\alpha < 0\), recall that \(c^*_\alpha (k) = b^*_{\alpha ,-k}\); the calculation then proceeds the same way, but in the second last line we use \((p-h)\cdot \omega _\alpha = (-k)\cdot \omega _\alpha = |k \cdot \omega _\alpha |\). \(\quad \square \)

We are now ready to calculate the kinetic energy of our trial state.

Proposition 5.5

(Kinetic Energy). Let \(\xi \) be defined as in (4.16). Then

$$\begin{aligned} \langle \xi , \mathbb {H}_{\mathrm{kin}} \xi \rangle = {\hbar \kappa }\sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}D(k)\sinh ^2(K(k)) + \mathfrak {E}_{\mathrm{kin}}, \end{aligned}$$

where \(D(k)\) is defined in (4.19) and the error term is such that \(|\mathfrak {E}_{\mathrm{kin}}|\le C \hbar / \mathfrak {n}^2\) with \(\mathfrak {n}= N^{1/3-\delta /2} M^{-1/2}\) as in (3.18).

Proof

We write \(T_\lambda = \exp (\lambda B)\), with B as in (4.20), and \(\xi = T \Omega \). Hence

$$\begin{aligned}&\langle \xi , \mathbb {H}_{\mathrm{kin}} \xi \rangle \\&\quad = \int _0^1 {{\mathrm{d}}}\lambda \langle \Omega , T^*_\lambda [\mathbb {H}_{\mathrm{kin}}, B] T_\lambda \Omega \rangle \\&\quad = \int _0^1 {{\mathrm{d}}}\lambda \langle \Omega , T^*_\lambda \Big [ \mathbb {H}_{\mathrm{kin}}, \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \frac{1}{2} \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } c^*_\alpha (k) c^*_\beta (k) - {\mathrm{h.c.}}\Big ] T_\lambda \Omega \rangle \\&\quad = {\text {Re}}\int _0^1 {{\mathrm{d}}}\lambda \sum _{k\in \Gamma ^{{\mathrm{nor}}}} \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } \langle \Omega , T^*_\lambda \left( [\mathbb {H}_{\mathrm{kin}}, c^*_\alpha (k)]c^*_\beta (k) + c^*_\alpha (k) [\mathbb {H}_{\mathrm{kin}},c^*_\beta (k)]\right) T_\lambda \Omega \rangle . \end{aligned}$$

From Lemma 5.4

$$\begin{aligned}&\langle \xi , \mathbb {H}_{\mathrm{kin}} \xi \rangle \\&\quad = {\text {Re}}\int _0^1{{\mathrm{d}}}\lambda \sum _{k\in \Gamma ^{{\mathrm{nor}}}} {\hbar ^2} \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } \left( |k\cdot \omega _\alpha |+ |k\cdot \omega _\beta |\right) \langle \Omega , T^*_\lambda c^*_\alpha (k) c^*_\beta (k) T_\lambda \Omega \rangle . \end{aligned}$$

Recall that \(|k \cdot \omega _\alpha |= |k|\kappa \hbar ^{-1} u_\alpha (k)^2\) with \(u_\alpha (k)\) defined in (4.15). Using Proposition 4.4 then

$$\begin{aligned}&\langle \xi , \mathbb {H}_{\mathrm{kin}} \xi \rangle \\&\quad = {\text {Re}}\int _0^1{{\mathrm{d}}}\lambda \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\hbar \kappa } \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } \left( u_\alpha (k)^2 + u_\beta (k)^2 \right) \\&\qquad \times \left\langle \Omega ,\left( \sum _{\delta \in \mathcal {I}_{k}} \cosh (\lambda K(k))_{\alpha ,\delta } c^*_\delta (k) + \sum _{\delta \in \mathcal {I}_{k}} \sinh (\lambda K(k))_{\alpha ,\delta } c_\delta (k) + \mathfrak {E}^*_\alpha (\lambda ,k) \right) \right. \\&\qquad \times \left. \left( \sum _{\gamma \in \mathcal {I}_{k}} \cosh (\lambda K(k))_{\beta ,\gamma } c^*_\gamma (k) + \sum _{\gamma \in \mathcal {I}_{k}} \sinh (\lambda K(k))_{\beta ,\gamma } c_\gamma (k) + \mathfrak {E}^*_\beta (\lambda ,k) \right) \Omega \right\rangle . \end{aligned}$$

Finally, using \(c_\delta (k)\Omega = 0\) and \(\langle \Omega , c_\delta (k) c^*_\gamma (k)\Omega \rangle = \delta _{\delta ,\gamma }\), we get

$$\begin{aligned} \langle \xi , \mathbb {H}_{\mathrm{kin}} \xi \rangle&= {\text {Re}}\int _0^1 {{\mathrm{d}}}\lambda \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|\frac{\hbar \kappa }{2} \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } \left( u_\alpha (k)^2 + u_\beta (k)^2 \right) \nonumber \\&\quad \times \sum _{\delta \in \mathcal {I}_{k}} \sinh (\lambda K(k))_{\alpha ,\delta } \sum _{\gamma \in \mathcal {I}_{k}} \cosh (\lambda K(k))_{\beta ,\gamma } \delta _{\delta ,\gamma } + \mathfrak {E}_{\mathrm{kin}} \nonumber \\&= \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\hbar \kappa } \sum _{\alpha \in \mathcal {I}_{k}} u_\alpha (k)^2 \int _0^1 {{\mathrm{d}}}\lambda \Big ( \sinh \big (2\lambda K(k)\big ) K(k) \Big )_{\alpha ,\alpha } + \mathfrak {E}_{\mathrm{kin}} \end{aligned}$$
(5.3)

where we defined

$$\begin{aligned} \mathfrak {E}_{\mathrm{kin}}&:= {\text {Re}}\int _0^1 {{\mathrm{d}}}\lambda \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|\frac{\hbar \kappa }{2} \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } \left( u_\alpha (k)^2 + u_\beta (k)^2 \right) \\&\quad \times \Big ( \langle \Omega , \mathfrak {E}^*_\alpha (\lambda ,k) \mathfrak {E}^*_\beta (\lambda ,k) \Omega \rangle + \sum _{\delta \in \mathcal {I}_{k}} \sinh (\lambda K(k))_{\alpha ,\delta } \langle \Omega ,c_\delta (k) \mathfrak {E}^*_\beta (\lambda ,k)\Omega \rangle \\&\qquad \qquad + \sum _{\gamma \in \mathcal {I}_{k}} \cosh (\lambda K(k))_{\beta ,\gamma } \langle \Omega , \mathfrak {E}_\alpha ^*(\lambda ,k) c^*_\gamma (k) \Omega \rangle \Big )\\&=: \mathfrak {E}^{(1)}_{\mathrm{kin}} + \mathfrak {E}^{(2)}_{\mathrm{kin}}+ \mathfrak {E}^{(3)}_{\mathrm{kin}}. \end{aligned}$$

We compute the integral in (5.3). We get

$$\begin{aligned} \langle \xi , \mathbb {H}_{\mathrm{kin}} \xi \rangle = {\hbar \kappa }\sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}D(k)\sinh ^2(K(k))+ \mathfrak {E}_{\mathrm{kin}}. \end{aligned}$$

Using that \(u_\alpha (k)^2 = |\hat{k}\cdot \hat{\omega }_\alpha |\le 1\), we bound the first error term by

$$\begin{aligned} | \mathfrak {E}^{(1)}_{\mathrm{kin}}|&\le \Big | \int _0^1 {{\mathrm{d}}}\lambda \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|{\hbar \kappa } \sum _{\alpha ,\beta \in \mathcal {I}_{k}} K(k)_{\alpha ,\beta } \left( u_\alpha (k)^2 + u_\beta (k)^2 \right) \\&\quad \times \langle \Omega , \mathfrak {E}^*_\alpha (\lambda ,k) \mathfrak {E}^*_\beta (\lambda ,k) \Omega \rangle \Big |\\&\le 2\int _0^1{{\mathrm{d}}}\lambda \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|{\hbar \kappa }\sum _{\alpha ,\beta \in \mathcal {I}_{k}} |K(k)_{\alpha ,\beta }|||\mathfrak {E}_\alpha (\lambda ,k)\Omega || ||\mathfrak {E}^*_\beta (\lambda ,k)\Omega || \\&\le 2\int _0^1{{\mathrm{d}}}\lambda \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|{\hbar \kappa } \Big [ \sum _{\alpha ,\beta \in \mathcal {I}_{k}} |K(k)_{\alpha ,\beta }|^2 \Big ]^{1/2} \\&\quad \times \Big [ \sum _{\alpha \in \mathcal {I}_{k}} ||\mathfrak {E}_\alpha (\lambda ,k)\Omega ||^2 \sum _{\beta \in \mathcal {I}_{k}} ||\mathfrak {E}^*_\beta (\lambda ,k)\Omega ||^2 \Big ]^{1/2}; \end{aligned}$$

and finally using (4.21)

$$\begin{aligned} | \mathfrak {E}^{(1)}_{\mathrm{kin}}|&\le \frac{C\hbar }{\mathfrak {n}^4} \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|||K(k)||_{{\mathrm{HS}}}e^{2||K(k)||_{{\mathrm{HS}}}} \sup _{\lambda \in [0,1]} \langle T_\lambda \Omega , (\mathcal {N}+2)^{3} T_\lambda \Omega \rangle \\&\quad \times \Big ( \sum _{l\in \Gamma ^{{\mathrm{nor}}}} ||K(l)||_{{\mathrm{HS}}}\Big )^{2}. \end{aligned}$$

From Proposition 4.6 and Lemma 4.5, we conclude that \(|\mathfrak {E}^{(1)}_{\mathrm{kin}}| \le C \hbar / \mathfrak {n}^4\). The third error term \(\mathfrak {E}^{(3)}_{\mathrm{kin}}\) can be controlled similarly, using Lemma 4.2:

$$\begin{aligned} |\mathfrak {E}^{(3)}_{\mathrm{kin}}|&\le C\hbar \int _0^1 {{\mathrm{d}}}\lambda ||(\mathcal {N}+1)^{1/2} \Omega || \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|\sum _{\alpha ,\beta \in \mathcal {I}_{k}} |K(k)_{\alpha ,\beta } |||\mathfrak {E}_\alpha (\lambda ,k)\Omega || \\&\quad \times \Big [ \sum _{\gamma \in \mathcal {I}_{k}} |\cosh (\lambda K(k))_{\beta ,\gamma } |^2 \Big ]^{1/2} \\&\le \frac{C\hbar }{\mathfrak {n}^2} ||(\mathcal {N}+1)^{1/2} \Omega || \sup _{\lambda \in [0,1]} ||(\mathcal {N}+2)^{3/2}T_\lambda \Omega || \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|||K(k)||_{{\mathrm{HS}}}\, e^{||K(k)||_{{\mathrm{HS}}}} \\&\quad \times \int _0^1 {{\mathrm{d}}}\lambda \, ||\cosh (\lambda K(k))||_{{\mathrm{HS}}}\sum _{l\in \Gamma ^{{\mathrm{nor}}}} ||K(l)||_{{\mathrm{HS}}}\\&\le \frac{C\hbar }{\mathfrak {n}^2} ||(\mathcal {N}+1)^{1/2} \Omega || \sup _{\lambda \in [0,1]} ||(\mathcal {N}+2)^{3/2}T_\lambda \Omega ||\\&\quad \times \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|||K(k)||_{{\mathrm{HS}}}\, e^{2||K(k)||_{{\mathrm{HS}}}} \sum _{l\in \Gamma ^{{\mathrm{nor}}}} ||K(l)||_{{\mathrm{HS}}}\\&\le \frac{\hbar }{\mathfrak {n}^2} C. \end{aligned}$$

The second error term \(\mathfrak {E}_{\mathrm{kin}}^{(2)}\) can be controlled in the same way. \(\quad \square \)

5.4 Expectation value of the interaction energy

We now evaluate the main contribution (3.9) to the interaction energy. This is the content of the next proposition.

Proposition 5.6

(Interaction Energy). Let \(\xi \) be the trial state defined as in (4.16), and let \(Q_{N}^{{\mathrm{B}}}\) be given by (3.9). Then

$$\begin{aligned} \begin{aligned} \langle \xi , Q_N^{{\mathrm{B}}} \xi \rangle&= {\hbar \kappa } \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}\left( W(k) \sinh ^2 (K(k)) + \tilde{W}(k) \sinh (K(k)) \cosh (K(k)) \right) \\&\quad + \mathfrak {E}_{\mathrm{int}} + \mathcal {O}(\hbar N^{-\delta /2}) \end{aligned} \end{aligned}$$

where \(W(k)\) and \(\tilde{W}(k)\) are defined in (4.19). The error term is bounded by \(|\mathfrak {E}_{\mathrm{int}}|\le C\hbar / \mathfrak {n}^2\), with \(\mathfrak {n}= N^{1/3-\delta /2} M^{-1/2}\) as in (3.18).

Proof

We start by decomposing the \(b_{k}\)-operators in the interaction Hamiltonian (3.12) by their patch decomposition (3.15),

$$\begin{aligned} \tilde{b}_k = \sum _{\alpha \in \mathcal {I}_{k}^{+}} \tilde{b}_{\alpha ,k} + \mathfrak {r}_k. \end{aligned}$$
(5.4)

We recall that the error terms \(\mathfrak {r}_k\) collect modes in the corridors and close to the equator:

$$\begin{aligned} \mathfrak {r}_k = \tilde{\mathfrak {r}}_k + \sum _{\alpha \not \in \mathcal {I}_{k}} {\tilde{b}}_{\alpha ,k}, \end{aligned}$$
(5.5)

where \(\tilde{\mathfrak {r}}_k\) is a linear combination of products \(a_h a_p\) such that at least one of the two momenta is in the corridors \(B_{\mathrm{corri}}\) (see Fig. 2), and the second term collects the contributions coming from the patches close to the equator. We are going to show that \(\mathfrak {r}_k\) gives a negligible contribution to \(\langle \xi , Q_{N}^{{\mathrm{B}}} \xi \rangle \).

Fig. 2
figure 2

Fermi surface in bold; two patches separated by a corridor of width 2R. Bold arrows represent particle–hole pairs (ph) that contribute to the expectation value of the interaction Hamiltonian. Dashed arrows represent particle–hole pairs of which mode p or h (or both) is not occupied in the trial state \(\xi \). Since \(|k|\le R\), pairs connecting the patches across the corridor do not exist in \(Q_N^{{\mathrm{B}}}\)

Contribution of corridors. We claim that the error operators \(\tilde{\mathfrak {r}}_k\) do not contribute to \(\langle \xi , Q_{N}^{{\mathrm{B}}} \xi \rangle \). To see this, recall that \(T = e^{B}\), with B not containing any mode \(q\in B_{\mathrm{corri}}\), see (4.20). Since at least one of the two momenta p and h appearing in \(\tilde{\mathfrak {r}}_k\) is in the corridor \(B_{\mathrm{corri}}\), we have \(\tilde{\mathfrak {r}}_k \xi = 0\). Plugging the decomposition (5.4) into \(Q^{{\mathrm{B}}}_{N}\) from (3.12) and taking the expectation value on \(\xi \), we realize that all terms containing at least one error operator \(\tilde{\mathfrak {r}}^\natural _k\) are zero, due to the fact that there is at least one error operator \(\tilde{\mathfrak {r}}_k\) directly acting on \(T \Omega \).

Contribution of patches near the equator. We claim that the contribution to \(\langle \xi , Q^{{\mathrm{B}}}_{N} \xi \rangle \) coming from patches \(\beta \not \in \mathcal {I}_{k}\) is subleading as \(N\rightarrow \infty \). These are the patches \(\beta \) in the collar where \(|\hat{k}\cdot \hat{\omega }_\beta |< N^{-\delta }\). The width of this collar is bounded above by \(C k_{\mathrm{F}} N^{-\delta }\), and its length—approximately equal to the circumference of the equator—is bounded above by \(C k_{\mathrm{F}}\); we conclude that the surface area of the collar is of order \(k_F^2 N^{-\delta }\).

Recall that \(n^2_{\beta ,k}\) is the number of particle–hole pairs with relative momentum k in patch \(\beta \); thus, adding in the corridors for an upper bound, \(\sum _{\beta \not \in \mathcal {I}_{k}} n^2_{\beta ,k}\) is bounded by the number of particle–hole pairs with relative momentum k in the collar. This number is bounded above by the number of hole momenta \(h \in B_{\mathrm{F}}\) that are at most a distance R from the collar (since \(|k|\le R\)). The number of such points of the lattice \(\mathbb {Z}^3\) can be counted by Gauss’ classical argument: assign to each lattice point k the cube \([k_1,k_1+1]\times [k_2,k_2+1]\times [k_3,k_3+1]\). Then the number of cubes belonging to lattice points near the collar is bounded by the Lebesgue measure of the collar “fattened” to a thickness R; i. e.,

$$\begin{aligned} \sum _{\beta \not \in \mathcal {I}_{k}} n^2_{\beta ,k} \le C k_{\mathrm{F}}^2 N^{-\delta } \times R = \mathcal {O}(N^{2/3-\delta }). \end{aligned}$$
(5.6)

We are now ready to estimate the contribution to \(\langle \xi , Q_{N}^{{\mathrm{B}}} \xi \rangle \) coming from the modes close to the equator. Consider, e. g., the term \(\frac{1}{2N} \sum _{k \in \Gamma ^{{\mathrm{nor}}}} \hat{V}(k) 2{\tilde{b}}^*_k {\tilde{b}}_k\) (all the other terms can be dealt with similarly). We get three contributions from (5.5), namely

$$\begin{aligned} \begin{aligned}&\frac{1}{N} \sum _{k \in \Gamma ^{{\mathrm{nor}}}}\hat{V}(k) \sum _{\beta \not \in \mathcal {I}_{k}} \tilde{b}^*_{\beta ,k} \sum _{\alpha \in \mathcal {I}_{k}^{+}} \tilde{b}_{\alpha ,k},\\&\frac{1}{N} \sum _{k \in \Gamma ^{{\mathrm{nor}}}}\hat{V}(k) \sum _{\beta \in \mathcal {I}_{k}^{+}} \tilde{b}^*_{\beta ,k} \sum _{\alpha \not \in \mathcal {I}_{k}} \tilde{b}_{\alpha ,k},\\&\frac{1}{N} \sum _{k \in \Gamma ^{{\mathrm{nor}}}}\hat{V}(k) \sum _{\beta \not \in \mathcal {I}_{k}} \tilde{b}^*_{\beta ,k} \sum _{\alpha \not \in \mathcal {I}_{k}} \tilde{b}_{\alpha ,k}. \end{aligned} \end{aligned}$$
(5.7)

We give the detailed estimate for the first term in the list (the other two terms can be controlled similarly)

$$\begin{aligned}&\frac{1}{N} \sum _{k \in \Gamma ^{{\mathrm{nor}}}}\hat{V}(k) \Big \langle \xi , \sum _{\beta \not \in \mathcal {I}_{k}} \tilde{b}^*_{\beta ,k} \sum _{\alpha \in \mathcal {I}_{k}^{+}} \tilde{b}_{\alpha ,k} \xi \Big \rangle \\&\quad =\frac{1}{N} \sum _{k \in \Gamma ^{{\mathrm{nor}}}}\hat{V}(k) \Big \langle \xi , \sum _{\beta \not \in \mathcal {I}_{k}} n_{\beta ,k} b^*_{\beta ,k} \sum _{\alpha \in \mathcal {I}_{k}^{+}} n_{\alpha ,k} {b}_{\alpha ,k} \xi \Big \rangle \\&\quad \le \frac{1}{N} \sum _{k\in \Gamma ^{{\mathrm{nor}}}}\hat{V}(k) \Big ( \sum _{\beta \not \in \mathcal {I}_{k}} n_{\beta ,k}^2 \Big )^{1/2} \Big ( \sum _{\alpha \in \mathcal {I}_{k}^{+}} n_{\alpha ,k}^2\Big )^{1/2} ||\mathcal {N}^{1/2} \xi ||^2 \\&\quad \le \frac{1}{N} \sum _{k\in \Gamma ^{{\mathrm{nor}}}}\hat{V}(k) \Big ( C N^{2/3-\delta } \Big )^{1/2} \Big ( M \frac{C k_{\mathrm{F}}^2}{M}\Big )^{1/2} \langle \xi ,\mathcal {N}\xi \rangle = \mathcal {O}(\hbar N^{-\delta /2}), \end{aligned}$$

where we used (5.6) to control the sum over \(\beta \not \in \mathcal {I}_{k}\), \(n_{\alpha ,k}^2 \le C k_{\mathrm{F}}^2/M\) due to Proposition 3.1 for the sum over \(\alpha \in \mathcal {I}_{k}^{+}\), and the bound on the number of fermions \(\mathcal {N}\) from Proposition 4.6.

Approximate Bogoliubov diagonalization of the effective interaction. By the discussion of the previous paragraph

$$\begin{aligned} \begin{aligned}&\langle \xi , Q_N^{{\mathrm{B}}} \xi \rangle \\&\quad = \frac{1}{N} \left\langle \xi , \sum _{k \in \Gamma ^{{\mathrm{nor}}}} \hat{V}(k) \Big ( \sum _{\alpha ,\beta \in \mathcal {I}_{k}^{+}}{\tilde{b}}^*_{\alpha ,k} \tilde{b}_{\beta ,k} + \sum _{\alpha ,\beta \in \mathcal {I}_{k}^{-}}{\tilde{b}}^*_{\alpha ,-k} {\tilde{b}}_{\beta ,-k} \right. \\&~~~~~~~~~~~~~~~~~~~~~~~~~\qquad \qquad + \left. \Big [ \sum _{\alpha \in \mathcal {I}_{k}^{+},\, \beta \in \mathcal {I}_{k}^{-}} {\tilde{b}}^*_{\alpha ,k} {\tilde{b}}^*_{\beta ,-k} + {\mathrm{h.c.}}\Big ]\Big ) \xi \right\rangle + \mathcal {O}(\hbar N^{-\delta /2}). \end{aligned} \end{aligned}$$

Introducing the normalization factors \(n_{\alpha ,k} = k_{\mathrm{F}} \sqrt{|k|} v_\alpha (k)\) and combining the \(b^*_{\alpha ,k}\) and \(b^*_{\alpha ,-k}\) operators to \(c^*_\alpha (k)\) operators as in (4.1), we get

$$\begin{aligned} \langle \xi , Q_N^{{\mathrm{B}}} \xi \rangle = \langle \xi , \mathbb {H}_{\mathrm{int}} \xi \rangle + \mathcal {O}(\hbar N^{-\delta /2}), \qquad \mathbb {H}_{\mathrm{int}} := \mathbb {H}_{\mathrm{int}}^{(1)} + \mathbb {H}_{\mathrm{int}}^{(2)} + \mathbb {H}_{\mathrm{int}}^{(3)}, \end{aligned}$$
(5.8)

where, recalling that \(g(k) = \kappa \hat{V}(k)\),

$$\begin{aligned} \mathbb {H}_{\mathrm{int}}^{(1)}&:= {\hbar \kappa } \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|g(k)\sum _{\alpha ,\beta \in \mathcal {I}_{k}^{+}} v_\alpha (k) v_\beta (k) c^*_\alpha (k) c_\beta (k),\\ \mathbb {H}_{\mathrm{int}}^{(2)}&:= \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|g(k)\sum _{\alpha ,\beta \in \mathcal {I}_{k}^{-}} v_\alpha (k) v_\beta (k) c^*_\alpha (k) c_\beta (k),\\ \mathbb {H}_{\mathrm{int}}^{(3)}&:= \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|g(k) \sum _{\alpha \in \mathcal {I}_{k}^{+}} \sum _{\beta \in \mathcal {I}_{k}^{-}} v_\alpha (k) v_\beta (k) c^*_\alpha (k) c^*_\beta (k) + {\mathrm{h.c.}}\end{aligned}$$

We shall evaluate \(\langle \xi , \mathbb {H}^{(i)}_{\text {int}} \xi \rangle \), \(i=1,2,3\), with \(\xi = T\Omega \), using the fact that the T operator behaves as an approximate bosonic Bogoliubov transformation, recall Proposition 4.4. Using also \(\langle \Omega , c_\delta (k) c^*_\gamma (k)\Omega \rangle = \delta _{\delta ,\gamma }\), we have

$$\begin{aligned} \langle \xi , \mathbb {H}_{\mathrm{int}}^{(1)} \xi \rangle = \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}W^{++}(k) \sinh ^2(K(k)) + \mathfrak {E}_{\mathrm{int}}^{(1)}, \end{aligned}$$
(5.9)

where

$$\begin{aligned} W^{++}(k)_{\alpha ,\beta } = \left\{ \begin{array}{cl} g(k) v_\alpha (k) v_\beta (k) &{} \text {for }\alpha ,\beta \in \mathcal {I}_{k}^{+}\\ 0 &{} \text {otherwise} \end{array}\right. \end{aligned}$$

and the error term is

$$\begin{aligned} \mathfrak {E}_{\mathrm{int}}^{(1)}&= \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|g(k) \sum _{\alpha ,\beta \in \mathcal {I}_{k}^{+}} v_\alpha (k) v_\beta (k) \Bigg [ \sum _{\gamma \in \mathcal {I}_{k}} \sinh (K(k))_{\alpha ,\gamma } \langle \Omega , c_\gamma (k) \mathfrak {E}_\beta (1,k) \Omega \rangle \\&\quad +\sum _{\gamma \in \mathcal {I}_{k}} \sinh (K(k))_{\beta ,\gamma } \langle \Omega , \mathfrak {E}^*_\alpha (1,k) c^*_\gamma (k) \Omega \rangle +\langle \Omega , \mathfrak {E}^*_\alpha (1,k) \mathfrak {E}_\beta (1,k) \Omega \rangle \Bigg ]. \end{aligned}$$

For the second part of the interaction we find

$$\begin{aligned} \langle \xi , \mathbb {H}_{\mathrm{int}}^{(2)} \xi \rangle = \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}W^{--}(k) \sinh ^2(K(k)) + \mathfrak {E}_{\mathrm{int}}^{(2)}, \end{aligned}$$
(5.10)

where

$$\begin{aligned} W^{--}(k)_{\alpha ,\beta } = \left\{ \begin{array}{cl} g(k) v_\alpha (k) v_\beta (k) &{}\quad \text {for }\alpha ,\beta \in \mathcal {I}_{k}^{-}\\ 0 &{}\quad \text {otherwise} \end{array}\right. \end{aligned}$$

and

$$\begin{aligned} \mathfrak {E}_{\mathrm{int}}^{(2)} = \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|g(k)&\sum _{\alpha ,\beta \in \mathcal {I}_{k}^{-}} v_\alpha (k) v_\beta (k) \Bigg [ \sum _{\gamma \in \mathcal {I}_{k}} \sinh (K(k))_{\alpha ,\gamma } \langle \Omega , c_\gamma (k) \mathfrak {E}_\beta (1,k) \Omega \rangle \\&+\sum _{\gamma \in \mathcal {I}_{k}} \sinh (K(k))_{\beta ,\gamma } \langle \Omega , \mathfrak {E} ^*_\alpha (1,k) c^*_\gamma (k) \Omega \rangle +\langle \Omega , \mathfrak {E}^*_\alpha (1,k) \mathfrak {E}_\beta (1,k) \Omega \rangle \Bigg ]. \end{aligned}$$

Finally, for the third interaction term we find

$$\begin{aligned} \begin{aligned} \langle \xi , \mathbb {H}_{\mathrm{int}}^{(3)} \xi \rangle&= 2 \hbar \kappa {\text {Re}}\sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}W^{+-}(k) \sinh (K(k)) \cosh (K(k)) + \mathfrak {E}_{\mathrm{int}}^{(3)} \\&= \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}\tilde{W}(k) \sinh (K(k)) \cosh (K(k)) + \mathfrak {E}_{\mathrm{int}}^{(3)}, \end{aligned} \end{aligned}$$
(5.11)

where

$$\begin{aligned} W^{+-}(k)_{\alpha ,\beta } = \left\{ \begin{array}{cl} g(k) v_\alpha (k) v_\beta (k) &{} \text {for }\alpha \in \mathcal {I}_{k}^{+}\text { and } \beta \in \mathcal {I}_{k}^{-}\\ 0 &{} \text {otherwise} \end{array}\right. \;; \end{aligned}$$

we used the fact that all terms are real to write the more symmetric expression in terms of \(\tilde{W}(k) = W^{+-}(k) + W^{-+}(k)\) (the latter is defined by exchanging the role of \(\mathcal {I}_{k}^{+}\) and \(\mathcal {I}_{k}^{-}\) in the former). The error term \(\mathfrak {E}_{\mathrm{int}}^{(3)}\) is given by

$$\begin{aligned} \mathfrak {E}_{\mathrm{int}}^{(3)}&= 2\hbar \kappa {\text {Re}}\sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|g(k) \sum _{\alpha \in \mathcal {I}_{k}^{+}} \sum _{\beta \in \mathcal {I}_{k}^{-}} v_\alpha (k) v_\beta (k)\\&\quad \times \Bigg [ \sum _{\gamma \in \mathcal {I}_{k}} \sinh (K(k))_{\gamma ,\alpha } \langle \Omega , c_\gamma (k) \mathfrak {E}^*_\beta (1,k) \Omega \rangle \\&\quad \qquad + \sum _{\gamma \in \mathcal {I}_{k}} \cosh (K(k))_{\gamma ,\beta } \langle \Omega ,\mathfrak {E}^*_\alpha (1,k) c^*_\gamma (k) \Omega \rangle + \langle \Omega , \mathfrak {E}^*_\alpha (1,k) \mathfrak {E}^*_\beta (1,k) \Omega \rangle \Bigg ]. \end{aligned}$$

Combining (5.9), (5.10), (5.11) and (5.8), we conclude that

$$\begin{aligned} \langle \xi , Q_N^{{\mathrm{B}}} \xi \rangle = \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}\left( W(k) \sinh ^2 (K(k)) + \tilde{W}(k) \sinh (K(k)) \cosh (K(k)) \right) + \mathfrak {E}_{\mathrm{int}} \end{aligned}$$

with \( \mathfrak {E}_{\mathrm{int}} = \mathfrak {E}^{(1)}_{\mathrm{int}} + \mathfrak {E}^{(2)}_{\mathrm{int}} + \mathfrak {E}^{(3)}_{\mathrm{int}}\). To control the error term \(\mathfrak {E}^{(1)}_{\mathrm{int}}\), we decompose it as

$$\begin{aligned} \mathfrak {E}_{\mathrm{int}}^{(1)}&= \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|g(k) \sum _{\alpha ,\beta \in \mathcal {I}_{k}^{+}} v_\alpha (k) v_\beta (k) \Bigg [ \sum _{\gamma \in \mathcal {I}_{k}} \sinh (K(k))_{\alpha ,\gamma } \langle \Omega , c_\gamma (k) \mathfrak {E}_\beta (1,k) \Omega \rangle \\&\quad +\sum _{\gamma \in \mathcal {I}_{k}} \sinh (K(k))_{\beta ,\gamma } \langle \Omega , \mathfrak {E}^*_\alpha (1,k) c^*_\gamma (k) \Omega \rangle +\langle \Omega , \mathfrak {E}^*_\alpha (1,k) \mathfrak {E}_\beta (1,k) \Omega \rangle \Bigg ] \\&=: \mathfrak {E}_{\mathrm{int}}^{(1,1)} + \mathfrak {E}_{\mathrm{int}}^{(1,2)} + \mathfrak {E}_{\mathrm{int}}^{(1,3)}. \end{aligned}$$

Recall that \(u_\alpha (k)^2 = |\hat{k}\cdot \hat{\omega }_\alpha |\le 1\) and hence, by Proposition 3.1, \(v_\alpha (k) \le \sqrt{\frac{C}{M}} u_\alpha (k) \le \sqrt{\frac{C}{M}}\). Thus, using Proposition 4.4 and Cauchy–Schwarz, we find

$$\begin{aligned} | \mathfrak {E}_{\mathrm{int}}^{(1,3)} |&\le \Big | \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|g(k) \sum _{\alpha ,\beta \in \mathcal {I}_{k}^{+}} v_\alpha (k) v_\beta (k) \langle \Omega , \mathfrak {E}^*_\alpha (1,k) \mathfrak {E}_\beta (1,k) \Omega \rangle \Big | \\&\le \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|g(k) \sum _{\alpha \in \mathcal {I}_{k}^{+}} \sqrt{\frac{C}{M}} ||\mathfrak {E}_\alpha (1,k) \Omega || \sum _{\beta \in \mathcal {I}_{k}^{+}} \sqrt{\frac{C}{M}} ||\mathfrak {E}_\beta (1,k) \Omega ||\\&\le C {\hbar } \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|\hat{V}(k) \sum _{\alpha \in \mathcal {I}_{k}^{+}} ||\mathfrak {E}_\alpha (1,k)\Omega ||^2. \end{aligned}$$

(Recall that \(|\mathcal {I}_{k}^{+}| = I_k \le M/2\).) With (4.21), we get

$$\begin{aligned} | \mathfrak {E}_{\mathrm{int}}^{(1,3)} |&\le \hbar \frac{C}{\mathfrak {n}^4} \sup _{\lambda \in [0,1]} \langle T_\lambda \Omega , (\mathcal {N}+2)^{3} T_\lambda \Omega \rangle \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|\hat{V}(k) e^{2||K(k)||_{{\mathrm{HS}}}} \Big [ \sum _{l\in \Gamma ^{{\mathrm{nor}}}} ||K(l)||_{{\mathrm{HS}}}\Big ]^2. \end{aligned}$$

Lemma 4.5 and Proposition 4.6 imply that \(|\mathfrak {E}_{\mathrm{int}}^{(1,3)}| \le C \hbar / \mathfrak {n}^4\). The term \(\mathfrak {E}_{\mathrm{int}}^{(1,1)}\) can be controlled similarly:

$$\begin{aligned} |\mathfrak {E}_{\mathrm{int}}^{(1,1)}|&\le \Big |\hbar \kappa \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|g(k) \sum _{\alpha ,\beta \in \mathcal {I}_{k}^{+}} v_\alpha (k) v_\beta (k) \sum _{\gamma \in \mathcal {I}_{k}} \sinh (K(k))_{\alpha ,\gamma } \langle \Omega , c_\gamma (k) \mathfrak {E}_\beta (k,1) \Omega \rangle \Big |\\&\le \frac{C \hbar }{M} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|\hat{V}(k) \sum _{\alpha ,\beta \in \mathcal {I}_{k}^{+}} \Big \Vert \sum _{\gamma \in \mathcal {I}_{k}} \sinh (K(k))_{\alpha ,\gamma } c^*_\gamma (k) \Omega \Big \Vert ||\mathfrak {E}_\beta (k,1)\Omega || \\&\le \frac{C \hbar }{M} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|\hat{V}(k) \sum _{\alpha ,\beta \in \mathcal {I}_{k}^{+}} \Big ( \sum _{\gamma \in \mathcal {I}_{k}} |\sinh (K(k))_{\alpha ,\gamma }|^2 \Big )^{1/2} \\&\quad \times || (\mathcal {N}+1)^{1/2} \Omega || ||\mathfrak {E}_\beta (k,1)\Omega ||\;; \end{aligned}$$

applying Cauchy–Schwarz in \(\alpha \) and in \(\beta \), using \(|\mathcal {I}_{k}^{+}| = I_k \le M/2\) and (4.21) we arrive at

$$\begin{aligned} |\mathfrak {E}_{\mathrm{int}}^{(1,1)}|&\le \frac{C\hbar }{\mathfrak {n}^2} \sup _{\lambda \in [0,\lambda ]}|| (\mathcal {N}+2)^{3/2} T_\lambda \xi || \sum _{k \in \Gamma ^{{\mathrm{nor}}}} |k|\hat{V}(k) e^{2||K(k)||_{{\mathrm{HS}}}} \sum _{l\in \Gamma ^{{\mathrm{nor}}}} ||K(l)||_{{\mathrm{HS}}}. \end{aligned}$$

Again, Lemma 4.5 and Proposition 4.6 show that \(|\mathfrak {E}_{\mathrm{int}}^{(1,1)}| \le C \hbar / \mathfrak {n}^2\). Analogously, we obtain also \(|\mathfrak {E}_{\mathrm{int}}^{(1,2)}| \le C \hbar / \mathfrak {n}^2\). Hence \(|\mathfrak {E}_{\mathrm{int}}^{(1)}| \le C \hbar / \mathfrak {n}^2\).

The error term \(\mathfrak {E}_{\mathrm{int}}^{(2)}\) differs from \(\mathfrak {E}_{\mathrm{int}}^{(1)}\) only in the replacement of the index set \(\mathcal {I}_{k}^{+}\) by \(\mathcal {I}_{k}^{-}\). Therefore, we find \(|\mathfrak {E}_{\mathrm{int}}^{(2)}|\le C \hbar / \mathfrak {n}^2\). As for the error term \(\mathfrak {E}_{\mathrm{int}}^{(3)}\), it also differs from \(\mathfrak {E}_{\mathrm{int}}^{(1)}\) in the index set, some hermitian conjugations, and the appearance of a \(\cosh \) instead of a \(\sinh \). The estimates however remain valid and we also obtain \(|\mathfrak {E}_{\mathrm{int}}^{(3)}| \le C \hbar / \mathfrak {n}^2\). \(\quad \square \)

5.5 Proof of the main theorem

Proof of Theorem 2.1

Recall the definition (3.8) of the correlation Hamiltonian and the decomposition (3.9) of the quartic interaction \(Q_N\). Combining the results of Sects. 5.1,  5.2, 5.3, Propositions 5.5 and 5.6, we conclude that

$$\begin{aligned} \begin{aligned}&\langle \xi , \mathcal {H}_{{\mathrm{corr}}} \xi \rangle \\&\quad = \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}\left( (D(k)+W(k)) \sinh ^2 (K(k)) + \tilde{W}(k) \sinh (K(k)) \cosh (K(k)) \right) + \mathfrak {E} \end{aligned} \end{aligned}$$

for an error \(\mathfrak {E}\) such that

$$\begin{aligned} |\mathfrak {E}|\le C \Big [ \frac{1}{N} + \frac{1}{M} + \frac{\hbar }{\mathfrak {n}^2} + \hbar N^{-\delta /2} \Big ] \end{aligned}$$

with \(\hbar = N^{-1/3}\) and \(\mathfrak {n}= N^{1/3-\delta /2} M^{-1/2}\).

To evaluate the main part of the expectation value explicitly, notice that by definition (4.17) of K(k) we have

$$\begin{aligned} \sinh (K(k)) \,{=}\, \frac{1}{2} \left( |S_1(k)^T|- |S_1(k)^T|^{-1} \right) , \quad \cosh (K(k)) \,{=}\, \frac{1}{2} \left( |S_1(k)^T|+ |S_1(k)^T|^{-1} \right) . \end{aligned}$$

Notice also that \(S_1(k) S_1(k)^T = |S_1(k)^T|^2\) and \( \left( |S_1(k)^T|^{-1} \right) ^2 = S_2(k) S_2(k)^T\), where

$$\begin{aligned}S_2(k) = \left( D(k) + W(k) - \tilde{W}(k)\right) ^{-1/2} E(k)^{1/2}.\end{aligned}$$

Consequently

$$\begin{aligned} \begin{aligned} \sinh (K(k)) \cosh (K(k))&= \frac{1}{4} \left( |S_1(k)^T|- |S_1(k)^T|^{-1} \right) ^T \left( |S_1(k)^T|+ |S_1(k)^T|^{-1} \right) \\&= \frac{1}{4} \left( S_1(k) S_1(k)^T - S_2(k) S_2(k)^T \right) .\end{aligned} \end{aligned}$$

Likewise

$$\begin{aligned} \begin{aligned} \sinh ^2(K(k))&= \frac{1}{4} \left( |S_1(k)^T|- |S_1(k)^T|^{-1} \right) ^T \left( |S_1(k)^T|- |S_1(k)^T|^{-1} \right) \\&= \frac{1}{4} \left( S_1(k) S_1(k)^T + S_2(k) S_2(k)^T - 2 \mathbb {I}\right) . \end{aligned} \end{aligned}$$

Now using the explicit form (4.18) of \(S_1(k)\), E(k), and \(S_2(k)\), this can be simplified to yield

$$\begin{aligned} \langle \xi , \mathcal {H}_{{\mathrm{corr}}} \xi \rangle&= \frac{\hbar \kappa }{4} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|\Big ( {\text {tr}}\left( D(k)+W(k)+\tilde{W}(k)\right) S_1(k) S_1(k)^T \nonumber \\&\quad + {\text {tr}}\left( D(k)+W(k)-\tilde{W}(k)\right) S_2(k) S_2(k)^T \Big ) \nonumber \\&\quad - \frac{\hbar \kappa }{2} \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|{\text {tr}}\big ( D(k)+W(k) \big ) + \mathfrak {E}\nonumber \\&= \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|\left( \frac{1}{2} {\text {tr}}E(k) - \frac{1}{2} {\text {tr}}\big (D(k)+W(k)\big )\right) + \mathfrak {E}. \end{aligned}$$
(5.12)

We are left with evaluating the traces in (5.12).

Evaluation of the traces. For simplicity, we shall drop the k-dependence in the notation (we will restore it in (5.14)). Recall the block diagonalization (4.25), by which

$$\begin{aligned} \begin{aligned} \frac{1}{2}{\text {tr}}E&= \frac{1}{2}{\text {tr}}\left[ \begin{pmatrix} d &{} 0 \\ 0 &{} d+2b \end{pmatrix}^{1/2} \begin{pmatrix} d+2b &{} 0 \\ 0 &{} d \end{pmatrix} \begin{pmatrix} d &{} 0 \\ 0 &{} d+2b \end{pmatrix}^{1/2}\right] ^{1/2}\\&= \frac{1}{2}{\text {tr}}\left[ d^{1/2} (d+2b) d^{1/2}\right] ^{1/2} + \frac{1}{2}{\text {tr}}\left[ (d+2b)^{1/2} d (d+2b)^{1/2}\right] ^{1/2}\\&= {\text {tr}}\left[ d^{1/2} (d+2b) d^{1/2}\right] ^{1/2}, \end{aligned} \end{aligned}$$
(5.13)

since \(d^{1/2} (d+2b) d^{1/2}\) and \((d+2b)^{1/2} d (d+2b)^{1/2}\) have the same spectrum. To calculate this trace, notice that

$$\begin{aligned} d^{1/2}(d+2b)d^{1/2} = d^2 + 2g |{\tilde{u}}\rangle \langle {\tilde{u}} |\end{aligned}$$

is a rank-one perturbation of a diagonal operator, with diagonal part \(d^2 = {\text {diag}}(u_\alpha ^4: \alpha =1,\ldots I)\) and with \({\tilde{u}} = ( v_1 u_1, \ldots , v_{I} u_{I} ) \in \mathbb {R}^{I}\).

The resolvent of a matrix with rank-one perturbation can easily be calculated: For any invertible matrix \(A \in \mathbb {C}^{n\times n}\), and \(x,y \in \mathbb {C}^n\),

$$\begin{aligned} (A+|x\rangle \langle y|)^{-1} = A^{-1} - \frac{A^{-1}|x\rangle \langle y|A^{-1}}{1+\langle y,A^{-1} x\rangle } \end{aligned}$$

whenever the right-hand side is well-defined. So for \(\lambda \in [0,\infty )\) we find

$$\begin{aligned} \left( d^2 + 2g|{\tilde{u}}\rangle \langle {\tilde{u}}|+\lambda ^2\right) ^{-1} = \left( d^2 + \lambda ^2 \right) ^{-1} - \frac{2g}{1+2g \sum _{\alpha =1}^{I_k} \frac{u_\alpha ^2 v_\alpha ^2}{u_\alpha ^4 + \lambda ^2}} \left|w \right\rangle \left\langle w\right|\;, \end{aligned}$$

with \(w \in \mathbb {R}^{I}\) defined by \(w_\alpha = u_\alpha v_\alpha (u_\alpha ^4 + \lambda ^2)^{-1}\). By functional calculus, for any non-negative operator A we have the identity

$$\begin{aligned} \sqrt{A}= \frac{2}{\pi }\int _0^\infty \left( \mathbb {I}- \frac{\lambda ^2}{A+\lambda ^2} \right) {{\mathrm{d}}}\lambda . \end{aligned}$$

Using the integral identity twice we find

$$\begin{aligned} {\text {tr}}\left[ d^{1/2} (d+2b) d^{1/2}\right] ^{1/2}&= \frac{2}{\pi } \int _0^{\infty }{\text {tr}}\left( \mathbb {I}- \frac{\lambda ^2}{d^2 + \lambda ^2}\right) {{\mathrm{d}}}\lambda \\&\quad + \frac{2}{\pi } \int _0^{\infty } \frac{\lambda ^2\, 2g}{1+2g\sum _{\alpha =1}^{I} \frac{u_\alpha ^2 v_\alpha ^2}{u_\alpha ^4+\lambda ^2}}||w||^2{{\mathrm{d}}}\lambda \\&= {\text {tr}}d + \frac{2}{\pi } \int _0^\infty \frac{\lambda ^2}{1+2g\sum _{\alpha =1}^{I} \frac{u_\alpha ^2 v_\alpha ^2}{u_\alpha ^4+\lambda ^2}} 2g \sum _{\alpha =1}^{I} \frac{u_\alpha ^2 v_\alpha ^2}{(u_\alpha ^4+\lambda ^2)^2}{{\mathrm{d}}}\lambda . \end{aligned}$$

Restoring the k-dependence, let

$$\begin{aligned} f_k (\lambda ) := 1+2g(k)\sum _{\alpha =1}^{I_k} \frac{u_\alpha (k)^2 v_\alpha (k)^2}{u_\alpha (k)^4+\lambda ^2}. \end{aligned}$$

Integrating by parts (noting that the boundary terms vanish since \(\log f_k(\lambda ) \sim 1/\lambda ^2\)), we find

$$\begin{aligned} {\text {tr}}\left[ d^{1/2} (d+2b) d^{1/2}\right] ^{1/2} = \frac{1}{2}{\text {tr}}D - \frac{1}{\pi } \int _0^\infty \lambda \frac{f_k'(\lambda )}{f_k (\lambda )}{{\mathrm{d}}}\lambda&= \frac{1}{2}{\text {tr}}D + \frac{1}{\pi } \int _0^\infty \log f_k (\lambda ){{\mathrm{d}}}\lambda . \end{aligned}$$

Thus, inserting in (5.13) and then in (5.12), we obtain

$$\begin{aligned} \langle \xi ,\mathcal {H}_{{\mathrm{corr}}} \xi \rangle= & {} \hbar \kappa \sum _{k\in \Gamma ^{{\mathrm{nor}}}} |k|\left( \frac{1}{\pi }\int _0^\infty \log f_k (\lambda ){{\mathrm{d}}}\lambda - g(k) \sum _{\alpha =1}^{I_k}v_\alpha (k)^2 \right) + \mathfrak {E} \end{aligned}$$
(5.14)

where we used that according to (4.24) \({\text {tr}}W= 2 {\text {tr}}b =2g\sum _{\alpha =1}^{I} v_\alpha ^2\).

Convergence to the Gell-Mann–Brueckner formula. To conclude the proof of Theorem 2.1, we show that (5.14) reproduces the Gell-Mann–Brueckner formula as stated in the theorem. Let

$$\begin{aligned} \tilde{f}_k (\lambda ) := 1+ 4\pi g(k)\left( 1-\lambda \arctan \left( \frac{1}{{\lambda }} \right) \right) . \end{aligned}$$

We claim that

$$\begin{aligned}&\Big |\Big ( \frac{1}{\pi }\int _0^\infty \log f_k (\lambda ){{\mathrm{d}}}\lambda - g(k) \sum _{\alpha =1}^{I_k}v_\alpha (k)^2 \Big ) - \Big ( \frac{1}{\pi }\int _0^\infty \log \tilde{f}_k (\lambda ){{\mathrm{d}}}\lambda - g(k)\pi \Big ) \Big |\nonumber \\&\quad \le C \left( M^{1/4 }N^{-\frac{1}{6}+\frac{\delta }{2}} + N^{-\frac{\delta }{2}} + M^{-\frac{1}{4}}N^{\frac{\delta }{2}} \right) . \end{aligned}$$
(5.15)

Since \(\log \tilde{f}_k (\lambda ) = g(k) = 0\) for all \(|k|> R\), inserting (5.15) into (5.14) we obtain

$$\begin{aligned} \langle \xi ,\mathcal {H}_{{\mathrm{corr}}} \xi \rangle= & {} \hbar \kappa \!\!\sum _{k\in \Gamma ^{{\mathrm{nor}}}}\!\! |k|\! \left( \frac{1}{\pi }\!\int _0^\infty \! \log \! \left[ 1 + 4\pi g(k) \left( 1- {\lambda } \arctan \left( \frac{1}{{\lambda }}\right) \right) \right] {{\mathrm{d}}}\lambda - g(k)\pi \right) \\&+ \tilde{\mathfrak {E}} \end{aligned}$$

with an error

$$\begin{aligned} \begin{aligned} |\tilde{\mathfrak {E}}|&\le C \Big [ N^{-1} + M^{-1} + N^{-1+\delta } M\Big ] + C\hbar \Big [ M^{1/4 }N^{-\frac{1}{6}+\frac{\delta }{2}} + N^{-\frac{\delta }{2}} + M^{-\frac{1}{4}}N^{\frac{\delta }{2}} \Big ]. \end{aligned} \end{aligned}$$

Recalling that \(M = N^{1/3 + \epsilon }\) and optimizing over \(0< \epsilon < 1/3\), \(0< \delta < 1/6 - \epsilon /2\), we find (with \(\epsilon = 1/27\) and \(\delta = 2/27\)), that \(|\tilde{\mathfrak {E}} |\le C \hbar N^{-1/27}\). Replacing the sum over \(k \in \Gamma ^{{\mathrm{nor}}}\) by 1 / 2 times the sum over \(k \in \mathbb {Z}^3\), and replacing \(\kappa = \kappa _0 + \mathcal {O}(N^{-1/3})\) by \(\kappa _0 = (3/4\pi )^{1/3}\) (using also the Lipschitz continuity of the logarithm), we arrive at (2.1).

We still have to show (5.15). To this end, recall from Proposition 3.1 that, in terms of the surface measure \(\sigma \) of the patch \(p_\alpha \) on the unit sphere, we have

$$\begin{aligned} v_\alpha (k)^2 = \sigma (p_\alpha ) u_\alpha (k)^2\Big ( 1 + \mathcal {O}\Big (\sqrt{M}N^{-\frac{1}{3}+\delta }\Big ) \Big ). \end{aligned}$$

Thus

$$\begin{aligned} f_k(\lambda )= & {} 1+2 g(k) \sum _{\alpha =1}^{I_k} \frac{u_\alpha (k)^2 v_\alpha (k)^2}{u_\alpha (k)^4+\lambda ^2} \\= & {} 1+2 g(k) \sum _{\alpha =1}^{I_k}\sigma (p_\alpha )\frac{u_\alpha (k)^4}{u_\alpha (k)^4+\lambda ^2} + \mathcal {O}\left( \sqrt{M}N^{-\frac{1}{3}+\delta }\right) . \end{aligned}$$

We approximate this Riemann sum by the corresponding surface integral over a subset of \(\mathbb {S}^2\). We write \(\cos \theta _\alpha = \hat{k}\cdot \hat{\omega }_\alpha = u_\alpha (k)^2\) and \(\varphi _\alpha \) for the azimuth of \(\omega _\alpha \). We parametrize the surface integrals in the same spherical coordinate systemFootnote 9 (i. e., the inclination \(\theta \) is measured with respect to k, and the azimuth \(\varphi \) in the plane perpendicular to k). We estimate every summand by

$$\begin{aligned}&\left|\int _{p_\alpha } \frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} {{\mathrm{d}}}\sigma - \sigma (p_\alpha ) \frac{\cos ^2 \theta _\alpha }{\cos ^2 \theta _\alpha +\lambda ^2} \right|\\&\quad \le \int _{p_\alpha } \left|\frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} - \frac{\cos ^2 \theta _\alpha }{\cos ^2 \theta _\alpha +\lambda ^2} \right|{{\mathrm{d}}}\sigma \\&\quad \le \iint _{\hat{\omega }(\theta ,\varphi ) \in p_\alpha } \left|\frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} - \frac{\cos ^2 \theta _\alpha }{\cos ^2 \theta _\alpha +\lambda ^2} \right||\sin \theta |{{\mathrm{d}}}\theta {{\mathrm{d}}}\varphi \;. \end{aligned}$$

Bounding the difference using the supremum of the derivative

$$\begin{aligned} \left|\int _{p_\alpha } \frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} {{\mathrm{d}}}\sigma - \sigma (p_\alpha ) \frac{\cos ^2 \theta _\alpha }{\cos ^2 \theta _\alpha +\lambda ^2} \right|&\le \sup _{\hat{\omega }(\theta ,\varphi ) \in p_\alpha } \left|\frac{{{\mathrm{d}}}}{{{\mathrm{d}}}\theta } \frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} \right|\frac{C}{\sqrt{M}} \sigma (p_\alpha ), \end{aligned}$$
(5.16)

where we also used that, since the partition is diameter bounded, \(\sup _{(\theta ,\varphi ) \in p_\alpha } |\theta -\theta _\alpha |\le C/\sqrt{M}\). The derivative is bounded by

$$\begin{aligned} \left|\frac{{{\mathrm{d}}}}{{{\mathrm{d}}}\theta } \frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} \right|\le 2 \frac{\lambda ^2}{\cos ^2 \theta + \lambda ^2 } \frac{|\cos \theta ||\sin \theta |}{\cos ^2 \theta + \lambda ^2 } \le \frac{2}{|\cos \theta |}. \end{aligned}$$

Recall that \(\alpha \in \{1,2,\ldots ,I_k\}\), which by definition of the index set implies \(\cos \theta _\alpha > N^{-\delta }\). The bound \(|\theta - \theta _\alpha |\le CM^{-1/2}\) implies that also \(\cos \theta > CN^{-\delta }\). So (5.16) implies

$$\begin{aligned} \left|\int _{p_\alpha } \frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} {{\mathrm{d}}}\sigma - \sigma (p_\alpha ) \frac{\cos ^2 \theta _\alpha }{\cos ^2 \theta _\alpha +\lambda ^2} \right|\le C \frac{N^\delta }{M^{3/2}}. \end{aligned}$$

Since the number of patches is at most M we conclude that

$$\begin{aligned} \Big | \sum _{\alpha =1}^{I_k}\sigma (p_\alpha ) \frac{u_\alpha (k)^4}{u_\alpha (k)^4+\lambda ^2} - \int _{\mathbb {S}^2_{\mathrm{reduced}}} \frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} {{\mathrm{d}}}\sigma \Big | \le C \frac{N^{\delta }}{\sqrt{M}}. \end{aligned}$$

Here we wrote \(\mathbb {S}^2_{\mathrm{reduced}}\) for a unit half-sphere excluding the collar of width \(N^{-\delta }\) and the corridors \(p_{\mathrm{corri}}\). Since \(\cos ^2 \theta /(\cos ^2 \theta + \lambda ^2) \le 1\) we can compare to the integral over the whole unit half-sphere \(\mathbb {S}^2_{\mathrm{half}}\),

$$\begin{aligned} \left|\int _{\mathbb {S}^2_{\mathrm{reduced}}} \frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} {{\mathrm{d}}}\sigma - \int _{\mathbb {S}^2_{\mathrm{half}}} \frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} {{\mathrm{d}}}\sigma \right|\le C \left[ N^{-\delta } + M^{1/2} N^{-1/3} \right] . \end{aligned}$$

The surface integral over the unit half-sphere is easy to compute,

$$\begin{aligned} \begin{aligned} \int _{\mathbb {S}^2_{\mathrm{half}}}\!\! \frac{\cos ^2 \theta }{\cos ^2 \theta + \lambda ^2} {{\mathrm{d}}}\sigma&= \int _{0}^{\pi /2}\!\! {{\mathrm{d}}}\theta \sin (\theta ) \frac{\cos (\theta )^2}{\cos (\theta )^2+\lambda ^2} \int _0^{2\pi }\!\! {{\mathrm{d}}}\varphi = 2\pi \Big (1-{\lambda } \arctan \Big ( \frac{1}{{\lambda }} \Big )\Big ). \end{aligned} \end{aligned}$$
(5.17)

Since \(g(k) = \kappa \hat{V}(k)\) is uniformly bounded (by assumption on \(\hat{V}\)), we conclude that

$$\begin{aligned} \left|f(\lambda ) - \tilde{f}(\lambda ) \right|\le C \Big ( \sqrt{M}N^{-\frac{1}{3}+\delta } + N^{-\delta } + \frac{N^\delta }{\sqrt{M}} \Big ). \end{aligned}$$

Since for \(x\ge 0\) the function \(x \mapsto \log (1+x)\) has Lipschitz constant 1 we get

$$\begin{aligned} \left|\log f(\lambda ) - \log \tilde{f}(\lambda ) \right|\le C \Big ( \sqrt{M}N^{-\frac{1}{3}+\delta } + N^{-\delta } + \frac{N^\delta }{\sqrt{M}} \Big ). \end{aligned}$$

It remains to compare the integrals over \(\lambda \). Since \(\log (1+x) \le x\) for all \(x \ge 0\), we have

$$\begin{aligned} \left|\log f(\lambda ) \right|\le 2g(k) \sum _{\alpha =1}^{I_k} \sigma (p_\alpha ) \frac{u_\alpha (k)^4}{u_\alpha (k)^4 + \lambda ^2} \le 2g(k) \sum _{\alpha =1}^{I_k} \frac{C}{M} \frac{1}{\lambda ^2} \le \frac{C}{\lambda ^2}, \end{aligned}$$

where we used the two inequalities \(0 \le u_\alpha (k)^4 \le 1\). Using the integral identity (5.17) it is easy to see that also

$$\begin{aligned} \left|\log \tilde{f}(\lambda ) \right|\le 4\pi g(k) \Big | 1-{\lambda }\arctan \left( \frac{1}{{\lambda }}\right) \Big | \le \frac{C}{\lambda ^2}. \end{aligned}$$

Using the last three estimates, by splitting the integration at some \(\Lambda > 0\) to be optimized in the last step, we obtain

$$\begin{aligned}&\left|\frac{1}{\pi } \int _0^\infty \log f(\lambda ){{\mathrm{d}}}\lambda - \frac{1}{\pi } \int _0^\infty \log {\tilde{f}}(\lambda ){{\mathrm{d}}}\lambda \right|\nonumber \\&\quad \le \frac{1}{\pi } \int _0^{\Lambda } \left|\log f(\lambda ) - \log {\tilde{f}}(\lambda ) \right|{{\mathrm{d}}}\lambda + \frac{1}{\pi } \int _{\Lambda }^\infty \frac{8\pi g(k)}{\lambda ^2}{{\mathrm{d}}}\lambda \nonumber \\&\quad \le C \Lambda \left( \sqrt{M}N^{-\frac{1}{3}+\delta } + N^{-\delta } + \frac{N^\delta }{\sqrt{M}} \right) + C \Lambda ^{-1}\nonumber \\&\quad \le C \left( M^{1/4 }N^{-\frac{1}{6}+\frac{\delta }{2}} + N^{-\frac{\delta }{2}} + M^{-\frac{1}{4}}N^{\frac{\delta }{2}} \right) . \end{aligned}$$
(5.18)

By a similar (simpler) Riemann sum argument we obtain

$$\begin{aligned} -g(k)\sum _{\alpha =1}^{I_k} v_\alpha ^2(k)&= -g(k) \sum _{\alpha =1}^{I_k} \sigma (p_\alpha ) u_\alpha ^2(k) \left( 1 + \mathcal {O}\left( \sqrt{M}N^{-\frac{1}{3}+\delta }\right) \right) \\&= - g(k)\pi + \mathcal {O}\left( \sqrt{M}N^{-\frac{1}{3}+\delta } + N^{-\delta }\right) \end{aligned}$$

where the error is obviously smaller than (5.18). This concludes the proof of (5.15). \(\quad \square \)

6 Counting Particle–Hole Pairs in Patches

In this section we prove Proposition 3.1, which is concerned with estimating the number

$$\begin{aligned} n_{\alpha ,k}^2 = \sum _{\begin{array}{c} p \in B_{\mathrm{F}}^c \cap B_\alpha \\ h \in B_{\mathrm{F}} \cap B_\alpha \end{array}} \delta _{p-h,k} \end{aligned}$$
(6.1)

of particle–hole pairs with momentum \(p-h = k\) in patch \(\alpha \) under the condition that \(\hat{\omega }_\alpha \cdot \hat{k} \ge N^{-\delta }\). Recall that \(p_\alpha \) is a patch on the unit sphere, and \(P_\alpha = k_{\mathrm{F}} p_\alpha \).

To illustrate the idea of the proof we first consider \(k = e_3 = ( 0, 0, 1)\). Consider the lattice lines \(L_{n} := \{n+tk: t\in \mathbb {R}\}\), \(n \in \mathbb {Z}e_1 + \mathbb {Z}e_2 \subset \mathbb {Z}^3\). For each lattice line \(L_{n}\) intersecting \(P_\alpha \) there is exactly one contribution to the sum (6.1)— in fact, a simple geometric consideration shows that since \(N^{-\delta } \gg M^{-1/2}\) (which is implied by the assumption \(\delta \le \frac{1}{6} - \frac{\epsilon }{2}\)) a line never enters the Fermi ball at such a small angle (measured with respect to the tangent plane of the Fermi surface) that it would cross the surface immediately a second time and leave the Fermi ball without picking up a pair (i. e., the situation of Fig. 3 is excluded due to \(\hat{\omega }_\alpha \cdot \hat{k} \ge N^{-\delta }\)). There is only one exception to this argument: A lattice line might cross the surface at a distance less than R from a side of the patch. Depending on the angle it could then leave the patch to the side before picking up a pair, as represented in Fig. 4. However, the number of such lines is of the same order as the length of the boundary. We can thus absorb this number in the circumference error from the Gauss argument (see next paragraph).

Fig. 3
figure 3

Fermi surface in bold. A line (dashed) intersects the patch but no particle–hole pair is picked up because both ends of k would be outside the Fermi ball. This could only happen if k was very long (excluded due to \(k \in {\text {supp}}\hat{V}\)) or almost tangent to the Fermi surface (excluded by \(\hat{\omega }_\alpha \cdot \hat{k} \ge N^{-\delta }\))

Fig. 4
figure 4

Fermi surface in bold. A line (dashed) intersects the patch but no particle–hole pair is picked up because k points from a hole momentum h in the patch out into a corridor between patches. This can happen only for hole momenta near the boundary. Since the area of a patch grows faster with N than its boundary length, the number of such lines is an error term of lower order

Fig. 5
figure 5

The number of lattice lines through the patch is the same as the number of lattice lines through the projection of the patch along k onto the plane spanned by \(e_1\) and \(e_2\)

Fig. 6
figure 6

Particles and holes are indicated by black and white dots, respectively; they are paired along lines parallel to k. The number of pairs per line is given by the greatest common divisor \(\gcd (k_1,k_2,k_3)\) (here \( = 1\))

So to leading order \(n_{\alpha ,k}^2\) is the number of lines \(L_n\) intersecting \(P_\alpha \). The number of such lines is equal to the number of lines intersecting the projection \(P_\alpha ^k\) of \(P_\alpha \) to the plane spanned by \(e_1\) and \(e_2\); see Fig. 5. To count we use Gauss’ classical argument (in two dimensions):

$$\begin{aligned} \left|\left\{ L_{n} : n \in \mathbb {Z}e_1 + \mathbb {Z}e_2\right\} \cap P_{\alpha }^{k} \right|= \mu \left( P_{\alpha }^{k} \right) + \mathcal {O}\left( \text {circumference of }P_{\alpha }^{k}\right) , \end{aligned}$$

where \(\mu \) is the two-dimensional Lebesgue measure in the plane. Hence we conclude that to leading order, \(n_{\alpha ,k}^2 = \mu \left( P_\alpha ^k\right) \) if \(k = e_3\).

If \(k = (0, 0, k_3)\) then for every lattice line there are \(k_3\) contributing pairs. As illustrated in Fig. 6, for the general case we have to take into account that the distance of lattice points along the lines changes, and the density of intersection points in the \(e_1\)-\(e_2\)-plane changes.

Proof of Proposition 3.1

We are going to prove that, assuming \(\delta \le \frac{1}{6} - \frac{\epsilon }{2}\) and \(\alpha \in \mathcal {I}_{k}^{+}\), the number of particle–hole pairs with momentum k in patch \(B_\alpha \) is

$$\begin{aligned} n_{\alpha ,k}^2 = u_\alpha (k)^2 k_{\mathrm{F}}^2 \sigma (p_\alpha ) |k|\left( 1 + \mathcal {O}\left( \sqrt{M}N^{-\frac{1}{3}+\delta }\right) \right) . \end{aligned}$$
(6.2)

The statement of the proposition then follows immediately.

Let \(k = (k_1, k_2, k_3)\), and consider a patch \(P_\alpha \). Possibly reflecting at coordinate planes, we can assume that \(k_1\), \(k_2\), and \(k_3\) are all non-negative, and without loss of generality we assume \(k_3 \ne 0\) (if \(k_3=0\) we would project onto another coordinate plane). Let \(P_\alpha ^k\) the projection of \(P_\alpha \) along k onto \(\mathbb {R}^2 \times \{0\}\), the plane spanned by \(e_1\) and \(e_2\).

First we calculate \(\mu \left( P_\alpha ^k\right) \). Consider the lines \(\{k_{\mathrm{F}} \hat{\omega }(\theta ,\varphi ) + t k: t \in \mathbb {R}\}\); their intersection with \(\mathbb {R}^2 \times \{0\}\) is at \(t = -k_{\mathrm{F}}\hat{\omega }(\theta ,\varphi )_3/k_3\); so

$$\begin{aligned} P_\alpha ^k&:= \left\{ (x(\theta ,\varphi ), y(\theta ,\varphi ) , 0) = k_{\mathrm{F}} \hat{\omega }(\theta ,\varphi ) - \frac{k_{\mathrm{F}}}{k_3} \hat{\omega }(\theta ,\varphi )_3 \, k: \hat{\omega }(\theta ,\varphi ) \in p_\alpha \right\} \\&\quad \subset \mathbb {R}^2 \times \{0\}. \end{aligned}$$

Writing \(\Phi (\theta ,\varphi ) = (x(\theta ,\varphi ), y(\theta ,\varphi ))\), we find that \(P_\alpha ^{k}\) has two-dimensional Lebesgue measure

$$\begin{aligned} \mu \left( P_\alpha ^k\right) = \int _{P_\alpha ^k} {{\mathrm{d}}}x {{\mathrm{d}}}y = \int _{p_\alpha } |\det D\Phi (\theta ,\varphi ) |{{\mathrm{d}}}\theta {{\mathrm{d}}}\varphi . \end{aligned}$$

Using \(\hat{\omega }(\theta ,\varphi ) = (\sin \theta \cos \varphi , \sin \theta \sin \varphi , \cos \theta )\) it is easy to calculate the Jacobi determinant

$$\begin{aligned} \begin{aligned} |\det D\Phi (\theta ,\varphi ) |&= k_{\mathrm{F}}^2 \frac{|\sin \theta |}{k_3} |k_1 \sin \theta \cos \varphi + k_2 \sin \theta \sin \varphi + k_3 \cos \theta |\\&= k_{\mathrm{F}}^2 \frac{|\sin \theta |}{k_3} |k\cdot \hat{\omega }(\theta ,\varphi ) |. \end{aligned} \end{aligned}$$

Since the patch is diameter bounded we have \(|k\cdot \hat{\omega }(\theta ,\varphi ) |= |k\cdot {\hat{\omega }}_\alpha |+ \mathcal {O}(M^{-1/2})\); and using \(|k\cdot \hat{\omega }_\alpha |\ge N^{-\delta }\) to convert the additive error into a multiplicative error, this implies

$$\begin{aligned} \mu \left( P_\alpha ^k \right) = \frac{k_{\mathrm{F}}^2}{k_3} \int _{p_\alpha } |k\cdot \hat{\omega }(\theta ,\varphi ) |\, |\sin \theta |{{\mathrm{d}}}\theta {{\mathrm{d}}}\varphi = \frac{k_{\mathrm{F}}^2}{k_3} |k\cdot {\hat{\omega }}_\alpha |\left( 1+\mathcal {O}\left( M^{-1/2} N^{\delta }\right) \right) \sigma (p_\alpha ). \end{aligned}$$
(6.3)

We now determine the distance between neighboring lattice points along every line

$$\begin{aligned} L_{n} := \left\{ n+tk: t \in \mathbb {R}\right\} \quad \text {where } n \in \mathbb {Z}^3. \end{aligned}$$

Let \(p := \gcd (k_1,k_2,k_3)\) be the greatest common divisor of the components of k. It is not difficult to see that the distance between neighboring lattice points on each line \(L_n\) is \(|k|/p\). Given a line \(L_n\) intersecting \(P_\alpha \), let \(h \in L_{n} \cap B_{\mathrm{F}}\) be the lattice point closest to \(P_{\alpha }\). Then on the line segment \(\left\{ h+tk: t \in (0,1] \right\} \) there are p lattice points; by shifting along the line, these correspond to p particle–hole pairs contributing to \(n_{\alpha ,k}^2\). We conclude that \(n_{\alpha ,k}^2\) is to leading order the number of lattice lines \(L_{n}\) intersecting \(P_\alpha ^{k}\), multiplied with \(\gcd (k_1,k_2,k_3)\).

We now determine how many lattice lines run through \(P_\alpha ^{k}\). Intersecting \(L := \bigcup _{n \in \mathbb {Z}^3} L_{n}\) with \(\mathbb {R}^2 \times \{0\}\) we find \(t= -n_3/k_3\). So

$$\begin{aligned} L \cap \left( \mathbb {R}^2\times \{0\} \right) = \left\{ \left( n_1 - n_3 \frac{k_1}{k_3} , n_2 - n_3 \frac{k_2}{k_3} , 0 \right) : n \in \mathbb {Z}^3 \right\} . \end{aligned}$$

This can be seen as the two-dimensional square lattice \(\mathbb {Z}^2\) (the translates of the unit square indexed by \(n_1\) and \(n_2\)) and a point pattern repeated in every lattice translation of the unit square. As soon as \(n_3 k_1/k_3\) and \(n_3 k_2/k_3\) simultaneously become integer, we start repeating the point pattern in another translate of the unit square. So the number of points in the unit square is the smallest integer \(n_3\) such that both \(n_3 k_1/k_3\) and \(n_3 k_2/k_3\) are integer. We claim that this is \(k_3/p\).

To prove this claim, consider the fraction \(k_1/k_3\). Obviously \(n_3 k_1/k_3\) is integer if and only if \(n_3\) is a multiple of \(k_3 / \gcd (k_1,k_3)\). Similarly \(n_3 k_2/k_3\) is integer if and only if \(n_3\) is a multiple of \(k_3 / \gcd (k_2,k_3)\). So the number of points in the unit square is given by the least common multiple,

$$\begin{aligned} \#\text {points in unit square} = {{\,\mathrm{lcm}\,}}\left( \frac{k_3}{\gcd (k_1,k_3)},\frac{k_3}{\gcd (k_2,k_3)} \right) . \end{aligned}$$

From the standard identity \(\gcd (a,b){{\,\mathrm{lcm}\,}}(a,b)=|ab|\) for all \(a,b \in \mathbb {Z}\) we get

$$\begin{aligned} {{\,\mathrm{lcm}\,}}\left( \frac{k_3}{\gcd (k_1,k_3)},\frac{k_3}{\gcd (k_2,k_3)} \right) = \frac{k_3^2}{ \gcd (k_1,k_3)\gcd (k_2,k_3)\gcd \left( \frac{k_3}{\gcd (k_1,k_3)} , \frac{k_3}{\gcd (k_2,k_3)} \right) } \end{aligned}$$

using twice the fact that \(m \gcd (a,b) = \gcd (ma,mb)\) for all \(m\in \mathbb {N}\); then the same fact in inverse direction with \(m=k_3\); then the fact \(\gcd (a,b,c) = \gcd (a,\gcd (b,c))\) and the analogous identity for four integers

$$\begin{aligned}&= \frac{k_3^2}{ \gcd \left( {k_3}{\gcd (k_2,k_3)} , {k_3}{\gcd (k_1,k_3)} \right) } = \frac{k_3}{ \gcd \left( {\gcd (k_2,k_3)} , {\gcd (k_1,k_3)} \right) } \\&= \frac{k_3}{\gcd (k_1,k_2,k_3)}. \end{aligned}$$

In extension of Gauss’ argument, the number of lines intersecting \(P_\alpha ^{k}\) is equal to the Lebesgue measure of \(P_\alpha ^{k}\) times the number of intersection points per unit square. We thus conclude that

$$\begin{aligned} n_{\alpha ,k}^2 = \mu \left( P_\alpha ^{k}\right) k_3 + \mathfrak {e}_{\alpha ,k}. \end{aligned}$$
(6.4)

The error term \(\mathfrak {e}_{\alpha ,k}\) is proportional to the circumference of \(P_\alpha ^{k}\), times the number of lines per unit square. Consider a patch that is not a spherical cap (the estimate for the two spherical caps works analogously); its circumference consists of four pieces. The first piece is parametrized by \(\gamma (\varphi ) :=\Phi (\theta _\alpha +\Delta \theta _\alpha ,\varphi )\), and has length

$$\begin{aligned} \int _{\varphi _\alpha -\Delta \varphi _\alpha }^{\varphi _\alpha +\Delta \varphi _\alpha } |{{\dot{\gamma }}}(\varphi )|{{\mathrm{d}}}\varphi = 2\Delta \varphi _\alpha k_{\mathrm{F}} |\sin \left( \theta _\alpha +\Delta \theta _\alpha \right) |= \mathcal {O}\left( \frac{k_{\mathrm{F}}}{\sqrt{M}}\right) . \end{aligned}$$

The second piece, parametrized by \(\varphi \mapsto \Phi (\theta _\alpha -\Delta \theta _\alpha ,\varphi )\), is of the same order. The third piece is parametrized by \(\tilde{\gamma }(\theta ) := \Phi (\theta ,\varphi _\alpha +\Delta \varphi _\alpha )\). By straightforward estimates

$$\begin{aligned} \begin{aligned} |\dot{\tilde{\gamma }}(\theta )|^2&= \frac{k_{\mathrm{F}}^2}{k_3^2} \Big |k_3^2 \cos ^2 \theta + (k_1^2 + k_2^2)\sin ^2 \theta \\&\qquad ~~~~ + 2 k_3 \cos \theta \sin \theta \big ( k_1 \sin (\varphi _\alpha + \Delta \varphi _\alpha ) + k_2 \cos (\varphi _\alpha + \Delta \varphi _\alpha ) \big ) \Big |\le 2 \frac{k_{\mathrm{F}}^2}{k_3^2} |k |^2. \end{aligned} \end{aligned}$$

Integrating and recalling that \(\Delta \theta _\alpha = \mathcal {O}(M^{-1/2})\), the length of this piece is at most of order \(k_{\mathrm{F}}/\sqrt{M}\). The fourth piece has length of the same order as the third piece. We conclude that \(|\mathfrak {e}_{\alpha ,k}|= \mathcal {O}(k_F M^{-1/2})\). Combining (6.4) with (6.3), and using \(u_\alpha (k)^2 \ge N^{-\delta }\) to convert the additive error into a multiplicative error (the new contribution is the dominating error), we obtain (6.2). \(\quad \square \)