Optimal Upper Bound for the Correlation Energy of a Fermi Gas in the Mean-Field Regime

While Hartree–Fock theory is well established as a fundamental approximation for interacting fermions, it has been unclear how to describe corrections to it due to many-body correlations. In this paper we start from the Hartree–Fock state given by plane waves and introduce collective particle–hole pair excitations. These pairs can be approximately described by a bosonic quadratic Hamiltonian. We use Bogoliubov theory to construct a trial state yielding a rigorous Gell-Mann–Brueckner–type upper bound to the ground state energy. Our result justifies the random-phase approximation in the mean-field scaling regime, for repulsive, regular interaction potentials.


Introduction
While Hartree-Fock theory describes some aspects of interacting fermionic systems very well, it utterly fails at others. The best known example is that Hartree-Fock theory predicts a vanishing density of states at the Fermi momentum, which is incompatible with measurements of the conductivity and specific heat in metals [30]. It is therefore important to develop a rigorous understanding of many-body corrections to Hartree-Fock theory. The simplest theory of many-body correlations is the random-phase approximation (RPA).
In this paper we show that the RPA is mathematically rigorous, insofar as the RPA correlation energy provides an upper bound on the ground state energy of interacting fermions in the mean-field scaling regime. Our approach also sheds some light on the emergence of bosonic collective modes in the Fermi gas, described by an effective quadratic Hamiltonian. We consider a system of N 1 fermionic particles with mass m > 0 in the torus T 3 = R 3 /(2π Z 3 ), interacting via a two-body potential V , in the mean-field scaling regime. Setting the Hamiltonian is defined as and acts on the Hilbert space L 2 a (T 3 ) N consisting of square-integrable functions that are anti-symmetric under permutations of the N arguments. For simplicity we consider only the spinless case. 1 The choice of = N −1/3 and coupling constant 1/N defines the fermionic mean-field regime: it guarantees that both kinetic and potential energies are of order N , as N → ∞ (see [9] for a detailed introduction).
The ground state energy of the system is defined as In Hartree-Fock theory, one restricts the attention to Slater determinants (1) ) f 2 (x σ (2) ) . . . f N (x σ (N ) ) with { f j } N j=1 an orthonormal set in L 2 (R 3 ). Slater determinants are an example of quasi-free states: all reduced density matrices can be expressed in terms of the oneparticle reduced density matrix ω := N tr 2,...,N |ψ ψ|. For a Slater determinant, one has ω = N j=1 | f j f j |. In particular, the energy of a Slater determinant is given by the Hartree-Fock energy functional, depending only on ω: (The first two summands are typically of order N and called the kinetic and direct term, respectively; the third summand is typically of order 1 and called the exchange term.) Thus, minimizing E HF (ω) over all orthogonal projections ω with tr ω = N gives an upper bound to the ground state energy E N . Actually, it turns out that Hartree-Fock theory provides more than an upper bound for the ground state energy: the method developed in [2,3,33] for the jellium model can also be applied to show that in the present mean-field scaling the Hartree-Fock minimum agrees with the many-body ground state energy up to an error of size o (1) for N → ∞. Moreover, by projection of the timedependent Schrödinger equation onto the manifold of quasi-free states one obtains the time-dependent Hartree-Fock equation [10], which was proven to effectively approximate the many-body evolution of mean-field fermionic systems [5][6][7][8]58,59].
For N non-interacting particles on the torus, the ground state is given by the Slater determinant constructed from plane waves where the momenta k 1 , . . . , k N ∈ Z 3 are chosen to minimize the kinetic energy in a way compatible with the Pauli principle; i. e., by filling the Fermi ball, up to the Fermi momentum k F . The energy E F := k 2 F /(2m) is called the Fermi energy, and the sphere k F S 2 of radius k F is called the Fermi surface. (We assume that N is chosen so that this state is unique, no modes in the Fermi ball being left empty.) We shall denote by ω pw the reduced one-particle density matrix of this state, It turns out that this simple state is a stationary state of the Hartree-Fock energy functional even with interactions, and in our setting provides a good approximation to the minimum of the Hartree-Fock functional. The focus of the present paper is to quantify the effect of correlations in the true many-body ground state: in particular, we shall be interested in the correlation energy, defined as the difference of the ground state energy and the Hartree-Fock energy of the plane wave state, 2 E N − E HF (ω pw ).
The quest of calculating the correlation energy has been a driving force in the early development of theoretical condensed matter physics. Let us discuss the case of the jellium model: that is, fermions interacting via Coulomb repulsion, exposed to a neutralizing background charge on the torus, in the large volume limit. Let us consider the ground state energy per volume of the system, in the high density regime. As noticed already by Wigner [66] and Heisenberg [40], the computation of the correlation energy is an intricate matter because perturbation theory with respect to the Coulomb potential becomes more and more infrared divergent at higher orders. It was however quickly understood that these divergences are an artefact of perturbation theory [53]; a partial resummation of the perturbative expansion allows to capture the effect of screening, that ultimately trades the infrared divergence for a ρ log ρ contribution (ρ being the density) to the ground state energy.
In their seminal work [13,55] Bohm and Pines related the screening of the Coulomb potential to an auxiliary bosonic mode called the plasmon, and coined the name "randomphase approximation"; see also [27] for a reformulation of their result using Jastrowtype states. Gell-Mann and Brueckner showed that the RPA can be seen as a systematic resummation of the most divergent diagrams of perturbation theory [28], which has become the most popular point of view for physicists. Another interpretation of the RPA was given by Sawada et al. in [60,61] as an effective theory of approximately bosonic particle-hole pairs. A systematic mapping of particle-hole pairs to bosonic operators was introduced by Usui in [64] but does not lead to a quadratic Hamiltonian. (In Usui's approach there are parallels to bosonization in the Heisenberg model [18,20,21,41], which also gives rise to interesting problems in the calculation of higher order corrections to the free energy [4].) Sawada's approach has been systematically related to perturbation theory in [1]. Sawada's effective Hamiltonian has proved useful for further investigations into diamagnetism and the Meissner effect [65]. While Sawada's concept of bosonic pairs is very elegant, it remained unclear which parameter makes the error of the bosonic approximation small. This was clarified many years later, highlighting the role of collective excitations delocalized over many particle-hole pairs [15,16,[24][25][26]39,42,43,[45][46][47]52,54]; the main idea being that collective excitations of pairs of fermions do not experience the Pauli exclusion principle if they involve many fermionic modes of which only few are occupied.
Concerning rigorous works for the jellium model, the only available result for the correlation energy is the work of Graf and Solovej [33], which provided an upper and lower bound proportional to ρ 4/3−δ for some δ > 0. This bound has been obtained using correlation inequalities for the many-body interaction together with semiclassical methods. Unfortunately, this is still far from the expected ρ log ρ behavior: to improve on [33], new ideas are needed.
In the context of interacting fermions in the mean-field regime, the first rigorous result on the correlation energy has been recently obtained in [37], for small interaction potentials, via upper and lower bounds matching at leading order. One has: (1. 3) The strategy of [37] is based on a rigorous formulation of second order perturbation theory following [17,35,36,38], combined with methods developed in the context of many-body quantum dynamics [5,7,8,58]. For larger interaction potentials however, this method is limited to a lower bound of the right order in and N but not capturing the precise value.
Here we shall provide a rigorous upper bound on the correlation energy, without any smallness assumption on the size of the potential. It improves on the upper bound of [37], to which it reduces in the limit of small interactions. The method of the proof is inspired by a mapping of the particle-hole excitations around the Fermi surface to emergent bosonic degrees of freedom: this allows to estimate the correlation energy in terms of the ground state energy of a quadratic, bosonic Hamiltonian. The expression we obtain, if formally extrapolated to the infinite volume limit, agrees with the Gell-Mann-Brueckner formula for the jellium model.
Our method can be seen as a rigorous version of the Haldane-Luther bosonization for interacting Fermi gases, a nonperturbative technique widely used in condensed matter physics; see [44] for a review. To our knowledge, this is the first time that this method is formulated in a mathematically rigorous setting. We believe that this method, possibly combined with [37], will be crucial to rigorously understand the correlation energy for a large class of high density Fermi gases, including the jellium model.
Correlation corrections to the ground state energy of interacting Bose gases have been studied to a much larger extent. Upper and lower bounds have been proven for the mean-field scaling regime in [19,34,49,56,57,62], for the jellium model in [50,51,63], for the Gross-Pitaevskii scaling regime in [12], and in an intermediate scaling regime in [11,14,29]. The Lee-Huang-Yang formula for the low-density limit has been proven as an upper bound in [22] for small potential and in [67] for general potential, and only very recently as a lower bound [23].

Main Result
In this section we present our main result, Theorem 2.1. Our theorem provides an upper bound for the ground state energy, which is consistent with the Gell-Mann-Brueckner formula for the correlation energy.
Notice that for the interaction potential we normalize the Fourier transform such that

Remarks.
(i) We conjecture that there is actually equality in (2.1); i. e., a corresponding lower bound, possibly with different error exponent, should hold.
(ii) Recall that the Hartree-Fock energy E HF (ω pw ) consists of kinetic energy (order N ), direct interaction energy (order N ), and exchange interaction energy (order 1). Our many-body correction is of order = N −1/3 . As expected, it is negative, so that it improves over E HF (ω pw ). (iii) Notice that already with regular interaction potential the correlation correction at order involves arbitrarily high powers of the interaction potential. (iv) If we formally extrapolate our formula to the jellium model it agrees with the correlation energy first obtained by Gell-Mann and Brueckner [28,Equation (19)] as a power series; see also [61,Equation (37)] for the first appearance of the explicit expression. Gell-Mann and Brueckner also obtain a contribution from a second order exchange-type term denoted b (2) ; for us, in mean-field scaling and with compactly supportedV , this term is only of order 2 . However, since our trial state captures the second order direct-type term correctly and can be expanded in powers ofV , we expect that it would also capture the second order exchange-type term in models where it has a bigger contribution. (v) For small interaction potentialsV , we can expand This is consistent with [37], see (1.3) (notice that [37] considered the Fermi gas in [0, 1] 3 instead of [0, 2π ] 3 ). Whereas [37] uses rigorous second-order perturbation theory, here we use a non-perturbative bosonization method which directly yields a resummation of the dominant contributions of the perturbation series both of the ground state and the ground state energy to all orders in the potential. (vi) The assumption ofV being compactly supported is mainly used to control the number of particle-hole pairs that may be lost near the boundaries of patches (see Sect. 6) and to avoid interaction between different patches across the separating corridors (see Fig. 2). A sufficiently fast power law decay ofV (k) for large k should allow to control such error terms, but to keep the article readable we do not follow up on this question here.
In the remaining part of the paper we prove Theorem 2.1. Our proof is based on a reorganization of the particle-hole excitations around the Fermi surface in terms of approximately bosonic collective degrees of freedom, which we will introduce in the next section. Notice that 1/m can be factored out from the Hamiltonian, replacing the potential V by mV , so we consider only m = 1 and the dependence on m is easily restored at the end.

Collective Particle-Hole Pairs
In this section we represent the correlation energy in terms of particle-hole excitations around the Fermi surface. These excitations will be described by quadratic fermionic operators on the Fock space, that behave as almost bosonic operators. The advantage of this rewriting is that the correlation energy can thus be related to the ground state energy of a quadratic almost-bosonic Hamiltonian.

The correlation Hamiltonian.
Here we shall introduce a Fock space representation of the model. We shall follow the notations of [9,Chapter 6], to which we refer for more details. Let F := F(L 2 (T 3 )) be the fermionic Fock space built on the single-particle space L 2 (T 3 ). Let us denote by H N the second quantization of H N . We have where a * x , a x are the creation and annihilation operators (more precisely, operator-valued distributions), creating or annihilating a fermionic particle at x ∈ T 3 . They satisfy the usual canonical anticommutation relations (CAR) Let us define the Fermi ball where k F is the Fermi momentum. Let N be the number of points in the Fermi ball, N := |B F |. Then, by Gauss' classical counting argument, We also introduce the complement of the Fermi ball, The filled Fermi ball is obtained by considering the Slater determinant ψ pw built from the plane waves f k i (x) = (2π) −3/2 e ik i ·x , associated to the points k i ∈ B F , i = 1, . . . , N . Let ω pw be the reduced one-particle density matrix associated to such states, With the plane waves f k defined in (1.2), we define the unitary 3 particle-hole transformation R ω pw : F → F by setting Here we introduced the vacuum vector = (1, 0, 0, . . .) ∈ F. Particle-hole transformations are a particular kind of fermionic Bogoliubov transformation. In fact, formally writing a x = a(δ(·−x)) and δ(y −x) = k∈Z 3 f k (y) f k (x) one can rewrite the previous relation in position space, where u = I − ω pw , v = k∈B F | f k f k | and where we also introduced the short- The state R ω pw plays the role of the new vacuum for the model, on which the new fermionic operators R ω pw a( f k )R * ω pw act. We call momenta in B F hole modes, and momenta in B c F particle modes. We will use the notation a * k := a * ( f k ). If we want to emphasize that the index is outside the Fermi ball we write a * p , p ∈ B c F ("p" like "particle") and say that a * p creates a particle. Similarly we use a * h , h ∈ B F ("h" like "hole") and say that a * h creates a hole in the Fermi ball. We call N p := p∈B c F a * p a p the number-of-particles operator and N h := h∈B F a * h a h the number-of-holes operator. If we do not want to distinguish between particles and holes we use the word "fermion", for example calling N = N p + N h the number-of-fermions operator.
Let us consider the conjugated Hamiltonian R * ω pw H N R ω pw . Using (3.3), and rewriting the result into a sum of normal-ordered contributions one gets (see [9,Chapter 6] for a similar computation in the context of many-body quantum dynamics): with d (A) the second quantization 4 of a one-particle operator A. The operator h is the one-particle Hartree-Fock Hamiltonian, given by where X is the exchange operator, defined by its integral kernel X (x, y) = −N −1 V (x − y)ω pw (x, y). As for the operator Q N on the r. h. s. of (3.4), it contains all contributions that are quartic in creation and annihilation operators. It is given by where and As we shall see, both E 1 and E 2 will provide subleading corrections to the correlation energy, as N → ∞. The operator R * ω pw H N R ω pw − E HF (ω pw ) is called the correlation Hamiltonian, (3.8) 4 The second quantization of the one-particle operator A is defined on the n-particle sector of F as d (A) := n j=1 A j , where A j := I ⊗ j−1 ⊗ A ⊗ I ⊗n− j acts non-trivially only on the j-th particle. If A has an integral kernel A(x, y), its second quantization can be written as d (A) = A(x, y)a * x a y dxdy.
Let ψ ∈ F be a normalized N -particle state in the fermionic Fock space, that is ψ = (0, 0, . . . , 0, ψ (N ) , 0, . . .). By the variational principle, we have where ξ = R * ω pw ψ. The last step follows from the identity (3.4). We are going to construct an N -particle state ψ trial = R ω pw ξ such that ξ, H corr ξ is given by the Gell-Mann-Brueckner formula up to errors that are of smaller order as N → ∞. To construct this state, we shall represent H corr in terms of suitable almost-bosonic operators, obtained by combining fermionic particle-hole excitations. As we shall see, the resulting expression will be quadratic in terms of these new operators; the state ξ will be chosen to minimize the bosonic energy.

Particle-hole excitations.
We start by rewriting the quartic contribution to the correlation Hamiltonian as The main contribution to Q N is Q B N , which, as we shall see, can be represented as a quadratic operator in terms of collective particle-hole pair operators. These operators behave approximately like bosonic creation and annihilation operators.
Let us define the (unnormalized) particle-hole operator as Notice thatb * 0 = 0 since uv = 0. Writing this operator in momentum representation, we can think of it as creating a particle-hole pair of momentum k, delocalized over all the Fermi surface. In terms of these operators Recall thatV has compact support by assumption, so there exists It is convenient to group together k and −k modes, as follows. Define nor ⊂ Z 3 (3.11) as the set of all k ∈ Z 3 ∩ B R (0) with k 3 > 0 and additionally half of the k-vectors with k 3 = 0, such that for every k ∈ nor we have −k ∈ nor . We then rewrite Q B N as (3.12) It turns out that the operatorsb k behave as approximate bosonic operators, whenever acting on vectors of F with only a few particles; the Pauli principle is relaxed by summing over a large number of momenta of which typically only few are occupied.
The main problem, however, is that the term d (uhu − vhv) in (3.4) cannot be represented as a quadratic operator in terms ofb k andb * k . To circumvent this issue we shall split the operatorsb k ,b * k into partially localized particle-hole operatorsb α,k ,b * α,k involving only modes of one patch of a decomposition (indexed by α) of the Fermi surface. This allows us to linearize the kinetic energy around the centers of patches, so that states of the formb * The non-trivial question is whether we can localize (3.10) sufficiently to control the linearization of the kinetic energy, while at the same time keeping it sufficiently delocalized so thatb * α,k involves many fermionic modes, thus relaxing the Pauli principlecomplete localization would of course destroy the bosonic behavior since (a * p a * h ) 2 = 0. We are going to find that this can be achieved by decomposing the Fermi sphere into Patch decomposition of the Fermi sphere. We construct a partition of the Fermi sphere k F S 2 into M diameter-bounded equal-area patches following [48], see Fig. 1. Let or more precisely, this number rounded to the nearest even integer. Our goal is to first decompose the unit sphere S 2 as where p α are suitable pairwise disjoint sets, to be defined below, and p corri has small surface measure, σ ( p corri ) The error p corri is due to the introduction of a positive distance ("corridors") separating neighboring patches. The important properties to be ensured in the construction are that all patches p α have Diameter-bounded partition of the northern half sphere following [48]: a spherical cap is placed at the pole; then collars along the latitudes are introduced and split into patches, separated by corridors. The vectorŝ ω α are picked as centers of the patches, marked in black. The patches will be reflected by the origin to cover also the southern half sphere area of order 1/M and that they do not degenerate into very long, thin shapes as M becomes large. We use standard spherical coordinates: forω ∈ S 2 , denote by θ the inclination angle (measured betweenω and e 3 = (0, 0, 1)) and by ϕ the azimuth angle (measured between e 1 = (1, 0, 0) and the projection ofω onto the plane orthogonal to e 3 ). We writeω(θ, ϕ) to specify a vector on the unit sphere in terms of its inclination and azimuth angles.
The construction starts by placing a spherical cap centered at e 3 , with opening angle θ 0 := D/ √ M, with D ∈ R chosen so that the area of the spherical cap equals 4π/M. Next, we decompose the remaining part of the half sphere, i. e., the set of allω(θ, ϕ) with D/ √ M ≤ θ ≤ π/2, into √ M/2 (rounded to the next integer) collars; the i-th collar consists of allω(θ, ϕ) with θ ∈ [θ i − θ i , θ i + θ i ) and arbitrary azimuth ϕ. The inclination of every collar will extend over a range θ i ∼ 1/ √ M; the proportionality constant is adjusted so that the number of collars on the half sphere is an integer.
Observe that the circle ω(θ i , ϕ) : ϕ ∈ [0, 2π) has circumference proportional to sin(θ i ); therefore we split the i-th collar into √ M sin(θ i ) (rounded to the next integer) patches. This implies that the j-th patch in the i-th collar covers an azimuth angles We fix the proportionality constants by demanding that all patches have area 4π/M (this is not necessary though, it would be sufficient that all patches have area of order 1/M). The last step is to define θ i := θ i −D R N −1/3 and ϕ i, j := ϕ i, j −D R N −1/3 / sin(θ i ), withD > 0 to be fixed below. We then define p 1 as the spherical cap centered at e 3 with opening angle θ 0 and the other M/2 − 1 patches as The constantD is chosen such that, when patches are scaled up to the Fermi sphere there are corridors of width at least 2R between adjacent patches (i. e.,D has to be slightly larger than κ −1 0 ). Having concluded the construction on the northern half sphere, we define the patches on the southern half sphere through reflection by the origin, k → −k. Finally we switch from enumeration by i and j to enumeration with a single index α ∈ {1, . . . , M}. From the construction it is clear that the patches p α have the following three properties: (ii) The family of decompositions is diameter bounded, i. e., there exists a constant C 0 independent of N and M such that, for the decomposition into M patches, the diameter 5 of every patch is bounded by 2 . Next, we scale the patches from the unit sphere up to the Fermi surface k F S 2 by setting The patches P α then have the following properties.
Finally, we shall introduce a "fattening" of the patch decomposition, which will be used to decompose the operators b k as sums of operators corresponding to particle-hole excitations around the patches. This is motivated by the fact that the only modes affected by the interaction are those in a shell around the Fermi sphere, where the thickness of the shell is given by the radius of the support ofV . Recalling again that R > 0 is chosen such thatV (k) = 0 for |k| > R, we define the fattened Fermi surface as We lift the partition of the unit sphere to a partition of ∂ B R F , by introducing the cones C α := r ∈(0,∞) r p α and defining The set B corri consist of all the remaining modes in the similarly fattened corridors.) To every patch B α we assign a vector ω α ∈ B α as the center of P α on the Fermi surface; in particular |ω α | = k F . The vectors ω α inherit the reflection symmetry of the patches, ω α+M/2 = −ω α for all α = 1, . . . , M/2. Localization on the Fermi surface. We recall (3.10) in momentum representation, we are only interested in the case |k| ≤ R; hence, the sum in (3.13) effectively runs only over p and h at most at distance R from the Fermi sphere Next, we decompose the sum on the r. h. s. of (3.14) into contributions associated with different patches. If k · ω α < 0, there will be few or no particle-hole pairs ( p, h) in the patch B α satisfying p − h = k; geometrically, k is approximately pointing from outside to inside of the Fermi ball, which is incompatible with the requirements p ∈ B c F and h ∈ B F . Also if k · ω α is positive but small, there are only few particle-hole pairs ( p, h) with p − h = k. For this reason, for any k ∈ Z 3 , we define the index set 6 for a parameter δ > 0 to be chosen later. We then writẽ The operator r * k contains all particle-hole pairs that are not included in α∈I + kb * α,k . This can happen for two reasons: because an index α is not included in I + k , or because one or both momenta of a pair ( p, h) belong to a corridor between patches. As we shall see, this operator can be understood as a small error, due to the fact that the number of pairs ( p, h) not included in the first sum is small.
Normalization of particle-hole pair operators. We still have to normalize the pair operators so that they can be seen as an approximation of bosonic operators. The normalized operators are defined by We call these operators the pair creation operators; their adjoints are called pair annihilation operators. The normalization constant can be calculated as follows: This shows that n 2 α,k is the number of particle-hole pairs with momentum k = p − h that lie in the patch B α . Due to the symmetry of the partition under point reflection at the origin we have n α,k = n α+M/2,−k . We define v α (k) ≥ 0 by setting In the next proposition, whose proof is deferred to Sect. 6, we estimate the normalization constants.
is the surface area of the patch p α on the unit sphere.
Due to the cutoffω α ·k ≥ N −δ imposed through the index set I + k , it immediately follows that there exists a constant 7 C such that

Construction of the Trial State
In this section we shall introduce the trial state that will produce the upper bound in our main result, Theorem 2.1. To begin, let us show that the particle-hole operators b α,k defined in (3.16) behave as almost-bosonic operators when acting on Fock space vectors containing only few fermions.

Particle-hole creation via almost-bosonic operators.
Recall the definition of nor given after (3.11). For k ∈ nor , let We shall also set I k = I + k ∪ I − k . To unify notation, we define Then The operator E α (k, l) commutes with N , and satisfies the bound The same estimate holds for Proof. The two identities on the first line of (4.2) are obvious. We prove the second line.
First case: α ∈ I + k and β ∈ I + l . We have From the definition it is clear that b and b * operators belonging to different patches commute, explaining the δ α,β -factor. Thus, from now on α = β. By the CAR, The first term in (4.5) gives the following contribution to the commutator (4.4): The two remaining terms in (4.5) produce the error term In the present case, the error term in the lemma is E α (k, l) := E(α, k, l). Let us only consider the second term in the left-hand side; the first can be controlled in the same way. Setting Recall also, for the second quantization of any bounded one-particle operator, the stan- , with the same bound as before. Third case: α ∈ I + k and β ∈ I − l , and vice versa. For α = β the commutator vanishes, just like in the previous cases. So consider α ∈ I + k and β = α Since I + k ∩ I − k = ∅, α = β is possible only for k = l. Also k = −l is excluded since k, l ∈ nor . Consequently δ k,l = 0 = δ k,−l , so (4.7) agrees with the statement of the Lemma (if we set E α (k, l) := E(α, k, −l)). The estimate of the error term remains the same.
It is obvious that E(α, k, l) commutes with N . This completes the proof of the lemma.
The next lemma provides bounds for the c α (k), c * α (k) operators that are similar to the usual bounds valid for bosonic creation and annihilation operators.
where N (B) := i∈B a * i a i for any set of momenta B ⊂ Z 3 . Furthermore, for f ∈ 2 (I k ) and ψ ∈ F, we have α,k . This proves (4.8). To prove the first inequality from (4.9), we use (4.8) together with Cauchy-Schwarz, We now prove the second inequality from (4.9). By Lemma 4.1, we have (4.10) Consider the last term on the r. h. s. Recall from (4.6) that for α ∈ I + k we have Obviously ψ, E α (k, k)ψ ≤ 0. For α ∈ I − k we have −k replacing k on the r. h. s., again producing a negative semidefinite operator. Hence in (4.10) we can drop the last summand for the purpose of an upper bound. Together with the first bound from (4.9) this implies where the error terms contain at least one r k -operator (see the discussion following (5.7) for the rigorous proof of their smallness). Recalling the definition (3.17) of v α (k) and the definition of the c and c * operators (4.1), we get Let us now consider the operator d (uhu − vhv) appearing in the definition of the correlation Hamiltonian (3.8). To express d (uhu −vhv) in terms of c α (k), c * α (k), we observe that, for α ∈ I + k and neglecting the contribution of the constant direct term and of the exchange operator X on the r. h. s. of (3.5) (they will be proven to be small) where we used the fact that, for p, h ∈ B α , p ω α h. A similar computation for α ∈ I − k shows that If the operators c * α (k), c α (k) were bosonic creation and annihilation operators, satisfying canonical commutation relations, and if d (uhu − vhv) were quadratic in these operators, (4.12) would lead us to Thus, Equations (4.11) and (4.12) suggest that, if restricted to states with few particles, the correlation Hamiltonian should be approximated by the Sawada-type effective Hamiltonian (4.14) We defined The main difference to Sawada's original Hamiltonian is that he treated pairs a * p a * h as bosonic; our pair operators instead are delocalized over large patches, thus relaxing the Pauli principle and allowing a controlled bosonic approximation.) If the operators c α (k), c * α (k) were exactly bosonic, the effective Hamiltonian H eff could be diagonalized via a bosonic Bogoliubov transformation. We provide the details of this computation in Appendix A. The ground state of (4.13) would be given by where, for every k ∈ nor , K (k) is the 2I k ×2I k matrix (with (the superscript T denoting the transpose of the matrix) with (4.19) However, the particle-hole pair operators c * α (k), c α (k) are not exactly bosonic, and thus the ground state vector of (4.13) is not given by (4.16). Nevertheless, by Lemma 4.1, it is reasonable to expect that the true ground state of H corr will be energetically close to ξ , provided that the number of fermions in ξ is small. This last fact is proven in Sect. 4.3.
Motivated by the above heuristic discussion, we define as trial state for the full manybody problem the fermionic Fock space vector (4.20) Notice that B * = −B, so T is unitary and hence ψ trial = 1. We have to check that ψ trial is an N -particle state. In fact, writing ξ := R * ω pw ψ trial , we have which shows that ψ trial is an eigenvector of N with eigenvalue N if and only if ξ is an eigenvector of N p − N h with eigenvalue 0. This is the content of the next lemma. Proof. Let ξ λ = T λ , with T λ = e λB for λ ∈ [0, 1]. Then ξ 1 = ξ , ξ 0 = , and thus

Approximate bosonic Bogoliubov transformations.
Our next task is to evaluate the energy of the fermionic many-body trial state ψ trial = R ω pw ξ = R ω pw T , which by (3.4) and (3.8) reduces to calculating ξ, H corr ξ . To do so, we will need some properties of the operator T , which are going to be proven in this section. More generally, we shall consider the one-parameter family of unitaries T λ = e λB , with B defined in (4.20). The next proposition establishes that the action of T λ approximates a bosonic Bogoliubov transformation.
Proof. We start from the Duhamel formula From Lemma 4.1, the commutator is given by where the error term is with χ I k the indicator function of the set I k = I + k ∪ I − k and E γ (k, l) bounded as in (4.3). Thus We iterate n 0 times by plugging the second equation into the second summand on the r. h. s. of the first equation and so forth. The simplex integrals produce factors 1/n!, so we obtain 8 8 The range of summation used in the matrix multiplication is clear from the momentum dependence of K , e.g., K (l) 2 Here we introduced the notation c α (l), which in this formula means c * α (l) for n 0 odd, and c α (l) for n 0 even. The left term on every line is the leading term, the right term on every line is an error term which will be controlled later. The very last line is the 'head' of the iteration after n 0 steps; we are going to control the expansion as n 0 → ∞, showing that the head vanishes.
Notice that leading terms are of the form of an exponential series λ n K (l) n /n! but intermittently with c and c * . Separating creation and annihilation operators, we reconstruct cosh(λK (l)) and sinh(λK (l)). We find where, for an arbitrary n 0 ∈ N, (In every summand, e α (l) and c α (l) appear for even n or n 0 , e * α (l) and c * α (l) for odd n or n 0 .) Notice that for any function f : R → R the simplex integration simplifies to λ 0 dτ 1 Therefore, for all ψ ∈ F we have using the explicit expression (4.22) for e α (l) we have Let us start by estimating B γ . We shall neglect the symbol ; the bounds are the same whether for the operator or its adjoint. Using also (4.9), we get pulling E α (k, l) through (N + 1) 1/2 to the front and then using (4.3), we get where we used n α,k ≥ n as established in (3.18), we separated the term with n = 0 and, for n ≥ 1, we applied Cauchy-Schwarz to the sum over α. Again by Cauchy-Schwarz, we obtain The error term A γ can be treated similarly (applying first (4.3) and then (4.9)). We find As for the term C γ , it is controlled with (4.9) by This implies that Finally, the term D γ can be bounded by Since all these bounds hold for any n 0 ∈ N, we obtain ⎡ The next lemma provides the required bounds for the matrix K (k), defined in (4.17). For later estimates it is important that this bound implies K (k) = 0 outside the support ofV . (Actually the constant C here may be chosen independent of V .) (4.19), are strictly positive. Let K (k) be defined by (4.17). Then we have

Lemma 4.5 (Bound on the Bogoliubov Kernel). Let k ∈ nor . Then the matrices E(k), D(k)+ W (k)−W (k), and D(k)+ W (k)+W (k), all defined in
where K (k) tr denotes the trace norm of the matrix K (k).
Proof. Recall that I k = |I + k | = |I − k | and that the matrix K (k) is symmetric and has size 2I k × 2I k . All quantities in this proof depend on the same k, so we simplify notation by dropping this dependence where there is no risk of confusion, writing e. g., I for I k .
To exhibit the block structure of the matrices, we map the indices I + k to {1, . . . , I }, and the indices I − k to {I + 1, . . . 2I }. There are many such mappings, but due to the reflection symmetry of the patches (B α+M/2 = −B α and ω α+M/2 = −ω α in the original numbering), we can choose one such that v α (k) = v α+I (−k). This implies that where b α,β = gv α (k)v β (k) defines an I × I -matrix. We drop the k-dependence from the notation and just write v α = v α (k). In Dirac notation, where |v v| is the orthogonal projection onto v = (v 1 , · · · , v I ), we have b = g|v v| ∈ C I ×I .
Also D α,α = |k ·ω α | is invariant under reflection at the origin and so D simplifies to Recalling the definition of the index set I + k we notice that u 2 α ≥ N −δ for all α ∈ {1, . . . , I }, and thus d is invertible. Since b ≥ 0 (because g ≥ 0), we find d +2b ≥ d > 0; hence also d + 2b is invertible.
To simplify the computation further, let where I is the I × I -identity matrix. Obviously U T = U = U −1 , and it simultaneously blockdiagonalizes This shows that D + W +W and D + W −W are strictly positive, thus invertible, and have a positive square root. We also find Both blocks are strictly positive; E is therefore invertible and has a strictly positive operator square root. Now consider We find Since b is a positive operator, using operator monotonicity of the inverse and the square root, we find Using the equality of the spectra σ (AB) = σ (B A) for positive operators A and B, we conclude that (4.29) We introduce the polar decomposition S 1 = O|S 1 |; a priori O is a partial isometry, but since S 1 is invertible, O is actually an orthogonal matrix. Then |S T Using furthermore the blockdiagonalization (4.26), we find Equations (4.28) and (4.29) imply that log A 1 ≤ 0 and log A 2 ≥ 0. Hence From the definition (4.27), we arrive at where we used Proposition 3.1, which implies v 2 We are now ready to estimate the expectation of N n in the state ξ , defined in (4.16). We follow a strategy similar to the one developed in the dynamical setting in [8] for the control of the growth of many-body fluctuations around Hartree-Fock dynamics. T λ ψ, (N + 1) n T λ ψ ≤ e Cn ψ, (N + 5) n ψ .

Proof. From the CAR (3.1) we get
. We calculate the derivative w.r.t. λ of the expectation value of (N + 5) n : To distribute the powers of the number operator equally to both arguments of the inner product, we insert I = (N + 1) n 2 −1− j (N + 1) j+1− n 2 between (N + 5) j and c * α (k) and then pull (N + 1) j+1− n 2 through c * α (k)c * β (k) to the right. Thus where we have introduced ξ j := (N + 1)  From the differential inequality (4.30), using Grönwall's Lemma, we conclude that where in the last inequality we used (4.23) and the assumptions on V .

Evaluating the Energy of the Trial State
In this section we calculate the expectation value ξ, H corr ξ , for the trial state ξ defined in (4.16) and H corr defined in (3.8). We start with some simple estimates for the non-bosonizable terms. Afterwards we linearize the kinetic energy and calculate its contribution to the expectation value, before we eventually turn to the main part of the interaction.

Getting rid of non-bosonizable terms.
In the next lemma, we show that the contribution of the terms in (3.6) to the expectation ξ, H corr ξ is negligible for N → ∞.

Lemma 5.1 (Non-Bosonizable Interaction Terms). Let E 1 (x, y) be defined as in (3.6).
Let ξ be the trial state defined as in (4.16). Then we have Proof. We are going to show that for all ψ ∈ F we have The final claim then follows using Proposition 4.6. To prove (5.1), let us rewrite the first term on the r. h. s. of (3.6) by using the CAR and u x , u y = u(x, y), yielding for any bounded one-particle operator A and any ψ ∈ F. Thus, using that u op ≤ 1, The second summand in (5.2) can be estimated in the same way. The same holds true for the other two terms in (3.6).
Let us now consider the error term E 2 , defined in (3.7). We prove that this term vanishes in our trial state ξ . (3.7). Let ξ be the trial state defined in (4.16). Then we have

Lemma 5.2 (Interaction Terms of Wrong Parity). Let E 2 (x, y) be defined as in
Proof. Since terms in E 2 (x, y) create exactly two fermions, we have Recall that ξ = T , with T = exp(B) and B as in (4.20). We have [i N , B] = 0, since B creates or annihilates particles four at a time. This implies T i N = i N T . Using i N = , we get which thus vanishes.

Estimating direct and exchange operators. In this section we estimate the contribution of the direct and exchange terms to d (uhu − vhv). Recall that
and therefore it vanishes on ξ by Lemma 4.3. The next lemma allows us to control the contribution of the exchange term X .

Lemma 5.3 (Bound for the Exchange Term).
Let ξ be the trial state defined as in (4.16).
Proof. Notice that Thus X is translation invariant, and hence Using that u op = 1 = v op , we get, by Proposition 4.6:

Expectation value of the kinetic energy.
In this section we evaluate the contribution of the Laplacian to the expectation value of the correlation Hamiltonian in the trial state ξ defined as in (4.16). We start by linearizing in Fourier space, Notice that from the first to the second line, momenta p and h that lie in the corridors or are more than a distance R away from the Fermi surface have disappeared from the sums; this is justified since such modes are never occupied in the trial state, i. e., a p ξ = 0 and a h ξ = 0. Furthermore, thanks to Lemma 4.3, we have where we used that |ω α | = k F for all α. To estimate ( p − ω α ) 2 and (h − ω α ) 2 , we recall that the diameter of the patches is bounded by C N 2/3 /M (since the diameter of the patch on the Fermi surface is bounded by N 2/3 /M which is large compared to its thickness of order R). Therefore and where the error E lin is bounded by where in the last step we used Proposition 4.6 to bound the expectation value of the number operator and = N −1/3 . To compute the expectation of the linearized kinetic energy operator H kin , we will make use of the following lemma.
We are now ready to calculate the kinetic energy of our trial state. Proposition 5.5 (Kinetic Energy). Let ξ be defined as in (4.16). Then where D(k) is defined in (4.19) and the error term is such that |E kin | ≤ C /n 2 with n = N 1/3−δ/2 M −1/2 as in (3.18).

|E
(1) From Proposition 4.6 and Lemma 4.5, we conclude that |E (1) kin | ≤ C /n 4 . The third error term E (3) kin can be controlled similarly, using Lemma 4.2: The second error term E kin can be controlled in the same way.

Expectation value of the interaction energy.
We now evaluate the main contribution (3.9) to the interaction energy. This is the content of the next proposition.
Proof. We start by decomposing the b k -operators in the interaction Hamiltonian (3.12) by their patch decomposition (3.15), We recall that the error terms r k collect modes in the corridors and close to the equator: wherer k is a linear combination of products a h a p such that at least one of the two momenta is in the corridors B corri (see Fig. 2), and the second term collects the contributions coming from the patches close to the equator. We are going to show that r k gives a negligible contribution to ξ, Q B N ξ .

Contribution of corridors.
We claim that the error operatorsr k do not contribute to ξ, Q B N ξ . To see this, recall that T = e B , with B not containing any mode q ∈ B corri , see (4.20). Since at least one of the two momenta p and h appearing inr k is in the corridor B corri , we haver k ξ = 0. Plugging the decomposition (5.4) into Q B N from (3.12) and taking the expectation value on ξ , we realize that all terms containing at least one error operatorr k are zero, due to the fact that there is at least one error operatorr k directly acting on T .
Contribution of patches near the equator. We claim that the contribution to ξ, Q B N ξ coming from patches β ∈ I k is subleading as N → ∞. These are the patches β in the collar where |k ·ω β | < N −δ . The width of this collar is bounded above by Ck F N −δ , and its length-approximately equal to the circumference of the equator-is bounded above by Ck F ; we conclude that the surface area of the collar is of order k 2 F N −δ . Recall that n 2 β,k is the number of particle-hole pairs with relative momentum k in patch β; thus, adding in the corridors for an upper bound, β ∈I k n 2 β,k is bounded by the number of particle-hole pairs with relative momentum k in the collar. This number is bounded above by the number of hole momenta h ∈ B F that are at most a distance R from the collar (since |k| ≤ R). The number of such points of the lattice Z 3 can be counted by Gauss' classical argument: assign to each lattice point k the cube [k 1 , . Then the number of cubes belonging to lattice points near the collar is bounded by the Lebesgue measure of the collar "fattened" to a thickness R; i. e., We are now ready to estimate the contribution to ξ, Q B N ξ coming from the modes close to the equator. Consider, e. g., the term 1 2N k∈ norV (k)2b * kb k (all the other terms can be dealt with similarly). We get three contributions from (5.5), namely We give the detailed estimate for the first term in the list (the other two terms can be controlled similarly) where we used (5.6) to control the sum over β ∈ I k , n 2 α,k ≤ Ck 2 F /M due to Proposition 3.1 for the sum over α ∈ I + k , and the bound on the number of fermions N from Proposition 4.6.
Approximate Bogoliubov diagonalization of the effective interaction. By the discussion of the previous paragraph Introducing the normalization factors n α,k = k F √ |k|v α (k) and combining the b * α,k and b * α,−k operators to c * α (k) operators as in (4.1), we get int , (5.8) where, recalling that g(k) = κV (k), We shall evaluate ξ, H (i) int ξ , i = 1, 2, 3, with ξ = T , using the fact that the T operator behaves as an approximate bosonic Bogoliubov transformation, recall Proposition 4.4. Using also , c δ (k)c * γ (k) = δ δ,γ , we have and the error term is For the second part of the interaction we find ξ, H where Finally, for the third interaction term we find we used the fact that all terms are real to write the more symmetric expression in terms ofW (k) = W +− (k) + W −+ (k) (the latter is defined by exchanging the role of I + k and I − k in the former). The error term E int is given by Combining (5.9), (5.10), (5.11) and (5.8), we conclude that int . To control the error term E int , we decompose it as Recall that u α (k) 2 = |k ·ω α | ≤ 1 and hence, by Proposition 3.1, v α (k) ≤ C M u α (k) ≤ C M . Thus, using Proposition 4.4 and Cauchy-Schwarz, we find Lemma 4.5 and Proposition 4.6 imply that |E (1,3) int | ≤ C /n 4 . The term E (1,1) int can be controlled similarly: applying Cauchy-Schwarz in α and in β, using |I + k | = I k ≤ M/2 and (4.21) we arrive at Again, Lemma 4.5 and Proposition 4.6 show that |E (1,1) int | ≤ C /n 2 . Analogously, we obtain also |E (1,2) int | ≤ C /n 2 . Hence |E (1) int | ≤ C /n 2 . The error term E (2) int differs from E (1) int only in the replacement of the index set I + k by I − k . Therefore, we find |E (2) int | ≤ C /n 2 . As for the error term E int , it also differs from E (1) int in the index set, some hermitian conjugations, and the appearance of a cosh instead of a sinh. The estimates however remain valid and we also obtain |E (3) int | ≤ C /n 2 .

Proof of the main theorem.
Proof of Theorem 2.1. Recall the definition (3.8) of the correlation Hamiltonian and the decomposition (3.9) of the quartic interaction Q N . Combining the results of Sects. 5.1, 5.2, 5.3, Propositions 5.5 and 5.6, we conclude that for an error E such that To evaluate the main part of the expectation value explicitly, notice that by definition (4.17) of K (k) we have Now using the explicit form (4.18) of S 1 (k), E(k), and S 2 (k), this can be simplified to yield We are left with evaluating the traces in (5.12).
Evaluation of the traces. For simplicity, we shall drop the k-dependence in the notation (we will restore it in (5.14)). Recall the block diagonalization (4.25), by which is a rank-one perturbation of a diagonal operator, with diagonal part d 2 = diag(u 4 α : α = 1, . . . I ) and withũ = (v 1 u 1 , . . . , v I u I ) ∈ R I .
We still have to show (5.15). To this end, recall from Proposition 3.1 that, in terms of the surface measure σ of the patch p α on the unit sphere, we have v α (k) 2 = σ ( p α )u α (k) 2 Here we wrote S 2 reduced for a unit half-sphere excluding the collar of width N −δ and the corridors p corri . Since cos 2 θ/(cos 2 θ + λ 2 ) ≤ 1 we can compare to the integral over the whole unit half-sphere S 2 half , The surface integral over the unit half-sphere is easy to compute, (5.17) Since g(k) = κV (k) is uniformly bounded (by assumption onV ), we conclude that Since for x ≥ 0 the function x → log(1 + x) has Lipschitz constant 1 we get It remains to compare the integrals over λ.
where we used the two inequalities 0 ≤ u α (k) 4 ≤ 1. Using the integral identity (5.17) it is easy to see that also Using the last three estimates, by splitting the integration at some > 0 to be optimized in the last step, we obtain By a similar (simpler) Riemann sum argument we obtain −g(k) where the error is obviously smaller than (5.18). This concludes the proof of (5.15).
k ω α Fig. 3. Fermi surface in bold. A line (dashed) intersects the patch but no particle-hole pair is picked up because both ends of k would be outside the Fermi ball. This could only happen if k was very long (excluded due to k ∈ suppV ) or almost tangent to the Fermi surface (excluded byω α ·k ≥ N −δ )

Counting Particle-Hole Pairs in Patches
In this section we prove Proposition 3.1, which is concerned with estimating the number of particle-hole pairs with momentum p − h = k in patch α under the condition that ω α ·k ≥ N −δ . Recall that p α is a patch on the unit sphere, and P α = k F p α .
To illustrate the idea of the proof we first consider k = e 3 = (0, 0, 1). Consider the lattice lines L n := {n + tk : t ∈ R}, n ∈ Ze 1 + Ze 2 ⊂ Z 3 . For each lattice line L n intersecting P α there is exactly one contribution to the sum (6.1)-in fact, a simple geometric consideration shows that since N −δ M −1/2 (which is implied by the assumption δ ≤ 1 6 − 2 ) a line never enters the Fermi ball at such a small angle (measured with respect to the tangent plane of the Fermi surface) that it would cross the surface immediately a second time and leave the Fermi ball without picking up a pair (i. e., the situation of Fig. 3 is excluded due toω α ·k ≥ N −δ ). There is only one exception to this argument: A lattice line might cross the surface at a distance less than R from a side of the patch. Depending on the angle it could then leave the patch to the side before picking up a pair, as represented in Fig. 4. However, the number of such lines is of the same order as the length of the boundary. We can thus absorb this number in the circumference error from the Gauss argument (see next paragraph).
So to leading order n 2 α,k is the number of lines L n intersecting P α . The number of such lines is equal to the number of lines intersecting the projection P k α of P α to the plane spanned by e 1 and e 2 ; see Fig. 5. To count we use Gauss' classical argument (in two dimensions): where μ is the two-dimensional Lebesgue measure in the plane. Hence we conclude that to leading order, n 2 α,k = μ P k α if k = e 3 . If k = (0, 0, k 3 ) then for every lattice line there are k 3 contributing pairs. As illustrated in Fig. 6, for the general case we have to take into account that the distance of lattice points along the lines changes, and the density of intersection points in the e 1 -e 2 -plane changes.
k ω α Fig. 4. Fermi surface in bold. A line (dashed) intersects the patch but no particle-hole pair is picked up because k points from a hole momentum h in the patch out into a corridor between patches. This can happen only for hole momenta near the boundary. Since the area of a patch grows faster with N than its boundary length, the number of such lines is an error term of lower order k P α Fig. 5. The number of lattice lines through the patch is the same as the number of lattice lines through the projection of the patch along k onto the plane spanned by e 1 and e 2 Proof of Proposition 3.1. We are going to prove that, assuming δ ≤ 1 6 − 2 and α ∈ I + k , the number of particle-hole pairs with momentum k in patch B α is The statement of the proposition then follows immediately. Let k = (k 1 , k 2 , k 3 ), and consider a patch P α . Possibly reflecting at coordinate planes, we can assume that k 1 , k 2 , and k 3 are all non-negative, and without loss of generality we assume k 3 = 0 (if k 3 = 0 we would project onto another coordinate plane). Let P k α the projection of P α along k onto R 2 × {0}, the plane spanned by e 1 and e 2 .
Since the patch is diameter bounded we have |k ·ω(θ, ϕ)| = |k ·ω α | + O(M −1/2 ); and using |k ·ω α | ≥ N −δ to convert the additive error into a multiplicative error, this implies Let p := gcd(k 1 , k 2 , k 3 ) be the greatest common divisor of the components of k. It is not difficult to see that the distance between neighboring lattice points on each line L n is |k|/ p. Given a line L n intersecting P α , let h ∈ L n ∩ B F be the lattice point closest to P α . Then on the line segment {h + tk : t ∈ (0, 1]} there are p lattice points; by shifting along the line, these correspond to p particle-hole pairs contributing to n 2 α,k . We conclude that n 2 α,k is to leading order the number of lattice lines L n intersecting P k α , multiplied with gcd(k 1 , k 2 , k 3 ).
We now determine how many lattice lines run through P k α . Intersecting L := n∈Z 3 L n with R 2 × {0} we find t = −n 3 /k 3 . So This can be seen as the two-dimensional square lattice Z 2 (the translates of the unit square indexed by n 1 and n 2 ) and a point pattern repeated in every lattice translation of the unit square. As soon as n 3 k 1 /k 3 and n 3 k 2 /k 3 simultaneously become integer, we start repeating the point pattern in another translate of the unit square. So the number of points in the unit square is the smallest integer n 3 such that both n 3 k 1 /k 3 and n 3 k 2 /k 3 are integer. We claim that this is k 3 / p. To prove this claim, consider the fraction k 1 /k 3 . Obviously n 3 k 1 /k 3 is integer if and only if n 3 is a multiple of k 3 / gcd(k 1 , k 3 ). Similarly n 3 k 2 /k 3 is integer if and only if n 3 is a multiple of k 3 / gcd(k 2 , k 3 ). So the number of points in the unit square is given by the least common multiple, #points in unit square = lcm k 3 gcd(k 1 , k 3 ) , k 3 gcd(k 2 , k 3 ) .
Integrating and recalling that θ α = O(M −1/2 ), the length of this piece is at most of order k F / √ M. The fourth piece has length of the same order as the third piece. We conclude that |e α,k | = O(k F M −1/2 ). Combining (6.4) with (6.3), and using u α (k) 2 ≥ N −δ to convert the additive error into a multiplicative error (the new contribution is the dominating error), we obtain (6.2).