Correlation Energy of a Weakly Interacting Fermi Gas

We derive rigorously the leading order of the correlation energy of a Fermi gas in a scaling regime of high density and weak interaction. The result verifies the prediction of the random-phase approximation. Our proof refines the method of collective bosonization in three dimensions. We approximately diagonalize an effective Hamiltonian describing approximately bosonic collective excitations around the Hartree-Fock state, while showing that gapless and non-collective excitations have only a negligible effect on the ground state energy.


Introduction and Main Result
In the last thirty years, the study of the quantum many-body problem has made tremendous progress, in particular for weakly interacting regimes where the validity of mean-field theory (or slightly more generally the quasi-free approximation) as an effective theory can be proved. In particular for bosonic systems the mathematical results have been very rich. Just to name some: in the beginning of the 2000s the Gross-Pitaevskii functional for the ground state energy of dilute Bose gases was derived [LSY00,LS02]. Later the time-dependent Gross-Pitaevskii equation was derived [ESY09,ESY10]; bounds on the rate of convergence were obtained by [BdS15,BS19]. In 2011 validity of the quasi-free approximation for the excitation spectrum of Bose gases in the mean-field regime was proven [Sei11], thus obtaining also the next-to-leading order of the ground state energy. In contrast, for dilute gases, the quasi-free approximation is not sufficient for obtaining the second order of the energy, although it can be used to derive the leading order with optimal rate of convergence [BBCS18,BBCS20,NNRT20]. Very recently, results going beyond the quasi-free approximation were obtained: the excitation spectrum for dilute Bose gases was derived [BBCS17,BBCS19]; the Lee-Huang-Yang formula for the second order of the ground state energy was proven [FS19]; and nonlinear classical Gibbs measures were derived as an approximation at positive temperature [LNR20,FKSS20].
Compared to the development in the theory of bosonic systems, the mathematical progress in the derivation of effective theories for fermionic systems has been lagging behind. For fermions, the mean-field or quasi-free theory leads to the Hartree-Fock approximation 1 which is widely used in computational physics and chemistry. The validity of the Hartree-Fock approximation was established for the ground state energy of Coulomb systems in a number of seminal works [FS90,Bac92,GS94]. Rigorous results taking this analysis beyond the quasi-free effective theory have been notably absent, except for a second-order bound [HPR20] on the many-body correction (called correlation energy) to the ground state energy, inspired by [Hai03,HHS05]. In the present paper we derive an optimal formula for the correlation energy.
Our proof is based on a non-perturbative framework which we started to develop in [BNPSS20]. The central concept of our approach is that the dominant degrees of freedom are particle-hole pairs which are delocalized over patches on the Fermi surface in momentum space in such a way that they behave approximately as quasi-free bosons. In [BNPSS20], by means of a trial state, we proved that the formula known as the random phase approximation (RPA) in physics is an upper bound to the correlation energy of a three-dimensional Fermi gas in the mean-field scaling regime (i. e., high density and weak interaction) with a regular interaction potential. In the present paper, we again start from the interacting many-body Hamiltonian and prove the matching lower bound, thus completely validating the random-1 In this paper we focus on a setting where the pairing density is not relevant. If the pairing density becomes important, one is lead to Hartree-Fock-Bogoliubov theory or the Bardeen-Cooper-Schrieffer (BCS) theory of superconductivity. Already the study of these quasi-free theories is very challenging. Recently the mathematical properties of BCS theory were extensively analyzed [FHSS16,HHSS08], and the Ginzburg-Landau theory of superconductivity was derived from BCS theory [FHSS12].
phase approximation for the ground state energy of the three-dimensional Fermi gas in the mean-field scaling regime.
The problem of calculating corrections to the Hartree-Fock approximation has a long history in theoretical physics. Already in the early days of quantum mechanics the computation of the correlation energy was attempted using second order perturbation theory [Wig34,Hei47] for a Fermi gas with Coulomb interaction (the electron gas); however, this approach leads to a logarithmically divergent expression due to the long range of the Coulomb potential. It was then noticed [Mac50] that perturbation theory with Coulomb potential becomes even more divergent at higher orders and suggested that a resummation might cure this problem. Then in their seminal work [BP53], Bohm and Pines developed the RPA: they argued that the Hamiltonian can be partially transformed into normal coordinates which describe collective oscillations screening the long-range of the Coulomb potential, and thus leading to a better behaved perturbative expansion. However, they had to introduce additional bosonic collective degrees of freedom by hand. This was somewhat clarified by [SBFB57,Saw57], who showed that the collective modes can be understood as a superposition of particle-hole pair excitations. The formulation of the RPA due to Sawada et al. has in fact been an important inspiration for our work. Ultimately it was discovered that the RPA can be seen as a systematic partial resummation of perturbation theory; following this line, one even obtains a more precise result [GB57]. These works have been very influential in the establishment of theoretical condensed matter physics.
The particle-hole pair bosonization of Sawada et al. found application in many settings, for example to describe nuclear rotation and calculate moments of inertia of atomic nuclei [MW69,AP75]. A bosonization method considering only the radial excitations of the Fermi surface was developed by [Lut79]; similar methods applied to systems with square Fermi surface [FSL99,SL05]. Later, the bosonization of collective excitations of the Fermi surface became an important tool in the context of renormalization group methods [BG90,HM93,HKMS94,Hal94,CF94b]. The collective aspect was further emphasized in the operatorformalism by [CF94a,CF95]. In the functional-integral formalism [FGM95,KHS95,Khv95,KC96,KS96,FG97] bosonization was established as a Hubbard-Stratonovich transformation. Despite this popularity, difficulties in judging the quality of the bosonic approximation have been pointed out [Kop97]: "For example, scattering processes that transfer momentum between different boxes on the Fermi surface and non-linear terms in the energy dispersion definitely give rise to corrections to the free-boson approximation for the Hamiltonian. The problem of calculating these corrections within the conventional operator approach seems to be very difficult." As far as the mean-field scaling regime is concerned, with our result we quantify such corrections as being of subleading order.
A different mathematical approach to the fermionic many-body problem has been developed by employing rigorous renormalization group methods to construct convergent perturbative expansions. This allowed the construction of Gibbs states or ground states for two main classes of interacting fermionic models.
The first class concerns models in the Luttinger liquid universality class (which was first proposed by Haldane [Hal80,Hal81]), such as interacting fermions or quantum spin chains in one dimension and some two-dimensional models. These models show universal properties agreeing with those of the Tomonaga-Luttinger model which is solvable in one dimension by an exact bosonization method [ML65]. These predictions of bosonization have been verified rigorously, starting from [BG90,BGPS94] to [BM01,BM04,BM11,BFM09a,BFM09b,AMP18,MP17,GMT20]; the proofs however are by detailed analysis of the fermionic theory instead of justifying directly the bosonization. One justification of a bosonization method was achieved by [BW20], showing equivalence of the massless sine-Gordon model for a special choice of the coupling constant and the massive Thirring model at the free fermion point.
The second class concerns fermions in two or three dimensions at low temperature. In this context, the use of sectors on the Fermi surface, very similar to the construction of patches we use, has been introduced in [FMRT92] for the program of proving existence of superconductivity [FMRT95]. There, bosonization was implemented as a Hubbard-Stratonovich transformation of sectorized collective excitations. While this ambitious program has not been completed, the sector method was later used to prove Fermi liquid behavior of fermions in two dimensions with uniformly convex Fermi surface at exponentially small positive temperatures (and non-Fermi liquid behavior for fermions with flat Fermi surface) [DR00a,DR00b,Riv02,AMR05a,AMR05b,BGM06]. It furthermore lead to a proof of convergence for the zero-temperature perturbation theory in a special two-dimensional fermionic model with an asymmetry condition of the Fermi surface; this is a series of eleven papers an overview of which is given in [FKT04]. Partial results have been obtained for fermions in three dimensions at positive temperature [DMR01]. We see our approach, while sharing the 'sectorization' or 'patches' concept, as providing a complementary point of view on related physical problems, based on different, non-perturbative ideas.
Finally, our result should also be contrasted to the study of two-dimensional models that have been constructed to be exactly bosonizable. This goes back to a proposal of [Mat87], who was motivated by high-temperature superconductivity. The analysis and variants of the model were developed by [Lan10a,Lan10b,dL10,dL12a,dL12b,dL14]. Furthermore, one may also see similarities (such as the limitation of the number of bosons that can occupy a single bosonic mode) in the bosonization concept to methods such as the Holstein-Primakoff map [CG12,CGS15,Ben17] for spin systems.

Many-Body Hamiltonian in the Mean-Field Regime
To describe N spinless fermionic particles on the torus T 3 := R 3 /(2πZ 3 ), the Hilbert space is the space of totally antisymmetric L 2 -functions of N variables in T 3 , The Hamiltonian is defined as the sum of Laplacians describing the kinetic energy 2 and a pair interaction, i. e., a multiplication operator defined using a function V : R 3 → R, (1. 2) The positive parameters and λ adjust the strength of the kinetic energy and interaction operator, respectively. In this paper, we assume the interaction potential V to be smooth. Thus the Hamiltonian is bounded from below and its self-adjointness follows from the Kato-Rellich theorem or using the Friedrichs extension. Here we are interested in the infimum of the spectrum (the ground state energy) E N := inf spec (H) = inf ψ, H N ψ : ψ ∈ L 2 a (T 3N ) , ψ L 2 = 1 . (1.3) In full generality, the computation of E N is clearly out of reach, simply because the model is too general: it may describe physical systems from superconductors to neutron stars. We thus need to be more specific and consider a particular case of the model, the most accessible case being a mean-field scaling regime: by considering a high density of particles we expect the leading order of the theory to be approximately described by an effective one-particle theory. We thus consider the limit of large particle number on the fixed-size torus. However, kinetic energy and interaction energy in typical states scale differently: the kinetic energy like N 5/3 due to the Pauli exclusion principle, the interaction energy like N 2 since there are N (N − 1)/2 interacting pairs. To have a chance of obtaining a non-trivial limit we choose to scale the parameters by 3 := N − 1 3 and λ := N −1 with N → ∞ . (1.4) With this choice the kinetic energy and the interaction energy in typical states close to the ground state have the same order of magnitude (order N ). This scaling regime couples a semiclassical scaling ( = N − 1 3 → 0) and a mean-field scaling (coupling constant λ = N −1 ). If the interaction vanishes, V = 0, then the ground state of the system is exactly given by the Slater determinant (i. e., antisymmetrized tensor product) of plane waves Here the momenta k of the plane waves are chosen such that the expectation value of the kinetic energy operator is minimized. The set of the corresponding momenta B F is called the Fermi ball. For simplicity we assume that the ball is completely filled, namely we set (1.6) and then define the particle number accordingly as N := |B F |. The limit of large particle number is then realized by considering k F → ∞. According to Gauss' classic counting argument we have 4 If the system is interacting, V = 0, the ground state becomes a complicated superposition of Slater determinants. Nevertheless, in Hartree-Fock theory one minimizes only over the set of all Slater determinants. In our setting, the Hartree-Fock energy is attained by the plane waves as in (1.5) and (1.6); see Appendix A for a proof 5 . Thus in order to gain non-trivial information about the interacting system one must go beyond the Hartree-Fock theory. Note that by the variational principle, the Hartree-Fock energy E HF N is an upper bound to the ground state energy E N . It follows from the analysis of [Bac92,GS94] that Hartree-Fock theory also provides a good lower bound to the ground state energy. In our setting, the approach of [Bac92,GS94] shows that (1.7) In particular, both E N and E HF N contain the Thomas-Fermi energy (in our scaling of order N ) and the Dirac correction, also know as the exchange term (in our scaling of order 1).
From the physical point of view, Slater determinants are as uncorrelated as fermionic states (which have to satisfy the Pauli principle) can be, in the sense that they are just antisymmetrized tensor products. Due to the presence of the interaction, the true ground state will contain non-trivial correlations (i. e., it will be a superposition of Slater determinants). Therefore Wigner [Wig34] called the difference E N − E HF N the correlation energy. According to (1.7) we know that the correlation energy in our scaling is of size o(1) as N → ∞. In the present paper, we are going to determine the leading order of the correlation energy. It is of order = N − 1 3 and given by the explicit formula predicted by the random-phase approximation, as obtained by [Mac50,GB57] based on a partial resummation of the perturbation series. We believe that our result is of importance as a rigorous step beyond mean-field theory into the world of interacting quantum systems. Our proof shows that the leading order of the correlation energy can be understood as the ground state energy of an effective quadratic Hamiltonian describing approximately bosonic collective excitations.

Main Result
We write the interaction potential via its Fourier coefficients Theorem 1.1 (Main Result). There exists a v 0 > 0 such that the following holds true. Assume thatV : Z 3 → R is compactly supported, non-negative, satisfiesV (k) =V (−k) for all k ∈ Z 3 , and V ℓ 1 < v 0 . For every k F > 0 let the particle number be N := |{k ∈ Z 3 : |k| ≤ k F }|. Then as k F → ∞, the ground state energy of the Hamiltonian H N in (1.2) with = N −1/3 and λ = N −1 is Here the correlation energy E RPA N is of order and, with κ = 3 4π 1 3 , given by The upper bound, E N ≤ E HF N +E RPA N +O( 1+ 1 9 ), was proved in [BNPSS20], even without smallness condition on the potential. In the present paper we prove the lower bound. The smallness condition is technical, and we expect that the lower bound is also true without this condition.
As already explained in [BNPSS20], by expanding (1.9) for smallV , we obtain (1.10) Thus we recover the result for the weak-coupling limit of [HPR20]. Moreover, the leading order of the correlation energy of the jellium model as given by Gell-Mann and Brueckner [GB57,Eq. (19)] (see also [SBFB57,Eq. (37)] and [Mac50]) when applied to the case of bounded compactly supportedV agrees with (1.9).
Although some tools from the earlier papers [BNPSS20, HPR20] will be useful for us, the proof of Theorem 1.1 requires several important new ingredients. Conceptually, our justification of the random phase approximation is based on the main input that at the energy scale of the correlation energy there are rather few excitations around the Fermi ball. For the upper bound in [BNPSS20], we consider a trial state whose number of excited particles is of order 1, allowing to control most of error terms easily. However, for the lower bound, the best available estimate for the number of excited particles in a ground state is O(N 1 3 ), thanks to a kinetic inequality from [HPR20]. This weaker input breaks most of the error estimates in the upper bound analysis [BNPSS20], and this is also the reason why a less precise lower bound was obtained in [HPR20]. In fact, using only similar bounds to [BNPSS20], we can at best show that the error terms are of the same order as the correlation energy. In the present paper, we go beyond that and complete the bosonization approach for the first time.
Let us quickly mention the most important new ingredients of the proof; a more detailed explanation will be given in Section 1.3.
• A refined estimate for the number of bosonic particles. In [BNPSS20], we control the number of bosonic particles by the fermionic number operator N . This is insufficient here, since the bound N ≤ CN 1 3 mentioned above is too weak. It is natural to try to bound all error terms using the kinetic operator H 0 , but a serious problem is that H 0 is not stable under the Bogoliubov transformation introduced later. Instead, we introduce the gapped number operator N δ in (5.6), which takes into account only the fermionic particles far from the Fermi surface and has a much better bound N δ ≤ CN δ with δ > 0 small. Thus in practice, using N δ is as good as using the kinetic operator H 0 in many estimates, with the advantage that N δ is stable under the Bogoliubov transformation (see Lemma 7.2). Since N δ involves the fermionic particles far from the Fermi surface, we have to control separately the contribution from particles close to the Fermi surface, using an improvement of the kinetic inequality in [HPR20] (see Lemma 4.2). The latter issue does not appear in [BNPSS20] since for an upper bound we can simply take a trial state without any contribution from particles close to the Fermi surface.
• A refined linearization of the kinetic energy. Similarly to [BNPSS20], the bosonization approach in the present paper is based on the construction of patches, which allows to linearize the fermionic kinetic operator H 0 and relates it to a bosonic operator D B . In [BNPSS20], we prove that the expectation value of H 0 − D B against a well-chosen trial state is small, which requires that the number of patches is M ≫ N 1 3 . In the present paper, we only control the commutator of H 0 − D B with bosonic pairs operators (see Lemma 8.2). This weaker bound is sufficient to ensure that H 0 − D B is essentially invariant under the Bogoliubov transformation (see Lemma 8.1), and importantly it requires only M ≫ N 2δ with δ > 0 small. The possibility of taking a much smaller M is crucial to bound all error terms caused by the Bogoliubov transformation.
• A refined control on the Bogoliubov kernel. Similarly to [BNPSS20], we will diagonalize the bosonizable part of the Hamiltonian by a Bogoliubov transformation. In [BNPSS20] we prove that the kernel of the Bogoliubov transformation is bounded uniformly in the Hilbert-Schmidt topology. This information is sufficient to estimate the error terms when N ∼ 1 (as in the trial state used for the upper bound), but it is insufficient now that there are potentially many excitations. In the present paper, we will derive an optimal bound for the matrix elements of the Bogoliubov kernel (see Lemma 6.1). The new estimate encodes that due to the geometry of the Fermi surface, the interaction energy vanishes at the same rate as the kinetic gap closes. This bound is crucial for improving error estimates involving the Bogoliubov transformation (see Lemma 7.1), especially for controlling the non-bosonizable terms.
• A subtle analysis of the non-bosonizable terms. As explained in [BNPSS20], the contribution of the non-bosonizable terms can be controlled by N −1 N 2 . The trial state in [BNPSS20] satisfies N 2 ∼ 1, and hence the non-bosonizable terms are much smaller than the correlation energy. In the present paper, we only know that N ≤ CN 1 3 , which is not enough to rule out the possibility that the non-bosonizable terms are comparable to the correlation energy. It turns out that controlling the non-bosonizable terms is highly nontrivial since these terms couple the bosonic degrees of freedom with the uncontrolled low-energy fermions. Our idea is to bound these terms from below by the kinetic operator. Technically, it is easy to establish the lower bound −C V ℓ 1 H 0 by completing a square. However, the difficulty here is that we have to validate this bound after implementing the Bogoliubov transformation (see Lemma 9.1). Handling the non-bosonizable terms requires a subtle analysis, using the refined estimate on the Bogoliubov kernel and the smallness assumption on the innteraction potential.
• Analysis of the diagonalized effective Hamiltonian. After implementing the Bogoliubov transformation, we obtain the desired correlation energy plus For the upper bound in [BNPSS20], the term K does not cause any problem since its expectation value in the vacuum state is 0. In the present paper, however, we have to bound it from below as an operator (see (10.16)). This task is nontrivial and we have to use again the refined estimate on the Bogoliubov kernel and the smallness assumption on the innteraction potential.
In summary, in the present paper we provide a complete and unified bosonization approach which can handle the states with a lot of low-energy excitations. We believe that our approach is of general interest and could be useful in other contexts.
We also see our result as a possible starting point for further investigations. For example, our bosonization method is general enough to derive a norm approximation on the manybody dynamics [BNPSS21]. Many questions remain; given the historical context of the problem, maybe most importantly the extension to Coulomb interaction, i. e., the electron gas, at least in some coupled mean-field/large-volume limit, requiring to optimize our bounds for extensivity. Of course, to reach this goal, we would first need to remove the smallpotential condition, which at the moment plays a central role. The next key task is to deal with the divergence at small k which appears in the higher orders of perturbation theory. As the small-k singularity is improved to a logarithmic singularity in (1.9), we believe that the bosonization method contains intrinsically the necessary "resummation" that is responsible for this screening of the potential. Of course, hard technical refinements, e. g., in optimizing the k-dependence of our estimates will be necessary. Another question concerns the low-energy spectrum of the Hamiltonian: it is believed that a collective plasmon mode can be isolated from the bosonized excitation spectrum, realizing a theory of electrons dressed by a cloud of excitations and supporting the screening concept. Within the bosonic approximation, the emergence of the plasmon mode has been discussed in [Ben19]. We expect that through a detailed analysis of the spectrum, the screening of the Coulomb potential, and the properties of the approximate ground state, the bosonization method may support the future development of a rigorous, non-perturbative Fermi liquid theory.
Beyond the mean-field scaling regime and the electron gas, there are other systems of physical interest: for example the helium isotope 3 He is fermionic and has short-range isotropic interactions. Furthermore, a high-density limit is particularly important in the description of atomic nuclei; the short-range interactions there are however spin-and isospindependent and anisotropic and furthermore have attractive parts. We conjecture that even with attractive potentials the RPA formula for the correlation energy applies as long as the logarithm in E RPA N does not become ill-defined. In our scaling, we do not see any contribution from the pairing density related to superconductivity, but one may expect that even if it was non-vanishing, its effect on the energy may be exponentially small. One may speculate that in an appropriate scaling limit the state of a superconductor might be described using a product of a particle-hole pair Bogoliubov transformation as we construct it for the normal phase, times a BCS-type fermionic Bogoliubov transformation.

Sketch of the Proof
We will use the Fock space formalism. Recall the fermionic Fock space The vector Ω := (1, 0, 0, . . .) ∈ F is called the vacuum. For ψ = (ψ (0) , ψ (1) , ψ (2) , . . .) ∈ F and f ∈ L 2 (T 3 ) we define the creation operators a * (f ) and the annihilation operators a(f ) by their actions Since we will work in the discrete momentum space (Fourier space) Z 3 , it is convenient to write These operators satisfy the canonical anticommutator relations (CAR) (1.12) The Hamiltonian H N in (1.2), originally defined on the N -particle sector L 2 a ((T 3 ) N ) ⊂ F, can be lifted to an operator on the fermionic Fock space as (1.13) Restricted to L 2 a ((T 3 ) N ) ⊂ F, H N agrees with the Hamiltonian as given in (1.2).
Correlation Hamiltonian. Now we separate the degrees of freedom described by the Slater determinant of plane waves in (1.5) from non-trivial quantum correlations. Recall the Fermi ball and its complement We define the particle-hole transformation R : F → F by (1.14) This map is well-defined since vectors of the form j a * k j Ω constitute a basis of F. Moreover, it is easy to verify that R = R * = R −1 ; in particular R is a unitary transformation. (In fact, R is an example of a fermionic Bogoliubov transformation.) In practice, the action of R on an operator on Fock space is easily computed using the rules (1.14) and the CAR (1.12). For example, consider the particle number operator (1.15) This identity implies that if Rψ is a N -particle state, then namely after the transformation R the number of particles is equal to the number of holes. The transformed Hamiltonian R * H N R has been computed in [BPS14a,BPS14b,BPS14c,BPS16,BSS18], in a slightly different way for mixed states in [BJPSS16], and in the context of the correlation energy in [HPR20,BNPSS20]. Let us therefore just give a short sketch of the transformation of the interaction term; the transformation of the kinetic term uses (1.16) but is otherwise very similar to (1.15). We start by using the CAR once to write The second summand of (1.17) equals − 1 2 k∈Z 3V (k), which contributes to the Hartree-Fock energy. For the transformation of the first summand one computes where we have introduced for any k ∈ Z 3 the particle-hole pair creation operator 6 b * (k) := and the non-bosonizable operator Note that D(k) * = D(−k) and D(0) * = N p − N h . Observing that the constant terms (i. e., not containing any creation or annihilation operator) contribute to the Hartree-Fock energy E HF N and collecting all quadratic terms in the operator X, we arrive at the result where the summands are given by Note that we have introduced the set Γ nor of all momenta k = (k 1 , k 2 , k 3 ) in Z 3 ∩ suppV satisfying k 3 > 0 or (k 3 = 0 and k 2 > 0) or (k 2 = k 3 = 0 and k 1 > 0) .
This set is chosen such that The term Q B is the bosonizable part of the interaction and contains only the pair operators.
The term E 1 is purely non-bosonizable and E 2 couples bosonizable and non-bosonizable excitations. Note that unlike the other terms E 1 is not normal-ordered (this choice is made so that we have E 1 ≥ 0); for this reason X and E 1 differ slightly from the expressions given in [BNPSS20]. Since X is quadratic in fermionic operators, it can be easily bounded using N /N , which will be seen to have expectation value much smaller than the order of E RPA N .
In [BNPSS20], it was proved that H 0 +Q B evaluated in a trial state of quasi-free particlehole pairs gives rise to E RPA N as an upper bound to the correlation energy. Accordingly, an important part of our task will be to prove that the contribution from E 1 + E 2 is negligible. (Whereas this was easily achieved for the upper bound using the explicit form of the trial state, for the lower bound it actually turns out to be a major challenge.) The rest of the paper is devoted to the proof of the inequality Thanks to (1.20) it directly implies the main result, the lower bound in Theorem 1.1. In the following we explain the key estimates in our proof. We use the symbol C for positive constants that may change from line to line, but are independent of N , , and M (the number of patches, to be introduced in (1.32)). The constants C may depend on the momentum k, which does not play a role ultimately since we only consider the finitely many k ∈ suppV , i. e., we can always take the maximum and so treat all constants as independent of k. We generally absorb any dependence onV in the constants C; we only write thê V -dependence of estimates explicitly where the smallness condition on V ℓ 1 plays a role.
A priori estimates. Similarly to [BNPSS20,HPR20], many approximations used in our approach are based on the idea that the relevant quantum states have only few excitations. For the upper bound in [BNPSS20], this fact is easily justified by the strong bound Ψ trial , N m Ψ trial ≤ C m (for all m ∈ N) for the trial state used to compute the expectation value of H corr . Compared to that bound, for the ground state we can only derive weaker estimates. In Lemma 2.4 we prove that the particle number operator can be controlled by the kinetic energy (i. e., the kinetic energy operator has a tiny gap, of order 2 ) by (1.23) To avoid the particle number operator, where possible we bound pair operators directly by the kinetic energy, using an inequality from [HPR20], (1.24) (The idea of directly using the kinetic energy for bounds has appeared already in [Hai03,HHS05] in the context of rigorous second order perturbation theory.) The bounds (1.24) and (1.23) imply the rough estimates in Lemma 2.1, as in [HPR20]: Together with an upper bound of order such as the trivial variational one obtained using the trial state Ω (corresponding to the Slater determinant of plane waves before the particlehole transformation), this implies that the ground state ψ gs of H corr , the minimizer of the expectation value on the left hand side of (1.22), satisfies For technical reasons, we will also need to control the expectation of higher powers of N , which does not follow from (1.24) and (1.23). To overcome this difficulty, in Lemma 3.1 we replace the ground state ψ gs by an approximate ground state Ψ satisfying This is achieved by using the technique of localizing particle number on Fock space, which goes back to Lieb and Solovej [LS01]. In the proof we will use the formulation from [LNSS15, Proposition 6.1]. It is the state Ψ that most of our subsequent analysis will be applied to.
Approximately bosonic creation operators. When applied to states with few excitations, the pair creation operators behave approximately as bosonic creation operators, namely we have to leading order the canonical commutator relations (CCR) (1.28) Unfortunately there is no expression for the kinetic energy H 0 in terms of the b ♮ (k)operators 7 . We take inspiration from the solution of the Luttinger model [ML65]: if the dispersion relation were linear, the b * (k) would create eigenvectors of H 0 . Since the dispersion relation 2 |k| 2 is not linear, we will linearize it locally. This is achieved by localizing the creation operators to patches on the Fermi surface. More precisely, we cut the shell of width RV := diam suppV around the Fermi surface into patches {B α } M α=1 . The construction of the patches is recalled in Section 4. As discussed in the introduction, under the name of "sectors", this idea has already been employed in the rigorous renormalization group context.
We consider the pair excitations supported in each patch 8 To normalize the constant in the approximate CCR, the normalization constant m α (k) should be chosen such that b * α (k)Ω = 1, namely This has the meaning of the number of particle-hole pairs (p, h) ∈ B c F × B F inside the patch B α with relative momentum p − h = k. However, this number may be zero! In fact, if k ·ω α < 0 withω α the unit vector pointing in the direction of the patch B α , then a simple geometric consideration shows that the summation domain in (1.30) and (1.29) is empty (the condition k ·ω α < 0 is incompatible with p ∈ B c F and p − k ∈ B F ). The same problem occurs for m 2 α (−k) = 0 if k ·ω α > 0. Furthermore, as suggested by [RS05, Chapters 8, 9.2.3, and 9.2.4] and [CS16], bosonization is expected to be a good approximation only if m α (k) is large. This cannot be ensured for patches where k ·ω α ≈ 0 (if we think of the direction of k as defining the north pole of the Fermi ball, these are the patches near the equator). However, the momentum k of such excitations is almost tangential to the Fermi surface and thus their energy is very low. In fact, we will be able to show that their contribution to the ground state energy is small and exclude them from the bosonization. To do so, we introduce a cut-off near the equator by defining the index subset (1.31) We will choose the cut-off parameter δ and the number of the patches M such that (1.32) 7 The symbol ♮ may stand both for " * " (adjoint in Fock space F) and for absence of " * "; we use it whenever the choice does not play a role. 8 Where confusion may arise, we use the notation p : p ∈ B c F ∩ Bα, p − k ∈ BF ∩ Bα in specifying the range of summation: here it is over all p ∈ Z 3 (but not over k) satisfying p ∈ B c F ∩ Bα and p − k ∈ BF ∩ Bα.
(Eventually we will choose M = N 4δ and δ = 1 24 .) Note that unlike [BNPSS20] where we require M ≫ N 1 3 , here we allow a much smaller value of M , which is important to control the error terms due to the Bogoliubov transformation introduced later.
Then by [BNPSS20, Proposition 3.1], the constant can be computed to be given by (1.33) (Heuristically, the reader may think of the number of particle-hole pairs as given by the surface area of the patch, 4πk 2 F /M , times the depth inside the Fermi ball that can be reached by h, namely |k ·ω α |. For this counting argument to be justifiable, the diameter of a patch on the Fermi surface may not become too large, requiring M ≫ N 2δ .) Consequently, the operators are well-defined and behave like bosonic creation operators, namely This is proven in Lemma 5.2, which is a slight extension of [BNPSS20, Lemma 4.1].
Gapped Number Operator. As we have seen in (1.26) we do not have strong control on the particle number operator, due to the possibility of having many small-energy excitations near the Fermi surface; a problem which in the beginning is avoided by directly using H 0 for bounds. However, a serious problem of using H 0 is that it is not stable under the Bogoliubov transformation that we will later introduce to approximately diagonalize the effective Hamiltonian. A way of overcoming this problem, and a key improvement compared to [BNPSS20] is that instead of using the full fermionic number operator N to control error terms, wherever possible we use only the gapped number operator which does not count low-energy excitations. Here we have used the dispersion relation e(i) = | 2 |i| 2 − κ 2 | introduced in (1.21), and due to the artificial gap we obtain in (1.26). Thus in practice, controlling error terms by using N δ is as good as using the kinetic operator H 0 . Furthermore, unlike H 0 , the gapped number operator N δ is stable under the Bogoliubov transformation (see Lemma 7.2). The main instance where N δ finds use is Lemma 5.3, where we bound the approximately bosonic number operator by the fermionic gapped number operator, (1.37) This improves [BNPSS20,Lemma 4.2], where N was used as the bound. The key insight leading to this improvement is that only bosonic pair operators with α ∈ I k are needed in the effective Hamiltonian (1.48) and the diagonalizing Bogoliubov transformation (1.49) to obtain the RPA energy (1.53). Since α ∈ I k means |k ·ω α | ≥ N −δ , the relative momentum k between particles p and holes h = p − k cannot be tangential to the Fermi surface; i. e., p or h (or both) has to lie above the gap e(i) ≥ 1 4 N − 1 3 −δ . This is the reason for the same parameter δ > 0 appearing both in the gapped number operator and in the equator cut-off (1.31). The new bound allows us to work with the bosonic pairs at the energy scale relevant for the result, while keeping them as much as possible separate from the low-energy excitations on whose number we do not have strong control.
In the next steps, we will write the correlation Hamiltonian H corr as a quadratic Hamiltonian in terms of the approximately bosonic operators c * α (k) and c α (k).
Bosonization of the interaction energy. By decomposing we can write the main interaction term as In the approximation (1.39) we have ignored all excitations outside the patches. It is justified in Lemma 4.1, where we prove that is similar to Q B + E 2 but contains only pair excitations in the patches. The proof of (1.40) requires an improved version of the kinetic inequality (1.24) (see Lemma 4.2). Thanks to (1.27), the error term in (1.40) does not contribute to the leading order of the correlation energy.
Note that the bound (1.40) is not necessary for the upper bound in [BNPSS20] because the trial state there is constructed to contain only pair excitations inside the patches, so that the expectation value of a pair not belonging completely to relevant patches is identically zero.
Bosonization of the kinetic energy. The bosonization of the fermionic kinetic energy is more complicated. A key observation is that if α ∈ I + k , then using the CAR (1.12) and linearizing the dispersion relation around k Fωα , we find For linearizing the dispersion relation we used the fact that for any Obviously the same holds if α ∈ I − k . Therefore, within commutators with pair operators, H 0 can be approximated as in the Luttinger model [ML65] by independent modes (i. e., harmonic oscillators) of energies κ2k ·ω α , namely (1.43) A key idea of our analysis is to justify (1.43) not by estimating the difference H 0 − D B directly, but rather by proving that it is essentially invariant under the approximate Bogoliubov transformation T which we will introduce below to diagonalize the quadratic bosonized Hamiltonian. More precisely, in Lemma 8.1 we show that with ψ := T * Ψ we have (1.45) Note that only here, in the first error summand, due to the linearization of H 0 , does M enter in the denominator. With ψ, N δ ψ ≤ CN δ (this bound is stable under the Bogoliubov transformation), we need to take M ≫ N 2δ . We will eventually choose M = N 4δ . The bound (1.44) is a crucial improvement over the linearization technique in [BNPSS20] which requires M ≫ N 1 3 , a condition that we cannot fulfill due to the second error summand in (1.45) (recall that in our approximate ground state we only know N ≤ CN 1 3 ). This improvement is achieved because in [BNPSS20] we unnecessarily linearized the expectation value of H 0 , whereas in the present paper we only linearize the necessary commutator with a pair operator c * α (k). In general, this new possibility of choosing a rather small M means that we gain flexibility in the technical steps because we can afford arbitrarily high powers of M as long as there is a negative power of N .
To apply (1.44), prior to using the Bogoliubov transformation, we will decompose Diagonalization of the bosonized Hamiltonian. By combining the approximation (1.39) and the operator +D B from (1.46), we find the effective quadratic bosonic Hamiltonian We have arrived at an effective quadratic Hamiltonian in terms of the approximately bosonic creation and annihilation operators. If the effective Hamiltonian were exactly bosonic, it could be diagonalized by a Bogoliubov transformation [Bog47]. While we do not have this tool available since our operators are not exactly bosonic, we can still use the explicit formula as for a true Bogoliubov transformation and define the unitary map where the real symmetric matrices K(k) are computed as in the exactly bosonic case. The choice of K(k) is the same as in [BNPSS20], following the abstract formulation given in [GS13]. We will quickly recall it in Section 6.
Another key aspect of our proof is the observation that the Bogoliubov kernel K(k) satisfies a refined entry-wise bound, for all k ∈ Γ nor and α, β ∈ I k .
(1.50) This is proved in Lemma 6.1. An important role in the proof is played by the fact that due to the geometry of the Fermi surface the normalization factor n α (k) 2 is proportional to |k ·ω α | (see (1.33)) which is also the linearization of the dispersion relation (see (1.43)), leading to cancellations. This means that as the gap of the kinetic energy closes when we consider particle-hole pairs that are almost tangential to the Fermi surface, the energy gain due to the interaction of such an excitation vanishes at the same rate. While the proof is essentially a detailed computation, it is crucial in controlling the non-bosonizable terms E 2 , see (9.5). In Lemma 7.1, we show that T acts approximately as a bosonic Bogoliubov transformation, namely where the error operators satisfy This is an improvement of [BNPSS20, Prop 4.4] in that we replaced some N by N δ . In order to put the error estimate (1.52) in good use, we need also that the particle number operators be stable under the approximate Bogoliubov transformation; this is the content of Lemma 7.2, based on a refinement of the Grönwall argument in [BPS14a,BNPSS20].
To diagonalize the bosonizable part of the Hamiltonian, we insert (1.51) in T * (D 0 + Q R B )T and write the transformed expression in Wick-normal order (with respect to the approximately bosonic operators). Up to a small error, this produces the ground state energy as desired, inf spec (1.53) Additionally we obtain the (up to a one-particle unitary) diagonalized quadratic Hamiltonian which in exact Bogoliubov theory would be the excitation spectrum; for some explicit matrix K(k) α,β it has the form As mentioned before, a further new idea of our proof is that we sacrifice the positive contribution of the excitation spectrum to control the negative term −D B left from the comparison of the fermionic and bosonic kinetic energy (1.44). In fact, we will prove that (see (10.16)) The proof of (1.54) is based on an explicit computation of the operator K(k) and the nice property (1.50) of the Bogoliubov kernel. When V ℓ 1 is small, the error term − V ℓ 1 H 0 in (1.54) is controlled by the positive term H 0 left from the comparison of the fermionic and bosonic kinetic energy (1.44).
Controlling non-bosonizable parts of the Hamiltonian. We still have to show that the non-bosonizable terms E 1 + E R 2 have only a small effect on the ground state energy. As explained in [BNPSS20] these error terms can be easily controlled by N 2 /N . In the trial state in [BNPSS20], the expectation value of N 2 is of order 1, so that N 2 /N is a small error. For the lower bound however, in the (approximate) ground state we only know that N is of order N 1 3 , so that N 2 /N would be of the same order = N − 1 3 as the correlation energy. Another way to see the difficulty in dealing with these terms is to observe that E R 2 couples the "good" bosonic degrees of freedom with "bad" uncontrolled fermions near the Fermi surface (the latter were, by construction, absent in the trial state used for the upper bound).
Thus the non-bosonizable parts require a subtle analysis. The following argument relies on the fact that E 1 is non-negative (asV (k) ≥ 0), which helps us in obtaining a lower bound for E R 2 . By the Cauchy-Schwarz inequality and the kinetic energy estimate (1.24), it is easy to see that Of course, this bound is useless because H 0 is of the same order as H corr . However, we are able to rescue this idea by proving a similar lower bound for the transformed operator T * (E 1 + E R 2 )T . In fact, in Lemma 9.1 we prove that, with ψ = T * Ψ, (1.55) The bound (1.55) is one of the most subtle estimates of our analysis and does not have any counterpart in the proof of the upper bound. Note that on the right hand side, once and only once the vector Ψ appears. The proof of this bound relies on the nice property (1.50) of the Bogoliubov kernel. The second and third summand on the right hand side of (1.55) are simply bounded by the a-priori estimates (1.27). Unlike CN − 1 2 ψ H 1/2 0 Ψ with its small pre-factor N − 1 2 the expectation value − V ℓ 1 ψ, H 0 ψ has to be controlled differently: since V ℓ 1 is assumed to be small, we can control it using the positive term H 0 left after the Bogoliubov transformation of the difference of fermionic and bosonic kinetic energy, see the right hand side of (1.44).
Eventually we will take the parameters M = N 4δ and δ = 1 24 , resulting in the total error O( 1+ 1 16 ) to the correlation energy. This completes the sketch of the proof.

Kinetic Estimates
Our goal is to derive some rough estimates on the correlation Hamiltonian H corr in (1.20) using the kinetic energy The main result of this section is the following estimate for H corr . The proof is based on the estimates of [HPR20, Section 4.1], which we shall review for the convenience of the reader.

Lemma 2.1 (A-Priori Estimates for the Correlation Hamiltonian).
There exists a v 0 > 0 such that forV : Z 3 → R compactly supported, non-negative, withV (k) =V (−k) for all k ∈ Z 3 , and V ℓ 1 < v 0 , the following holds true: Before coming to the proof of this lemma at the end of the section we need a couple of auxiliary lemmas. We start by recalling [HPR20, Lemma 4.7], which allows us to control the pair operators b(k) using the kinetic energy H 0 .

Lemma 2.2 (Kinetic Bound for Pair Operators).
For every k ∈ Z 3 and ψ ∈ F we have Since we will use this bound several times and we also need a modified version in Section 4, a simplified proof of (2.2) is provided in Appendix B for the reader's convenience.
As a consequence of Lemma 2.2 we can bound the pair operators by the kinetic energy.
k Figure 1: The grey area represents the set B c F ∩ (B F + k) ⊂ Z 3 , i. e., momenta which are affected by creation of particle-hole pairs with relative momentum k.
Proof. For every ψ ∈ F, from Lemma 2.2 and the triangle inequality we have The last estimate follows from a simple counting argument: the set B c F ∩ (B F + k) (sketched in grey in Fig. 1) is contained in the volume obtained by extending an area of size O(N 2 3 ) on the Fermi surface to a shell of thickness of order O(1). Therefore, this set contains no more than CN 2 3 points of Z 3 . Thus The following new bound is our main tool to control the number of excited fermions in the ground state. It is based on the observation that because of the discreteness of momentum space the kinetic energy operator has a tiny (i. e., order 2 ) gap.

Lemma 2.4 (Kinetic Bound for Number of Fermions
Moreover, using N p − N h ψ = 0 we find Thus we conclude As a consequence of Lemma 2.4 we can easily prove that the exchange term has a very small contribution to the ground state energy, namely bounded as in the following lemma.
Proof. This follows from the simple estimate 9 ±X ≤ C V ℓ 1 N /N and Lemma 2.4. Now we are ready to prove the main result of the section.
Proof of Lemma 2.1. By the Cauchy-Schwarz inequality and Lemma 2.3 Combining this with Lemma 2.5 we obtain The desired result follows from the smallness condition onV .

Localization of Particle Number
From the previous kinetic energy estimates it is possible to derive a-priori bounds for the ground states of H corr (see Lemma 3.2 below). For example, we can control the expectation value of the particle number N in a ground state using Lemma 2.4. To estimate also the expectation values of higher powers of N , inspired by [LS01] we use IMS localization with respect to particle number to construct an approximate ground state which has energy close to the ground state energy and at the same time fulfills the desired bounds for powers of the number operator. This is the main outcome of the section, given in the following lemma.
Lemma 3.1 (Localization in Particle Number). Let ψ gs be a ground state vector for H corr satisfying (N p − N h )ψ gs = 0, i. e., a minimizer of the left hand side of (1.22). Then there exists a normalized vector Ψ ∈ F such that , Ψ lives in the Fock space sectors with particle number less or equal to N 1 3 ) and As a first ingredient for the proof of Lemma 3.1, we have a-priori estimates based on Section 2. Note that for ψ = Ω we have Ω, H corr Ω = 0 ≤ C , and thus for any ground state ψ gs of H corr we have ψ gs , H corr ψ gs ≤ 0 by the variational principle. Thus we can apply the following lemma to ψ gs .
Lemma 3.2 (A-Priori Estimates). Let ψ ∈ F such that ψ, H corr ψ ≤ C . Then we have Proof. From the lower bound in Lemma 2.1 and the assumption of Lemma 3.2 we have This implies ψ,  Then The proof of Lemma 3.3 in [LNSS15] is based on the double commutator identity This is an analogue of the standard IMS localization formula in position space [Sim83] ∆ − f ∆f − g∆g = |∇f | 2 + |∇g| 2 , f 2 + g 2 = 1 .
Now we are ready to prove the main result of the section.
Proof of Lemma 3.1. We will apply Lemma 3.3 for A = H corr + . We can take ℓ = 4 as the Hamiltonian H corr changes particle number by at most ±4. By Lemma 2.1, we have because N commutes with H 0 and E 1 . Thus by Lemma 3.2 we get (with the constant C 0 > 0 fixed for reference in the further proof) Now applying Lemma 3.3, for all L ≥ 1 we can bound Combining with the variational principle (and using that ψ gs is a ground state for A) g L ψ gs , Ag L ψ gs ≥ g L ψ gs 2 ψ gs , Aψ gs and then together with f 2 L + g 2 L = 1 we obtain Consequently, (3.3) implies ψ gs , Aψ gs ≥ Ψ, AΨ − CN −1 with Ψ := f L ψ gs f L ψ gs .
Since A = H corr + and Ψ and ψ gs are normalized, the previous inequality is equivalent to Finally, from the definition of Ψ and 0 ≤ f L ≤ 1(N ≤ L) we get Ψ = 1(N ≤ L)Ψ. Since N and N p − N h commute, it follows also that (N p − N h )Ψ = 0.

Reduction to Pair Excitations on Patches
Our bosonization method is based on decomposing the pair excitations b * (k) into smaller pieces localized in disjoint patches on the Fermi surface. This procedure has been introduced in the context of the renormalization group [BG90,HM93,HKMS94,Hal94,CF94b]. In this section we will define the patches precisely. Moreover, we prove that the correlation Hamiltonian H corr can be properly represented by the pair excitations in patches up to an explicitly estimated error. First, we will decompose the Fermi surface into patches; a partition of the Fermi surface is sketched in Figure 2. Then we thicken the patches on the Fermi surface by allowing a relative momentum of order O(1). With a parameter δ ∈ (0, 1/6) that will eventually be optimized, the number M of patches will be chosen in the range (4.1) (We will eventually take δ = 1/24 and M = N 4δ .) The parameter δ is the same as will appear in the gapped number operator (5.6) and in the equator cut-off (4.6). The details of the construction are given in the following paragraphs, leading to the patch definition (4.5).
Patches on the unit sphere. We start our construction on the unit sphere (following, e. g., [Leo06]) and later scale up to the Fermi sphere of radius k F = κN 1 3 . We use standard spherical coordinates: forω ∈ S 2 , denote by θ the inclination (measured betweenω and e 3 = (0, 0, 1)) and by ϕ the azimuth (measured between e 1 = (1, 0, 0) and the projection ofω onto the plane perpendicular to e 3 ). We writeω(θ, ϕ) to specify a vector on the unit sphere by its inclination and azimuth.
We place a spherical cap centered at e 3 with opening angle ∆θ 0 := D/ √ M , D > 0 chosen such that the area of the cap is 4π/M . Then we decompose the remaining part of the northern half sphere into √ M /2 (rounded to the next integer) collars; the i-th collar consists of allω(θ, ϕ) with θ ∈ [θ i − ∆θ i , θ i + ∆θ i ) and arbitrary ϕ. The inclination of every collar extends over ∆θ i ∼ 1/ √ M ; the proportionality constant is adjusted so that the number of collars is integer.
Observe that the circle {ω(θ i , ϕ) : ϕ ∈ [0, 2π)} has circumference proportional to sin(θ i ); therefore we split the i-th collar into √ M sin(θ i ) (rounded to the next integer) patches. This implies that the j-th patch in the i-th collar covers an azimuth ϕ ∈ We fix the proportionality constants demanding that all patches have area 4π/M . SinceV is compactly supported, we can set RV := diam suppV .
Next we introduce corridors between the patches by redefining The constantD is chosen such that when scaled up to the Fermi sphere adjacent patches are separated by corridors of width strictly larger than 2RV .
We then define p 1 as the spherical cap with opening angle ∆ θ 0 centered at e 3 and the other M 2 − 1 patches as Extended patches around the Fermi sphere. Next we extend the patch decomposition radially, over the shell around the Fermi surface that is affected by the interaction with momenta k ∈ suppV . For any α ∈ {1, 2, . . . , M } we introduce the extended patch Removing patches near the equator. Now we assign a unit vectorω α to every patch on the northern half such that k Fωα ∈ P α ⊂ B α . Reflecting the construction to the southern half sphere, the vectorsω α inherit the reflection symmetrŷ ω α+M/2 = −ω α , ∀α = 1, . . . , M/2 .
For any momentum k ∈ Z 3 \ {0}, we are only interested in a subset of the constructed patches, as labeled by the index set (the parameter δ ∈ (0, 1/6) is the same as in (4.1)) Pair excitations near the equator k ·ω α ≈ 0 are almost tangential to the Fermi surface and cannot be treated with the bosonization technique. Fortunately their contribution to the energy turns out to be small. For any k ∈ Z 3 \ {0} we define the operators without the corridors and the excitations near the equator as (4.7) (4.8) The main result of this section is the following lemma, which takes care of estimating the difference to the original H corr .
Lemma 4.1 (Reduction to Pair Excitations on Patches). We have In order to prove Lemma 4.1 we will need the following modified version of the kinetic energy estimate in Lemma 2.2.
Lemma 4.2 (Kinetic Bound for Pairs near the Equator). Let δ ∈ (0, 77/624). Then for all k ∈ Z 3 we have The condition e(p) + e(p − k) ≤ 4N − 1 3 −δ implies that the momentum p is located near the equator of the Fermi surface (if we think of k as defining the direction of the north pole). This is easily seen expanding e(p) + e(p − k) = 2 2p · k − 2 k 2 ; because |p| ∼ N 1 3 we then have k ·p < CN −δ . The idea of Lemma 4.2 is that the estimate in Lemma 2.2 can be improved since here we sum only over a ribbon parallel to the equator on the Fermi surface. The ribbon covers a fraction of order N −δ of the Fermi surface, explaining the improvement from N to N 1−δ in (4.10). The proof of Lemma 4.2 can be found in Appendix B.
Proof of Lemma 4.1. For every k ∈ Z 3 , recall from (4.7) that i. e., the summation is over all p in the set Thus the error term compared to the full pair operator becomes In words: U consists of all those particle momenta p ∈ B c F that correspond to a kinematically permitted particle-hole pair (i. e., h := p−k is inside the Fermi ball) but do not belong to any included patch. Thus in (4.11) we sum over pairs belonging to a corridor between patches or to the cut-off equator region (i. e., they belong to a patch B α but α ∈ I k ). To estimate r R (k), let ψ ∈ F. By the triangle inequality we can bound To make contact with our earlier heuristic explanations, note that Y corresponds to the region (both patches and corridors between patches) of the Fermi surface near the equator, i. e., where k ·p ≈ 0, and U \ Y to the corridors between the patches on the rest of the Fermi surface. This may be seen by expanding e(p) + e(p − k) = 2 2p · k − 2 k 2 ; because p is close to the Fermi surface we have |p| ∼ N To estimate the first factor, note that p ∈ Y implies e(p) (4.14) To estimate the second factor, note that the number of lattice points of Z 3 in U \ Y can be bounded by the number of lattice points in the corridors between all patches (4.15) (Here we used that the length of a corridor surrounding a patch is of order N Putting (4.12) and (4.16) together we arrive at (4.17) Now we turn to the Hamiltonian. Expanding b(k) = b R (k) + r R (k) in the formula for H corr in (1.20), we get It is easy to see that the kinetic bounds in Lemma 2.3 hold also with b(k) replaced by b R (k). Therefore, in combination with (4.17), by the Cauchy-Schwarz inequality we get

Approximately Bosonic Creation Operators
Let {B α } M α=1 be the patches constructed as in the previous section and let k Fωα ∈ B α . Recall that for every k ∈ Z 3 \ {0} we have defined By the reflection symmetry we decompose further I k := I − k ∪ I + k where Then we define the local pair excitations {c * α (k)} α∈I k by Thus, for all k ∈ Γ nor , the operator b R (k) in Lemma 4.1 can be decomposed as The quantity n α (k) 2 counts the number of particle-hole pairs of relative momentum k belonging to patch B α . We cite the result from [BNPSS20, Proposition 3.1]. The condition M ≫ N 1 3 in [BNPSS20] is not necessary, M ≫ N 2δ is sufficient.
Then for all k ∈ Γ nor and α ∈ I k , we have (1)) .
(This lemma may heuristically be understood as follows: the surface area covered by the patch is 4πk 2 F /M . We think of the particle-hole pairs (p, h) ∈ B c F × B F with p − h = k as organized on lines through the lattice Z 3 parallel to k. To count how many lines intersect the Fermi surface we project the patch onto a plane, picking up the factor |k ·ω α |. The condition M ≫ N 2δ ensures that patches remain so small that even near their boundaries the assumption |k ·ω α | ≥ N −δ implies that k points from inside the Fermi ball to outside. The error term arises since we may miscount a pair when one of its components falls into the surrounding corridor, so it is proportional to the circumference of the patch. We write only 1 + o(1) because the precise estimate is not important for us.) A crucial idea of our analysis is that the local pair excitation operators {c * α (k)} α∈I k behave similarly to bosonic creation operators. More precisely, we have approximate canonical commutator relations as given by the following lemma. The lemma is a simple consequence of [BNPSS20, Lemma 4.1], but since it is a key idea of the collective bosonization concept we provide a self-contained proof again.
Lemma 5.2 (Approximate CCR). Let k ∈ Γ nor and l ∈ Γ nor . Let α ∈ I k and β ∈ I l . The operators c α (k) and c * β (l) satisfy the following commutator relations: The operator E α (k, l) = E α (l, k) * commutes with N and, for any γ ∈ I k ∩ I l , satisfies the operator inequalities Furthermore, for all ψ ∈ F we have Proof. By the CAR (1.12) it is easy to see that Moreover, if α = β, then B α ∩ B β = ∅, and hence [c α (k), c * β (l)] = 0. Now let us focus on the case β = α and compute [c α (k), c * α (l)]. We only consider the case α ∈ I + k ∩ I + l (the other cases are simple variations). By the CAR (1.12) it is straightforward to compute that

Let us focus on |E
(1) α (k, l)| 2 ; the term |E (2) α (k, l)| 2 can be bounded similarly, and the mixed terms are controlled by the Cauchy-Schwarz inequality. Symmetrizing, we find The first bound in (5.3) (without the summation) is a trivial consequence. The bound (5.4) follows from (5.3) and the Cauchy-Schwarz inequality.
In the following, we show that the approximately bosonic number operator can be controlled by a fermionic number operator. One of our main technical improvements compared to [BNPSS20, Lemma 4.2] is that instead of using the full N we use only the gapped number operator N δ := The parameter δ > 0 is the same as that in the cut-off parameter N −δ defining the index set I k of relevant patches. Compared to N as in Lemma 2.4, the gain in using the gapped number operator N δ is that it can be controlled by Ψ, N δ Ψ ≤ CN δ in an approximate ground state Ψ.
Lemma 5.3 (Bosonic Number Operator). For all k ∈ Γ nor we have Consequently, for all ψ ∈ F, Proof. First we take α ∈ I + k (the case α ∈ I − k is similar). For any ψ ∈ F, by the triangle and Cauchy-Schwarz inequalities, Using the definition of n α (k) and the fermionic property a i op ≤ 1 we deduce that and the condition α ∈ I + k ensures that k ·ω α ≥ N −δ ; hence By the same method we obtain the same bound when α ∈ I − k . Thus by the definition of the gapped number operator we can bound α∈I k c α (k)ψ 2 ≤ α∈I k q∈Bα : e(q)≥ 1 4 N − 1 which is equivalent to (5.7). Moreover, it can be seen from E α (k, k) ≤ 0 in (5.5) that By the Cauchy-Schwarz inequality we obtain Using that [c α (k), c * β (k)] vanishes for α = β, by (5.10) we obtain This concludes the proof.
For further application, it is useful to extend the definition of c α (k) to include a weight function. Given g : Z 3 × Z 3 → R, we define weighted pair operators The weighted pair operators satisfy similar bounds as the simple pair operators.
Lemma 5.4 (Weighted Pair Operators). For all k ∈ Γ nor and ψ ∈ F we have and for all f ∈ ℓ 2 (I k ) also Proof. Compared to Lemma 5.3 the only non-trivial modification is that we now use where before we used [c α (k), c * α (k)] ≤ 1. We omit the further details.

Bogoliubov Kernel
In this section we study the Hamiltonian h eff (k) introduced in (1.48). Let us usek := k/|k|. It is convenient to write where D(k), W (k), and W (k) are I k × I k real symmetric matrices with elements If c * α (k) were exactly bosonic creation operators, then the quadratic Hamiltonian h eff (k) could be diagonalized by a Bogoliubov transformation of the form (Formulas (6.9) and (6.10) below show that the square roots here involve only positive matrices, so that E(k) and S 1 (k) are well-defined.) Then the Bogoliubov kernel K(k) is The following lemma provides strong estimates for the matrix elements of K(k). While in most parts the simpler bound |K(k) α,β | ≤ CV (k)/M is sufficient for our analysis, the sharp bound of the lemma is crucial for controlling the non-bosonizable terms E 2 ; see (9.5). The simpler bound can be proved without smallness assumption on the potential; the sharp bound requires the smallness because we prove it using a power series expansion in (6.14).
The proof of the lemma is a lengthy but mostly straightforward computation. A key role in the proof is played by the fact that the factor k ·ω α arising from the linearized kinetic energy (through the matrix D(k)) appears also for the independent geometric reason of Lemma 5.1 in the normalization factor n 2 α (k) in the bosonized interaction, i. e., in W (k) and W (k). The geometry of the Fermi surface implies that the excitation-creating operators of the interaction vanish at the same rate u α = |k ·ω α | as the leading order of their kinetic energy when we move toward "tangential" excitations. Lemma 6.1 (Bogoliubov Kernel). Let K(k) be defined in (6.6). If V ℓ ∞ is sufficiently small, then for any k ∈ Γ nor , K(k) is a real symmetric matrix satisfying Proof. In the following, we frequently drop the k-dependence from the notation for simplicity. Let us introduce Recall that κ = ( 3 4π ) 1 3 . By definition of the index set I k , we have 1 ≥ u 2 α ≥ CN −δ for all α ∈ I k . Moreover, Lemma 5.1 implies the important relation v α ≃ u α 4π/M (up to a lower order error term) between the normalization factor n α (k) and the linearization of the kinetic energy, which will be used repeatedly for cancellations.
As in [GS13], denoting by I the I × I-identity matrix, we define Obviously U ⊺ = U = U −1 and it simultaneously block-diagonalizes D + W + W and D + W − W , namely Recall the matrix applying the block-diagonalization we find Now defining the matrix L := S 1 S ⊺ 1 − I with S 1 as in (6.5), we find (6.11) We are going to prove (6.12) In particular, thanks to the assumption of V ℓ ∞ being small, also, e. g., the Hilbert-Schmidt norm of L can be assumed to be uniformly smaller than 1, which is sufficient to ensure convergence of the matrix power series (6.11).
From L to K. Let us deduce (6.7) by assuming (6.12). Spelling out the matrix product The same holds with exchanged roles of u α and u β . From (6.11) we obtain (6.14) The convergence of the series of the logarithm follows from the assumption that |V (k)| is small. This implies (6.7), thanks to Lemma 5.1.
Bound for L. We now prove (6.12). The 2I × 2I-matrix L can be block-diagonalized using the orthogonal matrix U from (6.8), i. e., with I × I-blocks (6.16) Inverting (6.15) we obtain (6.17) Thus, with the matrix indices on L 1 and L 2 to be read as α mod I and β mod I, we have Estimating L 1 . In the square brackets in the definition (6.16) of L 1 we have a rank-one perturbation of a diagonal matrix, namely defining the vectorṽ := d 1/2 v we have Using the Sherman-Morrison formula for the resolvent of any invertible matrix A with rankone perturbation given by vectors x, y such that 1 + y, A −1 x = 0, i. e., we explicitly calculate the resolvent of (6.19) and then enter with it in the integral representation A −1/2 = 2 π ∞ 0 dλ(A + λ 2 ) −1 (for any positive matrix A), with the result that The function f (λ) here is given by Multiplying from both sides by d 1/2 , and subtracting the identity matrix, we obtain Recall v α ≤ u α CM − 1 2 and observe that 1/f (λ) ≤ 1, so we get Estimating L 2 . Recall that v| is a rank-one perturbation of (d + 2b) 2 , so by employing again the integral expansion as used for L 1 we obtain Now consider the functionf (λ) := 1 − 2g v, d+2b (d+2b) 2 +λ 2 v , the inverse of which is appearing in the integral. For λ = 0, using the Sherman-Morrison formula (6.20), this time expanding d + 2b around d, we find is uniformly bounded we havef (0) > 0, strictly and uniformly in k and M . Furthermore λ → v, d+2b (d+2b) 2 +λ 2 v is monotone decreasing for all λ ≥ 0, thusf (λ) ≥f (0) for all λ ≥ 0. We expand d+2b (d+2b) 2 +λ 2 = (d+ 2b)(d+ 2b+ iλ) −1 (d+ 2b− iλ) −1 and use the Sherman-Morrison formula separately for both the resolvents (d + 2b ± iλ) −1 . Using the Dirac ket notation, this results in where for the last line we used b = g|v v|. Keeping the 1 from the big square bracket separate while combining the other terms, this simplifies to (6.26) The vector d+2b (d+2b) 2 +λ 2 v has real elements since v is a real vector and d+2b (d+2b) 2 +λ 2 is a real matrix. However, (6.25) and (6.26) are not explicitly real (by choosing an order out of the two options (d + 2b) 2 + λ 2 = (d + 2b + iλ) −1 (d + 2b − iλ) −1 and (d + 2b) 2 + λ 2 = (d + 2b − iλ) −1 (d + 2b + iλ) −1 we have broken this symmetry). To make the expression explicitly real again, let us add the complex conjugate, yielding d 2 +λ 2 v (and dividing by 2) this becomes (6.29) In the denominator 1+2g v, 2d d 2 +λ 2 v +4g 2 | v, (d+iλ) −1 v | 2 ≥ 1; hence (6.29) can be dropped for an upper bound. Inserting (6.28) in (6.23) we obtain First summand times first summand in (6.30). Consider the product of the first summands from inside each of the absolute values. This is of the same type as (6.21), so Second summand times second summand in (6.30). We have 1 2dλ ≥ 1 (6.31) Assuming without loss of generality u α ≥ u β , by dropping a non-negative u 4 β from the numerator we find Mixed term in (6.30). We turn to the remaining two terms obtained from the product in the integral, for which we have to estimate . (6.32) The integral is .
Without loss of generality u α ≤ u β ; then In the last step we used that both fractions in the sum are bounded by 1. This completes the proof of Lemma 6.1.

Approximate Bogoliubov Transformation
Given the Bogoliubov kernel K(k) in (6.6), for any λ ∈ R we define a unitary transformation T λ : F → F by The following lemma is the main result of this section, showing that T λ acts approximately as a bosonic Bogoliubov transformation.
Here the matrices cosh(K(k)) and sinh(K(k)) are defined by functional calculus, or more explicitly by the series As a consequence of Lemma 6.1, by a calculation similar to that in (6.14) using the power series, we can verify that, for all λ ∈ [−1, 1] and k ∈ Γ nor , In order to prove Lemma 7.1, we need to show that the fermion number is stable under the approximate Bogoliubov transformation. This is the content of the next lemma.
Proof of (7.4). For any function Recall from Lemma 6.1 that |K(k) α,β | ≤ C/M , so that (5.9) from Lemma 5.3 implies Furthermore, by (5.8) from Lemma 5.3 we have Hence for all vectors X, Y ∈ F we have Using this bound we find In the last estimate we used that N δ commutes with N , that 0 ≤ N δ ≤ N , and that The estimate (7.6) closes a Grönwall bound for T λ ψ, (N + 4) m T λ ψ and (7.4) follows.
Proof of (7.5). In view of definition (5.11) we can write for some weight function g with g(p, k) ∈ {0, 1, 2}. Similarly to the above calculation Similar to the bounds used above, Lemma 5.4 provides us with We then get Thus (7.5) follows by Grönwall's inequality.
which follows from Lemma 5.2. When iterating the expansion, the term K(l) γ,α c * α in the commutator formula gives rise to the power series of the sinh and cosh; the rest of (7.8) will be considered as an error term. The conclusion is that (7.2) holds true, with the error term summed over all iteration steps being for any n 0 ≥ 1. (The two terms on the last line are the non-explicit integral term from the Duhamel formula and the powers missing to complete the series of the sinh and cosh from n = n 0 to +∞.) Here in every summand X ♮ means X for n even and X * for n odd.
Recall that as a consequence of Lemma 6.1 we have Therefore, by the triangle inequality and summing over γ, Using Lemma 5.3 we can bound the operators in the last two lines; then taking n 0 → ∞ we obtain (7.10) It remains to bound (7.10) term by term. For the first term, using Lemma 5.2, Lemma 5.3 and Lemma 7.2 we can estimate which is just what we bounded in (7.11). If β = α we have the decomposition (This decomposition can be verified by noticing that (7.12) consists of two commutators, recalling that E α (k, l) = [c α (k), c * α (l)] − δ k,l , and using the Jacobi identity.) With this decomposition we proceed to Here we have used the first bound from Lemma 5.2 (without the summation), Lemma 5.3 and then proceeded similarly to (7.11). The second term of (7.12) is treated similarly. For the last term of (7.12), we bound c * α (l)ξ ≤ (N δ + 1) 1/2 ξ (which follows from (5.9) with f = δ α ) and then use the fact that E α (k, k) commutes with N δ (this is clear from (5.5) with k = l) to get In the last estimate we used Lemma 7.2 again. Therefore (7.14) By the same argument, we obtain similar bounds for E * α (k, l)c β (k) and c β (k)E * α (k, l) (recall that E * α (k, l) = E α (l, k) to reduce to the previous estimates). Collecting the estimates for the terms of (7.10), this completes the proof of Lemma 7.1.

Linearization of the Kinetic Energy
In this section we prove that the fermionic kinetic energy H 0 behaves similarly to the bosonized kinetic energy Lemma 8.1 (Comparing Fermionic and Bosonized Kinetic Energy). Let T λ be the approximate Bogoliubov transformation defined in (7.1). Then for all ψ ∈ F we have The first of the two error terms in the lemma is the main reason for taking M large. More precisely, the first error (involving M − 1 2 ) comes from linearizing the fermionic kinetic energy on each patch, for which the size of the patch must not be too large. The linearization error of the fermionic kinetic energy is estimated in the following lemma.
Lemma 8.2 (Linearization of Kinetic Energy). For all k ∈ Γ nor and all α ∈ I k we have where the error term satisfies, for all ψ ∈ F, All bounds here gain a factor M − 1 2 compared to the bounds in Lemma 5.3.
Proof. Let us consider α ∈ I + k , the other case is similar. By the CAR (1.12) we have [a * i a i , a * p ] = δ i,p a * p . Therefore where, using definition (5.11), we can write E lin α (k) = c g α (k) with the weight function The claimed error estimates now follow from Lemma 5.4.
Proof of Lemma 8.1. Recall that T 0 = I. We will show that for all λ ∈ [0, 1] we have The claim then follows by integration over λ ∈ [0, 1].
Consider the approximately bosonic operator D B . We have where, with an indicator function χ(α ∈ I l ), the error term is For all ψ ∈ F, by the non-summed first bound from Lemma 5.2 and by Lemma 5.3 Now using we can decompose and estimate It remains to estimate the right side of (8.8), term by term. For I 1 , since |K(k) α,β | ≤ CM −1 , using Lemma 5.3, Lemma 8.2 and Lemma 7.2 we have We can bound I 2 similarly to I 1 , simply exchanging the roles of the E lin -operator with the c-operator. For I 3 using (8.6) instead of Lemma 8.2 we have For I 4 , we split the sum over α, β ∈ I k into two parts. If α = β, then and this part can be treated similarly to I 3 . When α = β, the corresponding contribution is Here we have inserted the definition (8.5) and used (7.12) to obtain The advantage of the last expression is that E α (l, l) commutes with N δ . Therefore, using the Cauchy-Schwarz inequality together with Lemma 5.2, Lemma 5.3 and Lemma 7.2 we obtain Adding up the contributions to (8.8) we arrive at the claimed bound (8.3).

Controlling Non-Bosonizable Terms
In this section we consider the non-bosonizable terms E 1 + E 2 . Lemma 4.1 allows us to replace E 2 by E R 2 , hence we estimate E 1 + E R 2 . Recall that after the expansion into patches (5.1) we have Now we prove that after the Bogoliubov transformation, the non-bosonizable terms can be bounded from below using the fermionic kinetic energy. This step relies on the assumption V (k) ≥ 0.
Lemma 9.1 (Non-Bosonizable Terms). Let T λ be the approximate Bogoliubov transformation defined in (7.1). Then for all λ ∈ [−1, 1] and for all ψ ∈ F we have Note that in the lemma H 0 acts once on ψ and once on T λ ψ. The sharp bound on the matrix elements of the Bogoliubov kernel from Lemma 6.1 is crucial to the proof of this lemma.
The smallness assumption onV is important to control the term −C V ℓ 1 H 0 appearing on the right hand side in this lemma.
Proof. Take k, l ∈ Γ nor . We denote D(l) := T * λ D(l)T λ . By Lemma 7.1 we can write Therefore, by the triangle inequality, the contribution of E R 2 to (9.2) can be bounded by We proceed to bound the right side of (9.4) term by term. The first term can be bounded using n α (k) ≤ CN 1 3 M − 1 2 and Lemma 7.1:

Using (7.3) we have
α∈I k n α (k) |cosh(λK(k)) α,β | + |sinh(λK(k)) α,β | ≤ Cn β (k) . (9.5) Consequently, using Lemma 2.2 we get The third term is more difficult. We have The term I 4 can be bounded again by Lemma 2.2 as The commutator in I 5 can be computed by undoing the approximate Bogoliubov transformation, T λ , and using (7.2) with T λ = T * −λ , i. e., Therefore, by the triangle inequality and (9.5) we can decompose For I 6 , we simply expand the commutator and use Lemma 7.1 similarly as done for I 1 to get Here we used D(l)T λ ψ = D(l)ψ as T λ is unitary.
For I 7 we compute the commutator explicitly. We decompose the operator D(l) * as In the case α ∈ I + k we can then compute By the Cauchy-Schwarz inequality and the kinetic energy estimate in Lemma 2.2 we obtain For α ∈ I − k and D h (l) we get similar estimates. The commutator in I 8 can be bounded exactly the same way, using D(l) * = D(−l). Thus Collecting all estimates for I 1 , . . . , I 8 we conclude from (9.4) that The bound (9.6) holds true for all k, l ∈ Γ nor . In particular, by the Cauchy-Schwarz inequality we deduce that This concludes the proof of (9.2).
We now focus on the approximately bosonic Hamiltonian D B + Q R B , with D B and Q R B as defined in (8.1) and (4.8). With h eff (k) being the effective Hamiltonian introduced in (6.1), we can write The main result of this section is the following lemma in which we approximately diagonalize the effective Hamiltonian, extract its ground state energy E RPA N , and then bound the excitation spectrum below by D B − C V ℓ 1 H 0 .
Lemma 10.1 (Diagonalization of Bosonized Hamiltonian). Let T 1 be the approximate Bogoliubov transformation defined in (7.1). For all normalized ψ ∈ F we have where E RPA N is the RPA correlation energy defined in (1.9).
The smallness assumption onV is important to control the term −C V ℓ 1 H 0 appearing on the right hand side in this lemma.
Proof. Error Terms. Recall that K(k) is a real symmetric matrix, and hence also sinh(K(k)) and cosh(K(k) are real symmetric matrices. Defining we have, according to Lemma 7.1, In the first step we are going to control the contribution of E α (1, k). By (7.3) we have |cosh(K(k)) α,β − δ α,β | + |sinh(K(k)) α,β | ≤ C M and thus using Lemma 5.3 (to treat the contribution of δ α,β , recall that c α (k)ψ ≤ N 1/2 δ ψ for all ψ ∈ F follows from the first bound in the lemma) we get The main contribution is To bound the error terms in (10.5), first observe that Now using the Cauchy-Schwarz inequality together with (10.4) and Lemma 7.1 we get (10.6) In this calculation we used the crude bound Bosonic Terms. Now we compute h diag eff (k) by inserting the transformation (10.2). In this step let us suppress the k-dependence in the notation. We havẽ where we used the approximate CCR (5.2) to achieve bosonic Wick-normal order. Moreover, using [c * α , c * β ] = [c α , c β ] = 0 we symmetrize the coefficients of c * α c * β and c α c β ; thus Adding both terms, we thus have To simplify (10.7) further, recall from (6.5) that and set S 2 := (S ⊺ 1 ) −1 . We use the polar decomposition to write S ⊺ 1 = O|S ⊺ 1 | with an orthogonal matrix O. Inserting our choice K = log|S ⊺ 1 | into the exponentials defining the cosh and the sinh, and noting that each exponential by itself is a symmetric matrix, we obtain Note that since K is symmetric, it cannot equal the non-symmetric matrix 1 2 (S 1 + S 2 ). Thus the inclusion of the unitary operator O in (10.8) is inevitable. However, O corresponds to a change of basis in the one-boson Hilbert space, which, at least in the bosonic approximation, does not change the energy of the many-body state.
Ground state energy. As in [BNPSS20, Proof of Theorem 2.1 and Appendix A.2], with the matrix E defined in (6.4), we have Using the fact that cosh(K) and sinh(K) are symmetric matrices, the constant term of (10.7) thus simplifies to 1 2 α∈I k 2 sinh(K) D + W sinh(K) + sinh(K) W cosh(K) + cosh(K) W sinh(K) This is the term giving rise to the ground state energy we aim to derive.
To estimate the error term proportional to E α (k, k) in (10.7), note first that From (5.5) one easily derives the bound This bound in combination with (10.9) and the fact that, due to the equator cut-off |k ·ω α | ≥ N −δ in the definition of the index set I k , Vanishing of the off-diagonal terms. Next we show that the terms in h diag eff (k) proportional to (c * α c * β + c β c α ) vanish, as intended with the Bogoliubov transformation. Indeed cosh(K)(D + W ) sinh(K) + sinh(K)(D + W ) cosh(K) + cosh(K)W cosh(K) + sinh(K)W sinh(K) Summarizing we get the lower bound with the matrix 10 K(k) := cosh(K(k)) D(k) + W (k) cosh(K(k)) + sinh(K(k)) D(k) + W (k) sinh(K(k)) + cosh(K(k)) W (k) sinh(K(k)) + sinh(K(k)) W (k) cosh(K(k)) .
10 A computation shows that K(k) = O(k) ⊺ E(k)O(k) but we are not going to use this formula.
Bogoliubov-diagonalized effective Hamiltonian. Summing (10.12) over k ∈ Γ nor and including the pre-factors as in (10.1) (in particular an ) we conclude that The error terms on the last two lines of (10.13) come from (10.6) and (10.10). In [BNPSS20, Eq. (5.15)] we already showed that the constant term on the right hand side of (10.13) gives rise to the correlation energy E RPA N we aimed to derive, (10.14) This is the correlation energy as given in the main theorem in (1.9).
Controlling −D B by the diagonalized effective Hamiltonian. To make use of the positive contribution of α,β∈I k K(k) α,β c * α (k)c β (k), it is convenient to subtract D(k). We can then expand cosh(K(k)) = (cosh(K(k))−I)+I; the term in K(k) where D(k) is multiplied from both sides by the identity cancels with the explicitly subtracted D(k). For the remaining terms we can use (7.3) and |W (k) α,β | ≤ C M u α (k)u β (k) to get (10.15) Thus, by Lemma 2.2 and recalling u α (k) ≤ CM 1 2 N − 1 3 n α (k), we conclude that for all ξ ∈ F ξ, Summing over k ∈ Γ nor we conclude that Inserting the last bound together with (10.14) in (10.13), the proof is complete.
Let Ψ ∈ F be the approximate ground state constructed by the particle number localization, Lemma 3.1, from some exact ground state ψ gs of H corr (i. e., from a minimizer of the expectation value on the left hand side of (1.22)). By the localization we have for all m ≥ 0. Here we have used N δ ≤ CN 1 3 +δ H 0 to estimate the gapped number operator. Let T 1 be the approximate Bogoliubov transformation defined in (7.1) with the Bogoliubov kernel K(k) from (6.6). Since T 1 is unitary, we can define ψ by setting Ψ = T 1 ψ .
From (11.1) and Lemma 7.2 we also have Now we collect the main bounds and estimate the error terms using (11.1), (11.2), and (11.3). First, by Lemma 2.5 we can bound the quadratic operator X below by (11.4) By Lemma 4.1, we have (11.5) Next we can use Lemma 8.1 to deduce that (11.6) and, because of the estimate for the matrix elements of the Bogoliubov kernel derived in Lemma 6.1, Lemma 9.1 shows that . (11.8) To conclude, we sum all error bounds from (11.4)-(11.8). The quantities from the error terms suggest us to take With this choice, collecting all of (11.4)-(11.8), we conclude that The contribution of 1 − C V ℓ 1 ψ, H 0 ψ is non-negative thanks to the smallness assumption for the potential. Thus it remains the error of order N − 1 48 = 1+ 1 16 , which completes the proof of the main result.

A Hartree-Fock Theory
In this appendix we consider the Hartree-Fock energy with the Hamiltonian H N in (1.2). We assume that N = |B F |, namely the Fermi ball is completely filled. It is clear that if the interaction vanishes, i. e., V = 0, then the Slater determinant of plane waves ψ pw as in (1.5) is the unique minimizer for E HF N . However, it is less trivial that the Hartree-Fock minimizer is unchanged if the interaction is sufficiently weak. Then the Slater determinant of plane waves ψ pw in (1.5) is the unique minimizer (up to a phase) for E HF N .
Note that Theorem A.1 does not require any specific choice of parameters λ and . In our semiclassical mean-field scaling, = N −1/3 and λ = N −1 , the condition 0 ≤ λ V ℓ 1 < 2 /2 holds for N large provided that V ∈ ℓ 1 .
We follow the argument in [GHL19] where the Hartree-Fock energy of the electron gas is studied. The main difference is that in our finite volume setting, the spectral gap of the Laplacian is strong enough to dominate the interaction, ensuring the exact equality E HF N = ψ pw , H N ψ pw instead of just an exponential closeness as in [GHL19].
Proof. Let Ψ = N j=1 u j be a Slater determinant. Then a straightforward computation shows that where γ = N j=1 |u j u j | is the one-body density matrix of Ψ, γ(x, y) = N j=1 u j (x)u j (y) and ρ γ (x) = γ(x, x). Note that the one-body density matrix γ pw of the plane waves has the integral kernel γ(x, y) = γ pw (x − y) = (2π) −3 p∈B F e ip·(x−y) .
In the following we estimate the right side of (A.1) term by term.
Here in the last equality we have used tr(γ) = tr(γ pw ) = N . Thus the desired equality (A.6) holds true if γ ⊥ pw Aγ ⊥ pw − γ pw Aγ pw + C 0 = − 2 ∆ − λG which is equivalent to The latter equality holds true when Note that since the Fermi ball is completely filled we have the gap |k 2 | 2 − |k 1 | 2 ≥ 1 for all k 1 ∈ B F and k 2 ∈ B c F (since |k 2 | 2 − |k 1 | 2 is positive and integer). Furthermore, When 0 ≤ λ V ℓ 1 ≤ 2 , we can choose and obtain The desired estimate (A.5) follows immediately.

B Kinetic Energy Estimates
In this appendix we provide a simplified proof of Lemma 2.2, which was first established in [HPR20, Lemma 4.7]. Like [HPR20] we use the following special case of a result by [Hux03].
Theorem B.1 (Integer points in ellipses). Let d 0 ∈ N. For every R > 0 consider the ellipse Then the number of points in E(R)∩Z 3 is |E(R)|+O(R γ ) for R → ∞, with any γ > 131/208.
We do not need the full power of this theorem; for our purpose, any γ < 1 is sufficient. With exponents γ = 2/3, it is a classic result due to Van der Corput's thesis [Cor19] from 1919.
Proof of Lemma 2.2. We start by proceeding as in [HPR20,Lemma 4.7]. Using the Cauchy-Schwarz inequality we get Heuristically, one may understand this bound as follows: the sum is over lattice points in the grey area of Fig. 1; since |k| = O(1) and the area of the Fermi surface is of order N 2 3 , the number of these points is also of order N 2 3 . Since |p| ∼ N 1 3 (p has to be close to the Fermi surface) and |k| ∼ 1, we expect that in average |p| 2 − |p − k| 2 = 2p · k − k 2 ∼ N 1 3 , leading us to the order N 1 3 in (B.1). Strictly speaking, |p| 2 − |p − k| 2 can be much smaller than the average size. Actually |p| 2 − |p − k| 2 can be of order O(1) if both p and p − k are very close to the Fermi surface. Fortunately, integer points very close to the surface of the sphere are quite rare and the contribution from this part can be controlled by number theoretic results.
This argument was formulated rigorously in [HPR20] but the detailed proof is rather technical. Here we present a simplified proof of (B.1) for the reader's convenience.