High-dimensional sphere packing and the modular bootstrap

We carry out a numerical study of the spinless modular bootstrap for conformal field theories with current algebra $U(1)^c \times U(1)^c$, or equivalently the linear programming bound for sphere packing in $2c$ dimensions. We give a more detailed picture of the behavior for finite $c$ than was previously available, and we extrapolate as $c \to \infty$. Our extrapolation indicates an exponential improvement for sphere packing density bounds in high dimensions. Furthermore, we study when these bounds can be tight. Besides the known cases $c=1/2$, $4$, and $12$ and the conjectured case $c=1$, our calculations numerically rule out sharp bounds for all other $c<90$, by combining the modular bootstrap with linear programming bounds for spherical codes.


Introduction
To what extent do self-consistency principles constrain, or even determine, the behavior of a system? This question underlies many topics in mathematics and physics. One notable example is the conformal bootstrap program [1,2,3,4] (see [5,6,7] for reviews), which seeks to map the space of possible conformal field theories (CFTs) and to identify those on the boundary of theory space (i.e., those with extremal properties, almost but not quite inconsistent). A more down to earth example is the sphere packing problem, in which the goal is to maximize the fraction of R d covered by congruent spheres whose interiors are not allowed to overlap. In low dimensions it is not hard to guess the optimal packings, but even that remains mysterious in high dimensions. Proving upper bounds for the packing density is particularly difficult, and in most cases the best bounds currently known are obtained via the linear programming bound of Cohn and Elkies [8], which relies on harmonic analysis.
While these problems sound completely unrelated, Hartman, Mazáč, and Rastelli [9] discovered a surprising connection between them: the spinless modular bootstrap for twodimensional CFTs is very nearly the same as the linear programming bound for sphere packing. The underlying optimization problems are exactly equivalent when the current algebra U (1) c left × U (1)c right with total central charge c total = c +c acts on the CFT and the sphere packing dimension is given by d = c total , and they are closely related (but not equivalent) under the Virasoro algebra. The relationship between the modular bootstrap and linear programming bounds seems to be specific to these particular techniques, rather than being based on a direct connection between CFTs and sphere packings.
In this paper we will focus on the spinless modular bootstrap for U (1) c left × U (1)c right with c total large. The analysis depends only on c total , not on the left and right central charges individually. To simplify the notation we setc = c and refer simply to the U (1) c modular bootstrap, parameterizing our results by c = c total /2.
Neither the modular bootstrap nor the linear programming bound has been completely analyzed, either theoretically or numerically. Each depends on producing some additional information (namely, an auxiliary function or linear functional satisfying certain inequalities), which must be chosen carefully to optimize the resulting bound, and this optimization has proved difficult. The equivalence between these problems adds to the motivation for studying them, because any consequences will shed light on two seemingly disparate topics. A third application is to generalizations of the Bourgain-Clozel-Kahane uncertainty principle for signs of functions [10,11]. Thus, these problems live at a particularly fruitful intersection of several fields.
In this paper, we carry out the first large-scale numerical study of the U (1) c spinless modular bootstrap with c large, or equivalently the linear programming bound on sphere packing in high dimensions, by adapting the numerical techniques introduced by Afkhami-Jeddi, Hartman, and Tajdini for the Virasoro case [12]. These techniques closely parallel the approach independently taken by Cohn, Elkies, Kumar, and Gonçalves [8,13,11,14] in the sphere packing literature, but the paper [12] introduced better extrapolation techniques and achieved superior performance.
In CFT terms, the spinless modular bootstrap corresponds to constraints on the partition function at zero angular potential. A natural question is whether the spinning modular bootstrap, i.e., including an angular potential, also bounds the density of general sphere packings. The answer is that it does not, as this would contradict known packings. The spinning bootstrap analysis for CFTs with U (1) c left × U (1) c right symmetry has interesting implications for holographic duality and will appear in a separate paper [15].

Results from the spinless modular bootstrap for large c
For sphere packing in high dimensions, the central question is the asymptotic behavior of the packing density. It is at least 2 −d in R d , with only much lower-order improvements known [16,17,18], and it is at most 2 −(κ+o(1))d with κ = 0.59905576 . . . . The latter bound was found by Kabatyanskii and Levenshtein [19] in 1978, and the exponential decay rate has not been improved since then. Cohn and Zhao [20] showed how to obtain it via the linear programming bound, and a fundamental open question is whether the linear programming bound is capable of improving on this decay rate.
In terms of the U (1) c spinless modular bootstrap, bounding the packing density amounts to bounding the spectral gap of the CFT. Specifically, the Kabatyanskii-Levenshtein bound says that the scaling dimension of the lowest non-vacuum primary is at most c/(K + o(1)) as c → ∞, where K = eπ2 2κ−1 = 9.79674646 . . . . No better bound is known for the spectral gap.
One of our primary results in this paper is a numerical estimate of the fully optimized U (1) c spinless modular bootstrap bound for the spectral gap (Conjecture 3.1). In sphere packing terms, it amounts to an upper bound of 2 −(λ+o(1))d for the sphere packing density in R d as d → ∞ with λ ≈ 0.6044; in modular bootstrap terms, it amounts to an upper bound of c/(Λ + o(1)) for the spectral gap as c → ∞ with Λ ≈ 9.869. This bound is based on numerical extrapolation, with no proof or even heuristic derivation, but we give a careful accounting of the potential error from the extrapolation. We furthermore guess that the exact value of Λ is π 2 (Conjecture 3.2), although that conjecture is much more speculative.
Conceptually, what our computations indicate is that the Kabatyanskii-Levenshtein upper bound can be decreased by an exponential factor through optimizing the linear programming bound. If proved, this bound would settle a longstanding open problem in discrete geometry. However, the improvement in the decay rate will be small.
The analytical [9] and numerical [12] results for Virasoro symmetry are quite a bit further away from each other (8.503 vs. 9.08). We have no conceptual explanation for why the Kabatyanskii-Levenshtein bound should come rather close to optimizing the U (1) c case, yet fall slightly short. Perhaps generalizing this bound will offer new techniques for optimizing the modular bootstrap more broadly, but we do not expect that it will lead to an exact solution without some new idea.
Sphere packings are error-correcting codes for a continuous communication channel, and they therefore play an important role in information theory. Their discrete counterpart is error-correcting codes for a binary channel, and these two theories are in many ways closely analogous [21], with substantial interplay between them, both in results and in techniques. The linear program bound originated in the discrete setting, in a fundamental paper by Delsarte [22], before being generalized to sphere packing by Cohn and Elkies [8], and the Kabatyanskii-Levenshtein bound was inspired by the MRRW bound, due to McEliece, Rodemich, Rumsey, and Welch [23].
Much like the case of sphere packing, the asymptotic rate in the MRRW bound has not been beaten by any method, and it is an open problem whether it optimizes the linear programming bound. Barg and Jaffe [24] examined this issue numerically, and they conjectured that it is the optimal rate in the linear programming bound. Their conjecture is widely believed, but the evidence is not conclusive. While our results have no direct implications for binary error-correcting codes, they suggest that the MRRW bound may not be optimal, because it is the discrete analogue of the Kabatyanskii-Levenshtein bound. It would be valuable to perform a more extensive study than Barg and Jaffe were able to do in 2001.
At the optimum, the linear programming approach provides not just a bound on the spectral gap, but a candidate spectrum for a CFT that saturates it. In sphere packing terms, this spectrum amounts to the pair correlation function of the packing. We study the spectrum numerically in Section 4 and find some intriguing structure. For computational purposes, the infinite set of bootstrap constraints is truncated to a finite system of 2N equations, with N taken as large as possible. The corresponding spectrum has N states other than the vacuum, with scaling dimensions ∆ 1 < ∆ 2 < · · · < ∆ N . We conjecture a formula for the ratio ∆ n /N in the limit N → ∞ with n/N held fixed. The formula is piecewise smooth, with an abrupt transition from linear to nonlinear behavior at n ∼ (2/π)N . We have no analytic explanation for this transition. The linear portion of the spectrum matches the spectrum of the generalized free fermion in one dimension, which was used to construct analytic functionals for CFT in [25] and adapted to sphere packing in [9].

New constraints on tight sphere packing bounds
In addition to studying the asymptotic behavior of the modular bootstrap, we also search for exceptional behavior at finite central charge. Four particular values are known to play a special role, namely c = 1/2, 1, 4, and 12. In sphere packing terms, these cases correspond to exact solutions of the sphere packing problem in dimensions 1, 2, 8, and 24. While d = 1 is trivial, d = 8 and d = 24 are far deeper, and they are the only cases in which the sphere packing problem has been solved above d = 3 (which was solved by Hales [26,27], with no connection to the modular bootstrap). Dimension 8 was a breakthrough due to Viazovska [28], and dimension 24 built on her techniques [29]. The linear programming bound seems to be exact when d = 2 as well, but no proof is known, although the two-dimensional sphere packing problem can be solved directly [30,31].
These cases are more subtle for CFTs than they are for sphere packings. For (c,c) = (4, 4), there is indeed a CFT invariant under U (1) c left × U (1)c right that achieves the spinless modular bootstrap bound, namely eight free fermions with a diagonal GSO projection. No such CFT exists for (c,c) = (12, 12) (see [15]), but there is a chiral CFT with (c,c) = (24, 0), namely the 24 chiral bosons compactified using the quotient of R 24 by the Leech lattice. 1 The case c = 1/2 is not an integer, so U (1) c left × U (1) c right symmetry does not even make sense, but again we can use a chiral boson with (c,c) = (1, 0). This time, however, it is not fully conformally invariant. Instead, it has a nontrivial phase under the action of the generator T of SL 2 (Z), but the spinless modular bootstrap nevertheless applies. Finally, in the case c = 1 no CFT invariant under U (1) c left × U (1) c right achieves the spinless bound (see [15]), but it is achieved by two chiral bosons with (c,c) = (2, 0) and a nontrivial phase under the T transformation. Thus, the CFT picture encompasses all four exceptional cases, provided we allow chiral CFTs and are willing to relax conformal invariance.
Why should the exceptional solutions of the sphere packing problem be limited to these specific dimensions? It comes as no surprise to see sporadic behavior tied to E 8 and the Leech lattice (for c = 4 and 12, respectively), but it is difficult to explain why this behavior is not more widespread. For a provocative example, why shouldn't the linear programming bound solve the sphere packing problem in all sufficiently large dimensions? We do not know how to rule out this possibility, although it is utterly implausible.
To shed light on this problem, we examine the conditions that would have to hold to obtain a sphere packing meeting the linear programming bound. To do so, we incorporate additional constraints beyond the modular invariance of the partition function. Specifically, we study the implied kissing number, the average number of tangencies between spheres in a hypothetical packing with this property. In all dimensions up through d = 250 other than 1, 2, 8, 24, 180, 181, and 192, we show that the implied kissing number from our numerically optimized bound is impossibly large. Thus, no sphere packing can attain the exact linear programming bound in these dimensions. 2 We do not expect optimal solutions in dimensions 180, 181, or 192, and we see no sign of them, but our bounds do not rule them out.
As the unexpected occurrence of dimensions such as 181 indicates, this problem has a surprisingly intricate structure. While certain aspects behave in straightforward ways that are not hard to extrapolate, other aspects are far more subtle. One feature of our numerical solutions for which we have no conceptual explanation is a kind of periodicity: the degeneracies are especially well described by a Cardy-like entropy formula when c is a multiple of 8, and the scaling dimensions are especially close to those for generalized one-dimensional free fermions when c is 4 more than a multiple of 8. In other words, multiples of 4 behave particularly well, but no value of c looks equally simple from all perspectives. The reason for this behavior remains mysterious.

The spinless modular bootstrap 2.1 Setting up the bootstrap
In this section, we will briefly review the spinless modular bootstrap, which is a technique for proving bounds on the possible scaling dimensions of primary fields in a compact, unitary 2d CFT [33,34,35]. Given such a CFT, let Z(τ,τ ) be its partition function, i.e., the sum over all states of q h−c/24qh−c/24 , where h andh are the conformal weights of the state, c and c are the left and right central charges, and q = e 2πiτ andq = e −2πiτ (with τ and −τ in the upper half-plane). 3 Because of conformal invariance, the partition function satisfies modular invariance: For the spinless modular bootstrap, we specialize the partition function to have zero angular potential. In other words, we setτ = −τ (i.e.,q = q) and use the restricted partition function Z(τ ) = Z(τ, −τ ).
The action of S on τ andτ preserves the conditionτ = −τ , and thus but the action of T does not. Thus, we expect that usually Z(τ + 1) = Z(τ ). The spinless modular bootstrap is based on the identity Z(−1/τ ) = Z(τ ). Because we make no use of the action of T , the bound applies even to theories that are invariant only under S. A simple example is a single chiral boson at the self-dual radius, for which Z(τ,τ ) = θ 3 (τ )/η(τ ) in terms of the Jacobi theta function and Dedekind eta function. Such theories are not fully conformally invariant, but the spinless modular bootstrap still applies.
The combined contribution of the descendants of a primary field of scaling dimension ∆ = h +h to the partition function Z(τ ) is a character χ ∆ (τ ) of a Verma module of the current algebra, and thus the partition function is given by a sum over the scaling dimensions of the primary fields, each with multiplicity given by the degeneracy d ∆ . The vacuum corresponds to ∆ = 0, with degeneracy d 0 = 1, and the other scaling dimensions are positive numbers ∆ 1 < ∆ 2 < · · · that tend to infinity.
The precise form of the characters depends on the current algebra. Our main interest in this paper will be the algebra U (1) c left × U (1)c right (more precisely, the corresponding affine Lie algebra), in which case where η is again the Dedekind eta function. In particular, the only dependence on the central charges is through their sum c +c. The spectral gap of the CFT is the lowest scaling dimension ∆ 1 of a primary other than the vacuum. We can obtain an upper bound for the spectral gap by producing a linear functional that acts in a certain way on functions of τ . The key observation is that if we set then we obtain the crossing equation by modular invariance. Now suppose ω is a linear functional such that whenever ∆ ≥ ∆ gap for some constant ∆ gap . If we apply ω to the crossing equation, we find that which would be impossible if all the non-zero scaling dimensions were at least ∆ gap , because the total would be positive. Thus, we conclude that some primary must have a scaling dimension strictly between 0 and ∆ gap . In other words, ∆ gap is a strict upper bound for the spectral gap. One can show that it is a weak upper bound even if ω(Φ 0 ) = 0, as long as ω(Φ ∆ ) is not identically zero (see Appendix A), and that ω(Φ 0 ) = 0 must hold for the optimal choice of ω.
The optimal functional ω is not known, except in a handful of special cases discussed below. In Sections 3 and 4, we will give the most detailed numerical study so far of how ω and ∆ gap behave. As noted earlier, because the spinless modular bootstrap for U (1) c left × U (1)c right depends only on c +c, we will setc = c and refer just to c. Strictly speaking this notation is misleading when c +c is odd, because the current algebra U (1) c left × U (1)c right makes sense only when c andc are nonnegative integers. For example, the only physically meaningful cases with c +c = 1 are (c,c) = (1, 0) or (0, 1), and they are therefore what we mean when we refer to the c = 1/2 case. More generally, the abstract problem of optimizing the bound makes sense for any c > 0, but there are consequences for CFTs only when c is an integer or half-integer.

Uncertainty principle
Hartman, Mazáč, and Rastelli [9] reformulated the U (1) c spinless modular bootstrap in terms of an uncertainty principle for eigenfunctions of the Fourier transform as follows. Suppose d = 2c is an integer, which is the meaningful case for CFTs. Given a functional ω as above, we define a radial function f ω : then f ω is an eigenfunction of the Fourier transform with eigenvalue −1; in other words, f ω = −f ω . To see why, we start with because the Dedekind eta function satisfies the identity η(−1/τ ) = (τ /i) 1/2 η(τ ). The complex Gaussian x → e πiτ |x| 2 on R d has Fourier transform y → (i/τ ) d/2 e πi(−1/τ )|y| 2 . Thus, the function x → e πiτ |x| 2 − (i/τ ) d/2 e πi(−1/τ )|x| 2 in the numerator of Φ |x| 2 /2 (τ ) is a −1 eigenfunction of the Fourier transform for each τ , because it is the difference of a Gaussian and its Fourier transform. We conclude that f ω also satisfies f ω = −f ω , by the linearity of ω. The same holds even when 2c is not an integer, if we interpret the radial Fourier transform in non-integral dimension as a Hankel transform. Conversely, every radial −1 eigenfunction of the Fourier transform in R d arises as f ω for some ω, which we can obtain as follows by constructing a basis. If we let then ω k (Φ ∆ ) is the product of e −2π∆ with a polynomial in ∆ of degree at most k, in which if k is even, and For comparison, the Laguerre polynomials L give a basis for radial functions on R d as x → L (d/2−1) k (2π|x| 2 )e −π|x| 2 , with eigenvalues (−1) k under the Fourier transform. We conclude that the functions x → ω k (Φ |x| 2 /2 ) with k = 1, 3, 5, . . . , 2m − 1 must span the same space as the Laguerre eigenfunctions with these values of k, and thus they span the entire −1 eigenspace as m → ∞.
We have seen that choosing the linear functional ω in the U (1) c spinless modular bootstrap amounts to choosing an integrable, radial function f : R d → R with f = −f such that f is not identically zero. The constraints on ω say that f (0) ≥ 0 and f (x) ≥ 0 whenever |x| ≥ r for some radius r. Then ∆ gap = r 2 /2, and optimizing the bound means minimizing r. This optimization problem for signs of eigenfunctions was first studied by Cohn and Elkies [8], and it was placed in the context of more general uncertainty principles for signs of functions by Cohn and Gonçalves [11].
The corresponding problem for +1 eigenfunctions asks for an integrable, radial function g : R d → R with g = g such that g(0) ≤ 0 and g(x) ≥ 0 for |x| ≥ r. Again the goal is to minimize r, without letting g vanish identically. This problem does not arise in the spinless modular bootstrap as set up above, but it would apply if the partition function satisfied Z(−1/τ ) = −Z(τ ) and d ∆ < 0 for ∆ > 0 (see Section 2.1 of [11]). It behaves much like the −1 case. For example, Cohn and Gonçalves [11] obtained an exact solution of the +1 problem for c = 6, which is analogous to the solutions of the −1 problem with c = 4 or 12. The partition function in this case is given by Z(τ ) = j(τ ) − 1728, which also arises as the Norton series for a certain pair of elements in the Monster group (see equation (7.3.4c) in [36, p. 425]). Although we do not have direct physical interpretation for this problem, generalized modular transformations do arise in theories with discrete anomalies [37] or fermions [38,39], including sectors that obey Z(−1/τ ) = −Z(τ ).

Sphere packing
The −1 eigenfunction uncertainty principle plays a fundamental role in discrete geometry, where it underlies the linear programming bound for the sphere packing density. Linear programming bounds are a powerful technique for proving upper bounds for packing density or error-correcting code rates. They were introduced for discrete error-correcting codes by Delsarte [22] in 1972, and extended to sphere packing in Euclidean space by Cohn and Elkies [8] in 2003. The connection with the spinless modular bootstrap for U (1) c was derived by Hartman, Mazáč, and Rastelli [9] in 2019.
The linear programming bound for sphere packing in R d converts an auxiliary function satisfying certain inequalities into a sphere packing density bound, as follows. 4 Theorem 2.1 (Cohn and Elkies [8]). Let h : R d → R be an integrable, continuous, radial function such that h is integrable, and let r be a positive real number. If h(0) = h(0) = 1, h(x) ≤ 0 whenever |x| ≥ r, and h(y) ≥ 0 for all y, then every sphere packing in R d has density at most the volume of a sphere of radius r/2 in R d , i.e., The problem of choosing h so as to minimize r is clearly reminiscent of the −1 eigenfunction uncertainty principle, but not obviously equivalent to it. One direction is simple: if h satisfies the hypotheses of Theorem 2.1, then the function Conversely, Cohn and Elkies conjectured that an optimal solution f of the −1 eigenfunction problem can always be lifted to a function h for use in Theorem 2.1 with the same value of r, such that h − h = f . In other words, the linear programming bound for sphere packing should be identical to the spinless modular bootstrap. No proof is known, but no counterexample has been found, either numerically or analytically.
At first glance, it is not obvious that any auxiliary function satisfies the hypotheses of the linear programming bound. For a first example, let χ : R d → R be the characteristic function of a ball B r/2 centered at the origin, with its radius r/2 chosen so that vol(B r/2 ) = 1. Then the convolution h := χ * χ has Fourier transform h = χ 2 . By construction, h(x) = 0 for |x| ≥ r and h(y) ≥ 0 for all y; furthermore, h(0) = vol(B r/2 ) = 1 and h(0) = vol(B r/2 ) 2 = 1. Thus, the sphere packing density in R d is at most vol(B r/2 ) = 1. This bound is sharp when d = 1, but it is of course not an exciting packing bound. For d > 1 the linear programming bound is much better than this first attempt.
In fact, it is the best upper bound known for the sphere packing density in high dimensions [20], but it is generally far from a tight bound [40]. Only four cases seem to be sharp: d = 1 (as shown above), 2, 8, and 24. The case d = 8 was a breakthrough due to Viazovska [28], and the case d = 24 extended her techniques [29]. These are the only cases in which the sphere packing problem has been solved above three dimensions. The optimal auxiliary functions for d = 8 and 24 can also be derived from analytic functionals constructed that same year by Mazáč [25] in the four-point function bootstrap for 1d CFTs, as shown by Hartman, Mazáč, and Rastelli [9]. Remarkably, the case d = 2 remains unsolved analytically. There is no doubt that it matches the two-dimensional packing density, 5 but no proof is known.
Linear programming bounds can be applied not just to sphere packing, but to understand ground states under pair potential functions more broadly [41,42,43]. We will not address that topic in this paper, except to note that our numerical results seem consistent with Conjecture 7.2 in [43], which says that the linear programming bound for sphere packing extends to the Gaussian core model and thereby proves a form of universal optimality, despite the failure of the analogous statement for binary error-correcting codes [44].

Numerics
To obtain numerical bounds for the U (1) c spinless modular bootstrap, we must choose a finite-dimensional space of functionals ω. We truncate at derivative order 4N − 1; in other words, ω will be a linear combination of ω 1 , ω 3 , . . . , ω 4N −1 , where as above ω k = ∂ k /∂τ k τ =i . For convenience let f (∆) = ω(Φ ∆ ), which differs from the −1 Fourier eigenfunction in being a function of ∆ rather than x with ∆ = |x| 2 /2. Then f (∆) can be written in terms of the Laguerre eigenfunctions as For fixed ∆ gap and N , the question is whether f can be chosen to satisfy the positivity conditions f (0) ≥ 0 and f (∆) ≥ 0 for ∆ ≥ ∆ gap without vanishing identically. This problem can be solved using semidefinite programming, or approximated using linear programming. Let ∆ LP,N 1 (c) be the best bound that can be obtained for a fixed truncation order N , and let ∆ LP 1 (c) be the best bound without restriction on ω. Increasing N improves the bound, and we expect that Numerical linear or semidefinite programming succeeds at small N , but has been limited to N 100 by the computational cost. It is much faster to trade the linear program for a nonlinear optimization over the roots of f (∆).
At the optimum, f (∆) is found empirically to have single roots at ∆ 0 = 0 and ∆ 1 = ∆ gap , and N − 1 double roots ∆ 2 , ∆ 3 , . . . , ∆ N . Assuming this to hold in general, we can restate the optimization problem as follows: fix ∆ 1 , and maximize f (0) over the parameters α j for 1 ≤ j ≤ 2N and ∆ n for 2 ≤ n ≤ N , subject to the pattern of roots If the optimized function has f (0) > 0, then this value of ∆ 1 is excluded. The marginal bound has f (0) = 0, and the corresponding ∆ 1 gives ∆ LP,N Aside from c = 3/2, this method has always worked in practice. 6 The resulting bound can be made rigorous simply by proving that the optimal functional satisfies the positivity conditions. This is essentially the method used by Cohn and Elkies [8]. In the conformal bootstrap, a similar approach was first discussed by El-Showk and Paulos in the context of 1d correlation functions [45]. Recently, their methods were improved by Afkhami-Jeddi, Hartman, and Tajdini and applied to the modular bootstrap [12]. The first step is to dualize the optimization problem. The dual problem, together with the equation f (0) = 0, leads to the equations Here (c) has exactly N states and can be found efficiently by Newton's method. The last ingredient we need in the algorithm is a procedure to generate the initial guess for Newton's method. We start at small N , where it is easy to find a guess by hand, and then gradually increase N , using the results from lower N to generate the next guess as in Appendix B of [12]. This method allows for large jumps in N , while still converging to the bound within a few Newton steps.

Data and plots
The linear programming bound for the sphere packing density in R d is shown in Figure 3.1 for d ≤ 48, along with the record packing densities from [21,  The linear programming bound is sharp when d = 1, 2 (conjecturally), 8, or 24, and seemingly nowhere else. From the perspective of discrete geometry, one of the most mysterious aspects is the role of these special dimensions. Unlike some other cases of the conformal bootstrap (see, for example, [46,Section V.B.4]), the bound itself shows no sign of kinks or other non-analytic behavior at these points in  For that purpose, a log-log plot is more effective, as in Figure 3.3. This figure shows the linear programming bound and the record sphere packing densities from [21] in black. The green and red curves show the best bounds that have been analytically derived: the green curve is Levenshtein's bound [47], and the red curve is the Kabatyanskii-Levenshtein bound [19], computed using Levenshtein's universal bound for spherical codes [48,49] and the approach of Cohn and Zhao [20] (we review these bounds in Sections 5.2 and 5.3). It is known that the linear programming bound is at least as strong as these bounds [8,20], but no further analytic results are known. As shown in Figure 3.3, our numerical calculations indicate that the linear programming bound is not much stronger than the better of these two bounds.
The blue curve in Figure 3.3 is a lower bound for the linear programming bound due to Scardicchio, Stillinger, and Torquato [50], which is a variant of a bound obtained by Torquato and Stillinger [51]. No better lower bound is known for the linear programming bound in the limit as d → ∞, and the known lower bounds for the sphere packing density are much method works for all c. We have no conceptual explanation for why this case seems to differ from all the others.  Figure 3.3; this exponential decay rate is the same as in the ideal glass phase of the hard sphere model (see [52, p. 247]), and no known lower bound improves on this rate. These lower bounds are obtained from probabilistic or averaging arguments, and as far as we are aware, no explicit construction in dimension 2048 or greater has been shown to achieve the Minkowski-Hlawka bound.

Extrapolation
The annotations on the right side of Figure 3.3 show the limits of the various curves as d → ∞. The limits of the green, red, and blue curves are known explicitly, and one can see that even d = 2048 is not especially close to the asymptotic limit as d → ∞. We know that the black curve always lies below the red curve, and it appears to be getting steadily closer. Thus, we expect the linear programming bound to be at most slightly better than the Kabatyanskii-Levenshtein bound in the limit as d → ∞. What Figure 3.3 does not reveal is whether the gap in fact tends to zero.
To estimate the asymptotic gap between the red and black curves, it is helpful to examine numerical data. Table 3.1 shows the difference KL − LP between these two bounds, as well as the differences between consecutive values of KL − LP and their ratios. The KL − LP column does indeed seem to decrease for d ≥ 2, and we expect that it is converging towards its limit at a rate proportional to 1/d. In that case, the difference column to its right should be decreasing towards zero at the same rate 1/d, and so the ratios in the last column should tend to 2. The behavior of the ratio column is not absolutely clear, but it does look like it may be increasing towards 2 beyond d = 32.
We therefore hypothesize that the difference column will continue to decrease by a factor between 1.92 and 2 in each additional row. In that case, the sum of all the entries below 0.00070 in the difference column must lie between In other words, the KL − LP column will fall below 0.00611 by an amount between 0.00070 and 0.00077, and so its limiting value will be between 0.00534 and 0.00541. We conclude that the limit of the LP column must be −0.6044 ± 0.0001 (i.e., between −0.6045 and −0.6043). In other words, the linear programming bound will be approximately 2 −0.6044d when d is large.
Conjecture 3.1. There exists a constant λ with 0.604 < λ < 0.605 such that the linear programming bound for the sphere packing density in R d is 2 −(λ+o (1))d as d → ∞ when the auxiliary function is fully optimized.
Of course we have no proof of this conjecture, or even a heuristic derivation. It is possible that the numbers could behave entirely differently when d is much larger, but that does not seem plausible. We are very confident in the first three decimal places of the estimate 0.6044 for λ, and fairly confident in the fourth. In fact, we have a proposal for the exact constant: Conjecture 3.2. The constant λ in Conjecture 3.1 is given by 2 −λ = e/(2π). Equivalently, where A − (d) denotes the optimal radius for the −1 eigenfunction uncertainty principle in R d .
Of course Conjecture 3.2 is speculative, and four digits of accuracy is far from enough to make a definitive argument for this value. The equivalent limits are much more appealing than the formula for λ, and their simplicity justifies going out on a limb. We have no great faith in this conjecture, but it is worth noting that a simple formula fits the data beautifully.  Our calculations also support Conjecture 1.5 from [11], which says that the sign change radii for the +1 and −1 eigenfunction uncertainty principles in R d are the same asymptotically as d → ∞. See Table 3.2 and https://hdl.handle.net/1721.1/125646 for the numerical data. 7 In the notation of [11], Specifically, the ratio A + (d)/A − (d) seems to be 1 + O(1/d).

Properties of the spectrum and degeneracies
The spectra of our numerically optimized solutions of the spinless modular bootstrap behave remarkably regularly when the central charge c is large. Later in this section we will examine how close this approximation is. The equation ∆ n = n + (c − 4)/8 amounts to the 1d generalized free fermion spectrum. This spectrum arose in analytic functionals for the 1d conformal bootstrap constructed by Mazáč [25], which were generalized to a basis by Mazáč and Paulos [53]. Hartman, Mazáč, and Rastelli [9] discovered that these functionals could be adapted to the 2d modular bootstrap scaling dimensions 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60  with U (1) c or Virasoro symmetry, and special cases were independently constructed by Rolen and Wagner [54] and by Feigenbaum, Grabner, and Hardin [55]. Figure 4.2 shows the sphere packing bounds obtained from these functionals. No proof is known that the functionals satisfy the required inequalities, and indeed they do not for 8 < d < 24; in particular, they would prove an impossibly good linear programming bound for d = 16 (see the dual bounds in [40]). However, the inequalities seem to hold for other values of d. Unfortunately, the resulting bounds are disappointing. The fact that the ∆ LP 1 (c) curve in Figure 4.1 bends below the free fermion line ∆ 1 = 1 + (c − 4)/8 is crucial for obtaining a strong bound, and the quality of the bound depends on the degree of deflection.
In contrast to the behavior for large c, the spectra for small c are much less regularly spaced. As we decrease c below 4, the scaling dimensions in Figure 4.1 start to diverge unpredictably from the green lines, and the behavior for c ∈ {1/2, 1} is entirely different. Our numerical techniques break down at c = 3/2, and it is presumably not a coincidence that this failure occurs at the transition between different regimes. It would be interesting to explore this transition for 1 < c < 2.      Figure 2]. Although we have no proof, we can predict the form of this divergence. It occurs starting at n ∼ (2/π)N , with shape determined by the following conjecture: Conjecture 4.1 says that the distribution shifts from uniform to a beta distribution around n ∼ (2/π)N . The uniform distribution corresponds to the 1d generalized free fermion spectrum, and the beta distribution describes the root distribution of high-degree Laguerre polynomials (see [57,Theorem 1]). Specifically, if we normalize the roots of the highest-degree polynomial L (c−1) 4N −1 (4π∆) from the truncated crossing equation (2.2) by dividing by a factor of N , then their distribution converges as N → ∞ to the beta function on [0, 4/π] with density x → x −1/2 (4/π − x) 1/2 /2. From this perspective, the transition in Conjecture 4.1 is between the uniform limiting behavior as N → ∞ and a generic root distribution corresponding to high-degree Fourier eigenfunctions.

Convergence to the free fermion spectrum
Conjecture 4.1 gives the following description of the curves in Figure 4.3 via Theorem 4 in [58]. We wish to approximate ∆ LP,N n (c)/N as N → ∞ with n/N → α for some constant where β is the solution of β−sin β = (1−α)π/2 with 0 ≤ β ≤ π/2. We will give some motivation for the high-energy portion of this formula in Section 4.6, after discussing degeneracies.

Deviations from the free fermion spectrum
Aside from c = 4 or 12, the spectrum ∆ LP n (c) is not exactly equal to the 1d generalized free fermion spectrum ∆ n (c) = n + (c − 4)/8. Instead, we always find some error in this approximation, and one natural question is whether the error tends to 0 as n → ∞. Figure 4.6 shows four test cases, namely c ∈ {16, 20, 24, 28}. In each of these cases, the error |∆ LP n (c) − (n + (c − 4)/8)| becomes small, no more than 10 −5 when n is large. However, there is a striking difference between two different scenarios: when c = 16 or 24, the error stabilizes above zero, while it seems to converge to 0 when c = 20 or 28. We have no conceptual explanation of this behavior, which seems to be periodic modulo 8 as c varies, with multiples of 8 being the worst case and 4 modulo 8 being the best case (and the only case with convergence to zero). Note that c = 4 and 12 fall into the latter category. In the plot, there are singularities at c = 4 and 12 that correspond to ε(c) = 0, with similar cusps at c = 20, 28, and 36, and again we see the largest error terms when c is a multiple of 8. This unexpected periodicity shows that the spinless modular bootstrap has a much richer number-theoretic structure than is apparent from the smooth plot of ∆ LP 1 (c) in Figure 3

Growth rate of the degeneracies
Solving the truncated crossing equation (2.2) yields not just scaling dimensions ∆ LP,N 1 (c) but also corresponding degeneracies d LP,N n (c), which converge as N → ∞ to the degeneracies d LP n (c) of a hypothetical CFT that attains the spinless modular bootstrap bound. For c ≤ 50 and c ∈ {1/2, 1, 4, 12}, numerical calculations show that these degeneracies are not integers, and thus they cannot come from an actual CFT. 8 For larger c, it is difficult to assess integrality, because the degeneracies grow exponentially as c → ∞ and must therefore be computed to high precision. The cumulative growth rate of the degeneracies is determined by modularity as follows. Because η(−1/τ ) = (τ /i) 1/2 η(τ ), the modular invariance of the partition function Z(τ ) = If we set τ = iβ/(2π) and let β → 0, we find that ∆ d ∆ e −β∆ ∼ (2π/β) c . 8 They could still come from a sphere packing. In that case dn would be the average number of sphere centers at distance √ 2∆n from a given sphere center, which need not be an integer.

Now the Karamata Tauberian theorem [59, Theorem 4.3 of Chapter V] implies that
In other words, the function is the U (1) c analogue of the Cardy formula [60] for degeneracies, because A similar formula applies to operator-product coefficients that appear in the bootstrap equations for conformal correlators [61]. Because the scaling dimensions ∆ LP n (c) are uniformly spaced with distance 1 asymptotically, we expect that d LP n (c) will be roughly ρ c (∆ LP n (c)) as n → ∞. The asymptotic formula (4.2) gives a sense in which this approximation is true on average, but we will see that the precise behavior is far more delicate. In the Virasoro case, a more fine-grained understanding of the asymptotic spectrum has been obtained recently using complex Tauberian theorems [62,63,64,65,66]. It would be interesting to do the same for sphere packing. This could perhaps explain the linear portion of the large-c spectrum, where the level spacing is very close to 1.

Degeneracies for c = 4 and 12
The degeneracies d LP n (4) and d LP n (12) are the coefficients of the theta series of the E 8 and Leech lattices, respectively. Much is known about these modular forms, including precise descriptions of their coefficients (see, for example, [21, p. 122 and p. 134]). The degeneracies are well understood, but far more subtle than the scaling dimensions ∆ LP n (4) = n and ∆ LP n (12) = n + 1. Figure 4.8 shows the normalized degeneracies d LP n (4)/ρ 4 (∆ LP n (4)) for c = 4. They are bounded above and below, but do not converge to 1; instead, they are strictly bounded away from 1. The most noteworthy aspect of Figure 4.8 is that the normalized degeneracies are almost, but not quite, periodic. 9 This near periodicity is explained by a classical formula for coefficients of Eisenstein series: d LP n (4)/ρ 4 (∆ LP n (4)) = σ 3 (n)/(ζ(4)n 3 ), where σ k (n) denotes the sum of the k-th powers of the divisors of n, and ζ is the Riemann zeta function. This function is not periodic, but for each ε > 0, there exists a natural number m such that if n 1 ≡ n 2 (mod m), then |σ 3 (n 1 )/n 3 1 − σ 3 (n 2 )/n 3 2 | < ε. In other words, for each ε > 0 it is approximately periodic to within ε, with the period length growing as ε → 0. Similarly, the function n → d LP n (12)/ρ 12 (∆ LP n (12)) is the sum of the almost periodic function n → σ 11 (n)/(ζ(12)n 3 )

Degeneracies for arbitrary c
The unexpected periodicity modulo 8 for scaling dimensions has a counterpart for degeneracies, as shown in Figure 4.9. Here the multiples of 8 are the best case for the accuracy of the U (1) c Cardy formula, while integers that are 4 modulo 8 are the worst case. Unlike the case of scaling dimensions, the cusps in Figure 4.9 do not seem to correspond to zero error in the limit as n → ∞. Instead, see Figure 4.10. Perhaps this discrepancy indicates that ρ c (∆) should be replaced with some better approximation.
Based on this data and the cases c = 4 and 12, we make the following conjecture, which is a more precise analogue of Conjecture 4.2: with equality whenever c is an integer that is congruent to 4 modulo 8.
Equality provably holds for c = 4 or 12.

Possible explanation of the high-energy spectrum
The formula for the upper portion of the high-energy spectrum, given by the second line in (4.1), was motivated by the following calculation. It does not fully explain the formula, but it indicates why it is a reasonable guess.
The counterpart of the Cardy formula in terms of functionals acting on the crossing equation is the integral identity which is the evaluation at 0 of the condition of being a radial Fourier eigenfunction. We will use this identity to generate an exact solution of the truncated crossing equations, which has too many states but is otherwise suggestive of the optimal solution. For small enough k, we can evaluate the integral exactly using the Gauss-Laguerre quadrature formula If we take n = 2N and p(x) = L (c−1) k (2x), we find that for k ≤ 4N − 1, where d m = w m /Γ(c) and the dimensions ∆ m satisfy L (c−1) 2N (2π∆ m ) = 0. In other words, we have generated a solution to the equations for odd k ≤ 4N − 1. This solution has 2N + 1 states, while our numerical method involves finding a solution to these truncated crossing equations with only N + 1 states, so this solution has no direct bearing on the bound. However we observe numerically that the high energy spectrum approximately agrees. The roots of the Laguerre polynomial L (c−1) 2N (2π∆) for large N and ∆ are given by the beta distribution described in Section 4.1, which motivates its appearance.

Spherical codes and implied kissing numbers
In sphere packing terms, the scaling dimensions ∆ n measure the distances √ 2∆ n between distinct sphere centers in the packing, and the corresponding degeneracy is the average number of centers at that distance from a given center. In particular, the first degeneracy d 1 is the average number of tangencies for spheres in the packing, i.e., the average kissing number of the packing. De Laat, Oliveira, and Vallentin [68] showed how to strengthen the linear programming bound for sphere packing by incorporating geometric bounds for the degeneracies, which go beyond the modular invariance of the partition function. In this section, we will use this idea systematically to explore when the linear programming bound can be sharp and how to improve on it. Along the way, we will review the Kabatyanskii-Levenshtein bound.
As motivation for this line of work, consider the implied kissing number d LP 1 (d/2), which is the average kissing number in a hypothetical d-dimensional packing achieving the linear programming bound. When d = 1, 2, 8, or 24, we of course obtain the kissing number of the optimal packing, but in general we obtain unrealistic numbers. For example, when d = 4 the implied kissing number is 26.43 . . . , which exceeds Musin's optimal bound of 24 for the four-dimensional kissing number [69]; it is therefore impossible for any packing to achieve the exact linear programming bound in R 4 . As Figure 5.1 shows, the implied kissing number is impossibly high for every d ≤ 24 except the known sharp cases. Within this range of dimensions, it perfectly delineates which cases are sharp. Figure 5.2 shows how the implied kissing number grows in high dimensions. Comparing it with upper bounds turns out to be surprisingly subtle, and we will do so in Figure 5.3 once we have explained more about the needed bounds. Aside from low dimensions, Figure 5.2 looks similar to Figure 3.3, and that is not a coincidence: Table 5.1 indicates that the implied kissing number is 2 d+o(d) times the linear programming bound as d → ∞. This relationship is easily explained using the U (1) c Cardy formula, because the sphere packing density is ρ c (∆ 1 )∆ 1 /(c4 c ) in terms of the spectral gap ∆ 1 and c = d/2. We can approximate the implied kissing number d LP 1 (c) by ρ c (∆ LP 1 (c)) using the Cardy formula; because of the size of ∆ LP 1 (c) we expect some error, but the error factor should be subexponential in c, in fact roughly ∆ LP 2 (c) − ∆ LP 1 (c). We conclude that the linear programming bound for the packing density is d LP 1 (c)/(4 + o(1)) c as c → ∞, which is the desired relationship. To prove bounds for kissing numbers, the relevant optimization problem is the spherical code problem, which is a compact analogue of the sphere packing problem. In dimension d and with minimal angle θ this problem asks how large a subset C of the unit sphere in R d can be if x, y ≥ s for all distinct x, y ∈ C, where s = cos θ. In other words, all points in C must be separated by at least a distance of θ along the surface of the sphere, and so C yields a packing with spherical caps of radius θ/2. Such a set is called a spherical code with minimal angle θ. The kissing problem amounts to the case θ = π/3; note that here we are considering the kissing problem for a single sphere, rather than averaged over a packing in Euclidean space. The implied kissing number from the linear programming bound, compared with the best upper bound known for the actual kissing number [70] and the current record [21].
Let A(d, s) be the largest possible size of such a code. Delsarte, Goethals, and Seidel [73] introduced a linear programming bound for A(d, s), which Kabatyanskii and Levenshtein [19] used to obtain the best sphere packing density bounds known in Euclidean space.
After briefly reviewing this linear programming bound, we will discuss two applications of spherical codes to the sphere packing problem, followed by a new average kissing bound. First we discuss the Kabatyanskii-Levenshtein bound, using the approach from [49]. Then we discuss a strengthening of the linear programming bound for the sphere packing problem through bounds for spherical codes, and its implications for when the bound can be tight. We conclude with a linear programming bound for the average kissing number.
The results of this section rely on the geometry of the sphere packing problem and do not appear to have any direct application to CFTs. On the other hand, conceptually, upper bounds on the average kissing number are similar to upper bounds on operator-product coefficients often considered in the bootstrap literature (e.g., [74,75]). It is also our hope that the methods in this section (in particular, the application of the Christoffel-Darboux formula to produce positive auxiliary functions) will inspire new analytic approaches to the conformal bootstrap.

The linear programming bound
The analogue of the radial Fourier transform on the surface of a sphere is the expansion in terms of zonal spherical harmonics, which uses the following orthogonal polynomials. 10 Let d be the dimension of the spherical code, and let a and b be nonnegative integers. (We can take a = b = 0 for now, but we will make use of a and b in the Kabatyanskii-Levenshtein bound.) Let where the normalization is chosen so that 1 −1 dt w a,b (t) = 1. Define the orthogonal polynomials Q a,b i (t) with deg(Q a,b i ) = i and positive leading coefficients by for i, j ≥ 0. Up to normalization, these polynomials are the Jacobi polynomials with parameters where e ∈ S d−1 is an arbitrary point and µ is the surface measure on the sphere S d−1 , normalized so that µ(S d−1 ) = 1. Therefore the polynomials Q i are orthogonal if we think of them as zonal functions on S d−1 , i.e., functions x → Q i ( x, e ) invariant under the stabilizer subgroup of O(d) with respect to e. Moreover, these polynomials are of positive type: for all finite C ⊆ S d−1 and all coefficients c x ∈ R for x ∈ C, This inequality follows from the addition formula where the functions v i,j : R d → R for j = 1, . . . , r i are an orthonormal basis of the spherical harmonics of degree i (see [76,Theorem 9.6.3]); specifically, the addition formula shows that the left side of (5.1) is a square and therefore nonnegative. The linear programming bound for spherical codes converts an auxiliary function into an upper bound on the greatest size of a spherical code: One can optimize this bound numerically for given d and N using semidefinite programming. Specifically, we can create a semidefinite program where f 0 , . . . , f N are one-by-one positive semidefinite matrices, and the inequality constraint is modeled as where X and Y are positive semidefinite matrices and v k (t) = (Q 0 (t), . . . , Q N/2 −k (t)).
If we additionally require f 0 = 1, then the objective is f (1), which means we are minimizing a linear functional over positive semidefinite matrices with linear constraints. This semidefinite program can be solved numerically on a computer.
Shtrom [78] computed the exact linear programming bound for the kissing number A(d, 1/2) when d ≤ 146, by determining the optimal value of N , beyond which there is no improvement. We have extended these computations to d ≤ 424 using N = 95, which appears to be high enough in this range of dimensions and in any case should closely approximate the optimum. Figure 5.3 shows the ratio of the implied kissing number to this upper bound. They are very close to each other in size, but their precise ratio seems difficult to predict, and we do not know what happens as d → ∞. No sphere packing can match the linear programming bound for density when this ratio is strictly greater than 1. Our initial hope was that this condition would rule out every dimension d > 24, but it does not. Instead, further progress may depend on more powerful bounds for the kissing number, such as semidefinite programming bounds [79,70]. This approach can rule out exact equality in the linear programming bound for packing density, but it does not give a quantitative improvement. We will return to the problem of improving the density bound, once we explain Levenshtein's universal bound and the Kabatyanskii-Levenshtein bound.
Since Q 1,0 k (t) is of positive type, these inequalities show K 1,0 k−1 (t, s) is of positive type as a function of t. Moreover, they also show (t − s)K 1,0 k−1 (t, s) is of positive type as a function of t by using the Christoffel-Darboux formula, which says where c k > 0 is the leading coefficient of Q 1,0 k . Since the product of functions of positive type is also of positive type, it follows that f (s) (t) is of positive type for t 1,1 k−1 ≤ s < t 1,0 k . By using the property that (t + 1)Q 1,1 is of positive type, this argument can be extended to show f (s) (t) is of positive type for all s ∈ [−1, 1]. Thus, these function can be used as auxiliary functions for Theorem 5.1, which gives Levenshtein's universal bound for the sphere. In terms of the normalized polynomials Q i (s) = Q i (s)/Q i (1), we arrive at the bound In certain cases this bound is the best that can be obtained from Theorem 5.1, but in general it does not fully optimize the choice of auxiliary function.

The Kabatyanskii-Levenshtein bound
The following geometric inequality shows how the sphere packing density ∆ R d in R d can be bounded using A(d, s): (see (6.9) in [49] or Proposition 2.1 in [20]). To obtain a good bound for fixed d, this inequality can be combined with Levenshtein's universal bound for A(d, s), where the best value of s can be found by optimizing a piecewise differentiable function. The resulting bound is the one shown as the Kabatyanskii-Levenshtein bound in Figure 3.3, although Kabatyanskii and Levenshtein [19] used a slightly worse bound for A(d, s) as well as for ∆ R d in terms of A(d, s).
To obtain an asymptotic bound as d → ∞, we can use the inequality when s ≤ t 1,1 k (see (6.13) in [49]). If k, d → ∞ with k/d → α, then by Corollary 5.17 in [49]. For θ < π/2, taking α = 1 − sin θ 2 sin θ ensures that t 1,1 k → cos θ, and applying Stirling's formula shows that as d → ∞. It now follows from (5.2) that the sphere packing density is at most 2 −(κ+o (1) Optimizing for the best choice of θ between π/3 and π/2 yields the root θ = 1.09951240 . . . of sec θ + tan θ = e (tan θ+sin θ)/2 , at which point we obtain the Kabatyanskii-Levenshtein bound κ = 0.59905576 . . . . Cohn and Zhao [20] gave a general transformation showing that any bound obtained from Theorem 5.1 and (5.2) can also be obtained directly from the Euclidean linear programming bound. Thus, there is no need to use spherical codes to obtain the Kabatyanskii-Levenshtein bound. However, the transformation sheds little additional light on this bound, and it is difficult to see how someone might think of it without using spherical codes.

Implied kissing numbers
One can strengthen the linear programming bound by taking into account constraints on spherical codes. The following relaxation of the kissing number will prove useful in doing so. Let C be the set of sphere centers in a packing. For x ∈ C and r ≥ 0, let N x (r) = #{y ∈ C : 0 < |x − y| ≤ r}, and let N (r) be the average of N x (r) over x ∈ C (we can restrict our attention to periodic packings, so that this average is well defined). If r 0 is the minimal distance in C, then N (r 0 ) is the average kissing number of C, and for t > 0 we define the average t-neighbor number to be N (tr 0 ). In other words, the average kissing number is the average 1-neighbor number.
The following strengthening of the linear programming bound is a special case of Theorem 1.4 in [68], where it was used to give improved bounds in dimensions 4 through 7 and 9. The proof in [68] amounts to retaining more terms in the Poisson summation argument from [8].
Theorem 5.2 ([68]). Let g : R d → R be a radial Schwartz function, and suppose that g satisfies the following inequalities for some η ≥ 0 and s > r > 0: (1) g(0) > 0 and g(0) > 0,  Suppose furthermore that every sphere packing in R d has average s/r-neighbor number at most M . Then the sphere packing density in R d is at most The linear programming bound is equivalent to taking η = 0. The extra flexibility of being able to choose g(0) and g(0) is irrelevant, because we can rescale g and its input variable, but it will be convenient below.
When the implied kissing number is impossibly large, we can apply Theorem 5.2 to improve on the linear programming bound from Theorem 2.1 as follows. Suppose h is a Schwartz function satisfying the hypotheses of the linear programming bound with radius r. One can check by a rescaling argument that the average kissing number of any sphere packing that achieves this bound must be where h (r) denotes the radial derivative at radius r (see Lemma 5.1 in [56]). Thus, if h is the optimal auxiliary function in the bound, then K must be the implied kissing number. Suppose furthermore that for some t > 1 we can prove an upper bound B for the average t-neighbor number in every packing in R d with B < K. For example, B could be an upper bound for A(d, cos θ) for some θ > π/3, which can be arbitrarily close to π/3. Given such a bound B < K, Theorem 5.2 proves a strictly stronger density bound than Theorem 2.1 using h, as follows. Let s = (1 + ε)r for some small ε > 0, and define g by g(x) = h(x/(1+ε)). Then g satisfies the hypotheses of Theorem 5.2 with η = −rh (r)ε+O(ε 2 ), g(0) = 1, and If ε is small enough, then we can take M = B in Theorem 5.2, and which is less than 1 when ε is sufficiently small, because B < K. The improvement here is not large, but the resulting density bound is strictly better than that from Theorem 2.1. Thus, Figure 5.3 shows that we can extend the improved density bound from [68]

Bounds for the average kissing number
The implied kissing number has a concrete geometric meaning, beyond being the average kissing number of a hypothetical packing. It turns out to be an upper bound for the average kissing number of any sphere packing, subject to some conjectures about interpolation. The key tool is the following theorem, which is the Euclidean analogue of Proposition 4.1 in [80] by Bourque and Petri.
Then the average kissing number of any d-dimensional sphere packing is at most Here f (r) denotes the value of f (x) when |x| = r.
Proof. It suffices to prove the inequality for finite packings and take a limit. Let C be any finite subset of R d with minimal distance r, and let N = #{(x, y) ∈ C 2 : |x − y| = r}/|C| be its average kissing number. Then Fourier inversion implies that thanks to the inequalities for f . By combining these two bounds, we conclude that N ≤ −f (0)/f (r).
One can also prove this theorem using Poisson summation, along the lines of [8] or [80]. The conditions for equality are similar to those for the linear programming bound for packing density, if we assume self-duality. 11 Specifically, equality holds iff f vanishes at radius r n := √ 2∆ n for n ≥ 2, f vanishes at r n for n ≥ 0 (with r 0 = 0), r = r 1 , and f (r 1 ) = 0 (because otherwise shifting r would improve the bound). For comparison, the equality conditions for h in Theorem 2.1 are identical, except that the conditions f (r 1 ) = 0 and f (0) = 0 are replaced with h(r 1 ) = 0.
Suppose r n = 2∆ LP n (d/2) and d n = d LP n (d/2) come from the optimal solution to the linear programming bound in R d . The crossing equation says that In other words, the bound −f (0)/f (r 1 ) for the average kissing number is d 1 , as desired.
When should such a function f exist and satisfy the hypotheses of Theorem 5.3? First, note that the condition f (0) = 0 is redundant, for the following reason. If we let F (x) = |x|f (x), then F (y) = −d f (y) − |y| f (y). The other conditions on f guarantee that F (0) = F (r n ) = F (r n ) = 0 for n ≥ 1, and then the crossing equation implies that F (0) = 0 and hence f (0) = 0. What we need is for f to satisfy the same equality conditions as h, except for changing h(r 1 ) = 0 to f (r 1 ) = 0.
These conditions arise naturally in interpolation problems [81,43]. Specifically, Open Problem 7.3 from [43] raised the question of whether radial Schwartz functions g : R d → R are uniquely determined by the values and radial derivatives of g and g at the radii r n for n ≥ 1. While this assertion fails for d ≤ 2 and is difficult to test for d = 3, it seems to hold numerically for d ≥ 4. Proving or disproving it would be an important step forward in our understanding of the modular bootstrap.
The conditions on f and h mean they are part of an interpolation basis for reconstructing g from these values, since all but one of the values must vanish for f and h. Thus, Theorem 5.3 gives a natural geometric interpretation for one of the basis functions, just as Theorem 2.1 does.
Aside from d = 8 or 24 (in which case [43] proves an interpolation theorem), we do not know how to prove that an interpolation basis exists, or that the basis functions satisfy the right sign conditions for these theorems. However, the numerical evidence indicates that both are true. If so, the implied kissing number is an upper bound for the average kissing number of every sphere packing.
This relationship has a pleasing consequence: in each dimension, either the implied kissing number is the best bound known for the average kissing number, or we can use a better bound in Theorem 5.2 to improve on the packing density bound. 12 In other words, if we fail to improve on the linear programming bound for density, it can only be because we have obtained an excellent bound for the average kissing number.

A The limiting case of the spinless modular bootstrap
In this appendix, we explain why ∆ gap is an upper bound for the spectral gap in the spinless modular bootstrap even if ω(Φ 0 ) = 0 (in the notation of Section 2.1). We expect that any such functional ω is the limit of functionals with ω(Φ 0 ) > 0, but it is not clear how to justify that expectation. Instead, we can use essentially the same argument as the proof of Proposition 2.4 in [11]. We will translate it into modular bootstrap terms for the convenience of the reader.

B Convergence of the spinless bootstrap
In this appendix, we examine the convergence rate of the spinless modular bootstrap as a function of the truncation order, and in particular explain why we are confident that the numerical calculations in Sections 3 and 4 have been fully optimized. Figure B.1 shows the density bounds obtained using truncation orders N = 1, 2, 4, . . . , 512 for dimensions d ≤ 2048, in the same format as Figure 3.3. 13 Each fixed N seems to lead to the same limit as d → ∞, analogously to [34, Section 3.2], but they closely approximate the optimal linear programming bound over increasingly large ranges of d. In particular, doubling N more or less doubles the range of dimensions over which we obtain a close approximation.
What Figure B.1 indicates is that N should be chosen proportionally to d if we wish to obtain comparably accurate results. Figure B.2 makes this assertion more precise, and shows that the required truncation order is remarkably close to linear in d. We obtained the limiting values ∆ LP 1 (c) in Figure B.2 by taking N quite large, in particular more than twice as large as needed to make the plotted values stop changing.
We do not know a formula for the slopes in Figure B.2. When we need to estimate the convergence rate in high dimensions, we extrapolate from lower dimensions and then make a conservative underestimate. What makes this procedure reliable is how close to linear  (c) have converged as a function of N . Here, the behavior is not nearly as linear, and it is more difficult to extrapolate.
The most delicate numerical estimation in this paper occurs in obtaining the number −0.6044 as the infinite-dimensional limit of the LP column in Table 3.1. Table B.1 gives evidence that the values in Table 3.1 are correctly extrapolated to infinite truncation order. Specifically, for each dimension Table B.1 lists the largest truncation order N we have computed, together with the smallest order N k that agrees with order N to k decimal places. The numbers in black are exact, meaning that truncation order N k − 1 is not enough. In each such case N is at least 2N k , and often much larger than that; this margin of safety gives us confidence that these values do reflect the limit as N → ∞. The red numbers are obtained by doubling the numbers above them, which seems to produce an overestimate and would work in every other case with d > 2. Even for the red numbers, N 5 < N , and therefore we believe that our truncation orders are high enough for all the numbers in Table 3.1 to have stabilized.  Table B.1: The truncation order N k required to approximate (log 2 density)/d to within 0.5 · 10 −k , so that rounding to k decimal places leads to error at most 10 −k .