Bootstrapping $\mathcal{N}=4$ sYM correlators using integrability

How much spectral information is needed to determine the correlation functions of a conformal theory? We study this question in the context of planar supersymmetric Yang-Mills theory, where integrability techniques accurately determine the single-trace spectrum at finite 't Hooft coupling. Corresponding OPE coefficients are constrained by dispersive sum rules, which implement crossing symmetry. Focusing on correlators of four stress-tensor multiplets, we construct combinations of sum rules which determine one-loop correlators, and we study a numerical bootstrap problem that nonperturbatively bounds planar OPE coefficients. We observe interesting cusps at the location of physical operators, and we obtain a nontrivial upper bound on the OPE coefficient of the Konishi operator outside the perturbative regime.


Introduction
The study of N = 4 supersymmetric Yang-Mills theory (sYM) has driven advancements in key areas of theoretical physics, including insights into the AdS/CFT correspondence through the development of scattering amplitudes, correlation functions, supersymmetric localization, integrability, and the conformal bootstrap. Some of these techniques offer exact results in special subsectors, while others are perturbative approximations which often exploit exact results as boundary conditions. It is interesting to ask to what extent the full theory is nonperturbatively determined by exact subsectors.
In the planar 't Hooft limit N = 4 sYM, integrability has led to impressive advancements for the computation of the spectrum of local operators, some correlations functions and scattering amplitudes at finite coupling in various kinematical limits [1][2][3]. This boundary data was instrumental in recent perturbative scattering amplitudes and form factors which reached a record high loop order (8 loops) [4] and high multiplicity (up to seven-points) [5,6]. In the strong coupling limit, most efforts have focused on correlators of 1 2 -BPS operators, which is primarily driven by string theory, supergravity, conformal bootstrap, and supersymmetric localization [7,8] techniques. Stringy corrections [9][10][11] and non-planar corrections [12,13] have been computed for four points, while the highest multiplicity correlator is the tree-level five-point function [14], obtained by exploiting all existing constraints. Although still out of reach for most observables, these developments suggest that more quantities will eventually be computed at finite coupling in this theory.
A promising nonperturbative avenue originates from the integrability literature: in the planar limit, the spectral problem is completely solved as a result of the Quantum Spectral Curve (QSC) [15][16][17], which governs the spectrum of single-trace operators at arbitrary values of the coupling; see ref. [18] for a recent pedagogical review. At the moment the QSC only provides spectrum of single-trace operators, and therefore dynamics, encoded through OPE coefficients for example, remain elusive at finite coupling. Nonetheless, there are encouraging developments with respect to the hexagon [19][20][21][22] and octagon [23,24] formalisms, which may overcome this challenge; the hexagon and octagon formalisms reinterpret the Feynman expansion as the scattering of magnons and they are readily computable in the large Rcharge limit. Unfortunately, away from this limit, the complexity of such computations grow exponentially.
An independent nonperturbative framework is the numerical conformal bootstrap [25][26][27]. The latter combines unitarity, crossing symmetry, and other nonperturbative constraints to resolve the spectrum of a generic conformal field theory and to bound its OPE coefficients; see [28] for a recent review of numerical bootstrap results. In the context of N = 4 sYM, this method provides a handle into energy-momentum correlation functions for finite gauge groups SU(N c ) [29], and, when combined with exact results from localization, arbitrary 't Hooft coupling [8].
It was recently proposed that integrability and bootstrap techniques could be combined to solve N = 4 sYM in the planar limit [30,31]. The authors introduced an approach ("Bootstrability") to study the 1D defect CFT defined by inserting local operators along a 1 2 -BPS Wilson line in N = 4 sYM. Taking exact spectral data as input from integrability, they used bootstrap techniques to derive tight bounds on the OPE coefficients of the first few lightest operator in the spectrum.
In this paper, we tackle a similar problem for a fully four-dimensional correlator involving four stress tensors in the planar limit of N = 4 sYM theory, meaning the N c → ∞ of a SU(N c ) gauge theory with fixed 't Hooft coupling λ = g 2 YM N c . We will use spectral information from integrability, together with suitable nonperturbative sum rules on the spectrum, to constrain OPE coefficients. In the planar limit, both single-trace and double-trace operators are exchanged as illustrated in fig. 1. The spectrum of single-trace operators can be calculated precisely owing to the QSC. Our goal will be to use the numerical bootstrap to bound the OPE coefficient of the lightest unprotected single-trace operators in the spectrum, the socalled Konishi operator.
At strong coupling, the Konishi operator is dual to a genuine massive string mode. Its properties have been studied extensively over the years. Its scaling dimension has been computed perturbatively in both weak and strong 't Hooft coupling limits [32,33], results which are now exactly connected at intermediate coupling by the QSC [15,34,35]. However, less is known about the Konishi operator's OPE coefficient. It has been computed up to five loops in the weak coupling limit [36], and is known to the leading order at strong coupling through its connection with the flat-space Veneziano-Shapiro amplitude [9,10,37]. Can it be bootstrapped at finite coupling, given spectral information from the QSC?
While our work is similar in spirit to refs. [30,31], the passage from D = 1 defects to D = 4 correlators presents significant challenges. A main one is that an infinite number of double-trace operators enter the OPE, polluting it with undesirable operators about which we have no spectral information. Worse, the numerical bootstrap leverages the positivity of OPE coefficients, but O(1/N 2 c ) corrections to OPE coefficients control planar correlators (see fig. 1) and do not have definite signs. These hurdles are overcome by the recently introduced dispersive CFT functionals [38][39][40][41], which decouple the double-traces; planar correlators are reconstructed from single-trace data only [42]. This work constitutes the first systematic application of dispersive functionals to a numerical bootstrap problem. A second challenge is that the single-trace operators entering the OPE are more numerous than in 1D, being now labelled by dimension and spin, but in practice only a finite number of dimensions can be computed from the QSC. This paper is organized as follows. In section 2, we describe our setup by detailing properties of the stress tensor multiplet correlator, the decoupling of the double-twist sector, and aspects of integrability relevant for our bootstrap algorithm. In section 3, we discuss the dispersive functionals used in this paper; details of their construction and numerical evaluation are described in the appendix. Section 4 contains our primary results -bounds on the OPE coefficient of the Konishi operator -obtained from the numerical bootstrap. Finally, we summarize our findings and discuss future work in section 5. Three appendices review integrability formulas; detail our construction of functionals which solve the 1-loop problem; and detail efficient formulas for numerically evaluating of dispersive functional.

Stress tensor multiplet correlators
We consider the simplest half-BPS operator in the sYM theory, which transforms as a symmetric traceless tensor so(6) R . Its supermultiplet notably contains the stress tensor. Working in index-free notation, this operator can be viewed as function of spacetime coordinates x and a null 6-vector y: O(x, y) ∝ Tr (y·φ(x)) 2 . (2.1) We use the canonical normalization O(x 1 , y 1 )O(x 2 , y 2 ) = (y 2 12 /x 2 12 ) 2 . Due to conformal symmetry, the four-point correlator factors through spacetime and R-charge cross-ratios, Furthermore, superconformal Ward identities constrain the dependence on R-charge vectors. Namely, they imply that the correlator with z = α is protected and does not depend on the coupling. This allows to separate the free theory limit (g 2 → 0) from the dynamical part H of the correlator [43,44]: where c ≡ N 2 c −1 4 , and H is independent of R-symmetry cross-ratios α, α, which only appear in its prefactor. We are interested in the planar limit, where H is N c -independent.
The function H effectively behaves like a correlator of four scalar primaries with ∆ = 4. It enjoys the following properties: 2. Operator Product Expansion: it can be expressed as a sum of short (protected) and long (unprotected) multiplets, the latter being labelled by their dimension and spin (∆, J): The protected part, equal to minus the g → 0 limit of the sum, will be discussed below.
3. Regge limit: as z, z → ∞ with fixed ratio z/z, H ∼ z J * −4 where z is the Regge intercept. Nonperturbatively in N c we have J * ≤ 1, ie. the quantity uvH is bounded. In the planar limit this bound becomes trivial because of the overall 1/c, but the bound of chaos still ensures J * ≤ 2, which is saturated at infinite 't Hooft coupling. We will assume that at finite coupling, J * < 2 strictly.
4. Weak and Strong limits: in our conventions, at extreme values of the 't Hooft coupling (2.7) In the above, we use the standard conformal block . Furthermore, F 1 is the box integral andD 2,4,2,2 is a derivative of it (see [46]): Note that we have factored the large-N c scaling of the single-trace OPE coefficients into λ 2 so that λ does not depend on N c : As an abuse of notation we will still refer to λ as OPE coefficients.
In the planar limit c → ∞, the OPE (2.6) receives contributions only from single-and double-trace operators. Our main goal will be to constrain the single-trace coefficients λ 2 ∆,J given input about the single-trace spectrum from integrability.

Decoupling double-traces
For our purposes, the double-trace contribution to the OPE is a nuisance. Since double-traces enter already in the disconnected correlator (∼ c 0 terms in eq. (2.4)), their contributions to H represent 1/c corrections to coefficients and scaling dimensions that do not have definite signs. To formulate a nonperturbative bootstrap in the planar limit, it is crucial to project out all double-traces. This is naturally achieved by taking a double-discontinuity of the correlator. For z, z < 0, let: where the arrows denote analytic continuation paths starting from the Euclidean region with 0 < z, z < 1. The first term is simply the Euclidean correlator (the path maintains z = z * ), which enjoys the usual OPE, while for the other terms the analytic continuation simply adds phases, so the OPE (2.6) yields The crucial point is the trigonometric factor, which has double-zeroes at the position of double trace operators, ∆−J = 4+2m + O(1/c) with m ∈ N. Thus, in the planar limit, the above sum is saturated by single-trace operators. The protected double-discontinuity is simple to describe: only operators of twist exactly two, from the stress-tensor multiplet, contribute. Taking the singular terms in eq. (2.31) of [29], we find (2.14) Note that the double-discontinuity of 1/u is a nonvanishing singular distribution near u = 0 [47]. A simple check is that this is precisely the double-discontinuity of the strong coupling result (2.7): This happens because in the supergravity limit all non-protected single-traces become heavy and decouple from (2.13). The double-discontinuity kills double-traces but is not crossing symmetric since it picks a specific channel (above, the s-channel). How do we get crossing equations? The nontrivial fact is that conformal correlators are uniquely determined by their double-discontinuity and Regge limits. Concretely, they can be reconstructed through dispersive integrals [42]: (2.16) The kernel K is recorded in eq. (C.1) but won't be important for the present discussion. The integration region lies inside s-channel kinematics z , z ≤ 0, where the OPE (2.13) converges.
(The integration region is further restricted by step-functions and delta-functions inside K.) By defining "Polyakov-Regge block" as the dispersive transform of a single s-channel block, the correlator can thus be expressed as The crucial point is that only single-traces enter this sum in the planar limit, since P inherits the double zeroes from dDisc. The protected contribution is simply H strong (u, v) because of the decoupling just mentioned; we verified this numerically from the formulas in appendix. The above is valid for any Euclidean u, v, namely, u and v which come from complexconjugate cross-ratios z = z * . This condition can be stated as: The dispersive representation manifests u ↔ v crossing symmetry, which correspond to the s ↔ t-channel crossing equation: To get crossing relations, the idea is that the second relation in (2.5) is nontrivial, and amounts to an infinite number of constraints: (2.20) This statement of crossing symmetry involves only single-trace data in the planar limit.
It is not the most general statement yet, because the (unsubtracted) dispersion relation (2.16) only relied on the Regge behavior J * < 4. (The threshold is shifted by four compared with the usual threshold of an unsubtracted dispersion relation due to supersymmetry and the factors in (2.4).) But since we expect J * < 2 at finite 't Hooft coupling, more is true: anti-subtracted dispersion relations converge. As explained in [39] and reviewed in section 3.1, the difference between different subtraction schemes are "dispersive sum rules" characterized by their patterns of zeroes on double-twist operators. Here we are not allowed to cancel any double-trace zero, so there is only a one-parameter family of extra constraints. We can take it to be the B 2,v sum rule in eq. (4.39) of [39] applied to u v H. Dividing it by v, we will call it simply the B v functional: It can be proved directly that B v [H] = 0, essentially by deforming the integration contour from the s-channel to the t-channel double-discontinuity, and exploiting u ↔v antisymmetry of the integrand [39]. For a generic correlator the contour deformation would pick a contribution from u-channel identity, but this is absent for H. The integral against (u ) δ becomes singular for u → 0 if δ ≤ −1, but can be defined by analytic continuation in δ. One finds in this way that when acting on twist-two exchanges, B v simply returns the coefficient of 1/u [39], (2.14). Therefore, the B v sum rules take the form: Here and below we use the notation for the action of a functional on a block. The salient feature of these sum rules is the protected contribution, which will provide an absolute normalization to OPE coefficients; it will play a similar role in our analysis as the identity operator in numerical bootstrap studies. At strong coupling, it can be interpreted as a relation between protected graviton exchanges and heavy string modes. It is crucial for its validity that the 't Hooft coupling is finite, so the Regge intercept is strictly less than 2.
The crossing relation (2.20) and B v sum rule (2.21) will be our main tool: they exhaust the constraints on single-trace data coming from crossing symmetry and good Regge behavior. Formulas for their efficient numerical evaluation are detailed in appendix C. Following the bootstrap method, the key idea will be to exploit positivity of the unknowns λ 2 ∆,J .

Input from integrability: Spectrum from quantum spectral curve
Operators in N = 4 SYM can be identified through their charges under the global symmetries, the conformal group SO(4, 2) : {∆, J 1 , J 2 } and the R-symmetry group SO(6): {r 1 , r 2 , r 3 }. However, the real "fingerprint" of a (single-trace) operator is its set of charges under the infinite family of symmetries that make the theory integrable. This fingerprint is encoded in a QSC [16,34]. The latter is composed of a set of 8 functions: P a (u) and Q j (u) with indices a, j ∈ {1, 2, 3, 4}, which depend on the spectral parameter u. 1 For each operator, there is a unique set {P a , Q j }. In particular, the global charges are recovered in their large-u asymptotics: Since we look at a single correlation function, of the stress-tensor multiplet, all the operators we are interested in have the same R-charges, and spacetime charges of the form (∆, J 1 = 1 We recognize the overload of the letter "u", which represents in turn the cross-ratio u, the spectral parameter u, and below, the Mellin-Mandelstam variable u. We hope that no confusion will appear from the context. Analytic structure of P a and Q j . The P a have a single square-root type branch cut at u ∈ [−2g, 2g], while the Q j have an infinite ladder of short branch cuts in the lower halfplane. Alternatively, the principal sheet of Q j could be defined so it is an analytic function outside two long cuts on the real axis. J + 2, J 2 = 0, r i = 0), corresponding to the exponents: (2.25) In this paper we use the quantum spectral curve to determine the scaling dimensions for a few values of the coupling g for the first few primary operators in the leading and sub-leading Regge trajectories. Specifically, we consider the operators with the following identification at weak coupling: Additionally, we will use the Asymptotic Bethe Ansatz to study the leading and subleading trajectory at asymptotically large spins. We now briefly review how the QSC is solved to find the dimension ∆ of a given operator. The starting point is the analytic properties of the P a , which is analytic outside a short cut [−2g, 2g], where it has a square-root branch point, see figure 2. This allows to parametrize it as where the Zhukovsky variable is (2.28) The series proceeds in even powers of 1/x due to the left-right symmetry of our operators [16]. The QSC equations allow us to gauge-fix c 4,1 = 0. This parametrization converges in a neighborhood of the cut on the second sheet, where the continued function is obtained by a simple replacement x → 1/x:P Given P a (u), the Q j (u) are obtained by solving a finite difference equation known as P-Q system. It involves an intermediate function Q a|j (u) which satisfies with χ ab = (−1) a δ a,5−b is a constant antisymmetric matrix. Given P a (u), the two preceding equations give a homogeneous finite difference equation which can be solved for Q a|j (u), subject to the boundary condition Q a|j ∼ u→∞ u Ma+M j +1 at large imaginary u. This determines Q a|j as an analytic function in the upper-half-plane, with a sequence of short cuts in the lowerhalf-plane starting at u = − i 2 + [−2g, 2g]. To close the equations, one uses eq. (2.31) together with the values of Q a|j (u) at u ∈ i 2 + [−2g, 2g] to evaluate Q j (u) for u ∈ [−2g, 2g] along the real axis. On the sheet shown in fig. 2, the function Q j (u) has an infinite series of short cuts in the lower-half-plane. A crucial requirement is that if one were to go through the first short cut at [−2g, 2g], one would find a functionQ j (u) that is analytic in the lower-half-plane. For the symmetrical operators that we consider in (2.26), we use the gluing conditions in (5.13) of [18] (with β = γ = 0 therein), where α is a constant (only g-dependent), * is complex conjugation, and the continuationQ is obtained simply by usingP a instead of P a in (2.31);Q j (u) ≡ Q j (u * ) * . These can be viewed as relations between analytic functions in the lower-half-plane, which can be analytically from there. For numerical implementation, following [18] we take (2.32) to the real axis and define the following ratios: 2 (2.33) These four quantities should be constant (u-independent) and equal to each other. We demand that this holds for a set of N cut sampling points u m ∈ [−2g, 2g]. These gluing conditions determine the parameters c a,n in (2.27), from which one reads off the scaling dimension from the exponents in (2.24), which are themselves related to the constants c a,0 .
To solve these conditions we follow the numerical algorithm based on the multivariate Newton's method described in section 6 of [18], as used initially in [34]. This requires to start with a seed of approximate values for the c a,n in eq. (2.27). One then solve the difference equation (2.30)-(2.31) to evaluate the four ratios α jk (u) at a discrete set of m points along the cut. Since all α's should be equal and constant, the variance should vanish: 3 where the changes in parameters are estimated from the derivatives ∂α jk (um) ∂ca,n (estimated numerically by varying each c a,n by a small amount ), similar to [18]. By iterating the algorithm several times starting from an adequate seed, the parameters {c a,n } converge to a value which solves the QSC equations with high accuracy. The algorithm and the parameters involved at each step are summarized in table 1.
We benefit from the fact that the algorithm has been extensively applied already to lowlying operators. Results for the Konishi operator will be described below. Later in [35], the same algorithm was used to produce data for higher spin operators J = 2, 4, 6, for a large range of coupling values g ∈ [0. 1,8]. In figure 3 we show data provided by the ancillary of [35]. We have used this data to test our own code.
For our purposes, we extended this database to include higher spin operators J = 8, 10, · · · in the leading Regge trajectory and the lightest operator of the first sub-leading trajectory. For this, we need to find good seeds to start the numerical algorithm. We use three complementary ways to find them: • In [48] we can find a database of solutions to the QSC at weak coupling. The solutions are presented as series expansions in g, see for instance (2.36). We use these results as seeds when the coupling is small g < 0.25, which is the typical radius of convergence of perturbation theory for many quantities.
• After generating a list of data for small coupling we can extrapolate to generate seeds for g > 0.25. After using the numerical algorithm to refine them, we can extrapolate again to larger and larger values of g.
• We can also extrapolate on the spin J and move along a Regge trajectory for a fix value of the coupling. This is possible after producing a database for a few spins, such as the one provided in [35].
Description P a Ansatz Start with a guess {c a,n } seed in (2.27) truncated at order n max = N p .

Glueing conditions
Find the series Q large a,j = u Ma+M j +1 Nq n=0 b a,j,n u 2n by solving (2.30) given {c a,n }. Iterate (2.24) to find Q a,j (u m + i 2 ) starting from Q large a,j (u m + i 2 + iu ∞ ). Use (2.31) to find Q j (u m ),Q j (u m ) and α ij (u m ) at N cut points on the cut.
Update {c a,n } Evaluate the errors α ij (u m ) for parameters that differ by , and use Newton's method to update {c a,n }. Table 1: Summary of steps and parameters in numerical algorithm for QSC. The last step is repeated until coefficients {c a,n } are found which minimize the error Ncut m=1 |F (u m )| 2 for the chosen parameters N p , N q , u ∞ and N cut . We normally choose those such that the dominant error is from N p . The operators on the subleading trajectory have been less studied but we can obtain seeds using the code from [48] which solves the QSC as a weak coupling series. There are two nearly degenerate operators at spin 0 and twist near 4, whose scaling dimensions are respectively: The seed corresponding to the first operator is recorded in (A.5). We observe using the quantum spectral curve that the spacing between the two operators continues to increase with coupling in the range of interest in this paper, so they do not cross. To confirm that these operators are indeed the ones corresponding to our problem, we constructed their eigenfunction and leading order structure constants as discussed in (A.7): We compared this data with the leading-logarithm terms in our four point correlator expanded to three loops using [49] (minus the contribution from twist-two operators): (2.37) The first two terms are not expected to match due to double-trace contributions, but the perfect agreement of the log 2 u and log 3 u terms nontrivially confirms that we correctly identified the exchanged operator.
At large spin, we use that operators in the leading Regge trajectory have a universal anomalous dimension with logarithmic scaling: 39) The cusp anomalous dimension, which also controls the UV divergences of lightlike cusped Wilson loops, and virtual anomalous dimension are computed by the formulas recorded in (A.2)-(A.3). We use the second version of the formula, in which twist is expressed as a function of the conformal Casimir, and which can be solved iteratively for τ . It is more accurate since it automatically removes 1/J corrections [50][51][52]. In addition, we can compute the gap between the leading and first subleading trajectory at large spin by adding excitations over the so-called GKP string, which represents the reference state corresponding to large spin operators: (2.40) The lightest R-singlet excitations is a pair of zero-momentum scalars, ∆τ J→∞ (g) = 2E φ (g), whose energy is calculated from eq. (A.4). The raw data we use from integrability are summarized in table 2.  Table 2: Spectral data for the leading Regge trajectory and gap to the first subleading trajectory, used for numerical bootstrap. These include the twists obtained from the Quantum Spectrum Curve, as well as large-spin asymptotics from the asymptotic Bethe Ansatz.  Figure 4: Scaling dimension of the Konishi operator at finite coupling, using data g ∈ [0, 2] from [35]. This was originally plotted in [15] . We also include comparison with the first 3 terms at weak and strong coupling, which were originally obtained in [32] and [33] respectively.

Properties of the Konishi operator at weak and strong coupling
The Konishi operator will be particularly important in our study since its scaling dimension effectively defines the 't Hooft coupling, from the point of view of the correlation function we are studying; bounding its OPE coefficient will be our main focus.  Figure 5: Rescaled OPE coefficient of the Konishi operator to two stress tensor multiplets at weak (2.42) and strong coupling (2.44). We display three forms of the strong coupling formula, which agree at asymptotically large g but become distinct at intermediate g. The rescaling was applied to remove oscillations and the overall exponential trend.
At weak coupling, the scaling dimension of the Konishi operator has been provided to 11 loop orders in [48] (building on much earlier work referenced there). We reproduce here the first 5 orders: Its OPE coefficient is currently know to 5 loop orders [36]: (2.42) Recall that we removed an overall factor 1 c from our OPE coefficients, see eq. (2.11). At strong coupling, the scaling dimension is known to 3 loop order (λ −5/4 ) from [53], while the OPE coefficient is known through its relation to the Virasoro-Shapiro amplitude [9,10,37]. Recently, ref. [54] also obtained subleading corrections by adding spectral information from integrability, together with constraints from localization: where f 1 = 23 4 and f 2 = 405 32 + 2ζ 3 on the first line. 4 The second line is an equivalent rewriting of the formula using the factorK N =4 defined below (C.14), which arises naturally in the derivation of the coefficient. Note that this rewriting neatly removes the λ − 1 4 term. We thus expect the second series to proceed in integer powers of 1/ √ λ, which would be interesting to verify.
The quantum spectral curve reviewed above enables to compute the scaling dimension ∆ K numerically with arbitrary precision at any g. Figure 4 displays the resulting curve from the original article [34], along with its comparison with weak and strong coupling expansions.
It is amusing to similarly plot the weak and strong coupling predictions for the OPE coefficient, for which exact results are not yet available. Since (2.44) depends strongly on the coupling g, different forms which agree asymptotically become distinct at moderate g, as visible in figure 5. (Without the subleading terms in (2.44), the curves would differ from each other much more strongly.) Below we study the OPE coefficient at g = 0.3. It is hard to extract a definite value from the plot, but it seems reasonable to assume that the true value should lie somewhere between a linear extrapolation of the weak coupling curve and the lowest of the strong coupling curve, giving 1. Swappability: Each functional commutes with the infinite sum over the CFT spectrum, ie. each gives a valid sum rule.
2. Asymptotic positivity of finite linear combinations: finite linear combinations must exist which are positive on all but a finite range of (∆, J).
3. Completeness, as the number of elements tend to infinity.
The first requirement is clearly essential, and is rigorously satisfied by the X u,v and B v thanks to the Regge boundedness of correlators and the analysis of [39]. We discuss convergence for Mellin-transformed functionals below, as well as the asymptotic positivity of various functionals.
Our strategy to fulfill the third requirement is to present the numerical optimization solver with a varied menu of functionals, and see which ones it prefers.

Dispersion relations in Mellin space
We now describe the Mellin formulation of the Polyakov-Regge block and collinear functionals corresponding to eq. (2.20) and (2.21) respectively. This formulation is also convenient for numerical evaluation. It was explained in [39] how the position space dispersion relation is equivalent to a straightforward dispersion relation in Mellin space. The Mellin representation for identical-dimension operators takes the form: Here s, t, u are the Mellin-Mandelstam variables, which are constrained to satisfy s + t + u = 4∆ φ , and from here we set ∆ φ ≡ 4 as the effective external dimension of our reduced correlator. For example, in the g → ∞ limit, the Mellin amplitude goes to M strong .
The u-channel Regge behavior of H (see section 2.1) implies that lim s →∞ M (s , t ) ∼ s J * −4 , where J * < 2 and t = t + s − s [38]. In particular the reduced correlator satisfies an unsubtracted dispersion relation: where again t = t + s − s and the contour encircles all the poles of M except that at s = s. In fact the above only assumes J * < 4. The stronger expectation J * < 2 at finite coupling implies that an "anti-subtracted" dispersion relation also converges, where we put zeros at some subtraction point: A natural choice is s 0 = 6, which suppresses the contribution from twist-two operators. These two relations are not independent, and their equality amounts to the sum rule for any s, t. This constraint is essentially independent of s 0 . The poles of the Mellin amplitude, according to the OPE (2.6), occur at descendants of primaries where Q is a Mack polynomial discussed further in appendix C.2, and m ≥ 0 is an integer.
Assuming the same spectrum in the s-and t-channel, their contribution can be combined in the form where the Polyakov-Regge block is defined as .
On the other hand, the constraint (3.5) amounts to the sum rule where (without loss of generality) we focus on the residue at s = s 0 and set s 0 = 6: The salient property of these sum rules is that they have double-zeros on all double-trace locations ∆ = 4 + 2m + J, originating from the Mack polynomials. They are similar but distinct from those used recently in [54] to constrain stringy corrections to double-trace OPE data: here we concentrate on sum rules which strictly remove all double-traces. The Polyakov-Regge expansions (2.17) and (3.7) are formally similar, and our nomenclature is not an accident: a result of [39] is that P is precisely the Mellin transform of P ! This is established using the uniqueness properties of Polyakov-Regge block namely, singlevaluedness, Regge boundedness, and the pattern of zeroes on Regge trajectories. Similarly, B v and B t are related by a Mellin transform. Explicitly, The physical requirements on these contours is that they run to the left of all s and tchannel poles (so Re(s), Re(t) < min(8, ∆ − J + 4)), and to the right of u-channel poles (so Re(s + t) > 8). Since all operators considered in this paper have ∆ − J ≥ 2, the simple choice Re(s) = Re(t) = 5 indicated above works uniformly. The identities (3.12)-(3.13) are highly nontrivial and give us independent methods to compute functionals numerically. In appendix C, we discuss our current best numerical implementations for each. Roughly, position-space methods seem to scale better with increasing spin, while our Mellin-space implementations scale better with increasing precision and are generally faster. The precise numerical agreement between these independent methods is very helpful for debugging.
As simple consistency check, note that B protected v and B protected t are indeed Mellin-transform of each other: For future reference, we also define a s↔u crossing functional in Mellin space as which is the Mellin transform of X u,v in (2.20).

Projection functionals derived from B v
So far we have two versions of the collinear functional, B v and B t , which diagonalize respectively a position cross-ratio or a Mellin moment. It is natural to try to diagonalize other quantities, for example the action operators of twist close to two and various spins. Since twist-two operators dominate dispersive sum rules at weak coupling, up to ∼ g 4 contributions from operators of twists τ ≥ 4, these functionals effectively solve the 1-loop problem analytically. They could also potentially be useful to suppress large-spin contributions. Such projection functionals can be constructed by integrating B t against a kernel W [t]: These can be thought of as an infinite sums of collinear functionals finely tuned to possess desirable properties. We constructed the following projectors, labelled by even spins ≥ 0, whose details can be found in appendix B. They are characterized by their zeros near twist two: 1. The Φ , +2 functional (B.25) has simple zeros at τ = 2 for J = or + 2, and double zeros on all other spins: 2. The Ψ functional (B.35) has double-zeros for all spins except = J, where it has a nonvanishing intercept and slope: All these functionals have double-zeros on double traces with τ ≥ 4, as required for our applications: they are saturated by single-trace operators at large N c . The first was inspired by the Φ functional used in [39] to prove the existence of operators below the double-twist threshold and fixed spin. Assuming that the above functionals exist, it is not hard to guess the values of the constants Φ ∞ and β by expanding the sum rules at weak coupling. The anomalous dimensions for the leading family of long operators have been known for some time (see [55][56][57] for threeloop results): where J ≥ 0 and S a (m) denote harmonic sums S a (m) ≡ m k=1 sign(a) k ) k | a| . The stress tensor multiplet is formally the J = −2 member of this family, but it is included in the "protected" part 5 . Our focus is on long operators, which have J ≥ 0. Denoting as Y (i) the coefficient of (g 2 ) i in the quantity Y , the data can be expanded as Generally, each of the above functionals yields a sum rule on single-trace data of the form with W = Φ , +2 or Ψ . The respective protected parts follow immediately from the protected The salient feature of all sum rules we consider is their double zeros at twists 4,6,. . . . This means that at one-loop the sum rules are saturated by the twist-two family, which we can evaluate from (3.17) and (3.18): Thus, consistency of the bootstrap sum rule (3.22) with the known perturbative data requires that This is in precise agreement with the constants Φ ∞ and β that come out of the derivation, recorded in eqs. (B.24) and (B.34). In other words, the Φ and Ψ sum rules analytically bootstrap the one-loop theory.
The fact that the one-loop data is determined by crossing was first noticed from large spin expansions in [52], and later extended to finite spin [58]. This is a quite generic behavior, which generally works up to a finite number of constants. For N = 4 sYM the one-loop corrections are fixed up to a single overall factor g 2 . The novel feature here is that this is obtained from sum rules with nice sign properties (the Ψ are non-negative, see appendix B.5), thereby uplifting the one-loop approximations to nonperturbative inequalities. In general, the action of the projection functionals Φ , [∆, J] and Ψ [∆, J] on an arbitrary state can be calculated exactly using the formulas for Mack polynomials (C.20).

Convergence of Mellin functionals
It is interesting to consider the sum rules X s,t and B t on their own right, rather than simply Mellin representations of position-space sum rules. A crucial fact that is that they are saturated by single-traces in the planar limit, which makes them sensible for our purposes. This can be seen from the Mack polynomials Q ∆+4,J , which have double zeros when ∆−J = 4+2n with n ≥ 0; physically this happens because the Γ-functions in the Mellin representation (3.12), (3.13) already account for double-twist operators. To use Mellin functionals, we need to determine the range of s, t such that the functionals can be swapped with the OPE.
Regge boundedness ensures that the functionals converge at large twist. Swappability with respect to the OPE therefore requires that the sum over spin converges.
Let us first consider the B t functional. Given eq. (3.10), the protected part of this sum rule, we expect convergence to be bounded by the strip 4 < t < 6. We can verify convergence of the OPE explicitly by taking the large spin and fixed twist limit of the Mack polynomials: The above suggest that the domain of convergence is saturated by the lowest-twist operator.
Since the OPE coefficients scale as λ 2 ∆,J ∼ J 1/2 , we conclude that swappability is guaranteed provided that 4 < t < 6. This is further verified numerically at zero-coupling in the left plot of fig. 6. (A similar conclusion was reached in appendix B of [54].) This exercise further demonstrates that the convergence rate is fixed by the Mack polynomials at large spin, and therefore it is invariant under Regge boundedness (anti)subtractions.
For X s,t , the arguments of the Mack polynomials are now dependent on both Mellin-Mandelstam variables s and t. A natural expectation is that convergence is allowed within the triangle-shaped domain Re s, t, u < 6, which includes the symmetrical point s = t = u = 16 3 . By evaluating the Mack polynomials in the fixed twist and large spin limit, we find that the forward-channel, the (s ↔ u) crossed-channel, and the (t ↔ u) crossed-channel blocks scale as follows: After accounting for the OPE coefficient and setting τ = 2, we conclude that the domain of convergence perfectly matches the triangle-shaped strip with Re s, t, u < 6 ! While we are unable to obtain an analytic expression for the asymptotics of the projected functionals, their growth is expected to be bounded by the collinear B t functional given the Functional

Range
Protected part Equation(s)  Table 3: Complete list of sum rules we used to bootstrap single-trace operators, including their allowed parameter ranges and protected contribution. The Mellin functionals B t and X s,t allow complex parameters. We include links to defining equations and to efficient evaluation formulas.
nature of the projection operation. This is confirmed numerically as shown in the right plot of fig. 6.

List of functionals we used
To summarize this section, we list the complete set of functionals we use to bootstrap singletrace OPE coefficients. In particular, we use both position-and Mellin-space sum rules. Although these functionals encode the same information in their respective spaces, the labelling of these functionals (cross-ratios (u, v) in position-space versus Mellin-Mandelstam (s, t) in Mellin-space) highlight how this encoding can lead to distinct functionals; generally, we put hats on Mellin functionals.
In each case, we produce a list of functionals by sampling a range of values. When testing functionals on perturbative data, we observe that numerical convergence is best achieved near the crossing-symmetric points, which are: z = z = −1 for the position-space crossing functionals X u,v , s = t = u = 16/3 for the Mellin-space Polyakov-Regge blocks X s,t , or v = 1 and t = 5 for the Mellin-space collinear functionals B v and B t respectively. In principle we could consider derivatives around these points, but for our numerical implementation we find it easier to sample a random selection of points to high numerical accuracy. The list of sum rules with appropriate ranges is recorded in table 3.

Numerical bootstrap
In this section we describe numerical bounds on OPE coefficients obtained using the functionals discussed in the preceding sections, given information about the single-trace spectrum. We focus on the coefficient of the lightest unprotected scalar, the Konishi operator, at weak and strong(ish) values of the coupling: g = 0.1 and g = 0.3.

Generalities
The main concept is similar to OPE bounds in the numerical conformal bootstrap: we look for combinations of functionals that have a definite sign on all allowed states. The main difference is our choice of basis of functionals. Traditionally, the numerical bootstrap exploits functionals that are derivatives of a crossing equation around the crossing-symmetric point; as reviewed in introduction, these must be projected out to obtained N c -independent bounds in the planar limit. Rather, we rely on the menu of dispersive functionals in table 3. Since we do not a priori know which one are most effective, we use a sample of all of them. Since the different functionals behave differently in various limits (small twist, large twist, or large spin), this strategy is intended to help the linear optimization problem and minimize computational resources.
Explicitly, let us label as W k any of the functional shown in table 3 for some particular choice of its parameter. Note that even when functionals allow continuous labels (like (u, v) or (s, t)), we only consider discrete choices that lie in the allowed ranges. By swappability, each functional W k leads to a valid sum rule: and we can take finite linear combinations to get We then separate the protected part and the target OPE coefficient we want to bound, λ 2 K , and impose that all other terms are positive. Namely, we look for linear combinations α k such that any of which proves an inequality The prime in (∆ , J ) indicates that the Konishi operator is omitted. The optimal bound (for a particular finite list of functionals {W k }) is thus found by solving a standard linear optimization problem on the decision variables α k , where one maximizes the right-hand-side subject to the inequalities (4.3) and a suitable normalization condition:  1. For the leading trajectory, we interpolate between results from integrability at low spins and large spin asymptotics, placing a conservative window around each operator. For subleading trajectories, we demand that the optimized functional be non-negative in a continuum above a twist gap. In this instance, we put exaggerated windows to see if the bootstrap discovers the spectrum.
The plus (minus) sign yields a upper (lower) bound, respectively. We use the SDPB solver to efficiently solve this type of problem [59]. In principle, the inequality (4.3) needs to be imposed on the infinite list of all single-trace operators of the theory. In practice, we can only solve the quantum spectral curve for a finite set, and in any case, we can only solve finite systems of inequalities. Truncation is necessary. Our solution is to identify ranges of ∆ where operators can be present for each spin, and to impose positivity on a dense sample of discrete values of ∆ in that range. Discretization errors can be controlled by plotting the obtained functionals and ensuring that it does not become negative between the sampled points. 6 Typically, for each spin, we can make good estimates about the leading operator ∆ min (J) and the gap to the next operator, which we call ∆ thresh (J). Thus, for each spin, we allow to operators in (4.3) to lie in a small window around ∆ min (J) (with width determined by our error estimates), or in a continuum at ∆ > ∆ thresh (J). Our spectral assumptions are further discussed below at weak and strong(ish) coupling.
Having only an incomplete spectrum implies that the inequalities we find, although conservatively valid, may not be optimal. This discrepancy cannot be removed unless we know the exact spectrum of the theory. Indeed it was observed in the study of 1d defects in N = 4 that including more operators improves the numerics significantly [31].

Bounds at weak coupling: toy problem with 2 functionals
We start our analysis by looking at a toy example with only two functionals, namely Ψ 0 and Φ 0,2 . Indeed, in the one-loop approximation, these functionals respectively compute the Konishi OPE coefficient (2.42) and the spin-2 anomalous dimension given the Konishi anomalous dimension defined by eq. (2.41). It is thus interesting to ask what they prove at finite but small coupling, ie. g = 0.1.
Our spectral assumptions are shown in fig. 7. For the leading twist we impose positivity in a large window close to twist 2 without using any perturbative data. We also add a conservative gap of ∆τ = 1.9 between the leading twist and a continuum; this is equal to the asymptotic gap at large spin (see (2.40)), while the gap appears to decrease monotonically with spin.
We find that the optimal combination of these two functionals produces an upper bound that is indeed quite close to the weak coupling OPE coefficient of eq. (2.42), and is almost saturated already by Ψ 0 : These can be contrasted with the free theory result λ 2 K (g=0) = 1 3 and the five-loop prediction λ 2 K (g=0.1) 5−loop = 0.30067(1). We recall that our OPE coefficients have 1 c ≈ 4 N 2 c factored out as shown by eq. (2.11). Bounds with more functionals are discussed in the next subsection.
The optimal two-functional combination is whose action on the leading trajectory states is displayed in fig. 8. We explicitly see that Ψ 0 nearly saturates the action of this functional. These bounds, which use functionals optimized for the one-loop problem, improve oneloop perturbation theory in two respects. First, they are numerically closer to the correct answer (in contrast with λ 2 K (g=0.1) 1−loop = 0.29333). Second, and perhaps most importantly, they are rigorously valid at finite g.

Bounds at weak coupling: adding more functionals
It is now interesting to add functionals from table 3 to see if the improved bounds capture higher-loop effects. We consider two options: with 20 functionals and with 40 functionals. Our 20 functionals consist of four Φ 1 , 2 , Ψ 0 , as well as five B t for t evenly spread in the (4, 6) interval. Moreover, we use ten X u,v for a set of u, v satisfying chosen randomly with a flat measure above the s↔u symmetric curve to avoid redundancy.
The largest set of functionals we were able to use consists of 40 functionals including all types from in table 3, except for X s,t , which we found challenging to stabilize. In addition to those in the preceding paragraph, we used the Ψ 2 , Ψ 4 functionals, nine more X u,v , four B v with v with 0.95, 1.01, 10 and 50, and finally, five more B t . We find that our bounds are stable with respect to adding more X u,v 's, however adding more B t or B v 's could spoil the convergence with twist and spin (and would lead to bounds which rules out the theory). Since X u,v are subdominant at large twist and B's dominates in that region, we suspect that this limitation would be removed if we impose a tighter grid in that region or alternatively use an asymptotic formula.
To obtain robust bounds, we impose positivity up to a large J cutoff ∼ 6000 for the leading trajectory and near the gap in the continuum. 7 However, starting τ = 20 we use a smaller cutoff (J τ >20 cutoff ∼ 250). We sample twists up to a cutoff τ cutoff ∼ 250. We numerically check that this twist is large enough to ensure positivity on the asymptotic spectrum for the set of functionals considered.
We find that with this increased number of functionals, the upper bound at g = 0.1 gets reduced to:  yet with respect to B v and B t functionals, as mentioned above.
However, as we add more functionals, we gain a higher resolution of the leading-twist spectrum. This means that despite the small change to the upper bound, the optimized functional is in fact manifestly different than the toy model optimized functional.
To investigate our functionals' ability to resolve the spectrum, we put slightly exaggerated windows around the expected position of single-trace operators (from perturbation theory), requiring positivity within these windows. We then observe that the optimal functional develops double zeros at these positions. In fig. 9, we show the optimized functional near (J, τ ) = (2, 2); the double-zero of the optimized functional probes the leading spin-2 singletrace operator in the spectrum. With 20 functionals, we are able to discover the first 6 operators in the leading family. With more functionals one expects of course to discover more states, but we did not try exaggerated windows with 40 functionals, since our main goal was to bound λ 2 K . We have not observed stable double-zeros on the subleading families. An alternative way to resolve the spectrum is to adjust the size of the positivity window imposed around a given operator; as long as the physical operator lies inside the window, the upper bound on λ 2 K should not vary. Therefore, by adjusting the upper and lower edges of the positivity window, a kink should form exactly where the operator exits the window. We observe precisely such kinks as shown in fig. 10 for the leading spin-2 operator. Finally, it is worth mentioning that if the window includes the physical operator in the leading family operators, its size does not affect our upper bound at all. Furthermore, omitting windows around operators that are discovered did not change our upper bounds on couplings.
In fig. 11, we illustrate the action of the individual functionals and the optimized one on the leading-twist states. This is qualitatively different from the toy model example considered previously, and the optimized functional is distinct from all individual functionals used in the optimization problem. We repeat our procedure for other weak coupling values such as g = 0.2, using the spectral data summarized in table 2. For this coupling, we again find a stable upper bound, essentially independent on the number of functionals:  Figure 12: Spectral assumption for g = 0.3. We use the QSC results up to spin 10, and for higher spins we use the large-spin asymptotics. Our estimated error is less than 0.0005 for each leading twist operator and is not visible on the plot. For the subleading trajectory, we conservatively use the large-spin gap for J ≥ 2.
which may be compared with the 5-loop estimate 0.25 (2). Lastly, let's briefly discuss lower bounds on the Konishi operator OPE. Proceeding similarly to the above, we were unable to obtain a non-trivial (nonnegative) lower bound with the set of functionals at hands. We have experimented with including different sets of functionals, as well as with different spectral assumptions and even tried unphysically large twist gaps. We observe that lower bounds converge towards the perturbative value only when we impose an infinite gap for spin 2 and spin 0 operators. What improvements are necessary to obtain a lower bound with realistic spectral assumptions is still an open question.

Stronger coupling: g = 0.3
Having analyzed the weak coupling regime extensively, we are now ready to take the next step and move to values for which perturbation theory is not expected to converge, namely g > 1 4 . As we will see, accurate spectral information becomes increasingly essential. Our spectral assumptions are shown in fig. 12. For the leading trajectory, we used precise results from the QSC up to spin J = 10, followed by the improved large-spin asymptotics (2.39). The latter has an estimated error smaller than 0.0005 for the twist of higher spin states, and even gives per-mil accuracy for J = 0! To ensure that our bounds are rigorously valid, we include a window of size ±0.0005 around each operator up to spin 146. For the subleading trajectory we use the spin 0 gap derived in table 2, and we use the conservative large spin gap for J ≥ 2. We use the same J cutoff and τ cutoff as in the preceding subsection. The only difference is that in order to obtain the asymptotic large spin behavior, we need to go higher in spin (as can be seen from comparing fig. 12 and fig. 7). We conservatively choose to calculate up to spin 250 and then extrapolate. We emphasize that we do not have complete control over the large twist, large spin region; for instance, for 20 ≤ τ ≤ τ cutoff we are unable to reach the large-spin asymptotic region and extrapolate. This could potentially affect the rigour of our bound. Further analysis is needed to completely tame this region.
Accurate spectral information seems important. In our earliest attempts, we did not use the improved form of large-spin asymptotics (2.39) which led to much larger error estimates (and larger window sizes ∼0.02 near J = 10). Using the more accurate spectrum immediately impacted the bounds.
We use the same set of 40 functionals as described at the top of section 4.3. As in the g = 0.1 case, the optimal functional does not exhibit zeros near the subleading trajectory, but it exhibits zeros on several operators on the leading trajectory. 8 This is plotted in fig. 13, along with the contributions from some individual functionals.
We quote the result we get with this data to be:

Discussion
In this paper, we initiated a numerical study of nonperturbative constraints on correlators of four stress tensor multiplets in planar four-dimensional N = 4 sYM. Our methodology consists of three main steps. First, we compute the scaling dimension of low-twist operators using integrability as described in section 2.3; the resulting data that we used is summarized in table 2. Second, we construct dispersive CFT functionals with distinct characteristics as summarized in table 3; each provides a nonperturbative sum rule on planar OPE coefficients. Finally, we implement a numerical bootstrap algorithm to find linear combinations of these sum rules that prove optimal bounds on the desired OPE coefficient, as detailed in section 4.
Our main result is that at small but finite coupling, the OPE coefficient of the Konishi operator satisfies a rigorous upper bound that is nearly saturated by the perturbative series. As a function of other scaling dimensions, the bound displays kinks (see figure 10) which suggest that with more functionals, the optimal bound will be saturated by the theory. At stronger coupling g = 0.3, outside the typical domain of convergence of the perturbative expansion, we obtain bounds with conservative spectral assumptions.
We focus on the 't Hooft planar limit since exact spectra from integrability are only available in this limit. On the other hand, this limit is challenging for traditional numerical bootstrap techniques, due to sign-indefinite double-trace contributions. The dispersive functionals described in section 3 avoid this problem by formulating crossing directly at the level of single-trace data, by virtue of having double-zeros that suppress all double-trace operators. We developed new technology to apply dispersive functionals in a numerical bootstrap context, summarized in appendix C, which could be useful for applications to other (non-supersymmetric) conformal field theories.
There is a lot of flexibility in the space of dispersive functionals. We constructed special combinations, the projected functionals Φ , +2 and Ψ (see eqs. (3.17)-(3.18)), which isolate specific twist-two anomalous dimensions and OPE coefficients in the weak coupling limit and determine them at one-loop. Significantly, the Ψ are nonnegative, which ensure that they prove rigorous bounds at finite coupling. By studying numerically combinations of more functionals, we find extremal combinations that develop double zeros at the location of various operators (see section 4.3), demonstrating that the functionals nonperturbatively constrain the spectrum.
As we increase the coupling and move beyond the convergence radius of perturbation theory, we observe that the optimal functionals become qualitatively different from Φ and Ψ. A main outstanding challenge to reach larger values of coupling is to gain more analytic control over positivity of the optimal functional in asymptotic regions of large spin and large twist. This will be needed to make completely rigorous the bounds presented in this paper. Furthermore, we expect that this will enable the stable inclusion of more B-type functionals. As discussed in subsection 4.4, this could enable the functionals to probe subleading Regge trajectories, whose scaling dimensions can be provided by the QSC.
We only obtain nontrivial upper bounds on OPE coefficients. This is to be contrasted with the 1D Wilson line defects considered in [30,31], where tight lower bounds are also obtained. It is possible that with more spectral information or more functionals this situation will improve. An alternative scenario is that, like in the 3D Ising model, bootstrapping the theory down to an isolated island will require to study more correlation functions [60].
Our analysis was performed in the strict planar limit: all the sum rules we used are homogenous in N c . It could be interesting to compare results with the conventional numerical bootstrap at some large but finite N c .
We anticipate many possible extensions of our analysis. First, with some optimization it should be possible to find from the QSC the scaling dimension of many more operators, with higher spin and/or higher twist. Second, it might be possible to derive polynomial approximations for the ∆-dependence of dispersive functionals, which would make it possible to include a vastly larger number of functionals. Such approximations were a crucial step in the development of the modern numerical bootstrap [61], and this technology would also be a key step toward applying dispersive functionals to other models. Third, it may be possible to incorporate sum rules beyond those we considered, for example the integrated constraints from localization [7,62], or constraints from mixed correlators involving other half-BPS operators. It remains to be determined if the method can yield tight two-sided bounds on OPE coefficients. matrix, which effectively inverts the Beisert-Eden-Staudacher (BES) kernel [65]: By inverting the matrix (1 + K) and dotting into suitable vectors, one finds the cusp and virtual anomalous dimensions Similarly, the ground state energy of a scalar excitation is .
In practice, these are calculated by truncating the matrix K to a finite size and extrapolating the results to infinite size. For the relatively small values of the coupling that we consider, convergence is fast.
Since the leading twist-4 operator has not been discussed earlier, we also record weak coupling QSC data corresponding to it, extracted from the code of [48] As mentioned in the main text, we confirmed that this is indeed the twist-four, spin-0 operator which is exchanged between stress tensor multiplets, by computing its OPE coefficient directly at weak coupling. For this we worked out the corresponding eigenstate of the 1-loop dilatation operator, using the Hamiltonian from [66] (eq. 3.6 therein). Including double-trace terms, there are four color-singlet operators involving 4 scalars and no derivatives, and we find that the eigenfunction corresponding to ∆ = 4 + g 2 (13 − √ 41) + O(g 4 ) is By normalizing the two-point function of O ∆ and computing its Wick contractions with two protected operators O(x 1 , y 1 )O(x 2 , y 2 ) of the form (2.1), we obtained the tree-level OPE coefficients (2.36). It was crucial in this calculation to retain the double-trace terms, which do contribute to the planar OPE coefficient. These are extremal three-point functions in the free-theory (the sum of twists of two operators is equal to the twist of the third operator) which are not predicted by the tree-level hexagon formulas [67,68].

B Projection Functionals
In this appendix we detail the construction of projection functionals, which diagonalize the action of the B v (or B t ) on twist-two operators. The idea is exploit orthogonality relations for the polynomials which control their action on twist-two operators.

B.1 Action of B t on operators near twist two
When acting on an operator with twist close to two, the functional of can be expanded as The factor 1 2 is included for compatibility with the literature. The two shown terms are insensitive to descendants, thanks to the double-zeros of B t , which suppress operators of twists 4,6. . . . Thus, the Mellin representation (3.11) simplifies to Comparing, one thus finds a (t) in terms of the Mack polynomial (C.10), explictly These polynomials are even under t → 10−t, and are normalized so they obey the following orthogonality relation for ≥ 0: where c = Γ( + 3) 4 Γ(2 + 6) 2 ( + 1) 4 (2 + 5) . (B.5) These properties are similar to those of the leading-twist functional B 2,t discussed in section 4 of [39], although the details differ slightly since here we are interested in diagonalizing the action on twist-two operators, rather than on the leading double-twists. 9 The slope b (t) is more complicated. For our applications below, it will be useful to know its integrals against a j . To this aim, we decompose it into odd and even parts. The odd part can be computed directly using symmetries of (B.2), while for the even part we adopt an expansion over Mack polynomials: In the second term we included a j = −2 term to account for the polar part of b : .
From investigation of many cases, we find explicitly the coefficients where the second line is for the generic case 0 ≤ j < . The factor outside the parenthesis matches the unsubtracted dispersion relation (4.45) of [39]) evaluated at ∆ φ = 3. This representation enables us to find simply the integral of b (t) against an even-spin a j . The slight subtlety is that is that the non-polynomial function a −2 (t) is not orthogonal to the polynomial ones, rather: Combining this with the orthogonality relation (B.4) and expansion (B.7), we find nice cancellations such that for even j, the integral of a Mack polynomial a against b j gives for 0 ≤ < j even, for > j even. (B.14) This result, together with (B.7), will now be used to define various projectors; we will not further need the b ,j coefficients.

B.2 One-loop anomalous dimensions: the Φ , +2 projection functionals
In [39] a functional Φ was constructed analytically, which had double-zeros on all operators of the first double-twist family of an arbitrary CFT, and single zeros on just the J = one. The sign properties of that functional, for some and external operator dimensions, established that mean field theory maximizes the twist gap for that spin.
Here we will construct functionals Φ 1 , 2 which analogously has double-zeros near twist τ = 2, except for two operators J = 1 , 2 , where it has a single zero. This problem is similar, but distinct, from the one studied in [39] since here we insist to maintain double-zeros on every double-twist τ ≥ 4.
As a first step, we attempt to construct a functional Φ with nonvanishing slope only for J = , by writing it as an integral over B t : We will find that the kernel Φ [t] does not lead to a convergent integral unless we allow a nonvanishing slope on at least two spins. The construction follows that in [39], although Φ here is slightly different. The conditions on Φ are that, for every even J ≥ 0, Note that a factor 2 arose from (B.1). The question mark emphasizes that Φ will be obtained by ignoring convergence constraints, which will be addressed below. First, we observe that first of eqs. (B.16) can be satisfied if the kernel is odd: Since the a J 's form a complete basis for a reasonable function space, we expect this to be the only solution. For an odd kernel, the second condition, about the slope, can be computed from (B.6) as Assuming that Φ [t] vanishes at infinity faster than any polynomial (otherwise the functional (B.15) doesn't make much sense), we integrate by parts to write the second condition as Comparing with the orthogonality relation (B.4), we deduce that the parenthesis should be proportional to c a (t), specifically This differential equation admits a unique solution, given that Φ [t] is odd and so must vanish at t = 5: where we let t = 5 + iy in the second line. The first few cases can be found analytically, for example 28 πy log(1+e −πy ) − Li 2 (−e −πy ) + π 2 12 (3y 2 −1) + π 2 (9−7y 2 ) tanh( πy 2 ) , (B.23) which are indeed odd functions of y. The only issue with this calculation is that the Φ kernels don't actually vanish as y → ∞: the corresponding functionals do not make sense as the integral (B.15) fails to converge at large y. The solution is to take finite linear combinations. By investigating the large-t limit of the kernels we find the following simple formula: Since the integrand in (B.21) decays exponentially, the corrections to the limit are proportional to e −πy . Thus, to define valid functionals, it suffices to combine any two kernels Φ [t] so as to cancel the constant: We have normalized it so it has unit slope as τ → 2 when acting on operators of spin J = 1 . The Φ 1 , 2 are of course not all linearly independent, they are spanned by the Φ , +2 considered in the main text. A convenient formula for their evaluation is discussed in B.4 below.
As discussed in the main text below (3.24), the combination (B.25) admits a simple interpretation in terms of one-loop anomalous dimensions, since the one-loop sum rule forces the following proportionality (with an -independent constant): This is indeed satisfied by (B.24). In other words, the Φ 1 , 2 sum rules analytically prove the one-loop formula for anomalous dimensions. This also makes it physically clear why a functional with a single zero on only one spin could not exist, since that would prove that one-loop anomalous dimensions identically vanish.

B.3 One-loop OPE coefficients: the Ψ functionals
The functionals just constructed possess desirable zero structure which makes them highly particularly sensitive to the scaling dimension of leading-twist operators of a given spin. However, they lack sensitivity to OPE coefficients, which are the main focus of this paper. We now construct, for each even ≥ 0, a functional Ψ which has double zeros around twist two for all spins except J = , where it has nonvanishing constant term and slope. At weak coupling, Ψ effectively relates the OPE coefficient and scaling dimension of the leading spin-operator. In terms of a kernel Ψ [t] defined similarly to (B.15), these conditions are interpreted as follows: for all even J ≥ 0, and where β is an a-priori unknown constant.
To construct the kernel Ψ [t], we propose to make an ansatz as a sum of its even and odd parts. The orthogonality relation (B.4) immediately fixes the even part: . We now substitute the ansatz into the second condition (B.28) and evaluate the even contribution using the integral I ,J in (B.13): Since we formally diagonalized the b J integral using the kernel Φ J [t] (see (B.16)), this system can be inverted as an infinite sum The single parameter left to determine is β . As above, it is fixed by the requirement that the kernel must vanish at large t for the functional to make sense, which gives where we plugged in the explicit limit (B.24). Perhaps surprisingly, these sum up to simple rational numbers. In fact, as discussed below (3.25) in the main text, there is a simple analytic guess for the result, which is required for the sum rule to be satisfied at one-loop: We verified numerically that this agrees precisely with (B.33) for many values of . To summarize starting from (B.29), the action of the Ψ functional is computed is with the kernel defined by

B.4 Formulas for evaluating Φ and Ψ functionals
We now describe a practical way to perform the t integral that define the projection functionals, starting with Φ 1 , 2 in (B.15), when acting on a generic state. Following the method in section C.3, it suffices to evaluate the following basic integrals for integer k ≥ 0: Indeed it is clear from the definition of B t that the action of Φ 1 , 2 on any state can be written as a finite sum of these integrals (see (3.11)). When plugging in the integral representation (B.21), the 1/(t − 6) denominator neatly cancels out: (B.38) Again we have set t = 5 + iy. This looks daunting, but the trick is to integrate by parts, using that the inner integral vanishes exponentially at infinity. This yields a difference of two integrals, Note that we antisymmetrized in y , which was not strictly necessary but makes each [Φ ] k real. The calculation is thus reduced to integrating a polynomial divided by cosh 2 . This can be completed term-by-term using the identity where B q on the right is the Bernouilli number, not to be confused with the position-space functional B v . In this way [Φ ] k can be evaluated as an exact rational number. The y lower bound in (B.40) is arbitrary since any constant added to the integral would cancel out in the combination (B.39), thanks to the definition of Φ ∞ . The choice made above, which makes the integral proportional to (1+y 2 ), turns out to ensure that [Φ ] k = 0 for > k, which will be convenient below. This can be interpreted as an orthogonality property of the Mack polynomials, although we were not able to strictly derive it from (B.4 The crucial fact is that that the n sum terminates, thanks to the vanishing properties just mentioned. Thus, while we were unable to find a closed-form expression for the kernel Ψ [t] itself, it is possible to compute its action on a state of arbitrary spin J as a finite sum of ∼ J terms. The sum gives rational numbers exemplified in table 4. We have verified that they agree with the direct numerical integration of (B.35), with the sum over n truncated to a large order.

B.5 Sign properties of Φ and Ψ functionals
The formulas from the preceding section, combined with the Mack polynomials reviewed in section C.2, enable to rapidly compute the Φ , +2 and Ψ functionals on generic states. One readily sees from figure 14 that Φ 0,2 indeed has the claimed single zeros at spins 0 and 2 and twist 2, and double zeros on higher spins and all higher twists. However, since the slopes on spin 0 and 2 have opposite signs, it is not sign definite as also visible from the figure. In fact, for J > 0 the Φ 0,2 functional is negative for τ < τ * and eventually becomes positive at large enough twist τ . Similarly, we show the action of Ψ 0 in fig. 15, which features double-zeros on all doubletwists for J ≥ 2 and it is non-zero when acting on the Konishi operator. Remarkably, we find that the functional is nonnegative for all τ ≥ 2. The same property is actually shared by all the Ψ : we observe that Ψ [∆, J] identically vanishes for J < and is positive otherwise.

C Formulas for efficient evaluation of functionals
The bootstrap method requires the evaluation of a large menu of trial functionals on a large sample of states (∆, J). High accuracy is required, since optimal combinations tend to involve large numerical cancellations between different functionals. Here we detail fast and accurate numerical methods.

C.1 Dispersion relations in position space
The most straightforward method to compute the Polyakov-Regge block P ∆,J (u, v) is perhaps to compute the dispersive integral (2.17). The kernel is that of the unsubtracted dispersion relation of [42]; our conventions follow those in (2.9) of [39], where we combine the s and t-channels: where x is a remarkable combination of four cross-ratios: In practice, we perform the double integral in (2.16) by changing to ρ coordinates, in which the kernel also takes a concise form. By adapting the radial coordinates from [69] to the u-channel, we obtain The choice of numerical integration strategy matters. In Mathematica, we force the use of the DoubleExponential method, which essentially computes a simple Riemann sum after a clever change of variable that makes the integrand decay exponentially near its endpoints. In theory, for a sufficiently "nice" integrand, the error with this method decreases exponentially with effort, however we only observe a decrease in error if we also force subdivisions using the MinRecursion option. For example, the two-dimensional ρ integral with NIntegrate[..., Method -> {"DoubleExponential", "SymbolicProcessing" -> 0},

MinRecursion -> 2, WorkingPrecision -> 25]
will typically achieve near 20 digits of accuracy, which generally suffices for bootstrap problems with O(30) functionals or fewer. One way to estimate accuracy is to make simple changes of variable such as ρ w → ρ 2 w , which theoretically should not change the integral but in practice do so with this method. We observe that each increase in MinRecursion typically adds about 10 significant figures at the cost of quadrupling the computation time. Therefore, arbitrary accuracy is in principle achievable with this method, but only for a limited number of functionals/states.

C.2 Mack polynomials
In this appendix we detail our evaluation of formulas involving the Mellin representation and Mack polynomials. For future reference, we keep explicit the dependence on external operator dimensions {∆ i } = {∆ 1 , ∆ 2 , ∆ 3 , ∆ 4 } and spacetime dimension d; only the case {∆ i } = {4, 4, 4, 4} and d = 4 is relevant for the main text.
To fix our conventions, we use the following Mellin representation for unequal operators following the convention of [41]: For our applications below, it is useful to explicit the m and t dependence of the Mack polynomials. Up to overall Γ-functions, the dependence is essentially through a polynomial of total degree J which is naturally written using Pochhammer symbols [38,70]: (C.6) Here and below a = ∆ 2 −∆ 1 The prefactor, which contains the double-twist zeros mentioned in the main text, is whereas the coefficients Q a,b ∆,J q,k are rational functions of ∆. It is useful to view them as a (J+1) × (J+1) matrix where q, k range from 0 to J, setting to zero the entries with q + k > J. The matrix can be populated efficiently using the Casimir recursion in appendix A of [9]. In the Pochhammer basis, we find ∆,J q,k+1 This recursion is seeded with the boundary condition Q a,b ∆,J 0,J = (−1) J+1 for the top-right element, which corresponds to the conformal block normalization lim We fill the matrix row-by-row using (C.8) to move leftward in k. For example, for the first row (q = 0) one finds the simple analytic solution which resums (see (C.6)) to give the standard m = 0 Mack polynomial [9,70] Q 0,a,b ∆, A slightly upsetting feature of (C.8) is that it produces spurious poles at values of ∆ that are not particularly meaningful, where the factor on the left-hand-side vanishes. We either avoid these values or fall back on the following analytic expression for the coefficients (see the formula recorded in appendix of [41]), which is explicitly free of spurious poles: Populating the Q a,b ∆,J q,k matrix with this formula requires O(J 3 ) multiplications.

C.3 Formulas for B functionals using Mack polynomials
The representation (C.6) is convenient for our purposes because the m dependence is isolated in a few Pochhammer symbols. Consider for example the B t functional, which we recall from The dependence on m is explicit in K and (−m) q , which allows the sum over the infinite number of descendants to be performed analytically. The basic formula is . (C.14) From here, we specialize to the ∆ i = 4 case of interest and abbreviate: ∆+4,J . The nonnegative factorK N =4 ∆,J will be present in front of most functionals. Explicitly, we thus compute B t in (3.11) by forming the vector (which is linear in t) which we dot into the Mack coefficients: The result takes the form of a polynomial in t of degree J + 1, divided by (t − 6). At this stage we typically express it as a Pochhammer sum, where the coefficients B[∆, J] k can be obtained from the above using simple vector operations. This representation is useful to compute various functionals related to B. For example, the Mellin transform which gives the position functional B v (3.13) can be done analytically using The results (and performance) can be compared with the position space integrals described in section C.1. For a sample operator with J = 150 and twist 4 + 1 19 , and v = 3/2, we find for example: The accurate agreement between methods is a crucial debugging tool which gives us high confidence in our implementation. The Mellin space version of B functionals is clearly faster, especially when high accuracy is needed. This is partly due to the sub-exponential convergence of the ρ integrations discussed above. In contrast, the Mellin formula boils down to the exact calculation of a matrix of rational numbers, times a numerical vector of 2 F 1 functions, whose cost increases very slowly with the requested precision. The projection functionals Φ , and Ψ are computed similarly using respectively the vectors (B

C.4 Formulas for Polyakov-Regge blocks using Mack polynomials
A similar strategy works for the Mellin-space Polyakov-Regge block (3.8), whose definition we recall: It turns out that this sum can also be evaluated analytically, now in terms of 3 F 2 hypergeometric functions. Explicitly, for q = 0 we find: (C.23) The terms with q > 0 admit similar expressions, but a more efficient strategy is to compute them recursively, using combinations that cancel out the m-dependent denominator: Again this can be rapidly calculated to very high accuracy. The above expression is also well-suited for performing the Mellin transform to obtain the position-space blocks P u,v . For the term with P s , the Mellin integral over t can be done analytically and gives (with u = 16 − s − t): (C.27) There is a single integral left to perform numerically, over s, in contrast with the twodimensional ρ integral in the position-space approach. However, the integrand is a fairly complicated function of s, especially at large spin J.
Fortunately, it turns out that the s integral in (C.27) is "nice" for numerics. We use an exponential parametrization s = 5 + i sinh(x) and simply approximate the integral by a Riemann sum, sampling x at discrete values uniformly spaced in a range such as [−6, 6]. Since the integrand decays doubly exponentially with x, it is easy to ascertain that the contribution from x outside the range is smaller than say 10 −200 . Furthermore, since the function is smooth, the Euler-Maclaurin theorem predicts that discretization errors decay nonpertubatively with the spacing ∆x. We observe empirically that the error decays as e −#/∆x with # ∼ 4 for a wide range of spins, twists, and cross-ratios.
Convergence is extremely fast. The Riemann sum using just 300 sampling points is typically accurate to 50 digits. Thus, our method for evaluating P ∆,J (u, v) boils down to evaluating the vectors P N =4 This method agrees precisely with the position space integrals described above. Again, it is instructive to compare the results (and performance on one of the authors' laptop) of the two methods:  (Mellin, 1200s). (C.28) Again, the precise agreement gives us high confidence in the validity of our codes. Generally, our position space implementation tends to be faster for getting a small number of figures, but its cost increases rapidly with the requested accuracy. On the other hand, the Mellin method requires some effort to get any significant figure at all (due to strong numerical cancellations in the matrix product), but it scales much better with requested accuracy. The above timings reflects a naive implementation of the 3 F 2 function (C.23), which is the most expensive step in the calculation; faster timings are achieved using the optimizations below.
A significant advantage of the Mellin approach is that the same ingredients can be recycled for many functionals. This makes the average evaluation time per functional much smaller than the above numbers suggest. Most of our intensive runs were computed using this method.

C.5 Some algorithmic improvements
For the reasons just mentioned, we invested significant effort to optimize Mellin-based formulas. The most expensive ingredient in the preceding subsection is to perform the sum over descendants in (C.22) for q = 0, which we expand here for convenience: We focus here on the sum which appears in N = 4 sYM, but we expect similar techniques to work for the more general dispersive sum rules discussed in [41]. A relevant fact is that for each (∆, J), we need the above sum for several hundred values of s. Instead of using Mathematica's HypergeometricPFQ, we find it advantageous to compute the sum by combining exact evaluation of the summand at small m with its asymptotic series at large m: (C.30) where X m collects all other factors in (C.29) and the Y n parametrize its 1/m expansion. In the last term, the m-sum can be rapidly computed as a series in 1/m max up to order n max . For calculations aiming for O(100) digits, we typically choose n max ∼ 100 or more, and increase it and m max until the error in the formula becomes smaller than the requested precision. Errors are readily estimated using the observation that the X m sum up to exactly 1 (see (C.13)).
The crucial point is that the expensive ingredients in this formula, X m and Y n , only need to be evaluated once for each operator (∆, J): the s-dependent factor is very simple. The sum can thus be computed for multiple s values for essentially the price of one, easily reducing the timings quoted in (C.28) by two orders of magnitude.
The second expensive ingredient is the vector of 2 F 1 functions in (C.26). We evaluate it by replacing it by linear combinations that involve ρ-like variables and computing those recursively. Specifically, define the two-vector: The desired integral (C.26) can be expressed in terms of those using More precisely, the integral [P s,v ] k is equal to a sum of the left-hand-side of the preceding equation with 0 ≤ r ≤ k 2 and integer coefficients. The two-vectors P s+2r,v which appear on the right-hand-side can be populated recursively in terms of the one with the largest r, using (We avoid using the recursion in the opposite direction because it is not numerically stable for v ≈ 1.) Even without the change of basis (C.33), hypergeometric relations would allow the vector [P s,v ] k to be populated with only two hypergeometric evaluations, a significant speedup over O(J) evaluations. The special combinations (C.31) further optimize the computation of two seeds by making the hypergeometric argument numerically smaller (see [69]).