Bounding Violations of the Weak Gravity Conjecture

The black hole weak gravity conjecture (WGC) is a set of linear inequalities on the four-derivative corrections to Einstein--Maxwell theory. Remarkably, in four dimensions, these combinations appear in the $2 \to 2$ photon amplitudes, leading to the hope that the conjecture might be supported using dispersion relations. However, the presence of a pole arising in the forward limit due to graviton exchange greatly complicates the use of such arguments. In this paper, we apply recently developed numerical techniques to handle the graviton pole, and we find that standard dispersive arguments are not strong enough to imply the black hole WGC. Specifically, under a fairly typical set of assumptions, including weak coupling of the EFT and Regge boundedness, a small violation of the black hole WGC is consistent with unitarity and causality. We quantify the size of this violation, which vanishes in the limit where gravity decouples and also depends logarithmically on an infrared cutoff. We discuss the meaning of these bounds in various scenarios. We also implement a method for bounding amplitudes without manifestly positive spectral densities, which could be applied to any system of non-identical states, and we use it to improve bounds on the EFT of pure photons in absence of gravity.


Introduction
The effective field theory (EFT) describing the known universe at the lowest energies includes only photons and gravitons. The broad array of massive particles in the Standard Model and beyond leave their imprints on the low-energy world in the form of higherderivative operators. The resulting EFT includes the Einstein-Hilbert term of gravity and the Maxwell term of electromagnetism, plus an infinite number of higher-dimensional operators, (1.1) In general, an n-derivative operator will introduce corrections to the observables which are suppressed by a factor of (E/M ) n−2 compared to the leading two-derivative contribution.
Here M refers to the scale of new physics -it is the energy at which new poles or cuts appear in the amplitude. The Planck mass M P determines the strength of the gravitational interaction; 1 gravity decouples in the limit where M P /M → ∞. The coefficients α 1 , α 2 , β, and so on are dimensionful. In examples where their contribution from UV physics is known, such as the Euler-Heisenberg EFT where they arise from integrating out a massive electron, they are order one numbers times powers of the coupling constant, in units of the scale of new physics M . It is a general expectation that this dimensional analysis should hold universally. Recent developments have made it possible to put this general expectation on a more rigorous footing. It has been clear for some time that not every EFT is consistent with some of the most basic principles of physics. Unitarity and causality imply positivity bounds [1][2][3][4][5][6] -constraints on the signs of EFT coefficients; such bounds are most efficiently derived with the aid of dispersion relations. An enormous amount of effort has gone into exploring the extent of these constraints, applying them broadly to EFTs across particle physics, quantum gravity, and cosmology . Recently, the methods for extracting constraints on EFTs from these basic requirements have been given a more systematic foundation [37][38][39][40][41][42]. This has led to a number of important outcomes, including a demonstration that S-matrix consistency implies two-sided bounds on ratios of EFT coefficients, essentially "proving" the intuition of dimensional analysis above, as well as a precise numerical recipe for obtaining optimal bounds. These methods have since been used to bound the Standard Model EFT [43], systems of scalars [44][45][46] and spinning particles [47,48] including photons [49] and gravitons [50]. See also [51] for a recent review.
Gravity presents a particular challenge for these methods, due to the so-called graviton pole, a 1/u divergence in the forward limit that arises from graviton exchange. This obstacle may be surmounted by considering a dispersion relation with more subtractions, which simply removes the pole entirely [37,50]. This is not, however, entirely satisfactory because including more subtractions in the sum rule will typically remove the four-derivative 1 In our conventions, the metric expands as gµν = ηµν + 2 M P hµν , and Newton's constant is given by interactions from the sum rule as well. These are the leading corrections, and they often have considerable theoretical interest. For an example of relevance to this paper, the fourderivative corrections to Einstein-Maxwell theory are required to obey a certain inequality if the weak gravity conjecture [52] is to be satisfied by the spectrum of black holes alone [53]. We shall review this in more detail below, but essentially this requires that (1.2) in the parametrization of (1.1). Remarkably, it was shown [54,55] that this so-called "blackhole weak gravity conjecture" immediately follows if the graviton pole may be safely ignored.
A number of consequences of this observation were subsequently explored [27,56,57], with the conclusion that such bounds are probably not applicable, as they would imply impossibly strong constraints or a unrealistically low EFT cutoff. An alternative possibility was conjectured in [57]: (1.2) may be violated by a small amount without spoiling the consistency of the S-matrix. This insight was supported by recent results [29,58,59] where a weakening of the causality criteria was observed in EFT coupled to gravity. In fact, problems of superluminality in EFTs with gravity had been understood since 1980, when Drummond and Hathrell [60] showed that the EFT that arises from integrating out an electron with dynamical gravity can allow light to travel superluminally on some backgrounds (see [61] for nice recent analysis). Roughly, the resolution seems to be that gravitational interactions universally cause a time delay, so EFT operators that cause a time advance are allowed in principle as long as the advance is smaller than the gravitational time delay.
In fact, many of these ideas are implicit in the work of [62], where three-point couplings such as β are shown to cause a time advance which overwhelms the gravitational time delay unless there is an infinite tower of higher-spin particles. For the theory described by (1.1), this time delay argument requires that these new particles must enter with masses satisfying M 2 HS 1 β . This may be thought of as a bound on β. We must have M M HS , which suggests the naive scalings where M is the scale of new physics. Provided that M is lower than the Planck mass, the inequalities (1.2) will hold provided α 1 and α 2 are positive. 2 However, as anticipated above, we shall see in this paper that that is not the case. Our goal is to use 2 → 2 photon scattering amplitudes to derive bounds on Einstein-Maxwell theory, including on the four-derivative coefficients appearing in (1.1). A general method for finding such bounds in the presence of a graviton pole was given in [65] and [66], where it was shown how to extract bounds on the leading four-derivative coefficients by acting on the dispersion relations with a more general class of functional. The result is, as expected, that a small amount of negativity is tolerated, but this negativity is essentially proportional to M 2 /M 2 P , and thus vanishes in the limit M P → ∞, where gravity decouples.
Applying this method to 4d requires care because infrared divergences preclude the existence of the positive functional needed for the argument. However, it was shown in [65] that this issue can be circumvented in some cases by regulating the divergences with an infrared cutoff (IR), leading to a number of interesting bounds on modifications to Einstein gravity in four dimensions. We shall use the same approach to handling the graviton pole in this paper, though we shall see that there are a few issues plaguing us which did not appear in [66] (essentially because corrections to Einstein gravity in 4d do not include any four-derivative operators).
Another technical improvement we make in this paper, especially relative to [49], is to show how to bound amplitudes which do not have manifestly positive partial wave expansions. This may be accomplished using a more general approach to unitarity constraints, sometimes called the "generalized optical theorem". A similar method has been used recently in the case of gravity [50,66] and more explicitly in [46] for a system of multiple scalars. In the present case, this will allow us to obtain bounds on helicity amplitudes without positive partial wave expansions, such as M +++− , in terms of other amplitudes with manifest positivity. For the case when gravity decouples, we shall see that these bounds are stronger than the bounds we previously obtained in [49]. For the case with gravity, we shall see that some negativity is allowed in the coefficients α 1 and α 2 , and we find that β is bounded by α 1 and α 2 .

The black hole weak gravity conjecture
Let us review the black hole weak gravity conjecture, and why our work is relevant to it. For a recent review of the literature, see [67].
The weak gravity conjecture (WGC) [52] was formulated as a criterion for determining which EFTs can be consistently coupled to quantum gravity. Such EFTs are said to live in the "Landscape," in contrast with the EFTs which are inconsistent with quantum gravity and therefore live in the "Swampland." The original version of the WGC states that there must be a particle whose charge is greater than its mass in Planck units, meaning (1. 4) In this case, the electric repulsion of two equally charged particles would be stronger (or equal, if the equality is saturated) than the gravitational attraction; hence it is a state for which "gravity is the weakest force." The requirement that such a state exists was motivated by the requirement that any non-supersymmetric black hole should be able to decay. The simplest possible case where such electrically charge black holes exist is Einstein-Maxwell theory, described by the leading terms in (1.1), Ignoring rotation and magnetic charges, this theory has a two-parameter family of black hole solutions, parametrized by mass m and electric charge q: (1.6) The curvature of these spacetimes blows up as r approaches zero; only those solutions where this point is hidden behind an event horizon are physically sensible. This means that the functions f (r) and g(r) must have a zero, which only happens when This is the black hole extremality bound: states satisfying it are called subextremal, those saturating it are extremal, and those violating it are called superextremal. Consider now the decay of a black hole with mass m m 1 + m 2 and charge q = q 1 + q 2 into two daughter states: If the initial black hole is extremal, i.e. √ 2q = m/M P , then one of two options holds. Either both of the daughter states are exactly extremal (and m = m 1 + m 2 ), or at least one of them is superextremal. This leads to the conclusion (1.4). More precisely, the WGC, in the original form of (1.4), states that theories of quantum gravity must have superextremal states so that their nearly extremal black holes can decay.
It is important to stress that there is no proof of this (or any) version of the WGC. For one, it is not at all clear why all black holes must be able to decay. Original arguments have included issues with large numbers of species [68] or remnants [69]. Another hint is the conceptual consistency with the no-global-symmetry conjecture [70,71]. The charge q depends implicitly on the gauge coupling g, so the WGC will be violated if one takes g → 0. In a sense, the WGC may be thought of as forbidding "nearly global symmetries." None of these arguments amount to a proof of the conjecture. Nonetheless, the WGC has been observed in every UV complete model known. In our own universe, it is resoundingly satisfied by the electron, where √ 2qM P /m 2 × 10 21 . The WGC has been given a number of interesting extensions and generalizations (see [67]). One possibility, considered almost as early as the WGC itself, is that the superextremal states satisfying (1.4) are black holes themselves [53]. The key idea of this work is that higher-derivative operators will shift the solution to the equations of motion, which may introduce corrections to the extremality bound. For the case of the Lagrangian given in (1.1), the equations of motion get corrected to < l a t e x i t s h a 1 _ b a s e 6 4 = " P R N m 1 o 3 N / 9 5 4 f 1 P p 9 Q v W 2 X 0 0 g e s = " > A A A B 7 n i c b V D L S g N B E O y N r x h f U Y 9 e B o P g K e y G o B 6 D X r w I E c w D k j X M T m a T I b O z y 0 y v E E I + w o s H R b z 6 P d 7 8 G y f J H j S x o K G o 6 q a 7 K 0 i k M O i 6 3 0 5 u b X 1 j c y u / X d j Z 3 d s / K B 4 e N U 2 c a s Y b L J a x b g f U c C k U b 6 B A y d u J 5 j Q K J G 8 F o 5 u Z 3 3 r i 2 o h Y P e A 4 4 X 5 E B 0 q E g l G 0 U u v u s U r C X q V X L L l l d w 6 y S r y M l C B D v V f 8 6 v Z j l k Z c I Z P U m I 7 n J u h P q E b B J J 8 W u q n h C W U j O u A d S x W N u P E n 8 3 O n 5 M w q f R L G 2 p Z C M l d / T 0 x o Z M w 4 C m x n R H F o l r 2 Z + J / X S T G 8 8 i d C J S l y x R a L w l Q S j M n s d 9 I X m j O U Y 0 s o 0 8 L e S t i Q a s r Q J l S w I X j L L 6 + S Z q X s X Z S 9 + 2 q p d p 3 F k Y c T O I V z 8 O A S a n A L d W g A g x E 8 w y j q a Q R a n 8 y P 3 d K z q z S J 2 G s b E l D 5 u r v i Q m N t B 5 H g e 2 M q B n q Z W 8 m / u d 1 U h N e + R M u k 9 S g Z I t F Y S q I i c n s d 9 L n C p k R Y 0 s o U 9 z e S t i Q K s q M T a h g Q / C W X 1 4 l z U r Z u y h 7 9 9 V S 7 T q L I w 8 n c A r n 4 M E l 1 O A W 6 t A A B i N 4 h l d 4 c x L n x X l 3 P h a t O S e b O Y Y / c D 5 / A B B g j r 0 = < / l a t e x i t > M 4 g 2 Figure 1. Schematic representation of our bounds: the weak gravity conjectures is satisfied at leading order (dark shaded region), but violations are still admissible at sub-leading order in M/M P (light shaded region).
with analogous corrections to f (r) and F 01 which are not important. Working to first order in the coefficients α 1 and β, one finds that the shifted solution leads to a shifted extremality condition: Let us imagine comparing two black holes, in the shifted and unshifted theory, which have the same charge and the minimal possible mass. Then the black hole in the theory with higher-derivative corrections can have a superextremal charge-to-mass ratio, compared to (1.7), and still have an event horizon, if the mass shift in (1.10) is negative. We see that this occurs when We derived this inequality by considering only electric black holes: the other two inequalities of (1.2) come from considering purely magnetic and dyonic black holes [72]. One of the main points of this paper is that causality constraints alone allow these inequalities to be violated by corrections proportional to M 2 /M 2 P .

Overview of results
The purpose of this paper is to explore the use of dispersion relations and positivity bounds in the Einstein-Maxwell EFT. Our main conclusions are • Using the generalized optical theorem, we bound quantities without manifestly positive spectral densities. This allows us to derive positivity bounds involving all three independent amplitudes f ∼ M ++++ , g ∼ M ++−− and h ∼ M +++− .
• In the limit where gravity decouples, it is easy to prove the WGC inequalities (1.2) by expanding in the forward limit. This is consistent with previous work [27,55], where it was shown that the WGC immediately follows if the graviton pole is discarded.
• We show how to derive corrections to the M P → ∞ limit. The strongest possible bounds with our approach allow for a violation of the WGC: introducing the notation 16α 1,2 = g 2 ± f 2 , the WGC would require g 2 |f 2 | 0. Instead we find where we give upper bounds on the O(1) constant c 1 < 24.2571. A schematic representation of our bounds in the (g 2 , f 2 ) plane is shown in figure 1. The WGC appears to be satisfied at leading order in M/M P , but zooming in on the boundaries of the allowed region unveils a region where it is violated. The size of the violation is suppressed by (M/M P ) 2 but enhanced by the logarithm of an infrared cut-off.
This paper is organized as follows: In section 2, we review the fundamentals of 2 → 2 photon scattering and the assumptions we use. Among these are (1) weak coupling: the requirement that loops are suppressed in the EFT, and (2) Regge boundedness: the requirement that, at fixed u, the amplitude grows slower than s 2 at large |s|. We also describe the approach to scattering non-identical states known as the "generalized optical theorem," and show how it improves the bounds obtained in the limit without gravity.
In section 3, we consider the problem of bounding the EFT coefficients in the presence of gravity. We derive a number of improved sum rules and use them to bound the fourderivative coefficients. Some explicit examples of functionals which yield these bounds are given. We end the section with a discussion on the relevance of our bounds to the WGC.

Bounding Photon Scattering
Let us first review the technical ingredients we will need in order to derive bounds. The goal will be to apply dispersion relations to 2 → 2 scattering amplitudes of photons. The result will be a set of sum rules which depend on Mandelstam invariants s and u. Semidefinite programming may then be used to derive optimal constraints on EFT coefficients from these sum rules. This numerical approach to deriving EFT constraints was pioneered in [40], and generalized to handle the graviton pole in [65].
At the heart of this method is the S-matrix, which maps ingoing states to outgoing states. For the four-particle amplitudes considered here, this amounts to out ψ 3 ψ 4 |ψ 1 ψ 2 in = free ψ 3 ψ 4 |S|ψ 1 ψ 2 free , (2.1) for particles ψ 1 , ψ 2 , ψ 3 , and ψ 4 . The S-matrix can be split into the identity operator and the interacting part as For the four-particle amplitude, we consider the external states to be two-particle center-of-mass plane waves. We will be concerned with the scattering of photons, hence the amplitudes will be depend on the helicities of the external particles, M λ 1 λ 2 λ 3 λ 4 , where λ i = ± denote states of circular polarization. We define the amplitude by This describes two photons with helicities λ 1 and λ 2 coming in along the z-axis, and scattering to two photons with helicities λ 3 and λ 4 , going in the direction (θ, φ) and (π − θ, π + φ).
We shall use all-ingoing conventions in this paper. The dynamics are symmetric with respect to rotating φ, so we will set it to 0. It will be convenient to package the individual helicity amplitudes into a matrix Here we see 16 amplitudes, but in the scattering of identical indistinguishable particles, there are discrete symmetries which reduce the number of independent functions which these depend on. For our case, where all particles have spin 1, the amplitudes are related by

5)
T : following from parity, time-reversal and boson exchange respectively. In addition to the helicities, these amplitudes are functions of the momenta of the external particles, parametrized by the usual Mandelstam invariants, s = −(p 1 + p 2 ) 2 , t = −(p 1 − p 4 ) 2 , and u = −(p 1 − p 3 ) 2 . The amplitudes are also related by crossing symmetry, which acts on their helicities and permutes the Mandelstam invariants (e.g. M ++−− (s, t, u) = M +−+− (t, s, u), and by complex conjugation of the helicities which relates M −λ 1 −λ 2 −λ 3 −λ 4 (s, t, u) = M λ 1 λ 2 λ 3 λ 4 * (s, t, u) ≡ (M λ 1 λ 2 λ 3 λ 4 (s * , t * , u * )) * . Now let us count the number of independent amplitudes. In the most general situation, we only allow two symmetries B and PT . This reduces the number of independent functions from 16 to 7 (real) functions, which reduces to 5 after crossing symmetry: In this case, g(s|t, u) is a real function with t-u symmetry. f (s, t, u) and h(s, t, u) are complex functions, fully symmetric under any permutation of s-t-u. Their real and imaginary parts reflect the parity-even and parity-odd parts respectively. In the rest of this paper, we will restrict ourselves to parity-even interactions. In this case, f * (s, t, u) = f (s, t, u) and h * (s, t, u) = h(s, t, u). As a result, we are left with only three independent real functions, f , g, and h.

Dispersion relations
Dispersion relations are a standard technique for deriving positivity bounds. The typical strategy is the following: Consider an amplitude M(s, u) which obeys the Froissart bound at fixed u in the physical region where u < 0. This behavior has been demonstrated for gapped systems [73,74], however for scattering of massless particles its status is less clear -see [75] for an interesting recent discussion, and [76] for a proof of the required property for scalar amplitudes in d > 4. In this paper, we will take (2.9) as an assumption. Equation (2.9) implies that the following doubly-subtracted contour integral vanishes, If the amplitude is analytic in the upper half s-plane, which follows from causality, then the contour can be deformed towards the real-s axis, defining the amplitude on the lower-half plane via M(s, u) ≡ M(s * , u * ) * . Then there are two contributions to the integral, which must therefore cancel: one contribution from three simple poles, and one contribution from the discontinuity across two cuts along the real axis, see figure 2. In a parity-respecting theory, the discontinuity picks up the imaginary part of the amplitude, and we get . (2.11) The strategy we will follow below is to parametrize the amplitude in the top line of (2.11) using the EFT, where it is given as a sum of undetermined coefficients. The amplitude in the bottom line will be parametrized using the partial wave expansion. Unitarity implies that the partial wave densities are positive (or, more generally, form positive definite matrices). This will allow us to convert (2.11) into an equation of the form L = H, where L and H are the low-and high-energy results of the dispersion integral and will be defined below.

(2.12)
By a direct computation, we note that the first terms in these expressions agree with the action (1.1) upon identifying 16α 1 = g 2 + f 2 and 16α 2 = g 2 − f 2 . The remaining parametrization allows for all possible terms consistent with the mentioned symmetries, and the assumption that contact interactions give a contribution to g(s|t, u) proportional to s 2 . Gravity decouples in the M P → ∞ limit, and the terms that involve graviton propagators go to zero in this limit. Let us comment on the terms with β, which are a little special: β arises from the Feynman diagrams with a single or double insertion of the operator F µν F ρσ W µνρσ , together with a graviton propagator in the diagram. From the way we 3 In graviton scattering, the terms in g(s|t, u) are multiplied by an universal helicity factor of s 4 , so once this is stripped, the scaling of g ∼ s 2 at large s means that unsubtracted or even antisubtracted sum rules are possible. In our case, the universal helicity factor is s 2 , so the stripped amplitude still requires an inverse power of s to kill the pole at infinity that appears in the dispersion integral. The result is that, unlike [66], we can not immediately read off improved sum rules by using unsubtracted dispersion relations. Instead we will need to derive improved sum rules by systematically subtracting off higher-derivative coefficients, as is done in [40,65].
have written it, it is clear that these terms vanish in the M P → ∞ limit. But one might ask why not also include an independent h 2 term, i.e. a term h 2 (s 2 + t 2 + u 2 ) in h(s, t, u). In fact, we shall see that forward-limit sum rules, which are applicable in the limit where gravity has decoupled, preclude the existence of any such term (and in fact also show that h 4 = 0). The h 2 -type interaction must shut off in that limit.
In terms of the matrix of amplitudes, we will introduce L, the low-energy matrix, by . (2.13) Using "prime" to denote this sum over residues, this can be written as The low-energy part is entirely determined by these functions f , g , and h .

High-energy: partial waves and unitarity
Now we turn to the high-energy part of the dispersion relation, defined by .
The amplitudes M(s , u) and M(−s − u, u) can be related by crossing, and we will use this fact to derive the exact form of the sum rules. 4 At high-energies, i.e. above the scale M , the EFT no longer applies, and we are forced to be more agnostic about the form of the amplitude. However the symmetries alone strongly constrain the possible form the amplitude can take. This motivates the use of the partial wave expansion. For spinning particles in four dimensions, this takes the form [77] M IJ = 16π Here I and J label the rows and columns of the matrix, or equivalently pairs of helicities. 5 The partial wave densities A IJ (s) are given by where |s λ 1 λ 2 refers to a two-particle state with definite angular momentum and energy and the scattering angle θ = arccos 1 + 2u m 2 . The set of allowed values of the spin of exchanged states depends on the external helicities. As we will see below, for f (s, t, u) and g(s|t, u) we have = 0, 2, 4, . . ., for g(t|s, u) and g(u|s, t), = 2, 3, 4, . . ., and for h(s, t, u), = 2, 4, 6, . . .. Now we define the spectral densities ρ λ 1 λ 2 λ 3 λ 4 (s) = Im A λ 1 λ 2 λ 3 λ 4 (s). Using the partial wave expansion in the integral, and combining the right-hand and left-hand cuts using crossing symmetry, we find an expression for the components of H. From here on, we will use s = m 2 . Then we have For convenience, let us denote the integrand by h IJ , so that (2.20)

Positivity from unitarity
Having defined h IJ , we shall now write it in a way that separates the dynamical and kinematical content of the high-energy amplitude. The aim is to do it in such a way that it makes the positivity conditions manifest. The key to finding positivity condition is to invoke unitarity, which implies that Contracting the second equation with external states of definite helicity and angular momentum gives where we have inserted a complete set of intermediate states of spin , labeled by X, which accounts for all other labeling of the state. If we define and use ρ = Im A , the result is From this result, sometimes called the generalized optical theorem (e.g. [46]), we can see that ρ IJ (s) is a positive definite Hermitian matrix. In what follows, we shall use the positivity of ρ IJ (s) to show that specific linear combinations of the high-energy part of the dispersion relation are positive. Specifically, they will be constructed by acting on H by certain linear functionals which will involve contracting H IJ with vectors v I , and taking various integrals over u. To make this concrete, let us start by rewriting h IJ as where for generality we consider a vector c ,X and a matrix V IJ Q . The V IJ Q may be determined from (2.19) and are given in explicit form in appendix A. The purpose of defining h IJ this way is that it makes the positivity constraints easier to deal with. This follows from the fact that for a real-valued symmetric matrix V , the condition V 0 implies that c † V c ≥ 0 for all complex vectors c. Since all the dynamical information of the high-energy amplitude is contained in the vectors c, we can analyze its positivity without making any further assumptions than those following from analyticity, unitarity and symmetry. Below we shall see how to construct sum rules by contracting (2.25) with vectors different vectors v, and derive positivity constraints by applying linear functionals on such sum rules.
In (2.25), the sum over Q represents a sum over "selection sectors" -defined by the parity P X and spin X of the exchanged state. Invariance under parity and boson exchange imply 6 From this, we can see that c ++ = c −− = 0 for odd spins, and c +− = c −+ = 0 for odd parity. Thus, we have that Q ranges over the following sectors: • Spin zero and parity-even, denoted Q = 0. We have c 0 0,X = (c ++ 0,X ) * , and V IJ 0 is a number for any fixed I, J.
• Odd spin = 3, 5, . . . and parity-even, denoted Q = o. We have c o ,X = (c +− ,X ) * , and V IJ o is a number for any fixed I, J.
There are no parity-odd exchanges for odd spin. With these considerations at hand, we are able to write the matrix entries of the high-energy integrand h as (2.28)

Sum rules
We may now express the result of our dispersive arguments in the form where L and H, as defined above, are matrices of functions of s and u. To proceed we will first contract the matrices with real vectors v I , which will lead to The left and right sides of this equation are both functions of s and u. From here, we make use of two basic ways to derive sum rules: We refer to the first type of sum rules as "forward-limit sum rules" because they essentially amount to a series expansion around u = 0. This means that they are not valid in the presence of a 1/u graviton pole. The second type we shall call "integral sum rules." These are more general: they include the forward-limit sum rules when φ(u) has a δ(u) factor. In practice, we shall consider linear combinations of such sum rules. The primary reason to do this is to derive "improved sum rules," where the low-energy part only depends on a finite number of EFT coefficients. We shall show in detail how these are constructed below. A general linear combination of rules can be constructed from a linear combination of sum rules formed from different vectors v i . The practical algorithm will therefore be The argument is as follows. It follows from (2.28) that if the condition (2.33) is true, then Then by (2.20), the high-energy part of the dispersion relation is positive. This implies that the low-energy part must also be positive, implying the positivity in (2.34).
To connect with the numerical bootstrap philosophy, we will think of the (weighted) sum over different choices of i in the argument above as acting on a set of sum rules with a linear functional Λ. Specifically, on matrix-valued functions f IJ (s, u, m 2 ) Then the algorithm can be reformulated in terms of searching for optimal functionals, a problem that can be implemented as a semi-definite program. Specifically,

Null constraints
Another important part of the numerical method of this paper is the addition of null constraints, first introduced in [39,40]. The are equations that arise when a single EFT coefficient can be written in terms of the high-energy expansion in two different ways. As such, they take the form of constraints only on the high-energy data. Including them in the numerics significantly improves the possible bounds. The use of null constraints for photon scattering in the forward limit was explained in [49]: in practice one expands the dispersion relations as in (2.31), with p sufficiently large to kill the graviton pole, and equates high-energy expansions leading to the same low energy expression. In this paper, however, we find more null constraints than in our previous work. Consider the sum rules derived from the "g-type" amplitudes only: g(s|t, u), g(t|s, u) and g(u|s, t). Each of these amplitudes enters a dispersion relation, not manifestly positive. In [49], only s-t symmetric combinations were considered, effectively reducing the number of amplitudes to consider to two: g(s|t, u) + g(t|s, u) and g(u|s, t).
The counting of null constraints is given in table 1. When considering only g-type sum rules, this leads to stronger constraints than the previous work. Moreover, we now have sum rules and null constraints involving the "h-type" amplitudes.
Another addition to the previous work is the null constraints of the form of integral sum rules. We will return to them in section 3.

Bounds without gravity
The rest of this section will be devoted to applying the methodology laid out above to the case where M P → ∞. In this limit, gravity decouples and the graviton pole vanishes. This means that we may apply forward-limit sum rules with as few as two subtractions, i.e. p 0 in (2.31). After reviewing these sum rules in more depth, we present a number of bounds derived from them. Among other things, we show how this immediately implies the WGC inequalities (1.2). This is consistent with the known fact that the WGC is directly provable from forward-limit bounds if the graviton pole is ignored [27].
The problem of bounding 2 → 2 photon amplitudes in the absence of gravity was addressed using a less general method in [49]. As such, we also include a discussion of the difference between the results obtained here and the results of that paper. We see that they are significantly stronger and we are also able to bound coefficients, such as h 3 , which remain unconstrained in [49].

Forward-limit sum rules
Recall that the sum rules take the form L = H, with components Contracting with a vector v and writing out the sum over Positivity of g 2 Let us illustrate this with a simple example. If we choose v = (1, 0, 0, 0) T then we find the low-energy part Consider the lowest-order sum rule by specifying p = 0, q = 0. This picks out v T Lv s 0 u 0 = g 2 . For the high-enery part, we use the explicit formulas in appendix A. For our choice of where θ = 1 + 2u s is the scattering angle. Now the forward limit . The result is a sum rules for g 2 , It is clear that this is a sum over positive terms. Hence g 2 must be positive! This example, therefore, turns out to be a translation of known results [49,63,64] into the language of this paper. Likewise, if one picks the power s 2k u 0 , one finds a sum rule that implies positivity of the coefficient of s 2k+2 u 0 in g(s|t, u), in agreement with [37], see (A.25) in appendix A.
WGC bounds without gravity Let us consider a slightly more complicated example, which will give us a very interesting result. Let us choose v T = 1 4 (1, −1, −1, 1), and again look at the leading (four-derivative) coefficients by specifying p = q = 0. From the lowenergy expansion, we can see that This is proportional to the exact combination that appears in the (electric) WGC bound in (1.2). Here we have defined h 2 = β/M 2 P . The reason is that h 2 = β/M P → 0 in the decoupling limit. However for the moment, we would like to be agnostic about the source of h 2 in the amplitude, and instead think of it as the nothing more than the coefficient of Now let us look at the high-energy parts The consequence of this is a sum rule for the WGC combination: The result is a sum of squares, and must therefore be positive. This result is also known in the literature. It was pointed out in [27,55] that these forward limit bounds directly imply the positivity of the WGC combination. Note also that we can easily obtain the magnetic and dyonic WGC inequalities by choosing instead choosing v = (1, 1, 1, 1) T and v = (1, 1, 1, −1) T , respectively. Let us be clear that this does not prove the WGC: we are required to take the decoupling limit M P → ∞ before we are allowed to use the forward-limit sum rules in the first place. As a result, it is sort of a silly example. Of course "gravity is the weakest force" in the limit where the strength of gravity goes to zero. Still, it serves to illustrate an important point: the bounds we can derive in the absence of gravity by expanding in the forward limit are stronger than the bounds available when the graviton pole is present. This shall be a major theme of section 3, where we will explore the bounds in the presence of gravity. Figure 3. Some bounds on the six-derivative term M 2 h 3 /g 2 . The dots refer to the partial UV completions as above.
Vanishing of h 2 without gravity Let us define h 2 to be the term proportional to s 2 + t 2 + u 2 in the amplitude h(s, t, u). The parametrization h 2 = β/M 2 P used in (2.12) indicates that h 2 = 0 in the absence of gravity, and in fact it is easy to derive a sum rule that shows this fact. Consider for instance the entry L 12 | s 0 = 2h 2 − h 3 u + . . . in the lowenergy amplitude. The corresponding entry H 12 | s 0 in the high-energy amplitude is in fact proportional to u, giving which shows that h 2 = 0. Note that argument is not valid in the presence of gravity, since it requires expanding the twice-subtracted dispersion relation in the forward limit. 7

Numerical results
The strategy to get bounds is very similar to the one in [49], but with two differences. The first one is that in this case we have a more general set of sum rules, which include the amplitude M +++− , too. Thus, we can now get bounds on the coefficients in the h amplitude; the corresponding sum rules and null constraints are non-diagonal in V IJ + . For instance, the sum rule for h 3 reads In figure 3 we give some examples of bounds derived when including the sum rule (2.50) in the set of sum rules and null constraints. We include in the plots the values of some known partial UV completions (table 3), which consist of integrating out massive fields at 7 One might believe that it would be safe to expand (2.49) in the forward limit even with gravity, since the dangerous gravity pole is not present in this particular amplitude. This example makes it clear that this is not allowed, since there are known partial UV-completions of Einstein-Maxwell theory with non-zero values of β = M 2 P h2. An example is the theory of a charged spin-1 2 fermion (QED), which will be discussed in section 3.4.  In the left figure we show in orange the one sided bound (allowed above, disallowed below) from [37], in light blue the result from [49] and in dark blue our new result. In the other figure we zoom in the new allowed region, adding some partial completions 3. tree-and loop-level. The notations in the plots for them is the following: massive axion (a), scalar (φ), graviton (h), QED (e), scalar QED (ẽ) and W ± sector (W ).
Another novelty with respect to our previous work is that we are not building crossinginvariant sum rules as before. This leads to a number of new null constraints, so in general the bounds will be stronger than those found in [49]. For instance, we can make use of a new null constraint of order m −6 , which immediately gives an improved bound for g 3 /g 2 . Previously, while the upper bound g 3 /g 2 1/M 2 was found without null constraints, the lower bound g 3 /g 2 −4.82/M 2 required the use of null constraints at order m −8 and higher, and the precise value of that bound depended on the number of null constraints used. Now, using the non-crossing symmetric sum rules, and therefore the new null constraint at order m −6 , we are able to obtain both an upper and lower bound which does not improve when adding more null constraints. The new optimal bound is In fact, this is just the first instance of an infinite sequence of two-sided bounds involving the coefficients of the powers of s p in the amplitude g(s|t, u), as shown in (A.26) in appendix A.
Moving to eight-derivative order, we can see that our new approach significantly reduces the allowed region in the plane given by g 4,1 and g 4,2 , see figure 4. More precisely, the new allowed region fits into a triangle determined by the inequalities 0 2g 4,2 g 4,1 + 2g 4,2

Results with Gravity
In this section we turn to the main novelty of this paper, which is bounds on the EFT coefficients that describe photon amplitudes in the presence of gravity. These bounds are derived via integral sum rules [65,66], which provide a way to circumvent the problem with the graviton pole. In four dimensions, using such sum rules introduces a logarithmic dependence on an infrared cutoff m IR . The details on how to derive such bounds will be laid out below; here we will summarize the main results. Figure 5. Exclusion plot in the plane (g 2 , β 2 ) properly normalized and divided by log(M/m IR ). The shaded regions represent the allowed values obtained by using various combinations of dispersion relations. In particular the light blue only uses the I g dispersion relation, while the darker blue uses the I g ,I 0 and the I g , I 0 , I β 2 dispersion relations. The bounds have been obtained in the parametric limit log(M/m IR ) 1.
Our most interesting result is that we find that the positivity approach used in this paper cannot rule out violations to the black hole weak gravity conjecture. Specifically, we find that the coefficient g 2 must satisfy an inequality of the form where c 1 = 24.257 and c 0 = 33.328. Moreover, assuming that β = 0, this inequality is strengthened to where nowc 1 = −10.557 andc 0 = 11.659. We can also construct bounds in the plane g 2 , β 2 by considering arbitrary values of the ratio β 2 M 2 P /(g 2 M 2 ). In figure 5 we present such bounds in the limit m IR → 0, where the IR logarithm dominates.
The rest of this section will be devoted to a detailed description of how to obtain bounds in the presence of gravity. In section 3.1 we outline the method used to generate the bounds in (3.1)-(3.2) and figure 5. Then in section 3.2.1 we present a completely explicit functional that gives a weaker version of the bound (3.1). The stronger bound (3.1) is simply found by extending this method to allow for more complicated functionals.
In section 3.4 we make an interpretation of our bounds. It is noteworthy that the violations to the black hole gravity conjecture vanish in the limit M 2 /M 2 P → 0. By assuming a scaling that is compatible with integrating out charged matter, which covers the case of QED, we find that in this limit the usual QED positivity bounds such that g 2 0 are recovered. In the limit where the electromagnetic strength becomes comparable to the gravitational, the bound (3.2) applies, and negative values of g 2 cannot be ruled out.

Algorithm
In order to obtain bounds on the low-energy parameters (3.4), we act on the vector dispersion relation (3.3) with a functional Λ and demand positivity of each term appearing on the right-hand side of the equation. If such a functional exists, it will produce a constraint on the low-energy parameters g 2 , f 2 , β, g 3 , etc., in terms of the parameters M , M P and m IR . More concretely, we consider functionals of the form where S i is a set of functions q n (p) in the variable p. Each q n contains integer or half-integer powers of p. We will discuss the choice of these sets in the next section. The index i runs over any non-empty subset of {1, 2, 3, 4, 5}. For instance, in section 3.2.1 and section 3.2.2 we will consider functionals using respectively only i = 1 and i = 1, 2.
The lower extreme of integration in (3.11) deserves special attention. We will discuss it in the next sections. Let us first spell out the concrete algorithm: 1. Choose a subset of dispersion relation to use. This corresponds to selecting which of the Λ i appear in (3.11).
2. Check if there exists a choice of coefficients c n,i such that:  3. If such a functional exists then we obtain the constraint where the . . . includes the other low-energy coefficients, if present. There is a constraint for any functional satisfying (3.13).
4. In order to obtain the optimal constraint on a given parameter, say g 2 , one can fix the values of the other parameters (β = β * , f 2 = f 2 * , etc) and choose the functional that optimizes the following conditions:

. (3.17)
In this way one obtains a bound on g 2 for fixed β * , f 2 * , etc. By scanning over them one gets bounds as a function of the other parameters. Similarly one can obtain bounds on any other parameter. 8 9

Positivity
In order to implement the algorithm, we need to find a method to impose positivity on the whole parameter space in the high-energy regime. Specifically, we need to impose that the functional is positive in all of the following regions, see figure 6, 8 When only a subset of dispersion relation is considered (i.e. Λi ≡ 0 for certain values of i), the function FΛ does not depend on some parameters and the resulting bounds are independent of them. 9 When β and β 2 do not appear simultaneously in FΛ, it is more convenient to fix the ratio between the parameters and get a bound on the overall normalization. Demanding positivity in the first three regions in the numerical implementation is a standard task and is achieved by a suitable discretization. We give more details about this in appendix C. The fourth region, namely in the limit of large and m 2 for fixed impact parameter b, requires a very careful consideration, which will be the topic of the remainder of this subsection.

Large-, m 2 behavior and choice of functionals
As explained in [65], the major obstruction in getting a positive functional comes from the tension between the need to cancel the oscillating behavior of the hypergeometric functions at large and the need to have convergent integrals in p. Let us review this problem and see how we can choose the functionals Λ i . To study more carefully the limit of large , we introduce the (dimensionless) impact parameter b = 2 M/m and consider the behavior of the terms in the dispersion relation in the limit of large , large m and fixed b. In order to extract the leading behavior, we need the known limit of hypergeometric functions. More precisely we need the asymptotics of the expression where the limit is taken for fixed b = 2 M m , and we have indicated that this limit is independent of the finite shifts a 1 and a 2 . We use the following formula, extracted from [79] 10 where J n is a Bessel function. Applying it to (3.18), we find Using the expression for the integral of p a J k (bp/M ) (which is a known integral, convergent for n > −1 and implemented for instance in Mathematica), we get Given our ansatz (3.11) for the functional Λ, we need to consider the large spin limit of the elementary integrals p n V + ,m 2 ,−p 2 , p n V − ,m 2 ,−p 2 and p n V o ,m 2 ,−p 2 . For the moment let us take the lower extreme of the integral m IR = 0. Using the above results we obtain where for simplicity we put M = 1 and omitted the b dependence inside C ν,n . Let us begin by focusing on the I g dispersion relation only, namely the first entry of (3.3), discarding the others for the moment. In this case the asymptotic behavior is controlled by C 0,n . The functional Λ ≡ Λ 1 should then satisfy (in addition to other conditions) n a n C 0,n (b) 0 , for any b 0 , (3.25) where the a n are linearly related the coefficients c n,1 appearing in (3.11) by the actual choice of functions q n ∈ S 1 . Inspecting the large b expansion of C 0,n (b), one observes potentially dangerous oscillating terms: where the higher order terms correspond to half-integer powers only. In order to have a chance to fulfill the positivity condition one must suppress the oscillating behavior. One possibility would be to include, in the set S 1 , a power n < 1/2 such that the first nonoscillating term in (3.26) could dominate over the rest. Unfortunately in four dimensions the integral of such a term would produce a divergence in the low-energy part of the dispersion relation due to the graviton pole: Alternatively, following [65], we can engineer a linear combination to cancel the leading oscillating terms. For numerical reasons it is convenient (although not necessary) not to introduce integer powers but to preserve the expansion in half-integer powers only. Hence the smallest power at our disposal is n = 3/2. In order to have this term dominate at large b we must cancel the first two oscillating terms. We also note that when n is an odd positive integer, the first term in (3.26) vanishes, thus we can use two such powers and create the combinations 11

Reintroducing the IR cut-off
Unfortunately, the above discussion does not hold for finite values of the dimensionless impact parameter b. As discussed in [65], there exists a tension between the conditions (3.25) and (3.27) -the two conditions are mutually exclusive. This fact is made manifest when passing to the impact parameter space by taking the two dimensional Fourier transform of a function of the transverse momentum p: where again J 0 is the Bessel function which also appears in the large limit (3.18). Recalling the definition of our functional Λ, we can interpret the positivity conditions in the large m, limit as the condition φ(b) > 0 for any b 0 . On the other hand, the finiteness of the functional on the gravity pole (3.27) would require Clearly the conditions (3.31) and (3.32) are incompatible. As a consequence, in order to proceed further we must relax one of the two conditions. Given the presence of IR divergences in gravity, it seems natural to introduce an IR regulator in the form of a maximal impact parameter b max that can be probed by the scattering process. If that were the case, we would only need to demand positivity of C 0,n (b) for b b max . In principle we could search for functionals subject to this reduced positivity condition and obtain bounds as a function of b max . However, is more convenient to introduce an IR cutoff as a regulator m IR > 0 at small momenta as in (3.11). This modification gets rid of the restriction (3.27) on the polynomials q n , since now all the integrals are finite, but at the same time makes the bounds on couplings explicitly dependent on m IR . More precisely, including a term q n with n < 1 introduces factors of the form (M/m IR ) 1−n . 13 A milder dependence on the cut-off can be obtained by only including q 1 , which instead gives a logarithmic dependence in (3.14) In conclusion, our tentative choice for the polynomials in S 1 is A second important effect of m IR is that the cancelation of the oscillating terms in (3.28) is not exact anymore: from the power p n ⊂ q 1 (p) one gets a correction to (3.21) of the form (ν = n = 1) (3.35) In the large b limit the leading decay is controlled by b −5/2 , coming from (3.28) with n = 3/2, which is comparable to the above correction for Hence, the price of regularizing the action of the functional on the graviton pole with an IR cutoff is to introduce a small negativity at large impact parameter. The smaller the cutoff, the farther away we push the negativity, but at the same time we make the bound on the 13 We checked that adding powers of −1 < n < 1 does not lead to stronger bounds (for n −1 the integral (3.20) does not converge). This is due to the following: the new power would determine the leading behavior at large b, hence the sign of its coefficient c n is fixed to be positive. But then the bound on, say, g2 would look like g2 −c n ( M m IR ) 1−n which is optimized by taking c n ∼ 0. low-energy coefficients less stringent, specifically the bounds will depend logarithmically on m IR .
Finally, let us consider the other dispersion relations in (3.3): we see that their asymptotic behavior enters with both signs or in the out-of-diagonal components of a matrix. In order for them not to spoil the suppression of large-b oscillations, we must not introduce new dominant contributions. To achieve this, it is enough to cancel the leading universal b −3/2 power. Hence we choose: , . . . , n max } , (3.37) , . . . , n max } , (3.38) More details about our numerical setup can be found in Appendix C.
Putting all the pieces together, our approach will then be the following. We will numerically look for a functional of the form (3.11) (with m IR = 0) with the choice of functions q n as in (3.34) and (3.37) subject to the conditions (3.13). This is done by running the numerical semi-definite program solver SDPB [81,82]. The output of the algorithm are the coefficients c n,i of (3.11). Taken at face value, such a functional would give a divergent result when applied to I| low . We then modify the functional by taking m IR > 0: (3.40) The new functional satisfies Λ [ I| low ] = finite, and produces bounds with a logarithmic dependence on m IR . On the other hand, the positivity condition is violated at large impact parameter b b max . This is acceptable since it does not make sense to probe infinitely large distances in a theory with an IR cut-off.

Example bounds from simple functionals
In this section, we will derive two bounds on the four-derivative coefficients by considering two explicit functionals. This will give concrete examples of the considerations above, and produce bounds that share the qualitative features with those presented in figure 5.

Example 1: Global minimum of g 2
As a first example, we will derive a bound involving g 2 only, by finding a functional Λ = Λ 1 that is manifestly positive. For the sake of simplicity, we take a slightly different form of the functional and allow integer powers. 14 Here we shall only use the sum rules derived from I g . Let us start by the following ansatz: (3.41) 14 Restricting to half-integer powers is merely a trick to optimize the numerics.  We can fix two of the coefficients to assume that The first condition is just a normalization condition, while the second condition is chosen to produce a bound that is independent of β 2 . Solving for c 2 and c 3 gives At this point we first look for a functional that satisfies all the positivity conditions (3.13) in the limit m IR → 0; this is done in the next sub-section. Once we have found it, we can now use the same value of c 1 to define a functional Λ 1 where now m IR is kept small but finite. As explained in the previous section, the newly defined functional mildly violates the positivity conditions (3.13) at large m 2 , and large impact parameter. Neglecting this violation, we get Λ 1 [I g | low ] 0, with 44) and in the parametric limit m IR /M → 0 where the log dominates, we would get the bound Optimizing over c 1 We will now look for a functional of the form we denote c * 1 . It turns out that it is the last condition in figure 6, at large m 2 and , that puts the strongest constraints on what c 1 can be used to find a positive functional. We will examine this limit to find for a small δ, which has to be numerically determined.
In the high-energy expression (3.4), there are two different expressions that appear: I g,1 and I g,2 (3.10). In the limit of large and m 2 , for fixed impact parameter, these two expressions have the same asymptotic behavior (3.22) 48) and C 0,n defined in (3.18). Given the expansion we see that for c 1 > 28, it is positive in the large b limit. To find a functional that is positive also for finite b, we choose c * 1 = 28 + δ, and find that δ can be taken as small as δ = 0.0033. 15 Thus we get By inspection, we can verify that the action of the functional is positive in the first entry V + 0,m 2 , V + ,m 2 , V − ,m 2 , V o ,m 2 and also C inf (b) 0 for any value of b 0. This is shown in figures 7 and 8.

Example 2: Bounds with fixed relation between g 2 and β 2
Now we will instead look at a functional of the form (3.51) We will consider bounds along rays with fixed ratio between β 2 and g 2 . Specifically, we will define and maximize the parameter α for a given θ. We can normalize the functional so that the low-energy part gives where c 0 is a constant. This sets c 3 = − 3 (3c 1 (4 sin θ + 7 cos θ) + c 2 (6 sin θ + 7 cos θ) + 420) 10 sin θ + 9 cos θ , (3.54) and for the constant we find c 0 = − c 1 (1179 cos θ + 1208 sin θ) − 24c 2 (3 cos θ + 4 sin θ) + 3780 60(9 cos θ + 10 sin θ) . (3.55) The optimal upper bound on α is obtained for the smallest value of c 1 . The algorithm will then be to minimize c 1 > 0 while varying c 1 , c 2 , d 1 , d 2 and d 3 . The results are given in table 2. One can see that they satisfy (3.54).
With the results found so far we get, in the limit log(M/m IR ) → ∞, the results (3.50) and the results in table 2. The combined effect of these found are shown in figure 9. 15 We obtained this number numerically. For smaller values of δ the subleading oscillating powers in (3.49) produces a negative region at some finite b. Table 2. Functionals of the form (3.51), and the corresponding bounds. c 0 is given by (3.55).

More results
In this section we present our best bounds on some of the couplings appearing in the lowenergy part of the dispersion relations (3.4). We already showed in figure 5 the constraint on the parameters g 2 and β 2 obtained acting with a functional on the dispersion relations I g , I 0 and I β 2 . Next, we include in our analysis the dispersion relation I f , which allows us to consider the coefficient f 2 , also appearing in the black hole WGC. More precisely, the black hole WGC would require g 2 ± f 2 0, but, as shown in figure 10, the presence of gravity in our setup allows again a violation of the inequality of order (M/M P ) 2 log(M/m IR ).
In section 2.3.1 we introduced a method to get bounds on the coefficients h 2 = β/M 2 P and h 3 appearing in the inelastic scattering amplitude such as M +++− , however until now we have not fully exploited this technology, except in absence of gravity in section 2.5. This because the dispersion relations considered so far only depend on positive spectral densities |c − ,X | 2 , |c +,1 ,X | 2 , |c +,2 ,X | 2 , and |c o ,X | 2 . Thus, as a final application we include the dispersion relation I h (and drop I f ) and consider again bounds on g 2 and β. Moreover, we fix m IR to a finite value and inspect the dependence of the bounds on such value. The results are shown in figure 11, for m IR = 10 −6 M , 10 −10 M . The inclusion of I h does not substantially improve the bounds, while the finite value of m IR corresponds to a finite shift (which is less and less important since we are plotting the bounds divided by log(M/m IR ).
The fact that the bound on β does not change in an appreciable way when including the new dispersion relation I h is a bit surprising. This however is a consequence of the fundamental input from the low-energy EFT which allowed us to relate β ∈ I h and β 2 ∈ I g . If we insisted on being agnostic about the interpretation of the low-energy couplings, the inclusion of I h would still let us bound them separately. 16 16 As an example, by assuming β 2 ≥ 0 but relaxing the relation between β 2 and the linear term contained in I h , we obtained a bound on the linear term alone: M 2 |β| 267.1 log (M/mIR), when g2 = 0 and in the limit where the logarithmic term dominates. This bound is definitively weaker than the one showed in Figure 10. Exclusion plot in the plane (g 2 , f 2 ) normalized to M 2 M 2 P log(M/m IR ). The shaded regions correspond to assuming f 3 ≤ 0 (light blue) or f 3 ≥ 0 (dark blue). The former contains the latter. The dashed gray lines indicate the bound without gravity g 2 = ±f 2 . This plot used the dispersion relations I g , I f , I 0 , I β 2 . The minimal value of g 2 is the same as in Fig. 5 and forces f 2 = 0, f 3 0.

Violations to the weak gravity conjecture
The purpose of this section has been to derive bounds on the four-derivative corrections to Einstein-Maxwell theory. The most interesting bound, in our opinion, is the lower bound on g 2 , which we find is given by This bound is determined with the optimization procedure described above and in the appendix, so our conclusion is the following: The assumptions of this paper, including unitarity, causality, and weak coupling, are not enough to prove the black hole WGC.
In our opinion, this conclusion is not surprising. As we stated in the introduction, it was already anticipated by [29,59] that gravity might weaken causality bounds by introducing time delays. Furthermore, it is consistent with the gravitational weakening of the dispersionrelation bounds for scalars in d 5 reported in [65].
Using this bound requires that we make sense of the logarithmic divergence. Strictly speaking, if we demand that the cutoff may be taken to 0, this bound simply tells us that no constraint may be placed on g 2 . However, we believe that it is possible to do better than this -for instance, it was pointed out in [66] that even with the conservative estimates M ∼ 1 TeV and m IR near the Hubble scale, the resulting log 10 77 is not very large. Still, it would be nice to understand what the sharpest possible bounds are, but this will require further assumptions. We comment more on this direction in the conclusion.
Another important assumption is weak coupling, which ensures that EFT loops are suppressed in the amplitudes. This is a rather typical assumption and simply means that we are bounding classical, or tree-level amplitudes. However, in the presence of gravity, it becomes more subtle, because the coupling at low-energy might include factors of the high-energy coupling, or the mass of the high-energy particle. To make this more concrete, consider the EFT which arises from integrating out a charged particle such as an electron or charged scalar. The high-energy loop diagrams which contribute to this may include EM and gravitational couplings, and give rise to four-derivative coefficients which take the form where the hatted variables are completely numerical constants, and the O(1/M 4 P ) terms correspond to pure gravitational loops. Here α represents the strength of the electromagnetic coupling, for QED α = e 2 4π . For the cases of a spin-1 2 or scalar particle, the values are referred to as QED or scalar QED (sQED), and for these cases the constants take known values [54,60], , (3.60) sQED :ĝ 2 = 32 . (3.61) Our assumptions require that the entire amplitude is weakly coupled, meaning that α 1, but the meaning of the bounds (3.1)-(3.2) depends on the relative size of α and µ = M 2 M 2 P . In this language, the bounds are so let us comment on their meaning in the following regimes: Regime µ α 2 . In this case, theĝ 2 term is larger than everything else in (3.62), so the bound reduces to the familiarĝ 2 0 . (3.63) This is equivalent to the regime where gravity decouples, so we recover those bounds from section 2.5. If we integrate out a particle that is light relative to the coupling, i.e.
then our results mean that the WGC bounds will be satisfied. This is not so surprising, as such particles already (easily) satisfy the particle form of the WGC. 17 Regime µ ∼ α 2 . Let us set α 2 = µ and absorb any additional factor in the numerical constants. Then the bound (3.62) that determines the minimum of g 2 takes the form In this case, theĝ 2 term is suppressed relative to the others, so our bound only effectively constrainsĝ 2 . As a result, we find • For any fixed M/m IR , we cannot rule out the possibility thatĝ 2 be negative by an amount given by the right-hand side of (3.65).
• Ifĝ 2 has a component that runs logarithmically with log(M/m IR ), we cannot rule out the possibility that this term can be negative, but it must be > −c 1 .
Regime µ α 2 . We are unable to probe this regime, which includes µ ∼ α, simply because the left-hand side of (3.65) becomes suppressed compared to the right-hand side. Therefore the bounds are trivially satisfied. 17 The connection between the particle form of the WGC and black hole WGC deep in the IR was also discussed in [55].

Constraints on β 2 and a species bound
Let us also comment on the meaning of our bounds on β 2 in terms of g 2 . 18 The allowed region in the space of these two parameters, visible in figure 5, is quite irregular, but at larger values of g 2 , the slope appears to approach about 0.17. However, in what follows we will ignore all numbers to focus on the scaling. We find In the spirit of the discussion above, let us assume for the moment that these coefficients are dominated by integrating out charged particles at 1-loop level. Then, again ignoring order one numbers, we have In this case, β comes from triangle-type diagrams with two photons and one graviton attached to the charged loop, and g 2 gets contributions from those diagrams as well as from simple boxes with all four photons attached to the charged loop. 19 Let us make the simplifying assumption that there are n different species of charged scalars and they all have an equal value of z = √ 2qM P /m. Then our schematic bound (3.66) becomes (3.68) Here it is possible that z 1 (incidentally, this means that the species satisfy the particle WGC, though that is not relevant), in which case the bound on β 2 leads to a simple bound on the number of charged species, n M 2 P /M 2 . If z 1, then a bound on the number of charged species still follows, but the actual value of z begins to matter as well.
It is interesting that this bound is highly analogous to the "species bound" [83,84], which roughly states that the cutoff scale in a EFT with gravity and a large number of species, is given by Our bound may be interpreted as an analogous bound for charged particles. Adding an extra species with z > 1 contributes more to the β 2 term than to the g 2 term, so an upper bound on β 2 in terms of g 2 gives a limit on the number of such species. If we allow for both bosons and fermions, the bound is weaker because their contributions to β have the opposite sign. However, some scenarios may still be ruled out this way. For instance, the Standard Model has charged bosons and fermions, but the fermions dominate due to the low mass of the electron. Thus if we imagine n copies of the Standard Model coupled only through gravity and electromagnetism, then our upper bound on β 2 implies a bound on n. More generally, since the contribution to β from a Dirac fermion is (−2) times the contribution from a complex scalar, we see that bosonic and fermionic degrees of freedom have exactly equal and opposite contributions in this case. This suggests that an upper bound on β 2 might have an interpretation as a bound on fermion-boson asymmetry. It would be interesting to try to make this speculation more precise in the future.

Conclusion
In this paper, we have applied dispersion relations to 2 → 2 scattering amplitudes of photons in order to derive bounds on higher-derivative corrections to Einstein-Maxwell theory. In doing so, we overcame two main technical challenges. First, using an approach similar to [46,50], we arranged the helicity amplitudes in a 4 × 4 matrix indexed by their ingoing and outgoing states. This allowed us to derive bounds on inelastic amplitudes in terms of the elastic ones. This is important because the WGC inequalities given in (1.2) depend linearly on β, but it is clear from (2.12) that the only amplitude which depends linearly on β is M +++− , which is inelastic and cannot be bounded on its own.
The second, and more significant, technical issue addressed here is the so-called graviton pole: the appearance of terms in the low-energy amplitudes which diverge in the limit of small transverse momenta. These terms invalidate bounds derived by taking the forward limit of doubly-subtracted dispersion relations. One possible strategy is to include more subtractions to remove the pole from the sum rules, but this has the undesirable side-effect of also removing the four-derivative coefficients, which are relevant to the black hole WGC. In this paper, we used the doubly-subtracted dispersion relation, but we acted on it with more general functionals, rather than simply expanding in the forward limit. This method, developed in [65] for scalars coupled to gravity, yields bounds on four-derivative coefficients.
However, these bounds are, in general, weaker than the bounds which may be derived without gravity, i.e. the forward limit bounds. This is exactly what we found here: as reviewed in section 2.5, it is easy to prove the WGC bound in the limit where gravity decouples, but in the presence of the graviton pole, the strongest bounds we are able to derive allow for some violation of the WGC. This violation is proportional to the ratio M 2 /M 2 P , so it vanishes in the M P → ∞ limit where gravity decouples, as it should. This "allowed violation" also includes a logarithmic dependence on an IR cutoff, included to eliminate divergences associated with the well-known IR divergences plaguing massless amplitudes in four-dimensions. This does not seem like a fundamental issue, simply because gravitational scattering in four dimensions happens in the real world. Still, it would be nice to understand what if further assumptions might allow us to remove its dependence from our bounds. One promising possibility, used recently in [76] to derive Froissart-like bounds in for gravitational amplitudes in d 5, is to add assumptions about the behavior of the amplitude in particular semiclassical regimes. Specifically, it may be possible to derive rigorous bounds using functionals that are negative in a regime, if that regime is where the amplitude is controlled by semiclassical physics, such as the eikonal regime at large b or the black hole regime at large s. Perhaps these, or other assumptions, will tame the divergences. Ultimately, a complete understanding may require reconsidering the meaning of the S-matrix for massless particles, perhaps along the lines of [85], which defines the physical asymptotic states by dressing the free states with a cloud of soft photons / gravitons.
It is also interesting to consider situations where the cutoff is meaningful. The classic example of this is in AdS, where the role of the IR cutoff is played by the AdS radius L. Indeed, it was shown in [86] how flat space bounds may be uplifted to AdS, where the divergences are naturally regulated. This raises some interesting possibilities. The EFT inequalities for the black hole WGC in AdS were explored in [87], and also recently addressed in [88]. Relatedly, CEMZ-like bounds on the W µνρσ F µν F ρσ were obtained in AdS using the analytic bootstrap [89] and boundary causality [90], the latter of which also considered AdS 4 and found parametric bounds depending on the log ∆ gap . Our results might be used to make these constraints precise. It would be very interesting to translate our bounds to AdS in order to do a more careful comparison with those works.
A somewhat more speculative idea is that the IR cutoff may be bounded by basic properties of quantum gravity. This idea is based on the observation, due to Bekenstein [91][92][93], that the entropy contained in a volume is bounded by the region's surface area. The result is that any local EFT description must breakdown at very large length scales. In [94] it is argued that, to satisfy this bound, EFTs should satisfy (4.1) In principle, this could be applied to the IR cutoff scale in this work, giving a natural way to bound it from below by the other two scales. It might be interesting to try to pursue this line of reasoning further. Of course, these divergences may also be removed by working in more than four dimensions. This introduces new technical issues such as determining the higher-dimensional spinning partial waves, but it seems to us that this can be overcome. Another issue is that in higher dimensions, there are also curvature squared corrections such as R µνρσ R µνρσ , which are related to topological terms in 4d. In general, these terms will appear in the electric 20 WGC bounds [53] but not in the the photon four-point function. Therefore we expect that one would need to consider graviton amplitudes as well to relate causality bounds to the WGC in d > 4. Bounding R µνρσ R µνρσ could also have significant interest beyond the WGC, for instance for corrections to the ratio of shear viscosity over entropy [96] (see [97] for a review).
More generally, it would be interesting to try to understand if quantum gravity requires more stringent assumptions about the S-matrix than does simple QFT. Indeed, this is related to the basic idea of the Swampland, which is that there are some consistent EFTs which nonetheless cannot arise as a low-energy limit of a theory of quantum gravity. In this paper, we show how including quantum gravity weakens the possible bounds on scattering amplitudes, so one might wonder if or how quantum gravity can introduce stronger constraints than those of the traditional S-matrix program. One promising hint discussed in [76] is that certain smeared amplitudes admit singly subtracted dispersion relations if one adds assumptions about the behavior of the amplitudes in certain semi-classical limits. Exploring whether these or other assumptions can lead to stronger bounds is an important question that we leave to the future.

A More details on the derivation of sum rules
The goal of this appendix is to explain in detail the crucial steps that lead to equation (2.25), which we for completeness repeat here: where the sum over X is a sum over any additional labels that index the states with spin and parity indicated by Q. In this equation, h IJ are the components of a matrix of (the imaginary part of) partial wave densities, and the high-energy contribution is given by summing over spin and integrating along the positive cut, see (2.20). Explicitly where θ = arccos(1 + 2u m 2 ). The first of these terms comes from the direct-channel cut, and the second term comes from the crossed-channel cut.
We will use the following simple expression for the Wigner d functions, given in [66]: Following the logic in the main text, we will make use of the generalized optical theorem to write the imaginary part ρ λ 1 λ 2 λ 3 λ 4 of the partial wave densities as a sum over three-point functions of exchanges states In our notation, c Boson exchange symmetry and parity symmetry imposes the following constraints on the c λ i λ j ,X : Here we demanded that the theory respects parity invariance, and hence the exchanged states can be assigned a definite parity P X in addition to spin = X . The constraints from (A.6) and (A.7) take different solutions depending on the assumptions on P a and a . 1) P X = 1, even, 2) P X = −1, even, 3) P X = 1, odd. Note that the fourth possibility, odd parity and odd spin, admits no solution.

Even parity and even spin
Here the solutions take the form forg p,0 , where we definedg p,q to be the term in g(s|t, u) that is proportional to s 2+p−q u q : 21 The rule for p = 0, withg 0,0 = g 2 , is reproduced in (2.43) in the main text, and is not valid in the presence of gravity. Note that the sum rule (A.25) immediately implies that the bound valid without gravity.
In a similar way, by picking the same v and looking at a suitable linear combination of the powers s p−1 u 1 and s p u 0 , one finds To systematically derive a basis of sum rules, we note that for a matrix M, we have v T Mv = Tr(wM) for w = vv T . Using this fact, a basis of sum rules can be found by considering all linearly independent symmetric matrices w. For any given s p u q , this would give ten different sum rules, however typically not all of the sum rules are linearly independent and one has to find a basis among the sum rules. Any sum rule in such a basis with the low-energy side being zero constitutes a null constraint.
Before proceeding to integral sum rules, let us explain how to make contact with the formalism used in [49]. In that paper, no dispersion relation for the h-type amplitude was used, which means that V IJ + is diagonal, i.e. [V IJ + ] 12 = 0 in all sum rules. Then (A.24) takes the form For a given sum rule for g-type and f -type Wilson coefficients α, the entries α and V o α agree exactly with the expressions for 2V + α , V e α , 2V − α and V o α in [49]. 21 This parametrization of the g amplitude agrees with the one used in [37], wheregp,q was denoted ap,q.

Improved integral sum rules
Consider taking the contraction with v = (1, 0, 0, 0) T , and picking the power s 0 , keeping u general. This gives for the low-energy side with a corresponding high-energy expressions . In principle, one could construct integral sum rules by integrating this expression at u = −p 2 against a function φ(p). A much more practical method is to first subtract an infinite tower of sum rules for (−1) k u kg k,0 with k 1, using (A.25). This idea was advocated in [65]. In this manner, we find the formal equality  Figure 12. Allowed regions in the space (M 2 f 3 /g 2 , f 2 /g 2 ) on the left, (M 2 g 3 /g 2 , f 2 /g 2 ) on the right. They both give stronger bounds on the vertical direction w.r.t. [49].  Figure 13. Allowed regions in the space (M 4 g 4,1 /g 2 , f 2 /g 2 ) on the left, (M 4 g 4,2 /g 2 , f 2 /g 2 ) on the right.  Figure 14. On the left, allowed region in the space (M 4 (g 4,1 + 2g 4,2 )/g 2 , M 2 g 3 /g 2 ). We notice that it rules out a kink we were speculating about in [49]. On the right, a bound in the (M 4 (g 4,1 + 2g 4,2 )/g 2 , M 2 h 3 /g 2 ) space.  Figure 15. Allowed regions in the space (M 2 h 3 /g 2 , f 2 /g 2 ) on the left, (M 2 g 3 /g 2 , M 2 f 3 /g 2 ) on the right.  We checked that decreasing the interval of the discretization does not change the bounds substantially.
2. Large m 2 , finite : we computed a polynomial approximation of the high-energy part of the dispersion relation (3.10) integrated against p n for any as in (C.1). This is done by Taylor expanding these expressions around m 2 = ∞ up to a certain order n M , multiplying by the appropriate power of m 2 and substituting m 2 = m 2 max (1 + x). The positivity condition is equivalent to the positivity of the resulting polynomial for any x 0. This condition can be easily implemented in SDPB. We also checked that our approximation reproduces the original function to high precision for all m m max .
3. Finite m 2 , large : we computed a polynomial approximation of the high-energy part of the dispersion relation (3.10) integrated against p n for fixed values of m 2 . This is done by Taylor expanding these expressions around = ∞ up to a certain order n L , multiplying by the appropriate power of and substituting = (500 + x). Since we are interested in the large behavior, in all expressions (3.10), we can neglect hypergeometric functions, which are rapidly oscillating in this limit. We then require positivity of the resulting polynomial for any To take into account the large b regime, we Taylor expanded (3.22) around b = ∞ up to a certain order n b as in (3.26). Following [65,66], we rewrite an expression containing oscillating terms in terms of a 2 × 2 matrix. For instance , where A n , B n , D n are polynomials in b. Given our choices of functional, they only contain integer powers of b. 22 We then replace the positivity condition with the stronger requirement a n C 0,n (b) 0 −→ a n P (n) A similar replacement can be done for any combination of C ν,n (b). On the other hand the positivity conditions (3.13) also involve semi-definite conditions on linear combination of matrices. In the large m 2 , limit these matrices also contain oscillating terms, as shown in (3.22). Again we can replace each element of the matrices with a two by two matrix. For instance: (C.7) We then demand positivity of linear combinations of the resulting 4 × 4 matrices, which is again a stronger condition but has the advantage of being in the form of a polynomial matrix, which can be fed to SDPB. 22 If one were to use different functionals, then the polynomials An, Bn, Dn would contain also fractional powers of b, One could deal with this issue by redefining b → (b) k , with appropriate k. The only complication is an increase of the degree of the polynomials.
Finally, let us list the procedures followed to obtain each figure: • Figure 5: we used the dispersion relations I g , I 0 and I β 2 . This is equivalent to considering Λ[ v] = i Λ i [v i ], with i = 1, 4, 5. The low-energy part only depends on g 2 and β. Just as we did in section 3.2.2, obtained bounds along rays in the (g 2 , β 2 ) plane: we looked for a functional satisfying (see (3 The second and third condition ensures that we can discard the contribution of β 2 and f 3 , provided that the latter has fixed sign. Hence the two distinct allowed regions. The overall allowed region is the union of the two regions. • Figure 11: we used the dispersion relations I g , I h , I 0 and I β 2 . This is equivalent to considering Λ[ v] = i Λ i [v i ], with i = 1, 3, 4, 5. The low-energy part only depends on g 2 , β, β and h 3 . This time we cannot get bounds along rays since both β and β 2 appear. Hence we scanned over β, and got bounds on g 2 . We looked for a functional satisfying (see (3.4)) Λ 1 [1] = 1 , (C.14) The second condition ensures that we can discard the contribution of h 3 , provided that it has fixed sign. It turns out that the allowed region is independent of the choice of the sign.