New Axion Searches at Flavor Factories

We assess the impact of searches at flavor factories for new neutral resonances that couple to both photons and gluons. These are well motivated by"heavy axion"solutions of the strong CP problem and by frameworks addressing both Dark Matter and the Higgs hierarchy problem. We use LHCb public diphoton data around the Bs mass to derive the current best limit on these resonances for masses between 4.9 and 6.3 GeV. We estimate that a future LHCb dedicated search would test an axion decay constant of O(TeV) for axion masses in the few-to-tens of GeV, being fully complementary to the low mass ATLAS and CMS searches. We also derive the impact of BABAR searches based on Upsilon decays and the future Belle-II reach.


I. INTRODUCTION
The lack of new physics at the Large Hadron Collider (LHC) and the lack of direct detection signal of dark matter (DM) at present experiments make it necessary to rethink the theoretical questions in the SM from a wider viewpoint and trigger broader experimental searches for new physics (NP). In this paper we make a step in this direction by presenting a NP case for flavor factories at the intensity frontier. These are light resonances below the EW scale which are neutral under the SM gauge group and couple to both gluons and photons. We show that flavor experiments have an unexploited potential to probe these states in a complementary mass range to previously proposed low-mass resonance searches at ATLAS and CMS [1]. Pointing out these gaps in the search program at flavor facilities is now a particularly important question in view of the upcoming LHCb upgrade and the Belle II data taking.
The possibility we consider here is that the new physics scale M NP lies beyond the reach of the LHC. If that is the case, NP signals might still arise from pseudo-Nambu-Goldstone bosons (pNGBs) associated to spontaneously broken approximate symmetries. These are often called axion-like particle (ALP) in the literature, they can be sensibly lighter than the NP scale (m a M NP ) and their couplings to the SM are controlled by the inverse of the decay constant 1/f . Generically, one has M NP = g * f with g * being the typical size of the couplings in the NP sector, so that probing weak enough couplings of the pNGB gives an indirect probe of the scale of new physics.
The focus of this paper will be on pNGBs with m a between 2 and 20 GeV, a mass window within the reach of flavor experiments. The driving question is whether flavor experiments can be sensitive to couplings of pNGBs small enough to probe new physics beyond the LHC reach. This question has been partially addressed for ALPs which couple to the SM by mixing with the Higgs sector [2,3] but it is surprisingly unexplored for ALPs with only gluon and photon couplings.
In the large theory space of all the possible couplings of the ALP to the SM, having a non-zero coupling to gluons is particularly well motivated from the theory perspective. In this paper we will discuss in detail two particularly compelling examples: "heavy" QCD axions [4][5][6][7][8][9][10][11][12][13][14][15][16] and the R-axion [17][18][19] in low energy SUSY-breaking. As we will show, in these two classes of models the gluon coupling is unavoidable, the photon coupling generic, the mass range of interest for this paper can be easily achieved. A TeV decay constant is theoretically favoured by ensuring the quality of the axion potential [20][21][22][23] or by explaining the DM relic abundance via thermal freeze out. Besides these two examples, ALPs with both gluon and photons couplings arise for instance as new pions in Composite Higgs models [24][25][26] or in theories with vector-like confinement [27].
The first observation of this paper is that many existing search strategies for light resonances in the 2−20 GeV range [28][29][30][31][32][33][34] lose sensitivity as soon as the gluon coupling is switched on. The main reason is that the decay width into gluons dominates over the one into photons arXiv:1810.09452v1 [hep-ph] 22 Oct 2018 unless a non-generic hierarchy of couplings is assumed, therefore strongly suppressing the signals expected in the existing strategies.
The dominant di-jet final states are much more difficult to distinguish from the SM background than diphotons. 1 As a way to overcome this issue, we show that the large production rate in pp collisions induced by the non-zero gluon coupling can be exploited at LHCb, which already has a low mass diphoton trigger designed to look for the rare decay B s → γγ. To substantiate this point, we use 80 pb −1 of public LHCb diphoton data [37] around the B s mass to derive a limit of O(100) pb on the signal strength of new diphoton resonances. This limit already constitutes the strongest existing probe for ALPs in the mass range between 4.9 and 6.3 GeV and motivates a dedicated LHCb search for diphoton resonances in a broader mass range. We estimate the sensitivity of such a search and show that decay constants at around the TeV scale are within reach of the high-luminosity phase of LHCb. This extends the coverage of low-mass resonance searches down to masses as low as 2 GeV and constitutes a new probe of multi-TeV scale NP which could be difficult to produce directly at the LHC. A similar point was made in Ref. [1] with ATLAS, CMS, and Tevatron diphoton searches, that are however limited by trigger issues to masses roughly above 10 GeV.
We finally discuss bounds on light resonances produced from SM meson decays. We estimate the BABAR constraint on Υ(1, 2, 3S) → γa(jj) and assess the future Belle-II sensitivity. This production channel currently constitutes the best probe of ALPs below ∼ 3 GeV.

II. RESULTS
We consider a spontaneously broken approximate U (1) symmetry in the UV. Integrating out the new physics sector at the scale M NP , we write down the effective interactions between the pNGBs and the SM where i runs over the hypercharge, weak and strong gauge groups,F µν i = µνρσ F i,ρσ /2, α i = g 2 i /4π and α 1 is GUTnormalised (α 1 = 5α y /3). The constants c i are anomaly coefficients which depend on the number of degrees of freedom chiral under the U (1) symmetry and carrying a non-zero charge under the SM gauge group. 2 1 As an example the LEP limit on BR(Z → γa) is 1.7 · 10 −5 from 36.9 pb −1 of data if a is a diphoton resonance [35] and 4.7 · 10 −4 from 5.5 pb −1 of data if a is a dijet resonance [36]. 2 If the SM fermions and the Higgs doublet are uncharged under the U (1) symmetry, the couplings of the pNGB to them arise only from loops of SM gauge bosons and can safely be neglected.  [30,37], projections are given for Belle II and future LHC stages. Details are given in Sec. IV. The other bounds are derived from Z width measurements [28,38], heavy ion collisions [39,40], Z → γa(jj) decays at LEP I [29] and diphoton cross section measurements at CDF (relevant only for ma 10 GeV), CMS and ATLAS [1]. For the latter we also give sensitivities up to the HL stage as derived in Ref. [1]. The thin dashed lines indicate theory benchmarks motivated by heavy QCD axion models and by ALPportal Dark Matter described in Sec. III. New coloured and EW states are expected to have masses of order g * f , where g * = 4π/ √ Nmess = 4π/ √ 2 ci.
In the NP sector, the strength of the interaction g * generically limits the maximal number of degrees of freedom to be below ≈ (4π) 2 /g 2 * . Therefore, a lower g * allows for large couplings of the ALP to the SM but at the same time it lowers the scale of new physics M NP g * f .
For m a M Z , we can write the ALP couplings to photons and gluons below EWSB using the same notation of the QCD axion where we have where g aγγ agrees with the standard formula for the QCD axion after normalizing the decay constant with respect to the QCD coupling f = 2N f PQ . The relevant decay widths of the pNGB are where we include NNLO corrections to the gluon width [41] in K gg (see Appendix A for more details). Note that (0.1 mm) −1 Γ tot = Γ gg + Γ γγ m bin γγ over the mass range of our interest. The new resonance decays promptly and has a very narrow width compared to its mass.
The LHCb constraint and sensitivities derived in Section IV are displayed on the ALP parameter space in Figure 1, for the benchmark c 1 = c 2 = c 3 = 10. We compute σ(pp → a) with ggHiggs v4 [42][43][44][45] using the mstw2008nnlo pdf set. We compare it with that obtained by the use of different pdf sets and of MadgraphLO v2 6 [46,47] upon implementing the ALP model in FeynRules [48], finding differences from 20% at m a = 20 GeV to a factor of 2 or larger for m a < 5 GeV. As detailed in Appendix A, a more precise determination of the signal would be needed, especially for m a 5 GeV.
vi) the Belle-II sensitivity in the same channel, that we determine simply by rescaling the expected sensitivities in [30] by a factor of 10. This assumes that the Belle-II reach will be statistics-dominated, and that it will be based on a factor of 100 more Υ(3S) than the BABAR one (i.e. on 1.2 × 10 10 Υ(3S) in total). The current Belle-II run plan for the first years assumes only a factor of 10 for the above ratio [54,55], corresponding to a few weeks of dedicated run at the Υ(3S) threshold. An extra factor of 10 could be obtained in a comparable time with dedicated later runs, because a higher instantaneous luminosity is foreseen [55]. An analogous search could be effectively performed, at Belle-II, also analysing the decays of Υ(1S, 2S).
vii) limits from the diphoton final state from heavy ion collisions are extracted from the recent CMS analysis in Ref. [40] and the reinterpretation of the AT-LAS light by light scattering data [39] of Ref. [56]. The lower reach of these measurements is set to m a 5 GeV as a consequence of the minimal cuts on the two photons transverse momenta.
ATLAS limits from Z → γa(γγ) [57] are not displayed in Fig 1. They imply BR(Z → γa(γγ)) < 2.2 · 10 −6 and turn out to be comparable to the heavy ions bound for our benchmark in Fig. 1. Similar constraints can be derived from the ATLAS inclusive search in pp → γa(γγ) [57]. The lower invariant mass reach of these ATLAS searches is set by the diphoton isolation requirement of [57], ∆R γγ = 0.15. This corresponds to an ALP mass of 4 GeV as discussed in Ref. [58]. Notice that LEP searches for Z → γa(γγ) [31] are weaker than the ATLAS bound. Future sensitivities from e + e − → γa(γγ) [33,34] do not reach values of f larger than 50 GeV and are not shown. Finally, the proposed search in B → K ( * ) a(γγ)) [33] at Belle-II has some sensitivity in a very limited portion of our mass range and it is not shown to avoid clutter.
In Fig. 2 we fix the ALP masses to two representative values m a = 5, 15 GeV and show the impact of the various searches in the plane (N/f, E/f ) which control the ALP's gluon and photon coupling respectively. As one can see from Fig. 2, diphoton searches for a ALP produced in gluon fusion both at ATLAS/CMS (see Ref. [1]) and at LHCb (see Sec. IV) can be sensitive to N/f as small as 10 −4 GeV −1 as long as the coupling to the photons is large enough. Moreover they can cover significant portion of the parameter space where the couplings are of their natural size.
Searches taking advantage of uniquely the photon coupling such as the ones in Refs. [32,34,57] become relevant only in the upper left corner of the plane where E/N 50. Such a hierarchy can be realized in clockwork constructions where the photon coupling is enhanced with respect to the gluon one (see for example Ref. [59]).
The ATLAS, CMS and LHCb limits and sensitivites shown in Fig. 2 are derived assuming gluon fusion as the ALP production process, so they sharply stop at a given small gluon coupling. If other production processes like vector-boson-fusion are taken into account, the limits and sensitivities would be slightly improved in the upper left corner of Fig. 2. Practically, the Heavy Ion results that we are including will always lead to stronger constraint because of the enhanced photon-fusion production and the loop suppressed background from light-by-light scattering.
The bottom right corner where the new resonance mostly couples to gluons is challenging to constrain in this mass range, even though boosted dijet searches at the LHC were recently able to go down to invariant masses of 50 GeV (see Ref. [60]). Of course for N/f (100GeV) −1 one expects color states generating the ALP coupling to be within the reach of the LHC.

III. PHYSICS CASES
In this section we expand on the two theory lines displayed in Fig. 1. We would like to motivate: 1) the coupling of the axion to gluons and photons, 2) the TeV decay constant, 3) the mass range considered here.
a. Heavy axions As a first example, we consider a particular class of axion solutions to the strong CP problem in the SM. First of all, introducing a spontaneously broken Peccei-Quinn symmetry U (1) PQ which is anomalous under QCD [61,62] leads unavoidably to a light axion with non-zero couplings to gluons [63,64]. In this sense, the axion coupling to gluons is deeply connected to its role in solving the strong CP problem. Taking the SM fields to be uncharged under the U (1) PQ , the QCD anomaly is generated by heavy vector-like fermions like in KSVZ type of models [65,66] where the fermion charges should satisfy |q PQ ψ − q PQ ψ | = q PQ Φ and by writing Eq. (6) we take q PQ Φ = 1. After U (1) PQ gets spontaneously broken by the VEV of Φ, the fermion mass is at M NP = g * f / √ 2. Below the PQ breaking scale we can integrate out the heavy fermions and match to the effective Lagrangian in Eq. (2): The vector-like fermions are often assumed to carry a non-zero hypercharge in order to allow a non-zero mixing with the SM quarks, to make them decay avoiding cosmological problems. This induces an anomaly of U (1) PQ with respect to the hypercharge, which leads to a non-zero coupling of the axion to photons: E = 0. To fix a benchmark, we add N mess complete SU (5) fundamental representations, that lead to N = N mess /2 and E = 4/3 N mess . This is the scaling assumed in Fig. 1, where we also take N mess = (4π/g * ) 2 to ensure calculability below M NP . In Fig. 2 we go beyond this benchmark and show how E/N can be modified changing the SM representation of the fermions in Eq. (6) (see Ref. [67] for a discussion). Operators breaking U (1) PQ other than the QCD anomaly would in general spoil the axion solution of the strong CP problem [20][21][22][23]. We can parametrize these contributions as new terms in the potential for the scalar Φ: In the presence of these new contributions the axion potential below the QCD phase transition is Since the new phase α ∆ is in general not aligned with the contribution given by the QCD anomaly, the presence of the UV operator shifts the axion VEV away from the origin, jeopardizing the solution to the strong CP problem. Note that this holds even if the NP sector inducing Eq. (8) preserves CP, because a new phase α ∆ ∼ O(1) is induced by rotating away the phase in the quark mass matrices. Requiring 2 N a/f 10 −10 to satisfy the present bound on the neutron dipole moment [68,69] gives an upper bound on the axion decay constant where we have assumed |λ ∆ | ∼ α ∆ ∼ O(1) and neglected other O(1) factors for simplicity. The upper bound on f depends on the scale of the UV completion Λ U V g * f and on the "quality" of the U (1) PQ , i.e. the dimension ∆ of the lowest dimension operator breaking the symmetry.
In the best case scenario, first discussed in Refs [20][21][22][23], the U (1) PQ is only broken by Planck suppressed operators 3 but, more generally, one might argue that all the global symmetries should be an accidental consequence of the gauge and matter content of the theory, exactly like in the SM. In the latter case the Λ UV in Eq. (10) will be below M Pl . Taking Eq. (10)  In the usual QCD axion where m a 6 keV · TeV/f (see e.g. [72]), values of the decay constant motivated by the axion quality problem are abundantly excluded by star cooling bounds [73] and K → πa transitions [3,74]. A common solution to this problem is to go to higher values of f and require a U (1) PQ with higher quality. Such a U (1) PQ can be made accidental in extra-dimensions or with more complicated UV completions in 4 dimensions (we refer to Refs [75][76][77][78] for an illustration of the challenges involved in constructing gauge theories with a U (1) PQ with arbitrarily high quality).
Alternatively, one can construct QCD axion models where the axion mass is heavier than its QCD value. The idea is to introduce new contributions to the axion potential which are aligned to the QCD one, so that the axion mass gets larger without spoiling the solution to the strong CP problem. A larger m a then relaxes the experimental constraints on f , potentially allowing to satisfy Eq. (10). There are several classes of models of this type which differ from the way the alignment is achieved: mirror axion models with one axion and two mirror QCD's [4][5][6][7][8], models where the QCD running is modified at high energies [9][10][11][12][13][14], and a more recent proposals [15] where the QCD group is embedded in SU (3) N with N axions relaxing each one of the allowed θ-angles.
All the solutions of the strong CP problem mentioned above can easily achieve the 2-20 GeV mass range, and 3 Gravity is expected to break global symmetries at the non perturbative level via wormhole solutions swallowing the PQ charge [70]. In this case the Wilson coefficient of the operators in Eq. (8) can be very suppressed for a large enough wormhole action: |λ ∆ | ∼ e −S Eucl . The latter has been shown in Ref. [71] to be too small in the Einstein theory of gravity but large enough in theories where the Einstein theory is suitably modified at Planckian distances.
result in an axion which generically couples to both gluons and photons with a decay constant at the TeV scale or lower. These are a perfect benchmark for the collider searches discussed here. For illustrative purposes we show in Fig. 1 the value of f corresponding to a U (1) PQ broken by ∆ = 6 operators generated at M GUT = 10 15 GeV. b. ALP-mediated Dark Matter The second example of ALP with coupling and masses of interest for this study comes from demanding it to be the mediator that couples the SM to fermion DM, singlet under the SM gauge group. This possibility has particular interest for colliders because direct detection constraints are totally irrelevant, see e.g. [79].
We write the ALP coupling to DM as in equation (6) and identify the DM as the Dirac fermion (ψ,ψ † ), so that m ψ = g * f / √ 2. The DM annihilation cross section into SM particles, mediated by the ALP, is dominated by final state gluon pairs and reads where α s is evaluated at the scale µ = 2 m ψ . The cross section for t-channel annihilation into a pair of mediators is p-wave and reads [80] ( therefore it is negligible with respect to the annihilation into gluons for the parameter values we are interested in, even for relativistic v rel . Requiring Eq. (11) to match 4.8 × 10 −26 cm 3 /sec, which is the value needed for heavy Dirac DM to reproduce the correct DM abundance via thermal freeze-out [81], we find m ψ 4.6 TeV c 3 10 where in the second equality we have assumed the scaling c 3 8π 2 /g 2 * . This is the benchmark value we display in Figure 1. It is interesting to note that indirect detection is still far from probing thermal values of the annihilation cross section for DM in this mass range (see e.g. [82][83][84]), thus adding further motivation to test this scenario with colliders.
Note that we have neglected the possible Sommerfeld enhancement from exchange of the ALP in the initial state. The precise computation of this effect is still the object of some debate, see e.g. [85] for a recent study with references, so that for simplicity we do not include it here. Its inclusion would result in an O(1) change in the favoured value of f , but would not affect our physics point that pseudoscalar mediated DM motivates ALP searches at flavor factories.
c. R-axion in Supersymmetry. We finally notice that the simplified DM model presented above arises naturally in theories of low-scale SUSY breaking. These predict that the lightest supersymmetric particle (LSP) is the Gravitino, whose mass m 3/2 is generically too small to account for the observed DM abundance. Indeed, using the power counting described in [19], one gets m 3/2 = F/( √ 3M Pl ) 11 meV·(g * /3)·(f /4 TeV) 2 . While not reproducing the observed value of DM, Gravitino masses in this ballpark are safe both from collider [86][87][88] and cosmological [89] constraints.
In the absence of stable superpartners, the natural DM candidate in these SUSY theories are particles belonging to the messenger or SUSY breaking sectors, see [90] for a first study of this possibility. In this case, as first noted in [91] (see [92] for further model building), the DM phenomenology may be dominated by its interactions with a pseudoscalar that is naturally present in the theory, the R-axion.
This arises as the pNGB of the U (1) R symmetry, defined as the only abelian global symmetry which does not commute with the SUSY generators. The spontaneous breaking of the U (1) R is intimately related to SUSYbreaking according to the general results of [17,93]. The R-axion couplings to gluons and photons are unavoidably generated by loops of gauginos, whose Majorana masses are chiral under the U (1) R , and possibly by UV messengers. Couplings to fermions and to the Higgs are less generic and can be suppressed by suitable charge assignment (see [19] for more details). Under these circumstances, the R-axion matches perfectly the Lagrangian in Eq. (1).
For f = O(TeV), motivated here not only by DM but also by the naturalness of the Fermi scale, i) its mass is expected to lie in the MeV range [94] or above [19,93], thus motivating searches at flavor factories, ii) superpartners can be taken outside the LHC reach, thus making it potentially the first sign of SUSY at colliders [19].

IV. DIPHOTON SEARCHES AT LHCb
LHCb detects photons either as "unconverted", i.e. they reach the electromagnetic calorimeter (ECAL), or as "converted", i.e. they convert to an e + e − pair upon interacting with the detector material before reaching the ECAL. The public LHCb note [37] presents the trigger and cut strategy that will be used to look for B s → γγ, and classifies diphoton events into two unconverted (0CV), one unconverted and one converted (1CV LL and DD, corresponding to conversions occurring in the Vertex Locator region or after it) and two converted (2CV) samples.
Searches for B s → γγ benefit from requiring the γγ vertex to be displaced from the pp interaction point, while the resonances we are interested in typically have a lifetime much shorter than the B s one. A displaced γγ vertex is however not imposed on the 0CV sample, because the resolution on the directions of the photons does not allow for a precise enough vertex reconstruction. Therefore this sample can be used to derive a bound on prompt diphoton resonances.
Measured diphoton events that pass the cuts are reported in [37] for L = 80 pb −1 of data, for each conversion category, in a diphoton invariant mass interval 4.9 GeV < m γγ < 6.3 GeV and in bins of 14.5 MeV. No known QCD or SM resonance is expected to give a signal within the LHCb reach, explaining why the event distributions in m γγ are very smooth in all categories, so that they constitute an ideal avenue to look for BSM resonances. Therefore, we place an upper limit on the signal cross section of a resonance a decaying to diphotons as where with A the geometrical acceptance of the signal in the LHCb detector and the total efficiency of the cuts plus detector effects in a given diphoton category. We use where the latter is given in [37] for the SM "signal" B s → γγ, and we determine the former by simulating the signal (see Appendix D for details) and imposing 2 < η < 5 at truth level.
Coming to the right-hand side of Eq. (14), N bkg is the number of background events in the 14.5 MeV bin reported in [37], which we take constant as the distribution in m γγ is actually flat well within its statistical uncertainties. 4 m bin γγ is the size of the bin centered on m γγ = m a that we expect to contain most of the signal from the resonance, which we assume to be narrow. In practice we use where δm γγ is the invariant mass resolution for the 0CV category which can be derived from the energy resolution and the granularity of the LHCb ECAL (see Appendix C). Fixing for definiteness m bin γγ /m γγ = 13%, we obtain 4 While this holds for the 1CV and 2CV categories, the distribution in the 0CV category is flat up to mγγ 5.7 GeV, and then drops smoothly. A possible origin of this drop is the use of 2 × 2 ECAL cells to measure the photon energy deposition at the first level of the software trigger (HLT1) [37]. In Appendix B we verified that imposing invariant mass cuts at HLT1 can cause a flat background at HLT1 to develop a dropping shape at higher level, where the invariant mass is defined using 3 × 3 cells.
The sensitivities that could be achieved by the current full dataset of 8 fb −1 and by the High Luminosity phase of LHCb with 300 fb −1 of data can be easily obtained from the above equation. 5 We also extend the mass range of the search to 3 < m γγ /GeV < 20, where the lower bound is chosen to make the computation of the signal strength reliable (see also Appendix A) and the upper bound is chosen somehow arbitrarily at 20 GeV, where the reach of the current ATLAS/CMS inclusive diphoton dataset [1] is already stronger than the projections of LHCb. For simplicity we take the signal acceptance and the efficiency to be constant and equal to the ones in Eq. (16). We discuss in Appendix D the motivations for this simplified assumption. Moreover we assume that the background is also constant in the extended mass range and equal to the one in Eq. (17). This simple procedure sets a useful benchmark for the actual search, which is good enough for the purpose of this paper. The resulting reach in the ALP parameter space is shown in Fig. 1.
We finally speculate about the limit and reach obtainable if the 1CV photon categories could be used. To set an optimistic reach, we do not take into account the signal loss because of the requirement of vertex displacement in present LHCb search. With this assumption, we repeat the procedure described above, with constant background N 1CV,DD bkg = 1600 and N 1CV,LL bkg = 1300 and constant efficiencies 1CV,DD = 1.35% and 1CV,LL = 1.32% as reported in Ref. [37]. Concerning the mass resolution, we take the one of the 0CV category divided by √ 2 to roughly account for the much better energy resolution of the converted photon. With all these assumptions we combine in quadrature the exclusions from the LL and DD single-converted categories and get which is almost a factor of 3 weaker than the 0CV bound.
In more realistic conditions we expect a sensible loss of signal from the requirement of displacement, although better background discrimination might be also achieved thanks to the converted photon. We do not even study the 2CV photon category because it is plagued by a very small signal efficiency. As a useful input for future more detailed studies, we collect here some considerations about the LHCb reach outside the interval 4.9 GeV < m γγ < 6.3 GeV: As far as the signal is concerned, we do not expect a significant drop in the efficiency going at higher invariant masses. As detailed in Appendix D at higher invariant masses the diphoton final state will be less forward, reducing the geometric acceptance. However, the decreasing boost of the produced particle is more than compensated by the higher efficiency of the photon p T cuts. Practically, the ultimate high mass reach of LHCb is not very relevant for the purposes of discovering new physics, since above 10-20 GeV it is likely to be superseded by the ATLAS/CMS diphoton searches (see [1] for details).
The most stringent limitation for scanning masses above ∼12 GeV at LHCb is the current dynamic range of the ECAL. This range, which depends on the electronics and not on the actual configuration of the detector, limits at the moment reconstructing photons with E T above ∼10 GeV (∼6 GeV at the level of the first level of the software trigger HLT1). Therefore, a potential increase in the dynamic range of the ECAL after the LHCb Upgrade would be very benificial to increase LHCb's sensitivity to higher masses. For instance, modifying the electronics to increase the range to 15− 20 GeV would be enough to cover all the mass range for which ATLAS and CMS have a poor sensitivity.
As already mentioned, the invariant mass distribution in the 0CV category from the data in Ref. [37] displays a drop for masses larger than approximately m γγ 5.7 GeV. In Appendix B, we argue that such drop is a consequence of the use of 2 × 2 ECAL cells to measure the photon energy deposition at HLT1. If our guess is correct there should be another drop of the background at low invariant masses in a region not showed by the plot of Ref. [37].
Understanding the composition of the diphoton background given in Ref. [37] would require a detailed MC simulation, including detector effects, which is beyond the scope of this paper. In Appendix C we provide a simple kinematical argument which shows that the background from boosted π 0 faking photons is likely to dominate over the one from real photons. A categorization of the data in different η regions would help suppressing this background at small η. This could be used to maximize the reach. A quantitative assessment of this is left for future studies.
The precise assessment of the 1CV limit and sensitivities would require a dedicated search for promptly decaying resonances without the requirement of a displaced vertex. In this case one could get an even better reach than the one presented here by combining the 0CV and the 1CV category.
We hope that this work could provide enough motivation to explore further the open issues described above and in general the possibility of performing bump hunts on the top of the diphoton background at such low invariant masses.

V. CONCLUSIONS
The LHC has pushed the energy scale of many motivated SM extensions beyond the TeV range. How to experimentally test NP models at and beyond those scales? A possibility is to look for low energy remnants of such theories, like pseudo-Goldstone bosons (aka ALPs) from an approximate global symmetry.
In Section III we showed that ALPs with masses and decay constants of interest for flavor factories arise as a solution to the strong CP problem ("heavy QCD axions") and in frameworks motivated by Dark Matter freeze-out and the Higgs hierarchy problem (e.g. the SUSY R-axion as mediator of DM interactions). These scenarios share the prediction of ALP couplings to gluons and photons, that are currently tested in a particularly poor way for masses below O(10) GeV.
In Section IV, we have used 80 pb −1 of public LHCb data to set a bound on diphoton resonances of σ(pp → Xa(γγ)) 100 pb, and we have performed a first study to assess future LHCb sensitivities. This bound is already the strongest existing one on the ALPs discussed above, and shows that LHCb has a very promising potential to test unexplored territory of well-motivated BSM extensions. Technical results that might be useful for future LHCb studies are provided in Appendices C and D. We have also recasted BABAR limits on Υ → γa(jj) on this model, and estimated the associated future capabilities of Belle-II, finding they would be particularly relevant for masses below ≈ 3 GeV. These results are summarised in Figure 1.
Our findings provide a strong motivation to pursue the phenomenological and experimental program of testing this class of ALPs at LHCb and Belle-II, thus enriching the physics case of both machines.

A. Acknowledgements
We thank Sean Benson and Albert Puig Navarro for many useful discussions, in particular about the LHCb note [37], and Marco Bonvini for clarifications about ggHiggs. D.R. thanks Simon Knapen for discussion and clarifications on Ref. [56]. D.R. thanks CERN and the Galileo Galilei Institute for theoretical physics (GGI) for kind hospitality during the completion of this work. F.S. is grateful to the Mainz Institute for Theoretical Physics (MITP), CERN and GGI for kind hospitality at various stages of this work. K.T. thanks MITP for kind hospitality during the completion of this work. Appendix A: More on the Signal We compute the gluon fusion production cross section at N3LO using ggHiggs v4 [42][43][44][45] and at LO using Mad-GraphLO. We compare the two predictions in Figure 3 left, for different choices of the pdf sets, and rescaling the ggHiggs cross section using that 246 GeV (anomaly coefficient coming from a top loop). The agreement between these determinations goes from the 20% level at m a = 20 GeV, down to a factor of 2 and worse for m a ≤ 4 GeV. We mention that at such low values the ggHiggs output should be taken with extra care, as it also yields some negative LO and NLO cross sections. This comparison underlines the need for a more precise determination of the production cross section, especially for ALP masses below 5 GeV or so. This task goes however beyond the purpose of this paper. We use the ggHiggs prediction with the mstw2008nnlo pdf set for all the LHC phenomenology in Section II.
Coming now to the ALP branching ratios, we use the NNLO QCD correction to the width of a pseudoscalar into gluons from [41]. In the notation of Eq. (4), it Figure 3 right we plot the resulting diphoton branching ratio together with its NLO and LO value and with the one given by Madgraph. NNLO corrections to the diphoton branching ratio reduce its LO value by a factor of 2, over the whole mass range we consider. We use the NNLO expression for all the limits and sensitivities described in Section II.
Appendix B: mγγ distribution of the 0CV category In our analysis, we assume the background yield to be roughly constant with respect to the diphoton invariant mass even outside the mass range reported in Ref. [37]. This is to provide an order-of-magnitude estimate of the background for the LHCb sensitivities to ALPs. The flatness of the data is actually seen in the 1CV and 2CV categories of Fig. 4(b-c) of Ref. [37]. However, in the 0CV category ( Fig. 4(a)), a kink is observed at large invariant masses. In what follows, we argue that this is an artifact due to the trigger level invariant mass cut.
In the invariant mass calculation at the trigger level of the 0CV category, two approximations are employed to speed up the calculation: 1) the photon energy is calculated from the energy deposition in 2 × 2 ECAL ∆θ γ1γ2 . We examined these two approximation and concluded that 1) could be the reason for the kink.
It is easy to show that the approximate mass formula is equivalent to the full mass formula with O(0.01) accuracy. This comes from the fact that the diphoton events within the LHCb fiducial volume have a small opening angle ∆θ γ1γ2 = O(0.1), after the E T trigger cuts are imposed. On the other hand one needs to use 3 × 3 cells to capture full energy deposit of a photon, so the information based on 2 × 2 cells underestimates the photon energy, which leads to the lower invariant mass, m trigger γγ < m full γγ . Because the first invariant mass cut is made at the trigger level, 3.5 GeV < m trigger γγ < 6 GeV, bins with a given m trigger γγ migrate to bins with m full γγ > m trigger γγ .This could explain why the reduction of the yield appears above m full γγ ∼ 6 GeV. This argument is confirmed by Fig. 1 bottom of [37], that shows how the trigger level mass distribution of the B s signal shifts to higher values of off-line invariant mass.
We further validate the argument modelling the energy smearing of the LHCb ECAL. For simplicity, we focus on the inner ECAL and approximate 2×2 cells as a circle of radius 4 cm. Because the Molière radius of a photon in the LHCb ECAL is 3.5cm, 6 the energy deposit inside the 2×2 cells is expected to be 95% of the total energy deposit on average. In order to model a realistic environment we include a stochastic gaussian smearing from the average value. We choose a standard deviation of 10% 7 such that the shift of the signal at m a = m Bs reproduces Fig. 1 bottom of [37]. This result is shown in Fig. 4 left. Then, we use the same prescription for the background-like events. The result is shown in Fig. 4 right. The invariant mass distribution in terms of m trigger γγ is normalized to be rectangular after the invariant mass cut. When the same dataset is plotted in terms of m full γγ we can see that a kink is induced.

Appendix C: Details on the LHCb Calorimeter
The Electromagnetic Calorimeter (ECAL) of LHCb has three layers with different granularities and is placed vertically with respect to the beam axis at z Ecal =12.52 m away from the collision point. The ECAL square cells have side lengths of ∆x cell =4.04, 6.06cm, 12.12cm for inner, middle, and outer layer, respectively [95]. The photon reconstruction algorithm uses patterns of 3×3 cells in each layer. Therefore, the inner layer, where most of the energy is expected to be deposited, has the best angular resolution.
a. Invariant mass resolution The invariant mass can be written as where E γ1,2 are the energies of the two photons and θ γγ is the angular separation between them. Using the above formula, we can relate the invariant mass smearing to the photon energy smearing and the ECAL granularity In the second line we assumed for simplicity E γ1 E γ2 E γ and approximated our result at the first order in θ 1. To obtain the second expression in the second line, we used the LHCb ECAL energy resolution δE/E 9% GeV/E ⊕ 0.8% reported in Ref. [96] and the granularity of the inner layer of the ECAL δθ = ∆x cell /z Ecal 0.003. Moreover, we have approximated θ γγ m γγ /E γ to get an expression of the typical energy smearing as a function of the typical photon energy. In computing the invariant mass resolution in the text, we take E γ = 50GeV. We believe this is a realistic benchmark value for this analysis because Separation of photon pairs from π 0 decay as a function of the pion total energy E π 0 . If this photon pair is misidentified as a single (fake) photon, E π 0 is the energy of the fake photon. Cases of inner, middle, and outer layers are plotted in blue, red and magenta respectively. E γ = E T γ cosh η and the LHCb analysis in Ref. [37] imposes E T γ > 3.5 GeV and E T γ1 + E T γ2 > 8 GeV on 2 × 2 cell clusters.
b. Background from π 0 faking single photon One of the advantages to study low mass diphoton resonances at LHCb is that low energy fake photons from QCD can be distinguished from real photon candidates. Here we focus on fake photons from π 0 decays whose collimated diphoton decay can mimick a single photon candidate.
Photon pairs from π 0 decay have angular separation θ π 0 γγ m π 0 /E γ 2m π 0 /E π 0 . The corresponding separation on a given ECAL layer is then If the π 0 is very energetic, the diphoton separation ∆r π 0 γγ is smaller than a single cell size and the object is mostly misidentified as a single photon candidate of energy E π 0 . Viceversa, when a pion is less energetic and the diphoton separation is large, ∆r π 0 γγ > O(2)∆x cell , two photon clusters are separately formed and a pion is resolved. In a regime where 1.8∆x cell > ∆r π 0 γγ 0.5x cell , the shower shape information makes a single energy cluster identified as a π 0 , which is called merged π 0 [97]. The identification efficiency using both resolved and merged π 0 is O(50%) for p T π 0 10 GeV (Fig. 21 left of Ref. [97]). As shown in Fig. 5, the final energy thresholds vary depending on the ECAL layer. For example, in the inner ECAL diphotons with E π 0 < 28 GeV corresponding to a large separation of ∆r π 0 γγ > 3∆x cell can be reconstructed as resolved π 0 s, while the ones with 46 GeV < E π 0 160 GeV could be seen as merged π 0 s. The planned LHCb B s → γγ analysis uses a photon energy threshold of E T γ >3.5 GeV which corresponds to E γ = 13 (260) GeV at η =2 (5). Comparing with the threshold determined above for the pions to be detected as fake photons, one learns that i) the background to the current search contains a non-negligible amount of fake photons; ii) a categorization in η of the data could help in reducing photon fakes.

Appendix D: Signal Acceptance and Efficiency
In this Appendix we discuss the strategy that we adopted to estimate the acceptance and efficiency of the signal. As mentioned in the main text, we eventually consider a constant value for the product of acceptance times efficiency on the mass range of interest for this paper. As reference value, we have chosen the one at the invariant mass of 5 GeV, corresponding to the B s signal considered in the LHCb note [37].
In order to estimate the acceptance and efficiency of the signal at LHCb, we implement the axion model in FeynRules [48], we generate events with MadgraphLO v2 6 [46,47] and shower them with Pythia 8.1 [98,99], matching up to 1 extra jets [100]. We then perform a simple analysis of the resulting samples using MadAnalysis5 [101]. Note that the signal events which are inside the acceptance of LHCb contain topologies where the axion has acquired a significant longitudinal boost, without the need of extra hard radiation. As a consequence the signal efficiency is essentially not changed by including extra jets (the minimal E T cuts of the LHCb selection can be satisfied with just a small transverse boost). This has to be contrasted with the low invariant mass searches at ATLAS/CMS where the recoil of the resonance against the extra jet increases the signal efficiency of the p T cuts significantly, as it was shown in Ref. [1].
In Table I we report the acceptance and the efficiency that we find in the mass range 5 − 15 GeV by following the selection cuts of refs. [37], that is A : 2 < η(γ) < 5 (D1) : We first observe that the value we find for the product A × , though in the same ballpark than the number reported by the LHCb note (see eqn. (16)), differs by around a factor of 2 on the case of m a = 5 GeV. In order to check wheter the discrepancy could be caused by detector effects, we also processed the same samples using Delphes as fast LHCb detector simulator, but we did not find a substantial improvement in the agreement.
However, besides the discrepancy on the benchmark of 5 GeV, our simple analysis provides indications on what could be the expected product of A × for the selection cuts (D2) for different mass values. As one can observe from Table I, increasing the mass of the axion the acceptance generically decreases. This is due to the fact that a heavier resonance will more likely be produced with less boost on the longitudinal axis, and hence the resulting photons will be less into the forward region which is covered by the LHCb detector. On the other hand, for larger values of the axion mass the outgoing photons will be more energetic and will more likely pass the energy and p T cuts, hence resulting in an increase in the signal efficiency. The combination of these two effects result in a product of acceptance times efficiency which actually slightly grows along the mass interval 5 − 15 GeV, but does not changes significantly. This justifies the simplified choice that we have adopted in the main part of the paper.