Axion-Like Particles at Future Colliders

Axion-like particles (ALPs) are pseudo Nambu-Goldstone bosons of spontaneously broken global symmetries in high-energy extensions of the Standard Model (SM). This makes them a prime target for future experiments aiming to discover new physics which addresses some of the open questions of the SM. While future high-precision experiments can discover ALPs with masses well below the GeV scale, heavier ALPs can be searched for at future high-energy lepton and hadron colliders. We discuss the reach of the different proposed colliders, focusing on resonant ALP production, ALP production in the decay of heavy SM resonances, and associate ALP production with photons, Z bosons or Higgs bosons. We consider the leading effective operators mediating interactions between the ALP and SM particles and discuss search strategies for ALPs decaying promptly as well as ALPs with delayed decays. Projections for the high-luminosity run of the LHC and its high-energy upgrade, CLIC, the future $e^+e^-$ ring-colliders CEPC and FCC-ee, the future pp colliders SPPC and FCC-hh, and for the MATHUSLA surface array are presented. We further discuss the constraining power of future measurements of electroweak precision parameters on the relevant ALP couplings.


Introduction
into photons, charged leptons, light hadrons or jets. These decays can be prompt or displaced if the width of the ALP is sufficiently small. We present bounds from current and future highenergy collider searches for ALPs decaying into photons, charged leptons and jets, including the case where the ALP couples dominantly to gluons. Existing constraints on the ALP-gluon coupling come from mono-jet [32] and di-jet [36] searches at the LHC and the rare kaon decay K + → π + a mediated by ALP-pion mixing [37].
Future hadron colliders can operate at unprecedented center-of-mass energies, whereas future lepton colliders benefit from their clean collision environment and the large production rates of on-shell Z bosons and tagged Higgs bosons. Two current proposals for circular electron-positron colliders are the Circular Electron-Positron Collider (CEPC) based in China [38] and the e + e − Future Circular Collider (FCC-ee) based at CERN [39]. CEPC is envisioned to have a 50 km tunnel and operate both at the Z pole and as a Higgs factory (at √ s = 250 GeV). At the Z pole the target is to produce 10 10 Z bosons per year. Over a period of 10 years an integrated luminosity of 5 ab −1 should be accumulated at two interaction points, which corresponds to one million Higgs events [38]. The FCC-ee is a proposed ring collider with 80 -100 km circumference operating at center-of-mass energies between 90 and 400 GeV. At the FCC-ee, more than 10 12 Z bosons would be produced at four interaction points within one year [40]. Roughly three million Higgs bosons would be produced in five years. Linear lepton colliders such as the ILC or CLIC loose in luminosity compared to their circular counterparts. The ILC is proposed to operate at 250, 350 or 500 GeV, accumulating an integrated luminosity of 2, 0.2 and 4 ab −1 , respectively [41,42]. CLIC is designed to collect 0.5, 1.5 and 3 ab −1 at 380 GeV, 1500 GeV and 3 TeV center-of-mass energy, respectively [43].
Current proposals for high-energy proton colliders include the High-Energy LHC (HE-LHC) operating at 27 TeV in the existing LHC tunnel and accumulating 15 ab −1 [44], the FCC-hh based at the proposed CERN FCC-ee tunnel operating at a center-of-mass energy of 100 TeV with a target luminosity in the range of 10 -20 ab −1 per experiment [45], and the Super-Proton-Proton-Collider (SPPC) based in the CEPC tunnel in China operating at 70 -100 TeV [38] accumulating 3 ab −1 .
Comparing the regions of ALP parameter space that can be probed with these future hadron and lepton colliders is particularly interesting and contributes to corroberating the physics case for these various machines. In this work we also consider proposed new experiments searching for long-lived particles, such as FASER [46], Codex-B [47] and MATHUSLA [48], which can access the ALP parameter space between the regions covered by LHC experiments and bounds from cosmology. This paper is structured as follows: In Section 2 we review the effective Lagrangian for an ALP interacting with SM fields and introduce the formalism for our ALP detection strategy. In Section 3 we discuss the reach of ALP searches at future colliders. We focus on existing LEP and LHC limits in Section 3.1, ALP searches at lepton colliders in Section 3.2, and move on to ALP searches at hadron colliders in Section 3.3. In Section 3.4 we discuss the reach of the future surface detector MATHUSLA at the LHC. Section 4 contains our conclusions.

ALP production and decays 2.1 Effective Lagrangian
An ALP is a light scalar which is a singlet under the SM gauge group and odd under CP. The ALP Lagrangian respects a shift symmetry, which is only softly broken by a mass term. Its leading interactions with the SM particles are described by dimension-5 operators [49] where the couplings to fermions c f f are assumed to be flavor universal, and Λ sets the characteristic scale of global symmetry breaking. The commonly used axion decay constant f a is related to our new-physics scale by Λ/|C eff GG | = 32π 2 f a . ALPs can obtain part of their mass from non-perturbative dynamics but need additional explicit breaking of the shift symmetry to be heavier than the QCD axion. 1 In the absence of an explicit breaking term, the QCD axion is defined by a strict relation between its mass and decay constant, m a ∝ f π m π /f a , with f π and m π the pion decay constant and mass, respectively. For ALPs such a strict relation does not apply, since m a and f a are independent parameters.
In the broken phase of the electroweak symmetry, the ALP couples to the photon and the Z boson as The relevant Wilson coefficients are given by where s w and c w are the sine and cosine of the weak mixing angle, respectively. The exotic decay Z → γa is governed by the Wilson coefficient C γZ .
Note that the anomaly equation for the divergence of the axial-vector current allows us to rewrite the ALP-fermion couplings in (1) in the form where the dots represent similar terms involving gluons and weak gauge fields [25]. This is instructive to relate results obtained for the ALP with analogous, and maybe more familiar, results derived for a CP-odd Higgs boson. E.g. the first term on the right-hand side is now of the same form as the coupling of a CP-odd Higgs to fermions. Interactions with the Higgs boson, φ, appear only at dimension-6 and higher, where the first operator mediates the decay h → aa, while the second one is responsible for h → Za. Note that a possible dimension-5 operator coupling the ALP to the Higgs current vanishes by the equations of motion. However, in theories where a heavy new particle acquires most of its mass through electroweak symmetry breaking, the non-polynomial dimension-5 operator C can be present [25,26,53,54]. In our analysis we allow for the presence of such an operator. We now summarize the relevant partial widths needed for the remainder of this paper. We express the relevant decay rates in terms of effective Wilson coefficients, which take into account loop-induced contributions, that have been calculated in [25]. In the case of h → Za decay the effective coefficient is defined as C eff Zh = C Zh +C Zh v 2 /2Λ 2 +loop effects. The relevant ALP decay rates are where the latter expression is only valid if m a Λ QCD . The exotic Higgs and Z-boson decay rates into ALPs are given by where λ(x, y) = (1 − x − y) 2 − 4xy.

ALP production at colliders
At high-energy colliders, ALPs can be produced in different processes. We distinguish resonant production through gluon or photon fusion and e + e − annihilation, the production in association with photons, Z bosons, Higgs bosons or jets [32][33][34][35], and the production via exotic decays of on-shell Higgs or Z bosons [25,26].  Figure 1: Tree-level Feynman diagrams for the processes e + e − → Xa with X = γ, Z, h.

Resonantly produced ALPs
At high-energy colliders, ALPs can be produced resonantly through gluon-fusion gg → a (GGF), photon fusion γγ → a (γγF), or electron-positron annihilation e + e − → a. An important difference between resonant production and ALP production through exotic decays or associated ALP production is that the resonant production cross section is always suppressed by the ALP mass, m a , over the new physics scale Λ. Resonant production is therefore mostly relevant for large ALP masses. At hadron colliders large ALP masses are also important to suppress backgrounds. The cross sections for the resonant ALP production processes are where f f gg (y) = 1 y dx x f g/p (x)f g/p (y/x) is the gluon luminosity function (the photon luminosity function is defined analogously) and K a→gg ≈ 3.3 − 2.4 for m a = 100 − 1000 GeV accounts for higher-order QCD corrections [55,56]. In the last equation we set m 2 e /s → 0. Both σ(e + e − → a) as well as the quark contribution to σ(pp → a) are strongly suppressed by the light fermion masses and these processes are therefore not the dominant production modes. ALP production in photon fusion with a subsequent di-photon decay of the ALP is particularly interesting, because the production times decay rate only depends on the ALP mass and the single coupling C eff γγ . Furthermore, the uncertainty of the photon distribution function in the proton has recently been considerably improved allowing for more robust limits [57]. For resonantly produced ALPs finite-lifetime effects do not play any role because the sizeable couplings and ALP masses required to obtain appreciable production cross sections lead to prompt ALP decays.
ALP production in association with a photon, Z or Higgs boson An important production mechanism especially at e + e − colliders is associated ALP production. The relevant Feynman diagrams are shown in Figure 1. Additional diagrams with ALPs radiated off an initial-state electron are suppressed by m 2 e /s relative to the shown graphs and hence neglected here. ALPs can be radiated of a photon or a Z boson and thereby be produced in association with a γ, a Z or a Higgs. The differential cross sections for ALPs produced in association with a γ, a Z or a Higgs boson are given by and g V = 2s 2 w − 1/2 and g A = −1/2. Note that the cross sections with a gauge boson in the final state become independent of s in the high-energy limit m 2 a , m 2 Z s < Λ, while the cross section for e + e − → ha decreases as 1/s in this limit.
Light or weakly coupled ALPs can be long-lived, and thus only a fraction of them decays inside the detector and can be reconstructed. The average ALP decay length perpendicular to the beam axis is given by where Γ a denotes the total width of the ALP, θ is the scattering angle (in the center-of-mass frame) and γ a specifies the relativistic boost factor. For the case of associated ALP production with a boson X = γ, Z, h, we have In order to obtain the total cross sections for ALPs produced in associated production, we integrate the differential distributions (16) - (18) with the non-decay probability, i.e.
where L det is the transverse distance from the beam axis to the detector component relevant to the reconstruction of the ALP. Associated production at hadron colliders will not be considered here. For long-lived or invisibly decaying ALPs such processes have been explored recently in [32,35].

ALP production in exotic decays of on-shell Higgs and Z bosons
Exotic decays are particularly interesting, because even small couplings can lead to appreciable branching ratios. In the case of the Higgs boson, the SM decay widths are strongly suppressed, and consequently the branching ratios for Higgs decays into ALPs can be as large as several percent [25,26]. In the case of the Z boson, the huge samples of Z events expected at future colliders provide sensitivity to Z → γa branching ratios much below current bounds. This allows us to probe large new-physics scales Λ, as illustrated in Figure 2, where we show the cross sections of the processes pp → Z → γa, pp → h → Za and pp → h → aa at the LHC with √ s = 14 TeV. The figure nicely reflects the different scalings of the dimension-5, 6, and 7 operators in the effective ALP Lagrangian. The shaded region in the left plot is excluded by Higgs coupling measurements constraining general beyond the SM decays of the Higgs boson, Br(h → BSM) < 0.34 [58]. The shaded area in the right plot is derived from the measurement of the total Z width, which corresponds to Br(Z → BSM) < 0.0018 [59]. This leads to constraints on the coefficients |C eff Zh | < 0.72 (Λ/TeV), |C eff ah | < 1.34 (Λ/TeV) 2 and |C eff γZ | < 1.48 (Λ/TeV). The Higgs and Z-boson production cross sections at 14 TeV are given by σ(pp → h) = 54.61 pb [60] and σ(pp → Z) = 60.59 nb, computed at NNLO using tools provided in [61,62].
As discussed above, it is important to include the effects of a possible finite ALP decay length. Using the fact that most Higgs and Z bosons are produced in the forward direction at the LHC and approximating the ATLAS and CMS detectors (as well as future detectors) by infinitely long cylindrical tubes, we first perform a Lorentz boost to the rest frame of the decaying boson. In this frame the relevant boost factor for the Higgs or Z decay into ALPs are given by We can compute the fraction of ALPs decaying before they have travelled a certain distance L det from the beam axis, finding where f a dec is relevant for h → Za and Z → γa decays, and f aa dec applies to h → aa decays. For Higgs bosons produced at e + e − colliders the assumption of forward production is no longer justified. Rather, the angular distribution in the scattering angle ϑ of the Higgs boson in the center-of-mass frame are approximately given by [64] with x i = m 2 i /s. The approximation s m 2 h for the Vector Boson Fusion (VBF) process is justified, because the VBF cross section becomes the dominant production cross section for √ s 500 GeV [64,65]. This fact is illustrated in Figure 3, which depicts the cross section of various Higgs production modes at lepton colliders as functions of the center-of-mass energy. Even though in the Higgs rest frame, the angular distribution of the produced ALPs will be isotropic, the corresponding distribution in the center-of-mass frame is more complicated in this case. Since the Higgs bosons are predominantly produced with ϑ ≈ 90 • , we will for simplicity make the conservative assumption that the ALPs are also produced at maximum scattering angle in the center-of-mass frame, corresponding to sin θ = 1 in (21). For the resonant process e + e − → Z → γa on the Z pole, no such difficulty arises. The corresponding differential branching ratio can be obtained from (16) by setting s = m 2 Z , and the decay-length effect can be taken into account as shown in (23).
For prompt ALP decays, we demand all final state particles to be detected in order to reconstruct the decaying SM particle. For the decay into photons we require the ALP to decay before the electromagnetic calorimeter which, at ATLAS and CMS, is situated approximately 1.5 m from the interaction point, and we thus take L det = 1.5 m. Analogously, the ALP should decay before the inner tracker, L det = 2 cm, for an e + e − final state to be detected. We also require L det = 2 cm for muon and tau final states in order to take full advantage of the tracker information in reconstructing these events. For CLIC, we use L det = 0.6 m for lepton reconstruction [66]. We define the effective branching ratios where X = γ, e, µ, τ, jet and Y = , hadrons. Multiplying the effective branching ratios by the appropriate Higgs or Z production cross sections and luminosity allows us to derive results for a specific collider. At hadron colliders like the LHC, we require 100 signal events, since this is what is typically needed to suppress backgrounds in new-physics searches with prompt decays of Higgs and Z bosons [58,67,68] (see also [25] for further discussion). At lepton colliders we assume a much cleaner environment and show the reach for 4 signal events. We do not take advantage of the additional background reduction obtained by cutting on a secondary vertex in the case where the ALP lifetime becomes appreciable. A dedicated analysis by the experimental collaborations including detailed simulations of the backgrounds is required to improve on our projections.

Collider reach for ALP searches
The reach of ALP searches at current and future colliders depends on the type of collider, the ALP production mechanism, and the center-of-mass energy of the experiment. For the LHC and the most advanced proposals for future colliders, we use the benchmark specifications collected in Table 1. In the following, we determine the reach of future colliders in comparison to the high-luminosity phase of the LHC with √ s = 14 TeV and an integrated luminosity of L = 3 ab −1 .

ALP searches at the LHC and LEP
Constraints from ALP searches at LEP have been discussed for the associated production of ALPs with a photon and the subsequent ALP decay into photon pairs (e + e − → γa → 3γ) [32], as well as for on-shell Z decays (e + e − → Z → γa → 3γ) [33]. The excluded parameter space in the m a − |C eff γγ |/Λ plane is shown in blue in Figure 4. At the LHC, exotic Higgs and Z boson decays are the most promising search channels. Decays of on shell Z bosons at the LHC have been discussed in [25,32,33,35]. The constraints from these searches can be mapped onto the m a − |C eff γγ |/Λ plane under the assumption that the two couplings C eff γγ and C eff γZ are related to each other. For example, if the ALP couples to hypercharge but not to SU (2) L , then (3) implies C γZ = −s 2 w C γγ , since C W W = 0. The corresponding constraint is shown in Figure 4: Left: Summary plot of constraints on the parameter space spanned by the ALP mass and ALP-photon coupling. Right: Enlarged display of the constraints from collider searches: LEP (light blue and blue), CDF (purple), LHC from associated production and Z decays (orange), LHC from photon fusion (light orange), and from heavy-ion collisions at the LHC (green).
orange in Figure 4. 2 The purple region is excluded by Tevatron searches for pp → 3γ [69], again assuming C W W = 0. The dark green area in Figure 14 in Section 3.3 below depicts the region where 100 events are expected in the process pp → Z → γa → 3γ at the LHC with √ s = 14 TeV and L = 3 ab −1 . We demand that the ALPs decay before they reach the electromagnetic calorimeter L det = 1.5 m. Note that for a part of this parameter space the photons from the ALP decay are very boosted and hard to distinguish from a single photon in the detector [70]. Searches for the exotic Higgs decays pp → h → Za → Zγγ and pp → h → aa → 4γ cannot be translated into constraints in the m a − |C eff γγ |/Λ plane, because the ALP-Higgs couplings governed by the coefficients C eff Zh and C eff ah are generally not related to C eff γγ . Instead, we show the reach of the high-luminosity LHC in the |C eff Figure 15 in Section 3.3.
Besides ALP production in exotic decays of Higgs and Z bosons, ALP production through photon fusion plays an important role at the LHC. This process was first considered in a VBF-type topology in [71], and the excluded region is part of the orange shaded region in Figure 4. For GeV-scale ALPs produced in photon-fusion, (quasi-)elastic heavy-ion collisions can provide even stronger constraints due to the large charge of the lead ions (Z = 82) used in the LHC heavy-ion collisions [34,72]. The parameter space probed by this process is shown in green in Figure 4. Figure 5: Left: Existing constraints on the ALP mass and coupling to gluons by mono-jet searches at the LHC (light blue), rare kaon decays (light red) and three-jet events (purple). Right: Constraints on the ALP mass and coupling to leptons from searches for solar axions (purple), the evolution of red giants (light red), beam dump searches for ALP decays into muons (blue) and BaBar searches for e + e − → 4µ.
Recently, the parton distribution function of the photon has been determined with significantly improved accuracy [57], and searches for di-photon resonances at the LHC can be recast to give bounds on heavy pseudoscalar particles with couplings to photons [73]. We have computed the constraints based on the most recent ATLAS analysis with 39.6 fb −1 of data [74] and show the corresponding sensitivity regions in light orange in Figure 4. A recent proposal to search for ALPs in elastic photon scattering at the LHC allows for a similar reach in the m a − |C eff γγ |/Λ plane [75]. Searches for ALPs decaying into photons are motivated by the relation between the ALP coupling to gluons C eff GG and to photons C eff γγ in models addressing the strong CP problem, and from a practical point of view by the difficulty of observing light ALPs decaying into jets at hadron colliders. On the other hand, if the coupling to gluons is present in the effective ALP Lagrangian (1), constraints arise from searches for mono-jets at ATLAS and CMS [32], as well as from the rare kaon decay K + → π + a mediated by ALP-pion mixing [37]. Di-jet searches at the LHC can provide bounds on heavy ALPs with masses m a > 1 TeV, whereas recent analyses of di-jet resonances accompanied by hard initial state radiation pp → ja → 3j allow the experiments to probe ALP masses below the TeV scale [36]. 3 We show these limits in the m a − |C eff GG |/Λ plane in the left panel of Figure 5. Another promising signature are leptonically decaying ALPs: a → + − with = e, µ, τ . In the right panel of Figure 5 we show a compilation of current limits in the m a − |c eff |/Λ plane taken from [25]. We assume universal couplings to leptons, such that lepton flavor changing couplings mediated by ALP exchange are absent at tree level. Lepton colliders are sensitive to the resonant production of ALPs with subsequent decays into leptons. In general, however, the loop-induced couplings to Zγ and γγ are more important than the tree-level coupling to electrons because the latter is suppressed by m e /Λ. Even for ALPs coupling only to leptons at tree level the associated production cross sections via the processes shown in Figure 1 dominate over the e + e − annihilation cross section. Projections for additional signatures, such as pp → aW ± (γ), pp → ajj(γ), pp → ha and pp → tta with stable ALPs or invisible ALP decays have been considered in [35].

ALP searches at future lepton colliders
Future lepton colliders have the potential to precisely measure the properties of the Higgs boson and search for new physics effects in electroweak observables. In addition they offer qualitatively new ways to search for ALPs. In contrast to hadron colliders, e + e − machines offer a much cleaner detector environment allowing one to identify ALPs produced in association with a Z boson, a photon or a Higgs boson. Therefore, in addition to ALPs produced in exotic decays of on-shell Z and Higgs bosons, we also discuss the associated production of ALPs. On the contrary, barring a fine-tuning of the collider energy, the resonant production of ALPs cannot be observed in e + e − collisions. 4 Of particular interest are processes governed by a single non-vanishing Wilson coefficient at tree-level that allow us to compare the projected sensitivity reach of the future lepton colliders with the results of previous experiments, see Figures 4 and 5. Studying these processes at a lepton collider allows one in particular to probe benchmark models in which the ALP couples only to electroweak gauge bosons or only to charged leptons. Other processes involve different couplings for the production and the decay of the ALP. Among these, the rich Higgs program of all proposed future lepton colliders motivates the search for ALPs produced in association with Higgs bosons or in exotic Higgs decays. For these channels, in order to compare the reach of the various proposed experiments, we focus on the di-photon and di-lepton ALP decay channels. Following [25], we present the corresponding sensitivity regions in a twodimensional plane spanned by these two couplings. We emphasize that analogous studies could be performed for different ALP decay channels, such as a → bb or a → jj.

ALP production in association with a photon, Z or Higgs boson
For e + e − → γa → 3γ and e + e − → Za → Zγγ, the process only depends on the photon coupling |C eff γγ |/Λ once a specific relation between C W W and C BB is assumed, see (3). The projected reach can therefore be compared to the limits in Figure 4. If the FCC-ee will operate at different values of the center-of-mass energy, it is in principle possible to measure the two coefficients C eff γZ and C eff γγ independently, as pointed out in [25]. Also, for the proposed Z-pole run of the FCC-ee, the process e + e − → γa → 3γ would correspond to on-shell decay of the Z boson to an ALP, Z → γa, which will be discussed below.

FCC-ee
e + e ! a e + e ! Za We show the projections for the various versions of the CLIC collider and the FCC-ee in Figure 6, assuming C W W = 0 which implies C γZ = −s 2 w C γγ . The parameter space corresponds to at least 4 expected signal events with the ALP decaying before it has reached the electromagnetic calorimeter (ECAL) which is assumed to be within a radius of ∼ 1.5 m of the beam axis. We consider only visible decays of the Z boson with Br(Z → visible) = 0.80. We also impose the constraint |C eff γZ | < 1.48 Λ/TeV from the LEP measurement of the total width of the Z boson.
The contours for the FCC-ee in Figure 6 combine the luminosities for the run at the Z-pole (in case of e + e − → γa), at √ s = 2m W and at √ s = 250 GeV, whereas for CLIC we show separate limits for three different versions of this collider. Note that the large luminosity of the FCC-ee run at the Z pole leads to a significantly larger sensitivity in the e + e − → γa channel compared to the e + e − → Za projection. Further, CLIC 1500 and CLIC 3000 allow to probe considerably higher ALP masses compared to both CLIC 380 and the FCC-ee. In this and the following figures, the relevant ALP branching ratio into the observed final state is set to a 100%. As we have shown in [25], the left boundary of the sensitivity region is largely independent of this assumption. For branching ratios smaller than Br(a → γγ) = 1, the reach in C eff γγ however is reduced by a factor Br(a → γγ) 1/2 . This follows from the cross sections (16) and (17), which imply the scaling σ(e + e − → γa → 3γ) ∼ |C eff γγ | 2 Br(a → γγ) and σ(e + e − → Za → Zγγ) ∼ |C eff γγ | 2 Br(a → γγ), respectively. 5 ALPs can also be produced in association with a Higgs boson. The rate for the process e + e − → ha depends on the Wilson coefficient C eff Zh in (5). The constraint Γ(h → BSM) < 2.1 MeV on the partial Higgs decay width into non-SM final states implies the upper bound |C eff Zh | < 0.72 Λ/TeV [58]. Assuming that the Higgs boson is reconstructed in the bb final states FCC-ee e + e ! ha e + e ! ha e + e ! ha e + e ! ha with Br(h → bb) = 0.58, we derive the sensitivity to C eff γγ and m a displayed in the upper left panel of Figure 7. In the upper right panel of Figure 7 we show how these projected sensitivity regions vary for different values of C eff Zh . The expected sensitivity remains the same down to a critical value of the branching ratio Br(a → γγ) < 1. Below this critical value less than 4 events are produced and the discovery reach is lost. For the FCC-ee, these critical values are Br(a → γγ) = 2 × 10 −4 for C eff Zh = 0.72 Λ/TeV, Br(a → γγ) = 10 −2 for C eff Zh = 0.1 Λ/TeV and Br(a → γγ) = 0.4 for C eff Zh = 0.015 Λ/TeV. For the case of leptonic ALP decays these values do not change, and they are only slightly different in the case of CLIC. In that case,  searches for other final states can become more promising. This includes searches for invisibly decaying (or stable) ALPs [77]. Lepton colliders are particularly powerful in constraining ALPlepton couplings. In order to avoid large lepton-flavor changing ALP couplings, we choose a benchmark with ALP couplings to leptons, The lower panels of Figure 7 show the regions of sensitivity for ALP searches in the process e + e − → ha → bb + − . The jumps in the sensitivity region appear at the thresholds for the production of muon and tau pairs. The ALP decays predominantly into the heaviest lepton that is kinematically accessible. The graphical representation in Figure 7 is suboptimal, because it highlights the dependence on one ALP coupling (|C eff γγ | or |c eff |), while the dependence on the other coupling (C eff Zh ) is only reflected by the different contours. In Figure 8 we show an alternative representation of the results in the plane of the two relevant ALP couplings, but for fixed values of the ALP mass. The sensitivity reach of the FCC-ee and the three versions of the CLIC collider for an ALP branching ratio of Br(a → γγ) = 1 (upper panels) and Br(a → + − ) = 1 (lower panels) is bounded by the colored contours. With decreasing ALP mass, the lifetime of the ALP increases and the sensitivity reach in C eff γγ and c eff is reduced. The fact that the sensitivity region for CLIC is maximal for the lowest center-of-mass energy is a consequence of the 1/s behavior of the e + e − → ha cross section in (18).
For the example of the FCC-ee, we also indicate the dependence of the sensitivity regions on the a → γγ or a → + − branching ratios, which in Figure 7 were assumed to be maximal. The parameter space to the right of the dotted contours corresponds to the sensitivity reach of the FCC-ee with the indicated ALP branching ratios. Smaller branching ratios reduce the sensitivity to C eff Zh , because the total number of signal events decreases. However, the values of C eff γγ and c eff for which sensitivity is lost are almost independent of the ALP branching ratio, as long as this branching ratio exceeds a critical value. Consider, for example the process e + e − → ha → bbγγ for m a = 10 GeV (upper left panel of Figure 8). If C eff Zh /Λ = 0.1 TeV −1 , the sensitivity reach in C eff γγ /Λ extends down to ≈ 10 −5 TeV −1 irrespective of Br(a → γγ), as long as this branching ratio exceeds 1%. The reason for this behavior is that the total width of the ALP increases for smaller ALP branching ratios and therefore the lifetime decreases. Smaller ALP lifetimes lead to more ALP decays in the detector volume, canceling the effect of the reduced branching ratio near the lower boundary of the sensitivity region [25]. In order to not clutter the plots we do not show the corresponding contours for CLIC.
From now on, whenever ALP production and decay are governed by unrelated Wilson coefficients, we will use the graphical representation in Figure 8.
A particularly interesting benchmark scenario is the model in which at tree-level the ALP only couples to charged leptons. In this case the production and decay are governed by the same parameter c . The ALP decays are dominated by Br(a → e + e − ) ≈ 1 for m a < 2m µ , Br(a → µ + µ − ) ≈ 1 for 2m µ < m a < 2m τ , and Br(a → τ + τ − ) ≈ 1 for m a > 2m τ . Interestingly, the most relevant production mode at e + e − colliders is still the associated production with photons and Z bosons, which proceeds through the loop-induced Wilson coefficients [25] with τ = 4m 2 /m 2 a , and τ /Z = 4m 2 /m 2 Z . In the last step in the second equation we have neglected terms of order m 2 /m 2 Z . Because of the anomaly equation, B 1 (τ ) ≈ 1 for m a > m and B 1 (τ ) ≈ − m 2 a 12m 2 for m m a and the relative size of the resonant production cross section and the associated ALP+γ production cross section is given by where N denotes the number of charged leptons lighter than the ALP, and Γ a ≈ keV is a typical width for a → τ + τ − , assuming |c |/Λ ≈ 1/TeV. For N < 3, the total width is reduced by m 2 µ /m 2 τ , and the associated ALP+γ production is even more dominant. The ratio of the partial decay widths on the other hand is given by with m the mass of the heaviest lepton in which the ALP can decay. For ALP masses below 720 GeV (2300 GeV) this ratio is larger than 1 (0.1), justifying the assumption of Br(a → + − ) = 1 for almost all of the relevant parameter space. We show projections for future e + e − colliders for flavor universal ALP-lepton couplings in Figure 9. An increase in sensitivity occurs at the di-muon and di-tau thresholds. Note that while the advantage of a high-luminosity run on the Z-pole of the FCC-ee accounts for an increase in sensitivity on C eff γγ of up to ∼ 2.5 orders of magnitude in Figure 9, for purely leptonic ALP couplings the Z-pole run only increases the sensitivity by about one order of magnitude in e + e − → γa, because the loop-induced Wilson coefficient C eff γZ is suppressed by the accidentally small vector coupling of the Z boson to charged leptons. CLIC can again constrain higher ALP masses.

ALP production in exotic decays of on-shell Higgs bosons
Beyond searches for ALPs produced in association with a photon, a Z boson or a Higgs boson, ALPs can also be searched for in exotic Higgs decays. The Higgs production cross section at lepton colliders is typically at least one order of magnitude smaller compared to the LHC. This implies that lepton colliders are most powerful for light ALPs with dominant decay channels for which backgrounds at hadron colliders are large. In Figure 10, we show the reach of the different stages of CLIC and the FCC-ee for ALPs produced in e + e − → h + X → aZ + X → γγZ vis + X and e + e − → h + X → aa + X → 4γ + X for three different ALP masses m a = 100 MeV, 1 GeV and 10 GeV. We do not distinguish between vector-boson fusion or associated Higgs production and demand four signal events. In order to reconstruct the Higgs, we further demand the Z boson to originate from the Higgs decay as well as all Zs to decay into visible final states with Br(Z → visible) = 0.8 and Br(a → γγ) = 1.   This condition can be relaxed if the electrons in ZZ-fusion or the additional Z in associated Higgs production are detected. Since the reach in searches for exotic Higgs decays is directly proportional to the number of Higgses produced, high-luminosity machines lead to the best sensitivity. In Figure 10 we further show the reach of the FCC-ee for different values of Br(a → γγ) = 10 −5 − 10 −1 given by the respective dotted lines. For leptonic ALP decays, the analagous plots are shown in Figure 11, where, in contrast to Figure 9, no connection between C eff ah , C eff Zh and c eff has been assumed. CLIC has a larger reach than the FCC-ee for leptonic ALP decays due to the larger detector volume, L det = 0.6 m at CLIC, compared to L det = 0.02 m at the FCC-ee. Since C eff ah and C eff Zh are not controlled by the anomaly equation, the one-loop contribution from a tree-level c eff coupling is proportional to m 2 /v 2 [25]. The gray regions in Figures 10 and 11 correspond to |C eff Zh | > 0.72Λ/TeV and |C eff ah | > 1.34 Λ 2 /TeV 2 excluded by the current upper limit on Br(h → BSM) < 0.34 (at 95% CL) [58]. Figure 12: Allowed regions in the parameters space of the Wilson coefficients C eff γγ − C eff γZ obtained from projections for the two-parameter global electroweak fit at 68% CL, 95% CL and 99% CL at FCC-ee (violet) and at 95% CL for the LHC at √ s = 14 TeV (green). For the parameter space within the dashed black contour, the FCC-ee measurement of α(m Z ) is within its projected errors at 95% CL [78]. The red dots represent the best fit points based on the current electroweak fit.

Electroweak precision constraints on ALP couplings
Besides direct measurements, lepton colliders will be able to measure electroweak observables with unprecedented precision, which allows us to set bounds on the ALP contributions to these observables [25]. The measurement of the oblique parameters will improve current constraints by roughly one order of magnitude [79], while the running of the electromagnetic coupling constant, α(m Z ), can be determined with an uncertainty of about 10 −5 [78]. In Figure 12, we show the projected electroweak fit for the FCC-ee, where we assume the central values to correspond to the SM prediction, in the C eff γγ − C eff γZ plane at 68% , 95% and 99% CL (violet), together with the expected sensitivity of the LHC at √ s = 14 TeV (green). Superimposed is the expected 95% CL bound derived from the measurement of α(m Z ) (black dashed contour), assuming that the theoretical error on this quantity will have decreased below the experimental uncertainty by the time the measurement can be performed. In deriving these projections we have set the ALP mass to zero. By combining the future measurements of α(m Z ) and of electroweak precision pseudo-observables one will be able to constrain |C eff γγ |/Λ 2.5 TeV −1 and |C eff γZ |/Λ 1.5 TeV −1 (at 95% CL). The current global fit has a slight tension with the SM prediction and the best fit point is at (S, T ) = (0.096, 0.111). If this effect is solely due to the ALP couplings C eff γγ and C eff γZ , the corresponding best fit points are indicated by the red dots in Figure 12. Such sizable coefficients are however strongly constrained by LHC searches for pp → γa and pp → γZ.

ALP searches at future hadron colliders
Future hadron colliders can significantly surpass the reach of the LHC in searches for ALPs. In particular, searches for ALPs produced in exotic Higgs and Z decays profit from the higher center-of-mass energies and luminosities of the proposed high-energy LHC (HE-LHC), planned to replace the LHC in the LEP tunnel with √ s = 27 TeV, and the ambitious plans for a new generation of hadron colliders with √ s = 100 TeV at CERN (FCC-hh) and in China (SPPC). At hadron colliders, ALP production in association with electroweak bosons suffers from large backgrounds. Previous studies of these processes have therefore focussed on invisibly decaying (or stable) ALPs, taking advantage of the missing-energy signature [32,35]. In contrast, here we focus our attention on resonant ALP production in gluon-fusion and photon-fusion, as well as on ALPs produced in the decays of Z and Higgs bosons.

Resonant ALP production
At hadron colliders ALPs can be produced resonantly in gluon-gluon fusion. A gluon coupling implies the presence of di-jet final states, which are hard to distinguish from the background for masses m a < 1 TeV. A more promising strategy is the search for di-photon events. Assuming non-vanishing couplings to photons and gluons, we show in Figure 13 the sensitivity reach for the LHC, LHC 27 and FCC-hh in the C eff GG − C eff γγ plane. This reach is obtained by a rescaling of the constraint derived in the ATLAS analysis with 39.6 fb −1 of data [74]. The ALP production cross section is computed with MadGraph5 [63] and corrected for N 3 LO corrections using the K factors K gg = 2.7 at m a = 200 GeV, K gg = 2.45 at m a = 500 GeV and K gg = 2.35 at m a = 1 TeV [56].

ALP production in exotic decays of Z or Higgs bosons
In analogy with the LHC specifications, we demand ALPs produced at pp colliders and decaying into photons to decay inside the detector and before the electromagnetic colorimeter, L det = 1.5 m, and for ALPs decaying into leptonic final states to decay before they reach the inner tracker, L det = 2 cm. Our sensitivity reach is defined by requiring at least 100 signal events. We use the reference cross sections σ(gg → h) = 146.6 pb [80] and σ(pp → Z) = 118.76 nb at √ s = 27 TeV, computed at NNLO [61,62]. At √ s = 100 TeV, the relevant cross sections are σ(gg → h) = 802 pb and σ(pp → Z) = 0.4 µb [81].
In Figure 14 we show the reach of the LHC, the HE-LHC (LHC 27 ) and the FCC-hh in searches for pp → Z → γa → 3γ, assuming as before that C W W = 0 and Br(a → γγ) = 1. The reach of the HE-LHC extends beyond the reach of the LHC at √ s = 14 TeV by a factor of about 3.2 assuming an integrated luminosity of 15 ab −1 . Colliders with √ s = 100 TeV and 20 fb −1 can improve this reach by a factor of about 6.7 compared with the LHC. However, a high-luminosity run of an e + e − collider on the Z-pole, as for example proposed for the FCC-ee, can probe the same couplings with even higher precision, as becomes clear by comparing the left upper panel of Figure 7 with Figure 14.
The situation is different for the case of exotic Higgs decays, because the Higgs production cross sections at hadron colliders with √ s = 14 − 100 TeV are larger by orders of magnitude compared to the proposed future lepton colliders. In Figure 15, we display the reach for observing 100 events at the LHC, HE-LHC and FCC-hh for searches for pp → h → Za →   Figure 15: Projected reach in searches for h → Za → + − + 2γ and h → aa → 4γ decays with the LHC (green), HE-LHC (light green) and a 100 TeV collider (blue). The parameter region with the solid contours correspond to a branching ratio of Br(a → γγ) = 1, and the contours showing the reach for smaller branching ratios are dotted. + − γγ (upper panels) and pp → h → aa → 4γ (lower panels) for m a = 100 MeV, 1 GeV and 10 GeV and Br(a → γγ) = 1. We further indicate the reach obtained in the case that Br(a → γγ) < 1 by the dotted lines. Even though we rely on leptonic Z decays with Br(Z → + − ) = 0.0673 to account for the more challenging environment at hadron colliders, a future 100 TeV collider significantly improves beyond the projected reach in C eff Zh and C eff ah of the FCCee shown in Figure 10. The sensitivity to C eff γγ , however, is comparable between the FCC-ee and FCC-hh, and the projections for searches for e + e − → ha → bbγγ at the second and third stage of CLIC even surpass the FCC-hh sensitivity in C eff γγ . For all considered ALP masses, the h → Za decay could be observed at a 100 TeV collider for Br(a → γγ) 10 −6 and the h → aa decay could be fully reconstructed for Br(a → γγ) 0.01. The results are similar for leptonic ALP decays. In Figure 16 we show the reach in the c eff −C eff  plane (upper row) and c eff − C eff ah plane (lower row). The results are again comparable with the projections for searches at future lepton colliders shown in Figure 11.

Searches for ALPs with macroscopic lifetime
For small couplings and light ALPs produced in Higgs or Z decays, the ALP decay vertex can be considerably displaced from the production vertex. For ALPs still decaying in the detector volume, this secondary vertex can be used to further suppress backgrounds. Very long-lived ALPs, which leave the detector before they decay, only leave a trace of missing energy. A detector further away from the interaction point can detect the decay products of these ALPs and reconstruct the ALP mass and direction. Recent proposals include the MATHUSLA largevolume surface detector [48,82]  detector [47] build in a shielded part of the LHCb cavern, and a set of detectors called FASER [46] build along the beam line, ∼ 150 m and ∼ 400 m from the interaction point of ATLAS or CMS. Since long lived ALPs are mostly produced in Higgs and Z decays at the LHC, we will consider the reach of the surface detector MATHUSLA for ALPs produced in the decays Z → γa, h → Za and h → aa. For MATHUSLA, it is impossible to detect both final state particles in h → Za and Z → γa decays and highly unlikely to see both ALPs from h → aa decays in the decay volume. However, because of the much lower background, single ALPs can be detected irrespective of their origin. The fraction of ALPs decaying in the MATHUSLA detector is then given by where Ω M describes the area in solid angle covered by the MATHUSLA detector, dσ/dΩ denotes the differential cross section for ALPs produced in the decay of a Z or Higgs boson in the laboratory frame, and L a = p a /(Γ a m a ), where p a is the ALP momentum in that frame. At fixed solid angle, the radii r in and r out denote the distances between the interaction point and the intersections of the ALP line of flight with the MATHUSLA detector. The MATHUSLA detector with a volume of 20 m × 200 m × 200 m will be placed 100 m above the beam line and 100 m shifted from the interaction point along the beam line and has a considerably smaller coverage in solid angle: approximately 5% at MATHUSLA compared to 100% at ATLAS and CMS. Nevertheless, as Figure 17 shows, for long-lived ALPs, the number of ALPs decaying in the MATHUSLA volume is comparable to the number of ALPs decaying within a radius of 1.5 m from the interaction point. However, for ALPs with masses m a > 1 GeV backgrounds at MATHUSLA are negligible, whereas for example for h → Za decays the Z boson needs to be reconstructed and more events are required to distinguish the signal from the background. As in Section 3.3, we therefore demand at least 100 events with leptonically decaying Z boson to determine the LHC reach, and at least 4 reconstructed ALP decays to determine the reach of MATHUSLA. In the left panel of Figure 17 we illustrate the geometry of the proposed MATHUSLA experiment. The right panel shows the percentage of ALPs produced via pp → h → Za that decay before reaching the electromagnetic calorimeter (green), the percentage of ALPs decaying within the detector together with a leptonically decaying Zboson (dashed green), and the percentage of ALPs decaying within the MATHUSLA detector volume (red) as a function of the ALP decay length. Taking into account the additional relative factor of ∼ 1/20 between the number of events we expect to determine the reach of LHC and MATHUSLA, the MATHUSLA detector performs significantly better than the LHC for ALPs with a decay length exceeding 100 m.
Using (35), we can define the corresponding effective branching ratios for ALP decays in MATHUSLA in analogy with (29), The expressions for ALP decays into leptons are analogous with the ALP decay into photons with Br(a → γγ) replaced by Br(a → + − ). In order to fully capture the geometric acceptance of the MATHUSLA detector, we use MadGraph5 to simulate the signal events at parton level and the code provided by the MATHUSLA working group to compute the acceptance [82]. We illustrate the reach of the LHC and the MATHUSLA detector for discovering ALPs decaying into photons from h → Za (upper panels) and h → aa (lower panels) decays in Figure 18. For the green region with solid contours, the LHC would see 100 events with a branching ratio of Br(a → γγ) = 1. For smaller branching ratios, larger couplings |C eff hZ | and |C eff ah | are required to obtain the same number of events. Dotted lines show the lower limit for Br(a → γγ) = 0.1 and Br(a → γγ) = 0.01. The red region with solid contours shows the parameter space for which 4 ALP decays are expected within the MATHUSLA detector volume for Br(a → γγ) = 1. Smaller branching ratios with constant partial width for ALP decays into photons imply a larger total decay width of the ALP and therefore smaller decay lengths. For Br(a → γγ) = 0.1 and Br(a → γγ) = 0.01, MATHUSLA therefore looses sensitivity for larger values of |C eff γγ |/Λ. In the case of h → aa decays, MATHUSLA will be able to probe smaller branching ratios than ATLAS and CMS. This underlines the complementarity between searches for prompt decays with ATLAS/CMS and searches for displaced ALP decays with MATHUSLA. We stress that a discovery of a resonance with MATHUSLA alone cannot be used to determine the production mode of the ALP. However, one can use the reconstructed mass of the ALP and the number of observed events to guide future searches at the LHC, for example searches for invisible ALPs in the final state.
10% m a = 1 GeV m a = 10 GeV Figure 18: Projected reach in searches for h → Za → + − + 2γ (top) and h → aa → 4γ (bottom) decays at the LHC (green) and MATHUSLA (red) with √ s = 14 TeV center-of-mass energy and 3 ab −1 integrated luminosity. The parameter region with solid contours correspond to a branching ratio of Br(a → γγ) = 1, and contours showing the reach for smaller branching ratios are dotted.
In Figure 19, we show the reach of h → Za and h → aa for ALPs decaying into muons. Since at least approximate lepton-flavor universality is expected for the couplings of the ALP, the muon decay mode is particularly well motivated for 2m µ < m a < 2m τ . Also here, MATH-USLA can probe much smaller couplings |c eff µµ | than the LHC. In the case of Z → γa decays, we show the reach of MATHUSLA in the m a −|C eff γγ |/Λ plane in Figure 20, again assuming C W W = 0. In principle, for non-vanishing C γZ , searches for exotic Z decays with MATHUSLA compete with the reach of future beam-dump experiments such as ShiP [20]. However for light ALPs, the reach shown in Figure 20 is probably overestimated. Whether the MATHUSLA detector will be able to resolve photon pairs for m a < 1 GeV Figure 19: Projected reach in searches for h → Za → + − + µ + µ − (left) and h → aa → µ + µ − + µ + µ − (right) decays with ATLAS/CMS (green) and MATHUSLA (red) with √ s = 14 TeV center-of-mass energy and 3 ab −1 integrated luminosity. The parameter region with solid contours correspond to a branching ratio of Br(a → µ + µ − ) = 1, and contours showing the reach for smaller branching ratios are dotted. Z ! a Figure 20: Projected reach in searches for Z → γa → 3γ with MATHUSLA for √ s = 14 TeV center-of-mass energy, 3 ab −1 integrated luminosity and Br(a → γγ) = 1, together with the expected sensitivity of FASER taken from [83] and SHiP [20]. will depend on the angular resolution of the final detector proposal. Interestingly, FASER can take advantage of the large Primakoff cross section for photons producing ALPs through interaction with the detector material (γN → aN ) in the forward region to set limits on C eff γγ independently [83]. The corresponding projected sensitivity reach of FASER is slightly better than that of MATHUSLA. A unique strength of surface detectors is the possibility to constrain hadronic ALP decays, whereas light ALPs (m a < 500 GeV) decaying into jets are hard to detect at the LHC because of the large QCD background. For ALPs produced in gluon fusion or through ALP-quark couplings, a sizable production cross section corresponds to couplings too large to produce any signal in the MATHUSLA detector. ALPs produced in resonant Higgs or Z decays can be detected in MATHUSLA by reconstructing di-jet (or multi-jet) events. Particularly well motivated are ALPs with only couplings to gluons, because in models addressing the strong CP problem the ALP-gluon coupling is the only ALP coupling that cannot be avoided. We show the parameter space for which at least four a → jj events are expected within the MATHUSLA volume in the m a − C eff GG plane in Figure 21 for different values of C eff Zh (left) and C eff ah (right). The expected minimal mass resolution of the MATHUSLA detector for ALPs in Higgs decays is of the order of m a ≈ 100 MeV, assuming a spatial resolution of 1 cm. In Figure 21 the lowest ALP mass is m a = 600 MeV. 6

Conclusions
Any ultraviolet completion of the SM in which an approximate global symmetry is broken gives rise to pseudo-Nambu-Goldstone bosons, which are light with respect to the symmetry breaking scale m a Λ. The discovery of such ALPs at the LHC or future colliders could therefore be the first sign of a whole sector of new physics, and measuring its properties could reveal important hints about the UV theory.
We consider the most general effective Lagrangian including the leading operators in the 1/Λ expansion that couple the ALP to SM particles. Whereas couplings to SM fermions and gauge bosons can arise at mass dimension-5, the Higgs portal only arises at dimension-6. We derive projections for the most promising ALP search channels for the LHC, its potential future high-energy upgrade, as well as a variety of possible future high-energy hadron and lepton colliders.
At lepton colliders, ALP production in association with a photon, a Z boson or a Higgs boson provide the dominant production processes, provided the ALP couplings to either hypercharge, SU (2) L gauge bosons or to the Higgs boson are present in the Lagrangian. Even if only ALP-fermion couplings are present at tree-level, ALP couplings to gauge bosons are generated at one-loop order through the anomaly equation. We point out that a high-luminosity run at the Z pole would significantly increase the sensitivity to ALPs produced in e + e − → γa with subsequent decays a → γγ or a → + − . This favors the FCC-ee proposal over CLIC in these particular searches, whereas CLIC, operating at √ s = 1.5 TeV or √ s = 3 TeV, can discover significantly heavier ALPs. At hadron colliders the largest production cross section for ALPs with masses below the corresponding thresholds are the exotic Z → aγ, h → aZ and h → aa decays. Searches for exotic Z decays at a future 100 TeV collider are less sensitive to ALP-photon couplings than a high-luminosity run of the FCC-ee at the Z pole. For the exotic Higgs decays h → Za and h → aa already the LHC at √ s = 14 TeV and 3 ab −1 provides a better reach compared to future e + e − colliders in the corresponding Wilson coefficients C eff ah and C eff Zh . The sensitivity of a future 100 TeV collider in both C eff Zh and C eff ah is about an order of magnitude larger than at the LHC, and about a factor of 3 in the coefficients C eff γγ (for a → γγ) and c eff (for a → + − ). A future dedicated detector searching for long-lived particles at the LHC, such as MATH-USLA, FASER or Codex-B could provide sensitivity for even smaller ALP couplings to photons, charged leptons or jets. MATHUSLA has unique capabilities to search for long-lived ALPs with a mean decay length of 100 m and more, corresponding to couplings 2-3 orders of magnitude smaller than the ones that can be probed with ATLAS and CMS. Such ALPs cannot be produced resonantly with a significant cross section, but large numbers of ALPs with small widths can be produced in exotic decays of Higgs or Z bosons. The main backgrounds at MATHUSLA are cosmic rays, allowing for a cleaner environment for observing ALPs in the O(1) − O(10) GeV range. This is particularly powerful for hadronically decaying ALPs, where MATHUSLA can overcome the large QCD background at the LHC and thus provide the opportunity to constrain light ALPs decaying into jets, which are otherwise difficult due to the large QCD background at hadron colliders.