1 Introduction

Axion-like particles (ALPs) are light, gauge-singlet pseudoscalar particles with derivative couplings to the Standard Model (SM). The name is inspired by the QCD axion, which is the pseudo-Nambu–Goldstone boson associated with the breaking of the Peccei–Quinn symmetry [1,2,3,4], proposed to address the strong CP problem. More generally, ALPs appear in any theory with a spontaneously broken global symmetry and possible ALP masses and couplings to SM particles range over many orders of magnitude. In certain regions of parameter space ALPs can be non-thermal candidates for Dark Matter [5] or, in other regions where they decay, mediators to a dark sector. For large symmetry breaking scales, the ALP can be a harbinger of a new physics sector at a scale \(\Lambda \) which would otherwise be experimentally inaccessible. Since the leading ALP couplings to SM particles scale as \(\Lambda ^{-1}\), ALPs become weakly coupled for large new-physics scales. Accessing the smallest possible couplings is thus crucial to reveal non-trivial information about a whole new physics sector.

Depending on the region in parameter space spanned by the ALP mass and couplings, the search strategies vary greatly. For masses below twice the electron mass, the ALP can only decay into photons and the corresponding decay rate scales like the third power of the ALP mass. Thus, light ALPs are usually long-lived and travel long distances before decaying. Experiments probing long-lived ALPs include helioscopes such as CAST [6], SUMICO [7, 8], as well as observations from the evolution of red giant stars [9,10,11] and the Supernova SN1987a [12, 13]. In addition, a set of cosmological constraints from the modification to big-bang nucleosynthesis, distortions of the cosmic microwave background and extragalactic background light measurements exclude a large region of this parameter space and are sensitive to very small ALP-photon couplings [14, 15]. For intermediate ALP masses up to the GeV scale, collider experiments such as BaBar, CLEO, LEP and the LHC searching for missing-energy signals probe long-lived ALPs with non-negligible couplings to SM particles [16, 17]. Current and future beam-dump searches are sensitive to ALPs with masses below \(\sim 1\,\)GeV radiated off photons and decaying outside the target [18,19,20,21]. ALP couplings to other SM particles are generally less constrained than the ALP-photon coupling. ALP couplings to charged leptons are constrained by searches for ALPs produced in the sun [22], the evolution of red giants [11], by beam-dump experiments [23], and through associate ALP production at BaBar [24, 25]. Proposals for future experiments suggest measuring the ALP-electron coupling in Compton scattering of an electron in the background of low- and high-intensity electromagnetic fields [26, 27].

High-energy colliders are sensitive to a large and previously inaccessible region in parameter space [25, 28]. Requiring the ALP to decay within the detector opens up a new region of parameter space. The different ALP production mechanisms at colliders offer a rich phenomenology, allowing us to probe a large range of ALP masses and couplings. Beyond resonant production, ALPs can be produced in decays of heavy SM particles [25, 28,29,30,31,32,33] or in association with gauge bosons, Higgs bosons or jets [34,35,36,37]. Resonant ALP production is particularly powerful for small new-physics scales \(\Lambda \), because the production rate is proportional to \(1/\Lambda ^2\). ALP production in Higgs and Z decays, on the other hand, is sensitive to large new-physics scales \(\Lambda \), because the corresponding exotic Higgs or Z branching fractions are enhanced by the small widths of these bosons. Interesting channels at the LHC are the on-shell decays \(h\rightarrow a a\), \(h\rightarrow Z a\) and \(Z\rightarrow \gamma a\). Dedicated analyses by the LHC experiments will provide new and complementary ALP searches. ALPs can also be produced in the decay of B mesons [38,39,40,41,42,43,44]. These decays are sensitive to flavor-changing ALP couplings, which we will not consider in this work. In an upcoming publication we will discuss constraints from flavor-changing ALP couplings including ALPs produced in the decay of B mesons [45].

Depending on the ALP mass and coupling structure, ALPs produced at colliders can decay into photons, charged leptons, light hadrons or jets. These decays can be prompt or displaced if the width of the ALP is sufficiently small. We present bounds from current and future high-energy collider searches for ALPs decaying into photons, charged leptons and jets, including the case where the ALP couples dominantly to gluons. Existing constraints on the ALP-gluon coupling come from mono-jet [34] and di-jet [46] searches at the LHC and the rare kaon decay \(K^+\rightarrow \pi ^+ a\) mediated by ALP-pion mixing [47].

Future hadron colliders can operate at unprecedented center-of-mass energies, whereas future lepton colliders benefit from their clean collision environment and the large production rates of on-shell Z bosons and tagged Higgs bosons. Two current proposals for circular electron-positron colliders are the Circular Electron-Positron Collider (CEPC) based in China [48] and the \(e^+ e^-\) Future Circular Collider (FCC-ee) based at CERN [49]. CEPC is envisioned to have a \(50\,\)km tunnel and operate both at the Z pole and as a Higgs factory (at \(\sqrt{s} = 250\,\)GeV). At the Z pole the target is to produce \(10^{10}\) Z bosons per year. Over a period of 10 years an integrated luminosity of \(5\,\)ab\(^{-1}\) should be accumulated at two interaction points, which corresponds to one million Higgs events [48]. The FCC-ee is a proposed ring collider with 80–100 km circumference operating at center-of-mass energies between 90 and \(400\,\)GeV. At the FCC-ee, more than \(10^{12}\) Z bosons would be produced at four interaction points within one year [50]. Roughly three million Higgs bosons would be produced in five years. Linear lepton colliders such as the ILC or CLIC loose in luminosity compared to their circular counterparts. The ILC is proposed to operate at 250, 350 or \(500\,\)GeV, accumulating an integrated luminosity of 2, 0.2 and \(4\,\)ab\(^{-1}\), respectively [51, 52]. CLIC is designed to collect 0.5, 1.5 and \(3\,\)ab\(^{-1}\) at \(380\,\)GeV, \(1500\,\)GeV and \(3\,\)TeV center-of-mass energy, respectively [53].

Current proposals for high-energy proton colliders include the High-Energy LHC (HE-LHC) operating at 27 TeV in the existing LHC tunnel and accumulating \(15\,\)ab\(^{-1}\) [54], the FCC-hh based at the proposed CERN FCC-ee tunnel operating at a center-of-mass energy of \(100\,\)TeV with a target luminosity in the range of 10–20 ab\(^{-1}\) per experiment [55], and the Super-Proton-Proton-Collider (SPPC) based in the CEPC tunnel in China operating at 70–100 TeV [48] accumulating 3  ab\(^{-1}\).

Comparing the regions of ALP parameter space that can be probed with these future hadron and lepton colliders is particularly interesting and contributes to corroberating the physics case for these various machines. In this work we also consider proposed new experiments searching for long-lived particles, such as FASER [56], Codex-B [57] and MATHUSLA [58], which can access the ALP parameter space between the regions covered by LHC experiments and bounds from cosmology.

This paper is structured as follows: In Sect. 2 we review the effective Lagrangian for an ALP interacting with SM fields and introduce the formalism for our ALP detection strategy. In Sect. 3 we discuss the reach of ALP searches at future colliders. We focus on existing LEP and LHC limits in Sect. 3.1, ALP searches at lepton colliders in Sect. 3.2, and move on to ALP searches at hadron colliders in Sect. 3.3. In Sect. 3.4 we discuss the reach of the future surface detector MATHUSLA at the LHC. Section 4 contains our conclusions.

2 ALP production and decays

2.1 Effective Lagrangian

An ALP is a light scalar which is a singlet under the SM gauge group and odd under CP. The ALP Lagrangian respects a shift symmetry, which is only softly broken by a mass term. Its leading interactions with the SM particles are described by dimension-5 operators [59]

$$\begin{aligned} {{\mathcal {L}}}_\mathrm{eff}= & {} \frac{1}{2} \left( \partial _\mu a\right) \!\left( \partial ^\mu a\right) - \frac{m_{a}^2}{2}\,a^2 + \sum _f \frac{c_{ff}}{2}\,\frac{\partial ^\mu a}{\Lambda }\,{\bar{f}}\gamma _\mu \gamma _5 f \nonumber \\&\quad +g_s^2\,C_{GG}\,\frac{a}{\Lambda }\,G_{\mu \nu }^A\,{\tilde{G}}^{\mu \nu ,A} +g^2\,C_{WW}\,\frac{a}{\Lambda }\,W_{\mu \nu }^A\,{\tilde{W}}^{\mu \nu ,A}\nonumber \\&\quad +g^{\prime \,2}\,C_{BB}\,\frac{a}{\Lambda }\,B_{\mu \nu }\,{\tilde{B}}^{\mu \nu }, \end{aligned}$$
(1)

where the couplings to fermions \(c_{ff}\) are assumed to be flavor universal, and \(\Lambda \) sets the characteristic scale of global symmetry breaking. The commonly used axion decay constant \(f_a\) is related to our new-physics scale by \(\Lambda /|C_{GG}^\mathrm{eff}|=32\pi ^2 f_a\). ALPs can obtain part of their mass from non-perturbative dynamics but need additional explicit breaking of the shift symmetry to be heavier than the QCD axion.Footnote 1 In the absence of an explicit breaking term, the QCD axion is defined by a strict relation between its mass and decay constant, \(m_a \propto f_\pi m_\pi /f_a\), with \(f_\pi \) and \(m_\pi \) the pion decay constant and mass, respectively. For ALPs such a strict relation does not apply, since \(m_a\) and \(f_a\) are independent parameters.

In the broken phase of the electroweak symmetry, the ALP couples to the photon and the Z boson as

$$\begin{aligned} {{\mathcal {L}}}_\mathrm{eff}\ni&e^2\,C_{\gamma \gamma }\,\frac{a}{\Lambda }\,F_{\mu \nu }\,{\tilde{F}}^{\mu \nu } + \frac{2e^2}{s_w c_w}\,C_{\gamma Z}\,\frac{a}{\Lambda }\,F_{\mu \nu }\,{\tilde{Z}}^{\mu \nu } \nonumber \\&+\frac{e^2}{s_w^2 c_w^2}\,C_{ZZ}\,\frac{a}{\Lambda }\,Z_{\mu \nu }\,{\tilde{Z}}^{\mu \nu }. \end{aligned}$$
(2)

The relevant Wilson coefficients are given by

$$\begin{aligned} C_{\gamma \gamma }= & {} C_{WW}+C_{BB},\nonumber \\ C_{\gamma Z}= & {} c^2_w\,C_{WW}-s^2_w\,C_{BB}, \nonumber \\ C_{ZZ}= & {} c^4_w\,C_{WW}+s^4_w\,C_{BB}, \end{aligned}$$
(3)

where \(s_w\) and \(c_w\) are the sine and cosine of the weak mixing angle, respectively. The exotic decay \(Z\rightarrow \gamma a\) is governed by the Wilson coefficient \(C_{\gamma Z}\).

Note that the anomaly equation for the divergence of the axial-vector current allows us to rewrite the ALP-fermion couplings in (1) in the form

$$\begin{aligned} \frac{c_{ff}}{2}\,\frac{\partial ^\mu a}{\Lambda }\,{\bar{f}}\gamma _\mu \gamma _5 f= & {} - c_{ff}\,\frac{m_f}{\Lambda }\,a\,{\bar{f}}\,i\gamma _5 f\nonumber \\&+ c_{ff}\,\frac{N_c^f Q_f^2}{16\pi ^2}\,\frac{a}{\Lambda }\,e^2 F_{\mu \nu }\,{\tilde{F}}^{\mu \nu } + \cdots , \end{aligned}$$
(4)

where the dots represent similar terms involving gluons and weak gauge fields [25]. This is instructive to relate results obtained for the ALP with analogous, and maybe more familiar, results derived for a CP-odd Higgs boson. E.g. the first term on the right-hand side is now of the same form as the coupling of a CP-odd Higgs to fermions.

Interactions with the Higgs boson, \(\phi \), appear only at dimension-6 and higher,

$$\begin{aligned} {{\mathcal {L}}}_\mathrm{eff}^{D\ge 6}= & {} \frac{C_{ah}}{\Lambda ^2} \left( \partial _\mu a\right) \!\left( \partial ^\mu a\right) \phi ^\dagger \phi \nonumber \\&+ \frac{C_{Zh}}{\Lambda ^3} \left( \partial ^\mu a\right) \left( \phi ^\dagger \,iD_\mu \,\phi + \text{ h.c. } \right) \phi ^\dagger \phi + \cdots , \end{aligned}$$
(5)

where the first operator mediates the decay \(h\rightarrow aa\), while the second one is responsible for \(h\rightarrow Za\). Note that a possible dimension-5 operator coupling the ALP to the Higgs current vanishes by the equations of motion. However, in theories where a heavy new particle acquires most of its mass through electroweak symmetry breaking, the non-polynomial dimension-5 operator

$$\begin{aligned} \frac{C_{Zh}^{(5)}}{\Lambda } \left( \partial ^\mu a\right) \left( \phi ^\dagger \,iD_\mu \,\phi + \text{ h.c. } \right) \ln \frac{\phi ^\dagger \phi }{\mu ^2} \end{aligned}$$
(6)

can be present [25, 28, 66, 67]. In our analysis we allow for the presence of such an operator.

We now summarize the relevant partial widths needed for the remainder of this paper. We express the relevant decay rates in terms of effective Wilson coefficients, which take into account loop-induced contributions, that have been calculated in [25]. In the case of \(h \rightarrow Z a\) decay the effective coefficient is defined as \(C_{Zh}^\mathrm{eff} = C_{Zh}^{(5)} + C_{Zh} v^2/2 \Lambda ^2 + \text {loop effects}\). The relevant ALP decay rates are

$$\begin{aligned} \Gamma (a\rightarrow \gamma \gamma )&= \frac{4\pi \alpha ^2 m_a^3}{\Lambda ^2}\,\big | C_{\gamma \gamma }^\text {eff} \big |^2 , \end{aligned}$$
(7)
$$\begin{aligned} \Gamma (a\rightarrow \ell ^+ \ell ^-)&=\frac{m_a m_\ell ^2}{8\pi \Lambda ^2} \left| c_{\ell \ell }^\text {eff}\right| ^2\sqrt{1-\frac{4m_\ell ^2}{m_a^2}}, \end{aligned}$$
(8)
$$\begin{aligned} \Gamma (a\rightarrow gg)&= \frac{32\pi \,\alpha _s^2(m_a)\,m_a^3}{\Lambda ^2} \left[ 1 + \frac{83}{4}\,\frac{\alpha _s(m_a)}{\pi } \right] \left| C_{GG}^\text {eff} \right| ^2, \end{aligned}$$
(9)

where the latter expression is only valid if \(m_a \gg \Lambda _\text {QCD}\). The exotic Higgs and Z-boson decay rates into ALPs are given by

$$\begin{aligned} \Gamma (h \rightarrow Za)&=\frac{m_h^3}{16\pi \,\Lambda ^2}|C_{Zh}^\text {eff}| ^2\lambda ^{3/2}\Big (\frac{m_Z^2}{m_h^2},\frac{m_a^2}{m_h^2}\Big ), \end{aligned}$$
(10)
$$\begin{aligned} \Gamma (h \rightarrow aa)&=\frac{m_h^3\,v^2}{32\pi \,\Lambda ^4}|C_{ah} ^\mathrm{eff}|^2\left( 1-\frac{2m_a^2}{m_h^2}\right) ^2 \sqrt{1-\frac{4m_a^2}{m_h^2}}, \end{aligned}$$
(11)
$$\begin{aligned} \Gamma (Z\rightarrow \gamma a)&=\frac{8\pi \alpha \,\alpha (m_Z)\,m_Z^3}{3s_w^2 c_w^2\Lambda ^2}\,\big | C_{\gamma Z}^\text {eff} \big |^2 \left( 1 - \frac{m_a^2}{m_Z^2} \right) ^3, \end{aligned}$$
(12)

where \(\lambda (x,y)=(1-x-y)^2-4xy\).

Fig. 1
figure 1

Tree-level Feynman diagrams for the processes \(e^+ e^-\rightarrow X a\) with \(X=\gamma , Z, h\)

2.2 ALP production at colliders

At high-energy colliders, ALPs can be produced in different processes. We distinguish resonant production through gluon or photon fusion and \(e^+e^-\) annihilation, the production in association with photons, Z bosons, Higgs bosons or jets [34,35,36,37], and the production via exotic decays of on-shell Higgs or Z bosons [25, 28].

2.2.1 Resonantly produced ALPs

At high-energy colliders, ALPs can be produced resonantly through gluon-fusion \(gg\rightarrow a\) (GGF), photon fusion \(\gamma \gamma \rightarrow a\) (\(\gamma \gamma \)F), or electron-positron annihilation \(e^+e^-\rightarrow a\). If an ALP coupling to heavy gauge bosons is present, ALPs can also be produced in vector-boson fusion [68]. An important difference between resonant production and ALP production through exotic decays or associated ALP production is that the resonant production cross section is always suppressed by the ALP mass, \(m_a\), over the new physics scale \(\Lambda \). Resonant production is therefore mostly relevant for large ALP masses. At hadron colliders large ALP masses are also important to suppress backgrounds. The cross sections for the resonant ALP production processes are

$$\begin{aligned} \sigma _\text {GGF}(pp \rightarrow a)&=\frac{4\pi ^3 \alpha _s^2(m_a)}{s}\frac{m_a^2}{\Lambda ^2} |C_{GG}^\text {eff}|^2\,K_{a\rightarrow gg}f f_{gg} \left( \frac{m_a^2}{s} \right) , \end{aligned}$$
(13)
$$\begin{aligned} \sigma _{\gamma \gamma \text {F}}(pp \rightarrow a)&=\frac{\pi ^3 \alpha ^2(m_a)}{2s}\frac{m_a^2}{\Lambda ^2} |C_{\gamma \gamma }^\text {eff}|^2\, f f_{\gamma \gamma } \left( \frac{m_a^2}{s} \right) , \end{aligned}$$
(14)
$$\begin{aligned} \sigma (e^+ e^- \rightarrow a)&{\mathop {=}\limits ^{s\, \approx \, m_a^2}}\frac{4\pi \Gamma _a}{(s-m_a^2)^2+m_a^2\Gamma _a^2}\frac{\sqrt{s} m_e^2}{8\pi \Lambda ^2}|c_{ee}^\text {eff}|^2 \end{aligned}$$
(15)

where \(ff_{gg} (y) = \int _y^1 \frac{dx}{x}f_{g/p}(x)f_{g/p}(y/x)\) is the gluon luminosity function (the photon luminosity function is defined analogously) and \(K_{a\rightarrow gg}\approx \) 3.3–2.4 for \(m_a=\) 100–1000 GeV accounts for higher-order QCD corrections [69, 70]. In the last equation we set \(m_e^2/s\rightarrow 0\). Both \(\sigma (e^+e^-\rightarrow a)\) as well as the quark contribution to \(\sigma (pp\rightarrow a)\) are strongly suppressed by the light fermion masses and these processes are therefore not the dominant production modes. ALP production in photon fusion with a subsequent di-photon decay of the ALP is particularly interesting, because the production times decay rate only depends on the ALP mass and the single coupling \(C_{\gamma \gamma }^\mathrm{eff}\). Furthermore, the uncertainty of the photon distribution function in the proton has recently been considerably improved allowing for more robust limits [71]. For resonantly produced ALPs finite-lifetime effects do not play any role because the sizeable couplings and ALP masses required to obtain appreciable production cross sections lead to prompt ALP decays.

Fig. 2
figure 2

Production cross sections of ALPs produced in the decays of Higgs and Z bosons at the LHC (\(\sqrt{s} = 14\,\)TeV) versus the new-physics scale \(\Lambda \). We set \(m_a=0\) and fix the relevant Wilson coefficients to 1. For the green contour in the left plot, we fix \(C_{Zh}^{(5)}=0\) and only consider the dimension-7 coupling in (5). The grey regions in the two plots are excluded by Higgs coupling measurements and the measurement of the total Z width, respectively

2.2.2 ALP production in association with a photon, Z or Higgs boson

An important production mechanism especially at \(e^+e^-\) colliders is associated ALP production. The relevant Feynman diagrams are shown in Fig. 1. Additional diagrams with ALPs radiated off an initial-state electron are suppressed by \(m_e^2/s\) relative to the shown graphs and hence neglected here. ALPs can be radiated of a photon or a Z boson and thereby be produced in association with a \(\gamma \), a Z or a Higgs. The differential cross sections for ALPs produced in association with a \(\gamma \), a Z or a Higgs boson are given by

$$\begin{aligned} \frac{d \sigma (e^+ e^-\rightarrow \gamma a)}{d\Omega }&= \, 2 \pi \alpha \alpha ^2(s) \frac{s^2}{\Lambda ^2} \left( 1 - \frac{m_a^2}{s}\right) ^3\nonumber \\&\quad \times \left( 1 + \cos ^2 \theta \right) \left( |V_\gamma (s)|^2 + |A_\gamma (s)|^2\right) , \end{aligned}$$
(16)
$$\begin{aligned} \frac{d \sigma (e^+ e^-\rightarrow Z a)}{d \Omega }&= \, 2 \pi \alpha \alpha ^2(s) \frac{s^2}{\Lambda ^2} \, \lambda ^{\frac{3}{2}}\left( x_a, x_Z \right) \nonumber \\&\quad \times \left( 1 + \cos ^2 \theta \right) \left( |V_Z(s)|^2 + |A_Z(s)|^2\right) , \end{aligned}$$
(17)
$$\begin{aligned} \frac{d \sigma (e^+ e^-\rightarrow h a)}{d\Omega }&= \, \frac{\alpha }{128\pi \, c_w^2 s_w^2} \frac{|C_{Zh}^\mathrm{eff}|^2}{\Lambda ^2}\frac{s \, m_Z^2}{(s-m_Z^2 )^2}\, \lambda ^{\frac{3}{2}}\nonumber \\&\quad \times \left( x_a,x_h \right) \sin ^2 \theta \, (g_V^2 + g_A^2) , \end{aligned}$$
(18)

where \(x_i =m_i^2/s\) and

$$\begin{aligned} V_\gamma (s)&= \frac{C_{\gamma \gamma }^\text {eff}}{s} + \frac{g_V}{2 c_w^2 s_w^2}\frac{C_{\gamma Z}^\text {eff}}{s - m_Z^2+ i m_Z \Gamma _Z}, \nonumber \\ A_\gamma (s)&= \frac{g_A}{2 c_w^2 s_w^2}\frac{C_{\gamma Z}^\text {eff}}{s - m_Z^2+ i m_Z \Gamma _Z} , \end{aligned}$$
(19)
$$\begin{aligned} V_Z(s)&= \frac{1}{c_w s_w}\frac{C_{\gamma Z}^\text {eff}}{s} + \frac{g_V}{2 c_w^3 s_w^3}\frac{C_{Z Z}^\text {eff}}{s - m_Z^2+ i m_Z \Gamma _Z} , \nonumber \\ A_Z(s)&=\frac{g_A}{2 c_w^3 s_w^3}\frac{C_{Z Z}^\text {eff}}{s - m_Z^2+ i m_Z \Gamma _Z}, \end{aligned}$$
(20)

and \(g_V = 2 s_w^2 - 1/2\) and \(g_A=-1/2\). Note that the cross sections with a gauge boson in the final state become independent of s in the high-energy limit \(m_a^2, m_Z^2 \ll s < \Lambda \), while the cross section for \(e^+ e^-\rightarrow h a\) decreases as 1 / s in this limit.

Light or weakly coupled ALPs can be long-lived, and thus only a fraction of them decays inside the detector and can be reconstructed. The average ALP decay length perpendicular to the beam axis is given by

$$\begin{aligned} L_a^\perp (\theta ) = \frac{\sqrt{\gamma _a^2 - 1}}{\Gamma _a}\,\sin \theta , \end{aligned}$$
(21)

where \(\Gamma _a\) denotes the total width of the ALP, \(\theta \) is the scattering angle (in the center-of-mass frame) and \(\gamma _a\) specifies the relativistic boost factor. For the case of associated ALP production with a boson \(X = \gamma , Z, h\), we have

$$\begin{aligned} \gamma _a=\frac{s-m_X^2+m_a^2}{2m_a\sqrt{s}}. \end{aligned}$$
(22)

In order to obtain the total cross sections for ALPs produced in associated production, we integrate the differential distributions (16)–(18) with the non-decay probability, i.e.

$$\begin{aligned} \sigma (e^+e^-\rightarrow X a)=\int d\Omega \, \frac{d\sigma (e^+e^-\rightarrow a X)}{d\Omega } \left( 1-e^{-L_\text {det}/L_a^\perp (\theta )}\right) , \end{aligned}$$
(23)

where \(L_\text {det}\) is the transverse distance from the beam axis to the detector component relevant to the reconstruction of the ALP.

Associated production at hadron colliders will not be considered here. For long-lived or invisibly decaying ALPs such processes have been explored recently in [34, 37].

2.2.3 ALP production in exotic decays of on-shell Higgs and Z bosons

Exotic decays are particularly interesting, because even small couplings can lead to appreciable branching ratios. In the case of the Higgs boson, the SM decay widths are strongly suppressed, and consequently the branching ratios for Higgs decays into ALPs can be as large as several percent [25, 28]. In the case of the Z boson, the huge samples of Z events expected at future colliders provide sensitivity to \(Z \rightarrow \gamma a\) branching ratios much below current bounds. This allows us to probe large new-physics scales \(\Lambda \), as illustrated in Fig. 2, where we show the cross sections of the processes \(pp \rightarrow Z \rightarrow \gamma a\), \(pp \rightarrow h \rightarrow Z a\) and \(pp \rightarrow h \rightarrow aa\) at the LHC with \(\sqrt{s} = 14\,\)TeV. The figure nicely reflects the different scalings of the dimension-5, 6, and 7 operators in the effective ALP Lagrangian. The shaded region in the left plot is excluded by Higgs coupling measurements constraining general beyond the SM decays of the Higgs boson, \(\text {Br}(h\rightarrow \text {BSM})<0.34\) [72]. The shaded area in the right plot is derived from the measurement of the total Z width, which corresponds to \(\text {Br}(Z\rightarrow \text {BSM})<0.0018\) [73]. This leads to constraints on the coefficients \(|C_{Zh}^{\text {eff}}| < 0.72\,(\Lambda /\text {TeV})\), \(|C_{ah}^\mathrm{eff}| < 1.34\,(\Lambda /\text {TeV})^2\) and \(|C_{\gamma Z}^\mathrm{eff}| < 1.48\,(\Lambda /\text {TeV})\). The Higgs and Z-boson production cross sections at \(14\,\)TeV are given by \(\sigma (pp\rightarrow h)= 54.61\,\)pb [74] and \(\sigma (pp\rightarrow Z)=60.59\) nb, computed at NNLO using tools provided in [75, 76].

As discussed above, it is important to include the effects of a possible finite ALP decay length. Using the fact that most Higgs and Z bosons are produced in the forward direction at the LHC and approximating the ATLAS and CMS detectors (as well as future detectors) by infinitely long cylindrical tubes, we first perform a Lorentz boost to the rest frame of the decaying boson. In this frame the relevant boost factor for the Higgs or Z decay into ALPs are given by

$$\begin{aligned} \gamma _a={\left\{ \begin{array}{ll} \displaystyle {\frac{m_h^2-m_Z^2+m_a^2}{2m_am_h}};&{} \text {for}\,\, h \rightarrow Z a,\\ \displaystyle \frac{m_h}{2m_a};&{}\text {for}\,\, h \rightarrow aa,\\ \displaystyle \frac{m_a^2+m_Z^2}{2m_Zm_a};&{}\text {for}\,\, Z \rightarrow \gamma a.\\ \end{array}\right. } \end{aligned}$$
(24)

We can compute the fraction of ALPs decaying before they have travelled a certain distance \(L_\mathrm{det}\) from the beam axis, finding

$$\begin{aligned} \begin{aligned} f_\mathrm{dec}^{a}&= \int _0^{\pi /2}\!d\theta \, \sin \theta \left( 1 - e^{-L_\mathrm{det}/L_a^\perp (\theta )} \right) , \\ f_\mathrm{dec}^{aa}&= \int _0^{\pi /2}\!d\theta \, \sin \theta \left( 1 - e^{-L_\mathrm{det}/L_a^\perp (\theta )} \right) ^2, \end{aligned} \end{aligned}$$
(25)

where \(f^a_\text {dec}\) is relevant for \(h \rightarrow Z a\) and \(Z\rightarrow \gamma a\) decays, and \(f^{aa}_\text {dec}\) applies to \(h\rightarrow aa\) decays.

Fig. 3
figure 3

Leading order Higgs production cross sections at \(e^+e^-\) colliders as a function of the center-of-mass energy, produced with MadGraph5 [77]

For Higgs bosons produced at \(e^+e^-\) colliders the assumption of forward production is no longer justified. Rather, the angular distribution in the scattering angle \(\vartheta \) of the Higgs boson in the center-of-mass frame are approximately given by [78]

$$\begin{aligned} \frac{d\sigma }{d\Omega }&\propto {\left\{ \begin{array}{ll}\dfrac{3}{2} \dfrac{\lambda (x_h,x_Z)\sin ^2\vartheta +8x_Z}{\lambda (x_h,x_Z) +12x_Z}\,~{\mathop {\longrightarrow }\limits ^{s\gg m_h^2}} ~ \dfrac{3}{2}\sin ^2\vartheta \,; &{} e^+e^-\rightarrow h Z,\\ \dfrac{3}{2}\sin ^2\vartheta \,; &{} \text {VBF},\\ \end{array}\right. } \end{aligned}$$
(26)

with \(x_i =m_i^2/s\). The approximation \(s \gg m_h^2 \) for the Vector Boson Fusion (VBF) process is justified, because the VBF cross section becomes the dominant production cross section for \(\sqrt{s}\gtrsim 500\,\)GeV [78, 79]. This fact is illustrated in Fig. 3, which depicts the cross section of various Higgs production modes at lepton colliders as functions of the center-of-mass energy. Even though in the Higgs rest frame, the angular distribution of the produced ALPs will be isotropic, the corresponding distribution in the center-of-mass frame is more complicated in this case. Since the Higgs bosons are predominantly produced with \(\vartheta \approx 90^\circ \), we will for simplicity make the conservative assumption that the ALPs are also produced at maximum scattering angle in the center-of-mass frame, corresponding to \(\sin \theta =1\) in (21). For the resonant process \(e^+e^-\rightarrow Z\rightarrow \gamma a\) on the Z pole, no such difficulty arises. The corresponding differential branching ratio can be obtained from (16) by setting \(s=m_Z^2\), and the decay-length effect can be taken into account as shown in (23).

For prompt ALP decays, we demand all final state particles to be detected in order to reconstruct the decaying SM particle. For the decay into photons we require the ALP to decay before the electromagnetic calorimeter which, at ATLAS and CMS, is situated approximately \(1.5\,\)m from the interaction point, and we thus take \(L_\mathrm{det} = 1.5\,\)m. Analogously, the ALP should decay before the inner tracker, \(L_\mathrm{det} = 2\,\)cm, for an \(e^+e^-\) final state to be detected. We also require \(L_\text {det} =2\,\)cm for muon and tau final states in order to take full advantage of the tracker information in reconstructing these events. For CLIC, we use \(L_\text {det}=0.6\,\)m for lepton reconstruction [80]. We define the effective branching ratios

$$\begin{aligned}&\text {Br}(h\rightarrow Za\rightarrow Y\bar{Y}+X\bar{X}) \big \vert _\text {eff}\nonumber \\&\quad =\text {Br}(h\rightarrow Za)\,\text {Br}(a\rightarrow X\bar{X})f_\text{ dec }^a\,\text {Br}(Z\rightarrow Y\bar{Y}),\end{aligned}$$
(27)
$$\begin{aligned}&\text {Br}(h\rightarrow aa\rightarrow X\bar{X}+X\bar{X})\big \vert _\text {eff}\nonumber \\&\quad =\text {Br}(h\rightarrow aa)\,\text {Br}(a\rightarrow X\bar{X}) ^2f_\text {dec}^{aa}, \end{aligned}$$
(28)
$$\begin{aligned}&\text {Br}(Z\rightarrow \gamma a \rightarrow \gamma X\bar{X} )\big \vert _\text {eff}\nonumber \\&\quad =\text {Br}(Z\rightarrow \gamma a)\,\text {Br}(a\rightarrow X\bar{X})f_\text {dec}^a, \end{aligned}$$
(29)

where \(X=\gamma , e, \mu , \tau , \text {jet}\) and \(Y=\ell , \mathrm{hadrons}\). Multiplying the effective branching ratios by the appropriate Higgs or Z production cross sections and luminosity allows us to derive results for a specific collider. At hadron colliders like the LHC, we require 100 signal events, since this is what is typically needed to suppress backgrounds in new-physics searches with prompt decays of Higgs and Z bosons [72, 81, 82] (see also [25] for further discussion). At lepton colliders we assume a much cleaner environment and show the reach for 4 signal events.

We do not take advantage of the additional background reduction obtained by cutting on a secondary vertex in the case where the ALP lifetime becomes appreciable. A dedicated analysis by the experimental collaborations including detailed simulations of the backgrounds is required to improve on our projections.

Table 1 Benchmark specifications of various future collider proposals. The number of Z and Higgs bosons indicated with a \(\sim \) have been computed with MadGraph5 [77]

3 Collider reach for ALP searches

The reach of ALP searches at current and future colliders depends on the type of collider, the ALP production mechanism, and the center-of-mass energy of the experiment. For the LHC and the most advanced proposals for future colliders, we use the benchmark specifications collected in Table 1. In the following, we determine the reach of future colliders in comparison to the high-luminosity phase of the LHC with \(\sqrt{s}=14\,\)TeV and an integrated luminosity of \(L=3\,\)ab\(^{-1}\).

Fig. 4
figure 4

Left: summary plot of constraints on the parameter space spanned by the ALP mass and ALP-photon coupling. Right: enlarged display of the constraints from collider searches: LEP (light blue and blue), CDF (purple), LHC from associated production and Z decays (orange), LHC from photon fusion (light orange), and from heavy-ion collisions at the LHC (green)

3.1 ALP searches at the LHC and LEP

Constraints from ALP searches at LEP have been discussed for the associated production of ALPs with a photon and the subsequent ALP decay into photon pairs (\(e^+e^-\rightarrow \gamma a\rightarrow 3 \gamma \)) [34], as well as for on-shell Z decays (\(e^+e^- \rightarrow Z \rightarrow \gamma a\rightarrow 3 \gamma \)) [35]. The excluded parameter space in the \(m_a - |C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \) plane is shown in blue in Fig. 4. At the LHC, exotic Higgs and Z boson decays are the most promising search channels. Decays of on shell Z bosons at the LHC have been discussed in [25, 34, 35, 37]. The constraints from these searches can be mapped onto the \(m_a - |C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \) plane under the assumption that the two couplings \(C_{\gamma \gamma }^\text {eff}\) and \(C_{\gamma Z}^\text {eff}\) are related to each other. For example, if the ALP couples to hypercharge but not to \(SU(2)_L\), then (3) implies \(C_{\gamma Z}=-s_w^2\,C_{\gamma \gamma }\), since \(C_{WW}=0\). The corresponding constraint is shown in orange in Fig. 4.Footnote 2 The purple region is excluded by Tevatron searches for \(p{\bar{p}} \rightarrow 3 \gamma \) [83], again assuming \(C_{WW}=0\).

The dark green area in Fig. 14 in Sect. 3.3 below depicts the region where 100 events are expected in the process \(pp\rightarrow Z \rightarrow \gamma a \rightarrow 3 \gamma \) at the LHC with \(\sqrt{s}=14\,\)TeV and \(L=3\,\)ab\(^{-1}\). We demand that the ALPs decay before they reach the electromagnetic calorimeter \(L_\text {det}=1.5\,\)m. Note that for a part of this parameter space the photons from the ALP decay are very boosted and hard to distinguish from a single photon in the detector [84]. Searches for the exotic Higgs decays \(pp\rightarrow h \rightarrow Z a\rightarrow Z\gamma \gamma \) and \(pp\rightarrow h \rightarrow a a\rightarrow 4\gamma \) cannot be translated into constraints in the \(m_a - |C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \) plane, because the ALP-Higgs couplings governed by the coefficients \(C_{Zh}^\mathrm{eff}\) and \(C_{ah}^\mathrm{eff}\) are generally not related to \(C_{\gamma \gamma }^\mathrm{eff}\). Instead, we show the reach of the high-luminosity LHC in the \(|C_{Zh}^\mathrm{eff}|/\Lambda - |C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \) or \(|C_{ah}^\mathrm{eff}|/\Lambda ^2 - |C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \) planes for some fixed ALP masses in Fig. 15 in Sect. 3.3.

Besides ALP production in exotic decays of Higgs and Z bosons, ALP production through photon fusion plays an important role at the LHC. This process was first considered in a VBF-type topology in [85], and the excluded region is part of the orange shaded region in Fig. 4. For GeV-scale ALPs produced in photon-fusion, (quasi-)elastic heavy-ion collisions can provide even stronger constraints due to the large charge of the lead ions (\(Z=82\)) used in the LHC heavy-ion collisions [36, 86]. The parameter space probed by this process is shown in green in Fig. 4.

Recently, the parton distribution function of the photon has been determined with significantly improved accuracy [71], and searches for di-photon resonances at the LHC can be recast to give bounds on heavy pseudoscalar particles with couplings to photons [87]. We have computed the constraints based on the most recent ATLAS analysis with \(39.6\,\)fb\(^{-1}\) of data [88] and show the corresponding sensitivity regions in light orange in Fig. 4. A recent proposal to search for ALPs in elastic photon scattering at the LHC allows for a similar reach in the \(m_a - |C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \) plane [89].

Searches for ALPs decaying into photons are motivated by the relation between the ALP coupling to gluons \(C_{GG}^\mathrm{eff}\) and to photons \(C_{\gamma \gamma }^\mathrm{eff}\) in models addressing the strong CP problem, and from a practical point of view by the difficulty of observing light ALPs decaying into jets at hadron colliders. On the other hand, if the coupling to gluons is present in the effective ALP Lagrangian (1), constraints arise from searches for mono-jets at ATLAS and CMS [34], as well as from the rare kaon decay \(K^+\rightarrow \pi ^+ a\) mediated by ALP-pion mixing [47].Footnote 3 Di-jet searches at the LHC can provide bounds on heavy ALPs with masses \(m_a> 1\,\)TeV, whereas recent searches for a new vector resonance decaying into di-jets accompanied by hard initial state radiation \(pp \rightarrow j Z'\rightarrow 3 j\) can be recast into limits on ALPs with masses below the TeV scale in the process \(pp \rightarrow j a\rightarrow 3 j\) [46, 91, 92].Footnote 4 As pointed out in [94], the hard cut on hadronic activity applied in the analyses [46, 91, 92], strongly reduces the efficiency in a gluon-fusion initiated signal compared to a \(q{\bar{q}}\)-initiated signal as expected for a vector resonance. In Fig. 5, we show the limit derived in [94] (labeled LHC) in the \(m_a-|C_\text {GG}^\text {eff}|/\Lambda \) plane.

Fig. 5
figure 5

Left: existing constraints on the ALP mass and coupling to gluons by mono-jet searches at the LHC (light blue), rare kaon decays (light red) and three-jet events (purple). Right: constraints on the ALP mass and coupling to leptons from searches for solar axions (purple), the evolution of red giants (light red), beam dump searches for ALP decays into muons (blue) and BaBar searches for \(e^+e^-\rightarrow 4 \mu \)

Another promising signature are leptonically decaying ALPs: \(a\rightarrow \ell ^+\ell ^-\) with \(\ell =e,\mu ,\tau \). In the right panel of Fig. 5 we show a compilation of current limits in the \(m_a-|c_{\ell \ell }^\text {eff}|/\Lambda \) plane taken from [25]. We assume universal couplings to leptons, such that lepton flavor changing couplings mediated by ALP exchange are absent at tree level. Lepton colliders are sensitive to the resonant production of ALPs with subsequent decays into leptons. In general, however, the loop-induced couplings to \(Z\gamma \) and \(\gamma \gamma \) are more important than the tree-level coupling to electrons because the latter is suppressed by \(m_e/\Lambda \). Even for ALPs coupling only to leptons at tree level the associated production cross sections via the processes shown in Fig. 1 dominate over the \(e^+e^-\) annihilation cross section. Projections for additional signatures, such as \(pp \rightarrow a W^\pm (\gamma )\), \(pp \rightarrow a jj (\gamma )\), \(pp \rightarrow h a\) and \(pp \rightarrow t{\bar{t}} a\) with stable ALPs or invisible ALP decays have been considered in [37]. The complementarity between di-photon and di-lepton final states has also been emphasized in the proposal for boosted di-tau resonances [63].

3.2 ALP searches at future lepton colliders

Future lepton colliders have the potential to precisely measure the properties of the Higgs boson and search for new physics effects in electroweak observables. In addition they offer qualitatively new ways to search for ALPs. In contrast to hadron colliders, \(e^+e^-\) machines offer a much cleaner detector environment allowing one to identify ALPs produced in association with a Z boson, a photon or a Higgs boson. Therefore, in addition to ALPs produced in exotic decays of on-shell Z and Higgs bosons, we also discuss the associated production of ALPs.Footnote 5 On the contrary, barring a fine-tuning of the collider energy, the resonant production of ALPs cannot be observed in \(e^+e^-\) collisions.Footnote 6

Of particular interest are processes governed by a single non-vanishing Wilson coefficient at tree-level that allow us to compare the projected sensitivity reach of the future lepton colliders with the results of previous experiments, see Figs. 4 and 5. Studying these processes at a lepton collider allows one in particular to probe benchmark models in which the ALP couples only to electroweak gauge bosons or only to charged leptons. Other processes involve different couplings for the production and the decay of the ALP. Among these, the rich Higgs program of all proposed future lepton colliders motivates the search for ALPs produced in association with Higgs bosons or in exotic Higgs decays. For these channels, in order to compare the reach of the various proposed experiments, we focus on the di-photon and di-lepton ALP decay channels. Following [25], we present the corresponding sensitivity regions in a two-dimensional plane spanned by these two couplings. We derive this sensitivity region by demanding 4 reconstructed events before the inner tracker for ALPs decaying into electrons and muons and before the ECAL for ALP decays into photons. The assumption that this number of events is sufficient for future lepton colliders to be sensitive to a signal is based on the very clean final states (photons or leptons) and the strong cuts that can be applied if several resonances appear in the signal, e.g. the ALP, the Z boson and the Higgs in the process \(h \rightarrow Z a \). For similar searches at LEP, cuts have reduced the background to 2–9 events [34, 96, 97]. We emphasise that these projected sensitivity regions therefore represent estimates that cannot replace a full analysis that should be performed by experimentalists. Analogous studies could be performed for different ALP decay channels, such as \(a\rightarrow b {\bar{b}}\) or \(a \rightarrow jj\).

Fig. 6
figure 6

Projected sensitivity regions for searches for \(e^+e^-\rightarrow \gamma a \rightarrow 3\gamma \) (left) and \(e^+e^-\rightarrow Z a \rightarrow Z_\text {vis}\gamma \gamma \) (right) at future \(e^+e^-\) colliders for \(\text {Br}(a\rightarrow \gamma \gamma )=1\). The constraints from Fig. 4 are shown in the background. The sensitivity regions are based on 4 expected signal events

3.2.1 ALP production in association with a photon, Z or Higgs boson

For \(e^+e^-\rightarrow \gamma a\rightarrow 3 \gamma \) and \(e^+e^-\rightarrow Z a\rightarrow Z\gamma \gamma \), the process only depends on the photon coupling \(| C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \) once a specific relation between \(C_{WW}\) and \(C_{BB}\) is assumed, see (3). The projected reach can therefore be compared to the limits in Fig. 4. If the FCC-ee will operate at different values of the center-of-mass energy, it is in principle possible to measure the two coefficients \(C^\text {eff}_{\gamma Z}\) and \(C^\text {eff}_{\gamma \gamma }\) independently, as pointed out in [25]. Also, for the proposed Z-pole run of the FCC-ee, the process \(e^+e^-\rightarrow \gamma a\rightarrow 3 \gamma \) would correspond to on-shell decay of the Z boson to an ALP, \(Z\rightarrow \gamma a\), which will be discussed below.

We show the projections for the various versions of the CLIC collider and the FCC-ee in Fig. 6, assuming \(C_{WW}=0\) which implies \(C_{\gamma Z}=-s_w^2C_{\gamma \gamma }\).Footnote 7 The parameter space corresponds to at least 4 expected signal events with the ALP decaying before it has reached the electromagnetic calorimeter (ECAL) which is assumed to be within a radius of \(\sim 1.5\,\)m of the beam axis. We consider only visible decays of the Z boson with \(\text {Br}(Z\rightarrow \text {visible})=0.80\). We also impose the constraint \(|C_{\gamma Z}^\mathrm{eff}| < 1.48\, \Lambda /\text {TeV} \) from the LEP measurement of the total width of the Z boson.

Fig. 7
figure 7

Left: projected sensitivity regions for searches for \(e^+e^-\rightarrow ha \rightarrow b{\bar{b}} \gamma \gamma \) (upper panels) and \(e^+e^-\rightarrow ha \rightarrow b{\bar{b}} \ell ^+\ell ^-\) (lower panels) for future \(e^+e^-\) colliders, assuming that \(|C_{Zh}^\text {eff}|=0.72\,\Lambda /\text {TeV}\) and \(\text {Br}(a\rightarrow \gamma \gamma )=1\) (upper panels) and \(\text {Br}(a\rightarrow \ell ^+\ell ^-)=1\) (lower panels). Right: sensitivity regions for the example of the FCC-ee with \(|C_{Zh}^\text {eff}|=0.72\,\Lambda /\text {TeV}\) (solid contour), \(|C_{Zh}^\text {eff}|=0.1\,\Lambda /\text {TeV}\) (dashed contour), and \(|C_{Zh}^\text {eff}|=0.015\,\Lambda /\text {TeV}\) (dotted contour)which corresponds to Br\((h \rightarrow Za)=34\%, 1\% \) and Br\((h \rightarrow Za)=0.02\% \), respectively . The constraints from Fig. 4 are shown in the background. The sensitivity regions are based on 4 expected signal events

The contours for the FCC-ee in Fig. 6 combine the luminosities for the run at the Z-pole (in case of \(e^+e^-\rightarrow \gamma a\)), at \(\sqrt{s}=2m_W\) and at \(\sqrt{s}=250\,\)GeV, whereas for CLIC we show separate limits for three different versions of this collider. Note that the large luminosity of the FCC-ee run at the Z pole leads to a significantly larger sensitivity in the \(e^+e^-\rightarrow \gamma a\) channel compared to the \(e^+e^-\rightarrow Z a \) projection. Further, CLIC\(_\text {1500}\) and CLIC\(_{3000}\) allow to probe considerably higher ALP masses compared to both CLIC\(_{380}\) and the FCC-ee. In this and the following figures, the relevant ALP branching ratio into the observed final state is set to a 100%. As we have shown in [25], the left boundary of the sensitivity region is largely independent of this assumption. For branching ratios smaller than \(\text {Br}(a\rightarrow \gamma \gamma )=1\), the reach in \(C_{\gamma \gamma }^\text {eff}\) however is reduced by a factor \(\big [\text {Br}(a\rightarrow \gamma \gamma )\big ]^{1/2}\). This follows from the cross sections (16) and (17), which imply the scaling \(\sigma (e^+e^-\rightarrow \gamma a\rightarrow 3\gamma ) \sim |C_{\gamma \gamma }^\text {eff}|^2\,\text {Br}(a\rightarrow \gamma \gamma )\) and \(\sigma (e^+e^-\rightarrow Z a\rightarrow Z \gamma \gamma ) \sim |C_{\gamma \gamma }^\text {eff}|^2\,\text {Br}(a\rightarrow \gamma \gamma )\), respectively.Footnote 8

ALPs can also be produced in association with a Higgs boson. The rate for the process \(e^+e^- \rightarrow h a\) depends on the Wilson coefficient \(C_{Zh}^\text {eff}\) in (5). The constraint \(\Gamma (h\rightarrow \text {BSM})< 2.1\,\)MeV on the partial Higgs decay width into non-SM final states implies the upper bound \(|C^\text {eff}_{Zh}|< 0.72\,\Lambda /\text {TeV}\) [72]. Assuming that the Higgs boson is reconstructed in the \(b{\bar{b}}\) final states with \(\text {Br}(h \rightarrow b{\bar{b}})= 0.58\), we derive the sensitivity to \(C_{\gamma \gamma }^\mathrm{eff}\) and \(m_a\) displayed in the upper left panel of Fig. 7. In the upper right panel of Fig. 7 we show how these projected sensitivity regions vary for different values of \(C^\text {eff}_{Zh}\). The expected sensitivity remains the same down to a critical value of the branching ratio \(\text {Br}(a\rightarrow \gamma \gamma )<1\). Below this critical value less than 4 events are produced and the discovery reach is lost. For the FCC-ee, these critical values are \(\text {Br}(a\rightarrow \gamma \gamma )=2\times 10^{-4}\) for \(C_{Zh}^\text {eff}=0.72 \, \Lambda /\text {TeV}\), \(\text {Br}(a\rightarrow \gamma \gamma )=10^{-2}\) for \(C_{Zh}^\text {eff}=0.1 \, \Lambda /\text {TeV}\) and \(\text {Br}(a\rightarrow \gamma \gamma )=0.4\) for \(C_{Zh}^\text {eff}=0.015 \, \Lambda /\text {TeV}\). For the case of leptonic ALP decays these values do not change, and they are only slightly different in the case of CLIC. In that case, searches for other final states can become more promising. This includes searches for invisibly decaying (or stable) ALPs [98]. Lepton colliders are particularly powerful in constraining ALP-lepton couplings. In order to avoid large lepton-flavor changing ALP couplings, we choose a benchmark with ALP couplings to leptons,

$$\begin{aligned} c_{\ell \ell }\equiv c_{ee}=c_{\mu \mu }=c_{\tau \tau }\,. \end{aligned}$$
(30)

The lower panels of Fig. 7 show the regions of sensitivity for ALP searches in the process \(e^+e^-\rightarrow ha\rightarrow b{\bar{b}}\, \ell ^+\ell ^-\). The jumps in the sensitivity region appear at the thresholds for the production of muon and tau pairs. The ALP decays predominantly into the heaviest lepton that is kinematically accessible.

Fig. 8
figure 8

Parameter regions which can be probed for \(e^+e^-\rightarrow ha \rightarrow b{\bar{b}} \gamma \gamma \) (upper panels) and \(e^+e^-\rightarrow ha \rightarrow b{\bar{b}} \ell ^+\ell ^-\) (lower panels) at future \(e^+e^-\) colliders. The grey shaded area is excluded by LHC Higgs measurements. The parameter space to the right of the dotted contours corresponds to the sensitivity reach of the FCC-ee with the indicated ALP branching ratios. The sensitivity regions are based on 4 expected signal events

The graphical representation in Fig. 7 is suboptimal, because it highlights the dependence on one ALP coupling (\(|C_{\gamma \gamma }^\text {eff}|\) or \(|c_{\ell \ell }^\text {eff}|\)), while the dependence on the other coupling (\(C_{Zh}^\text {eff}\)) is only reflected by the different contours. In Fig. 8 we show an alternative representation of the results in the plane of the two relevant ALP couplings, but for fixed values of the ALP mass. The sensitivity reach of the FCC-ee and the three versions of the CLIC collider for an ALP branching ratio of \(\text {Br}(a\rightarrow \gamma \gamma )=1\) (upper panels) and \(\text {Br}(a\rightarrow \ell ^+\ell ^-)=1\) (lower panels) is bounded by the colored contours. With decreasing ALP mass, the lifetime of the ALP increases and the sensitivity reach in \(C_{\gamma \gamma }^\text {eff}\) and \(c_{\ell \ell }^\text {eff}\) is reduced. The fact that the sensitivity region for CLIC is maximal for the lowest center-of-mass energy is a consequence of the 1 / s behavior of the \(e^+e^-\rightarrow ha\) cross section in (18).

For the example of the FCC-ee, we also indicate the dependence of the sensitivity regions on the \(a\rightarrow \gamma \gamma \) or \(a\rightarrow \ell ^+\ell ^-\) branching ratios, which in Fig. 7 were assumed to be maximal. The parameter space to the right of the dotted contours corresponds to the sensitivity reach of the FCC-ee with the indicated ALP branching ratios. Smaller branching ratios reduce the sensitivity to \(C_{Zh}^\text {eff}\), because the total number of signal events decreases. However, the values of \(C_{\gamma \gamma }^\text {eff}\) and \(c_{\ell \ell }^\text {eff}\) for which sensitivity is lost are almost independent of the ALP branching ratio, as long as this branching ratio exceeds a critical value. Consider, for example the process \(e^+e^-\rightarrow ha\rightarrow b{\bar{b}}\gamma \gamma \) for \(m_a=10\,\)GeV (upper left panel of Fig. 8). If \(C_{Zh}^\text {eff}/\Lambda =0.1\,\text {TeV}^{-1}\), the sensitivity reach in \(C_{\gamma \gamma }^\text {eff}/\Lambda \) extends down to \(\approx 10^{-5}\,\text {TeV}^{-1}\) irrespective of \(\text {Br}(a\rightarrow \gamma \gamma )\), as long as this branching ratio exceeds 1%. The reason for this behavior is that the total width of the ALP increases for smaller ALP branching ratios and therefore the lifetime decreases. Smaller ALP lifetimes lead to more ALP decays in the detector volume, canceling the effect of the reduced branching ratio near the lower boundary of the sensitivity region [25]. In order to not clutter the plots we do not show the corresponding contours for CLIC.

Fig. 9
figure 9

Projected exclusion contours for searches for \(e^+e^-\rightarrow \gamma a \rightarrow \gamma \ell ^+\ell ^-\) (left) and \(e^+e^-\rightarrow Z a \rightarrow Z_\text {vis}\ell ^+ \ell ^-\) (right) for future \(e^+e^-\) colliders, and \(\text {Br}(a\rightarrow \ell ^+ \ell ^-)=1\). The constraints from Fig. 5 are in the background. The sensitivity regions are based on 4 expected signal events

From now on, whenever ALP production and decay are governed by unrelated Wilson coefficients, we will use the graphical representation in Fig. 8.

A particularly interesting benchmark scenario is the model in which at tree-level the ALP only couples to charged leptons. In this case the production and decay are governed by the same parameter \(c_{\ell \ell }\). The ALP decays are dominated by \(\text {Br}(a\rightarrow e^+e^-)\approx 1\) for \(m_a< 2m_\mu \), \(\text {Br}(a\rightarrow \mu ^+\mu ^-)\approx 1\) for \(2m_\mu<m_a< 2m_\tau \), and \(\text {Br}(a\rightarrow \tau ^+\tau ^-)\approx 1\) for \(m_a> 2m_\tau \). Interestingly, the most relevant production mode at \(e^+e^-\) colliders is still the associated production with photons and Z bosons, which proceeds through the loop-induced Wilson coefficients [25]

$$\begin{aligned}&C_{\gamma \gamma }^\text {eff} = \frac{1}{16\pi ^2}\, c_{\ell \ell }\, \sum _{\ell =e, \mu , \tau }B_1(\tau _\ell ), \end{aligned}$$
(31)
$$\begin{aligned}&C_{\gamma Z}^\text {eff}=\frac{1}{16\pi ^2}\Big (s_w^2-\frac{1}{4}\Big ) \,c_{\ell \ell }\, \sum _{\ell =e, \mu , \tau } B_3(\tau _\ell , \tau _{\ell /Z})\approx \Big (s_w^2-\frac{1}{4}\Big )\,C^\text {eff}_{\gamma \gamma }, \end{aligned}$$
(32)

with \(\tau _\ell =4m_\ell ^2/m_a^2\), and \(\tau _{\ell /Z}=4m_\ell ^2/m_Z^2\). In the last step in the second equation we have neglected terms of order \(m_\ell ^2/m_Z^2\). Because of the anomaly equation, \(B_1(\tau _\ell )\approx 1\) for \(m_a>m_\ell \) and \(B_1(\tau _\ell )\approx -\frac{m_a^2}{12m_\ell ^2}\) for \(m_\ell \gg m_a\) and the relative size of the resonant production cross section and the associated ALP+\(\gamma \) production cross section is given by

$$\begin{aligned} \frac{\sigma (e^+e^-\rightarrow \gamma a)}{\sigma (e^+e^-\rightarrow a )}&= \frac{\alpha \,\alpha (s)^2}{12\pi ^2}\,N_{\ell }^2\,\frac{s^2}{\Gamma _a m_a m_e^2 }\bigg (1-\frac{m_a^2}{s}\bigg )^5 \nonumber \\&\approx 1.3\times 10^{11}\,\bigg [\frac{N_\ell }{3}\bigg ]^2\bigg [\frac{s}{\text {TeV}}\bigg ]^2 \bigg [\frac{\text {GeV}}{m_a}\bigg ] \bigg [\frac{\text {keV}}{\Gamma _a}\bigg ], \end{aligned}$$
(33)

where \(N_\ell \) denotes the number of charged leptons lighter than the ALP, and \(\Gamma _a\approx \) keV is a typical width for \(a\rightarrow \tau ^+\tau ^-\), assuming \(|c_{\ell \ell }|/\Lambda \approx 1/\text {TeV}\). For \(N_\ell <3\), the total width is reduced by \(m_\mu ^2/m_\tau ^2\), and the associated ALP\(+\gamma \) production is even more dominant. The ratio of the partial decay widths on the other hand is given by

$$\begin{aligned} \frac{\Gamma (a\rightarrow \ell \ell )}{\Gamma (a\rightarrow \gamma \gamma )}&\approx \frac{8 \pi ^2 m_\ell ^2}{\alpha ^2 m_a^2 N_\ell ^2} \approx 4.1\times 10^4\,\bigg [\frac{3}{N_\ell }\bigg ]^2\frac{4m_\ell ^2}{m_a^2} , \end{aligned}$$
(34)

with \(m_\ell \) the mass of the heaviest lepton in which the ALP can decay. For ALP masses below 720 GeV (2300 GeV) this ratio is larger than 1 (0.1), justifying the assumption of \(\text {Br}(a\rightarrow \ell ^+\ell ^-)=1\) for almost all of the relevant parameter space.

We show projections for future \(e^+e^-\) colliders for flavor universal ALP-lepton couplings in Fig. 9. An increase in sensitivity occurs at the di-muon and di-tau thresholds. Note that while the advantage of a high-luminosity run on the Z-pole of the FCC-ee accounts for an increase in sensitivity on \(C_{\gamma \gamma }^\text {eff}\) of up to \(\sim 2.5\) orders of magnitude in Fig. 6, for purely leptonic ALP couplings the Z-pole run only increases the sensitivity by about one order of magnitude in \(e^+e^-\rightarrow \gamma a\), because the loop-induced Wilson coefficient \(C_{\gamma Z}^\mathrm{eff}\) is suppressed by the accidentally small vector coupling of the Z boson to charged leptons. CLIC can again constrain higher ALP masses.

Fig. 10
figure 10

Parameter regions which can be probed in the decay \(h \rightarrow Z a\) with \(a \rightarrow \gamma \gamma \) (upper row) and \(h \rightarrow a a\) with \(a \rightarrow \gamma \gamma \) (lower row) at future \(e^+e^-\) colliders. The grey shaded area is excluded by LHC Higgs measurements. The dotted contours correspond to the sensitivity region of the FCC-ee for ALP branching ratios smaller than 1. The sensitivity regions are based on 4 expected signal events

Fig. 11
figure 11

Parameter regions which can be probed in the decay \(h \rightarrow Z a\) with \(a \rightarrow \ell ^- \ell ^+\) (upper row) and \(h \rightarrow a a\) with \(a \rightarrow \ell ^+ \ell ^-\) (lower row) at future \(e^+e^-\) colliders. The grey shaded area is excluded by LHC Higgs measurements. The dotted contours correspond to the sensitivity region of the FCC-ee for ALP branching ratios smaller than 1. The sensitivity regions are based on 4 expected signal events

3.2.2 ALP production in exotic decays of on-shell Higgs bosons

Beyond searches for ALPs produced in association with a photon, a Z boson or a Higgs boson, ALPs can also be searched for in exotic Higgs decays. The Higgs production cross section at lepton colliders is typically at least one order of magnitude smaller compared to the LHC. This implies that lepton colliders are most powerful for light ALPs with dominant decay channels for which backgrounds at hadron colliders are large. In Fig. 10, we show the reach of the different stages of CLIC and the FCC-ee for ALPs produced in \(e^+e^- \rightarrow h+X \rightarrow a Z +X \rightarrow \gamma \gamma Z_\text {vis}+X\) and \(e^+e^- \rightarrow h + X\rightarrow a a + X\rightarrow 4\gamma +X\) for three different ALP masses \(m_a= 100\,\)MeV, \(1\,\)GeV and \(10\,\)GeV. We do not distinguish between vector-boson fusion or associated Higgs production and demand four signal events. In order to reconstruct the Higgs, we further demand the Z boson to originate from the Higgs decay as well as all Zs to decay into visible final states with \(\text {Br}(Z\rightarrow \text {visible})=0.8\) and \(\text {Br}(a\rightarrow \gamma \gamma )=1\). This condition can be relaxed if the electrons in ZZ-fusion or the additional Z in associated Higgs production are detected. Since the reach in searches for exotic Higgs decays is directly proportional to the number of Higgses produced, high-luminosity machines lead to the best sensitivity. In Fig. 10 we further show the reach of the FCC-ee for different values of \(\text {Br}(a\rightarrow \gamma \gamma )=10^{-5}-10^{-1}\) given by the respective dotted lines. For leptonic ALP decays, the analagous plots are shown in Fig. 11, where, in contrast to Fig. 9, no connection between \(C_{ah}^\text {eff}\), \(C_{Zh}^\text {eff}\) and \(c_{\ell \ell }^\mathrm{eff}\) has been assumed. CLIC has a larger reach than the FCC-ee for leptonic ALP decays due to the larger detector volume, \(L_\text {det}=0.6\,\)m at CLIC, compared to \(L_\text {det}=0.02\,\)m at the FCC-ee. Since \(C_{ah}^\text {eff}\) and \(C_{Zh}^\text {eff}\) are not controlled by the anomaly equation, the one-loop contribution from a tree-level \(c_{\ell \ell }^\mathrm{eff}\) coupling is proportional to \(m_\ell ^2/v^2\) [25]. The gray regions in Figs. 10 and 11 correspond to \(|C_{Zh}^\text {eff}| >0.72 \Lambda /\text {TeV}\) and \(|C_{ah}^\text {eff} | >1.34\,\Lambda ^2/\text {TeV}^2\) excluded by the current upper limit on \(\text {Br}(h \rightarrow \text {BSM})< 0.34\) (at 95% CL) [72].

Fig. 12
figure 12

Allowed regions in the parameters space of the Wilson coefficients \(C_{\gamma \gamma }^\mathrm{eff}-C_{\gamma Z}^\mathrm{eff}\) obtained from projections for the two-parameter global electroweak fit at 68% CL, 95% CL and 99% CL at FCC-ee (violet) and at 95% CL for the LHC at \(\sqrt{s}=14\,\)TeV (green). For the parameter space within the dashed black contour, the FCC-ee measurement of \(\alpha (m_Z)\) is within its projected errors at 95% CL [99]. The red dots represent the best fit points based on the current electroweak fit

3.2.3 Electroweak precision constraints on ALP couplings

Besides direct measurements, lepton colliders will be able to measure electroweak observables with unprecedented precision, which allows us to set bounds on the ALP contributions to these observables [25]. The measurement of the oblique parameters will improve current constraints by roughly one order of magnitude [100], while the running of the electromagnetic coupling constant, \(\alpha (m_Z)\), can be determined with an uncertainty of about \(10^{-5}\) [99]. In Fig. 12, we show the projected electroweak fit for the FCC-ee, where we assume the central values to correspond to the SM prediction, in the \(C_{\gamma \gamma }^\mathrm{eff}-C_{\gamma Z}^\mathrm{eff}\) plane at 68% , 95% and 99% CL (violet), together with the expected sensitivity of the LHC at \(\sqrt{s}=14\,\)TeV (green). Superimposed is the expected 95% CL bound derived from the measurement of \(\alpha (m_Z)\) (black dashed contour), assuming that the theoretical error on this quantity will have decreased below the experimental uncertainty by the time the measurement can be performed. In deriving these projections we have set the ALP mass to zero. By combining the future measurements of \(\alpha (m_Z)\) and of electroweak precision pseudo-observables one will be able to constrain \(|C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \lesssim 2.5\,\)TeV\(^{-1}\) and \(|C_{\gamma Z}^\mathrm{eff}|/\Lambda \lesssim 1.5\,\)TeV\(^{-1}\) (at 95% CL). The current global fit has a slight tension with the SM prediction and the best fit point is at \((S,T)= (0.096, 0.111)\). If this effect is solely due to the ALP couplings \(C_{\gamma \gamma }^\mathrm{eff}\) and \(C_{\gamma Z}^\mathrm{eff}\), the corresponding best fit points are indicated by the red dots in Fig. 12. Such sizable coefficients are however strongly constrained by LHC searches for \(pp\rightarrow \gamma a\) and \(pp\rightarrow \gamma Z\).

3.3 ALP searches at future hadron colliders

Future hadron colliders can significantly surpass the reach of the LHC in searches for ALPs. In particular, searches for ALPs produced in exotic Higgs and Z decays profit from the higher center-of-mass energies and luminosities of the proposed high-energy LHC (HE-LHC), planned to replace the LHC in the LEP tunnel with \(\sqrt{s}=27 \,\)TeV, and the ambitious plans for a new generation of hadron colliders with \(\sqrt{s}=100\,\)TeV at CERN (FCC-hh) and in China (SPPC). At hadron colliders, ALP production in association with electroweak bosons suffers from large backgrounds. Previous studies of these processes have therefore focussed on invisibly decaying (or stable) ALPs, taking advantage of the missing-energy signature [34, 37]. In contrast, here we focus our attention on resonant ALP production in gluon-fusion and photon-fusion, as well as on ALPs produced in the decays of Z and Higgs bosons.

Fig. 13
figure 13

Projected reach in searches for \(pp \rightarrow a \rightarrow \gamma \gamma \) with the LHC (green), HE-LHC (light green) and a \(100\,\)TeV collider (blue). Contours of constant branching ratios \(\text {Br}(a\rightarrow \gamma \gamma )\) are shown as dotted lines. The sensitivity regions are based on 100 expected signal events

3.3.1 Resonant ALP production

At hadron colliders ALPs can be produced resonantly in gluon-gluon fusion. A gluon coupling implies the presence of di-jet final states, which are hard to distinguish from the background for masses \(m_a < 1\) TeV. A more promising strategy is the search for di-photon events. Assuming non-vanishing couplings to photons and gluons, we show in Fig. 13 the sensitivity reach for the LHC, LHC\(_{27}\) and FCC-hh in the \(C_{GG}^\mathrm{eff}-C_{\gamma \gamma }^\mathrm{eff}\) plane. This reach is obtained by a rescaling of the constraint derived in the ATLAS analysis with \(39.6\,\)fb\(^{-1}\) of data [88]. The ALP production cross section is computed with MadGraph5 [77] and corrected for N\(^3\)LO corrections using the K factors \(K_{gg} = 2.7\) at \(m_a=200\) GeV, \(K_{gg} = 2.45\) at \(m_a=500\) GeV and \(K_{gg} = 2.35\) at \(m_a=1\) TeV [70].

3.3.2 ALP production in exotic decays of Z or Higgs bosons

In analogy with the LHC specifications, we demand ALPs produced at pp colliders and decaying into photons to decay inside the detector and before the electromagnetic colorimeter, \(L_\mathrm{det} = 1.5\,\)m, and for ALPs decaying into leptonic final states to decay before they reach the inner tracker, \(L_\mathrm{det} = 2\,\)cm. Our sensitivity reach is defined by requiring at least 100 signal events. We use the reference cross sections \(\sigma (gg \rightarrow h) = 146.6\,\)pb [101] and \(\sigma (pp \rightarrow Z) = 118.76 \,\)nb at \(\sqrt{s} = 27\,\)TeV, computed at NNLO [75, 76]. At \(\sqrt{s} = 100\,\)TeV, the relevant cross sections are \(\sigma (gg \rightarrow h) = 802\,\)pb and \( \sigma (pp \rightarrow Z) = 0.4\,\mu \)b [102].

Fig. 14
figure 14

Parameter regions which can be probed in the decay \(Z \rightarrow \gamma a\) with \(a \rightarrow \gamma \gamma \) at hadron colliders . The projected reach is colored green (LHC), light green (HE-LHC) and turquoise (FCC-hh). We assume \(\text {Br}(a\rightarrow \gamma \gamma )=1\). The sensitivity regions are based on 100 expected signal events

In Fig. 14 we show the reach of the LHC, the HE-LHC (LHC\(_{27}\)) and the FCC-hh in searches for \(pp \rightarrow Z \rightarrow \gamma a\rightarrow 3\gamma \), assuming as before that \(C_{WW}=0\) and \(\text {Br}(a\rightarrow \gamma \gamma )=1\). The reach of the HE-LHC extends beyond the reach of the LHC at \(\sqrt{s}=14\,\)TeV by a factor of about 3.2 assuming an integrated luminosity of \(15\,\)ab\(^{-1}\). Colliders with \(\sqrt{s}= 100\,\)TeV and \(20\,\)fb\(^{-1}\) can improve this reach by a factor of about 6.7 compared with the LHC. However, a high-luminosity run of an \(e^+e^-\) collider on the Z-pole, as for example proposed for the FCC-ee, can probe the same couplings with even higher precision, as becomes clear by comparing the left upper panel of Fig. 7 with Fig. 14.

Fig. 15
figure 15

Projected reach in searches for \(h \rightarrow Za \rightarrow \ell ^+\ell ^-+2\gamma \) and \(h \rightarrow aa \rightarrow 4\gamma \) decays with the LHC (green), HE-LHC (light green) and a \(100\,\)TeV collider (blue). The parameter region with the solid contours correspond to a branching ratio of \(\text {Br}(a\rightarrow \gamma \gamma )=1\), and the contours showing the reach for smaller branching ratios are dotted. The sensitivity regions are based on 100 expected signal events

Fig. 16
figure 16

Projected reach in searches for \(h \rightarrow Za \rightarrow \ell ^+\ell ^-+\ell ^+\ell ^- \) and \(h \rightarrow aa \rightarrow 4\ell \) decays with the LHC (green), HE-LHC (light green) and a \(100\,\)TeV collider (blue). The parameter region with the solid contours correspond to a branching ratio of \(\text {Br}(a\rightarrow \ell ^+\ell ^-)=1\), and the contours showing the reach for smaller branching ratios are dotted. The sensitivity regions are based on 100 expected signal events

Fig. 17
figure 17

Left: geometric setup of the MATHUSLA surface detector above the ATLAS/CMS cavern together with a sketch of the \(pp\rightarrow h\rightarrow aZ\) process with a subsequent decay of the ALP in the MATHUSLA detector volume. Right: Total percentage of ALPs decaying within the ATLAS or CMS detector per ALPs produced in the Higgs decay \(h\rightarrow aZ\) (green), fraction of ALPs produced decaying in ATLAS/CMS together with a leptonically decaying Z (dashed green), and the percentage of ALPs decaying within the MATHUSLA detector volume (red). The gray area shows the distance between the interaction point and the electromagnetic calorimeter

The situation is different for the case of exotic Higgs decays, because the Higgs production cross sections at hadron colliders with \(\sqrt{s}=14-100\,\)TeV are larger by orders of magnitude compared to the proposed future lepton colliders. In Fig. 15, we display the reach for observing 100 events at the LHC, HE-LHC and FCC-hh for searches for \(pp\rightarrow h \rightarrow Za\rightarrow \ell ^+\ell ^-\gamma \gamma \) (upper panels) and \(pp\rightarrow h \rightarrow aa\rightarrow 4 \gamma \) (lower panels) for \(m_a= 100\,\)MeV, \(1\,\)GeV and \(10\,\)GeV and \(\text {Br}(a\rightarrow \gamma \gamma )=1\). We further indicate the reach obtained in the case that \(\text {Br}(a\rightarrow \gamma \gamma )<1\) by the dotted lines. Even though we rely on leptonic Z decays with \(\text {Br}(Z\rightarrow \ell ^+\ell ^-)=0.0673\) to account for the more challenging environment at hadron colliders, a future \(100\,\)TeV collider significantly improves beyond the projected reach in \(C_{Zh}^\mathrm{eff}\) and \(C_{ah}^\mathrm{eff}\) of the FCC-ee shown in Fig. 10. The sensitivity to \(C_{\gamma \gamma }^\text {eff}\), however, is comparable between the FCC-ee and FCC-hh, and the projections for searches for \(e^+e^-\rightarrow ha\rightarrow b{\bar{b}} \gamma \gamma \) at the second and third stage of CLIC even surpass the FCC-hh sensitivity in \(C_{\gamma \gamma }^\mathrm{eff}\). For all considered ALP masses, the \(h\rightarrow Z a\) decay could be observed at a \(100\,\)TeV collider for \(\text {Br}(a\rightarrow \gamma \gamma )\gtrsim 10^{-6}\) and the \(h\rightarrow a a\) decay could be fully reconstructed for \(\text {Br}(a\rightarrow \gamma \gamma )\gtrsim 0.01\).

The results are similar for leptonic ALP decays. In Fig. 16 we show the reach in the \(c_{\ell \ell }^\text {eff} - C_{Zh}^\text {eff}\) plane (upper row) and \(c_{\ell \ell }^\text {eff} - C_{ah}^\text {eff}\) plane (lower row). The results are again comparable with the projections for searches at future lepton colliders shown in Fig. 11.

3.4 Searches for ALPs with macroscopic lifetime

For small couplings and light ALPs produced in Higgs or Z decays, the ALP decay vertex can be considerably displaced from the production vertex. For ALPs still decaying in the detector volume, this secondary vertex can be used to further suppress backgrounds. Very long-lived ALPs, which leave the detector before they decay, only leave a trace of missing energy. A detector further away from the interaction point can detect the decay products of these ALPs and reconstruct the ALP mass and direction. Recent proposals include the MATHUSLA large-volume surface detector [58, 103] build above the ATLAS or CMS site at CERN, the Codex-B detector [57] build in a shielded part of the LHCb cavern, and a set of detectors called FASER [56] build along the beam line, \(\sim 150\,\)m and \(\sim 400\,\)m from the interaction point of ATLAS or CMS. Since long lived ALPs are mostly produced in Higgs and Z decays at the LHC, we will consider the reach of the surface detector MATHUSLA for ALPs produced in the decays \(Z\rightarrow \gamma a\), \(h \rightarrow Z a\) and \(h \rightarrow aa\). We present projections for the sensitivity region for ALPs decaying into photons, muons and jets (gluons). Note that the possibility to detect photons with the MATHUSLA detector is an optional feature of the current design plan [103].

For MATHUSLA, it is impossible to detect both final state particles in \(h\rightarrow Za \) and \(Z\rightarrow \gamma a\) decays and highly unlikely to see both ALPs from \(h\rightarrow aa\) decays in the decay volume. However, because of the much lower background, single ALPs can be detected irrespective of their origin. The fraction of ALPs decaying in the MATHUSLA detector is then given by

$$\begin{aligned} f^a_\text {M}=\int _{\Omega _\text {M}} d\Omega \,\bigg (\frac{1}{\sigma }\,\frac{d\sigma }{d\Omega }\bigg )\, \left[ e^{-r_\text {in}(\Omega )/L_a}-e^{-r_\text {out}(\Omega )/L_a}\right] , \end{aligned}$$
(35)

where \(\Omega _\text {M}\) describes the area in solid angle covered by the MATHUSLA detector, \(d\sigma /d\Omega \) denotes the differential cross section for ALPs produced in the decay of a Z or Higgs boson in the laboratory frame, and \(L_a=p_a/(\Gamma _a m_a)\), where \(p_a\) is the ALP momentum in that frame. At fixed solid angle, the radii \(r_\text {in}\) and \(r_\text {out}\) denote the distances between the interaction point and the intersections of the ALP line of flight with the MATHUSLA detector. The MATHUSLA detector with a volume of \(20\,\text {m} \times 200\,\text {m} \times 200\,\text {m}\) will be placed \(100\,\)m above the beam line and \(100\,\)m shifted from the interaction point along the beam line and has a considerably smaller coverage in solid angle: approximately \(5\%\) at MATHUSLA compared to \(100\%\) at ATLAS and CMS. Nevertheless, as Fig. 17 shows, for long-lived ALPs, the number of ALPs decaying in the MATHUSLA volume is comparable to the number of ALPs decaying within a radius of \(1.5\,\)m from the interaction point. However, for ALPs with masses \(m_a>1\,\)GeV backgrounds at MATHUSLA are negligible, whereas for example for \(h \rightarrow Z a \) decays the Z boson needs to be reconstructed and more events are required to distinguish the signal from the background. As in Sect. 3.3, we therefore demand at least 100 events with leptonically decaying Z boson to determine the LHC reach, and at least 4 reconstructed ALP decays to determine the reach of MATHUSLA. In the left panel of Fig. 17 we illustrate the geometry of the proposed MATHUSLA experiment. The right panel shows the percentage of ALPs produced via \(pp\rightarrow h\rightarrow Z a\) that decay before reaching the electromagnetic calorimeter (green), the percentage of ALPs decaying within the detector together with a leptonically decaying Z-boson (dashed green), and the percentage of ALPs decaying within the MATHUSLA detector volume (red) as a function of the ALP decay length. Taking into account the additional relative factor of \(\sim 1/20\) between the number of events we expect to determine the reach of LHC and MATHUSLA, the MATHUSLA detector performs significantly better than the LHC for ALPs with a decay length exceeding \(100\,\)m.

Fig. 18
figure 18

Projected reach in searches for \(h \rightarrow Za \rightarrow \ell ^+\ell ^-+2\gamma \) (top) and \(h \rightarrow aa \rightarrow 4\gamma \) (bottom) decays at the LHC (green) and MATHUSLA (red) with \(\sqrt{s}=14\,\)TeV center-of-mass energy and \(3\,\)ab\(^{-1}\) integrated luminosity. The parameter region with solid contours correspond to a branching ratio of \(\text {Br}(a\rightarrow \gamma \gamma )=1\), and contours showing the reach for smaller branching ratios are dotted. The sensitivity regions are based on 4 (MATHUSLA) and 100 (LHC) expected signal events, respectively

Using (35), we can define the corresponding effective branching ratios for ALP decays in MATHUSLA in analogy with (29),

$$\begin{aligned}&\text {Br}(h\rightarrow Za\rightarrow Z \gamma \gamma )\big \vert _\text {eff}^\text {M}\nonumber \\&\quad =\text {Br}(h\rightarrow Za)\,\text {Br}(a\rightarrow \gamma \gamma )f_\text {M}^a, \end{aligned}$$
(36)
$$\begin{aligned}&\text {Br}(h\rightarrow aa\rightarrow a \gamma \gamma )\big \vert _\text {eff}^\text {M}\nonumber \\&\quad =2\text {Br}(h\rightarrow aa)\,\text {Br}(a\rightarrow \gamma \gamma )f_\text {M}^a, \end{aligned}$$
(37)
$$\begin{aligned}&\text {Br}(Z\rightarrow \gamma a\rightarrow 3\gamma )\big \vert _\text {eff}^\text {M}\nonumber \\&\quad =\text {Br}(Z\rightarrow \gamma a)\,\text {Br}(a\rightarrow \gamma \gamma )f_\text {M}^a\, . \end{aligned}$$
(38)

The expressions for ALP decays into leptons are analogous with the ALP decay into photons with \(\text {Br}(a\rightarrow \gamma \gamma )\) replaced by \(\text {Br}(a\rightarrow \ell ^+\ell ^-)\). In order to fully capture the geometric acceptance of the MATHUSLA detector, we use MadGraph5 to simulate the signal events at parton level and the code provided by the MATHUSLA working group to compute the acceptance [103].

We illustrate the reach of the LHC and the MATHUSLA detector for discovering ALPs decaying into photons from \(h \rightarrow Za\) (upper panels) and \(h \rightarrow a a\) (lower panels) decays in Fig. 18. For the green region with solid contours, the LHC would see 100 events with a branching ratio of \(\text {Br}(a\rightarrow \gamma \gamma )=1\). For smaller branching ratios, larger couplings \(|C^\text {eff}_{hZ}|\) and \(|C_{ah}^\mathrm{eff}|\) are required to obtain the same number of events. Dotted lines show the lower limit for \(\text {Br}(a\rightarrow \gamma \gamma )=0.1\) and \(\text {Br}(a\rightarrow \gamma \gamma )=0.01\). The red region with solid contours shows the parameter space for which 4 ALP decays are expected within the MATHUSLA detector volume for \(\text {Br}(a\rightarrow \gamma \gamma )=1\). Smaller branching ratios with constant partial width for ALP decays into photons imply a larger total decay width of the ALP and therefore smaller decay lengths. For \(\text {Br}(a\rightarrow \gamma \gamma )=0.1\) and \(\text {Br}(a\rightarrow \gamma \gamma )=0.01\), MATHUSLA therefore looses sensitivity for larger values of \(|C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \). In the case of \(h \rightarrow a a\) decays, MATHUSLA will be able to probe smaller branching ratios than ATLAS and CMS. This underlines the complementarity between searches for prompt decays with ATLAS/CMS and searches for displaced ALP decays with MATHUSLA. We stress that a discovery of a resonance with MATHUSLA alone cannot be used to determine the production mode of the ALP. However, one can use the reconstructed mass of the ALP and the number of observed events to guide future searches at the LHC, for example searches for invisible ALPs in the final state.

Fig. 19
figure 19

Projected reach in searches for \(h \rightarrow Za \rightarrow \ell ^+\ell ^-+\mu ^+\mu ^- \) (left) and \(h \rightarrow aa \rightarrow \mu ^+\mu ^-+\mu ^+\mu ^- \) (right) decays with ATLAS/CMS (green) and MATHUSLA (red) with \(\sqrt{s}=14\,\)TeV center-of-mass energy and \(3\,\)ab\(^{-1}\) integrated luminosity. The parameter region with solid contours correspond to a branching ratio of \(\text {Br}(a\rightarrow \mu ^+\mu ^-)=1\), and contours showing the reach for smaller branching ratios are dotted. The sensitivity regions are based on 4 (MATHUSLA) and 100 (LHC) expected signal events, respectively

Fig. 20
figure 20

Projected reach in searches for \(Z \rightarrow \gamma a \rightarrow 3\gamma \) with MATHUSLA for \(\sqrt{s}=14\,\)TeV center-of-mass energy, \(3\,\)ab\(^{-1}\) integrated luminosity and \(\text {Br}(a\rightarrow \gamma \gamma )=1\), together with the expected sensitivity of FASER taken from [104] and SHiP [20]. The sensitivity regions are based on 4 (MATHUSLA) and 100 (LHC) expected signal events, respectively

In Fig. 19, we show the reach of \(h \rightarrow Za\) and \(h \rightarrow aa\) for ALPs decaying into muons. Since at least approximate lepton-flavor universality is expected for the couplings of the ALP, the muon decay mode is particularly well motivated for \(2m_\mu< m_a < 2m_\tau \). Also here, MATHUSLA can probe much smaller couplings \(|c_{\mu \mu }^\mathrm{eff}|\) than the LHC.

In the case of \(Z\rightarrow \gamma a \) decays, we show the reach of MATHUSLA in the \(m_a - |C_{\gamma \gamma }^\mathrm{eff}|/\Lambda \) plane in Fig. 20, again assuming \(C_{WW}=0\). In principle, for non-vanishing \(C_{\gamma Z}\), searches for exotic Z decays with MATHUSLA compete with the reach of future beam-dump experiments such as ShiP [20]. However for light ALPs, the reach shown in Fig. 20 is probably overestimated. Whether the MATHUSLA detector will be able to resolve photon pairs for \(m_a< 1\,\)GeV will depend on the angular resolution of the final detector proposal. Interestingly, FASER can take advantage of the large Primakoff cross section for photons producing ALPs through interaction with the detector material (\(\gamma N \rightarrow a N\)) in the forward region to set limits on \(C_{\gamma \gamma }^\mathrm{eff}\) independently [104]. The corresponding projected sensitivity reach of FASER is slightly better than that of MATHUSLA.

Fig. 21
figure 21

Projected exclusion contours for searches for \(pp\rightarrow h \rightarrow Z a \) (left) and \(pp\rightarrow h \rightarrow aa \) (right) with the subsequent ALP decay \(a \rightarrow gg\) and \(\text {Br}(a\rightarrow gg)=1\) with the MATHUSLA detector. The different contours correspond to different values of \(C_{Zh}^\text {eff}\) and \(C_{ah}^\text {eff}\). The sensitivity regions are based on 4 expected signal events, respectively

A unique strength of surface detectors is the possibility to constrain hadronic ALP decays, whereas light ALPs (\(m_a< 500\,\)GeV) decaying into jets are hard to detect at the LHC because of the large QCD background. For ALPs produced in gluon fusion or through ALP-quark couplings, a sizable production cross section corresponds to couplings too large to produce any signal in the MATHUSLA detector. ALPs produced in resonant Higgs or Z decays can be detected in MATHUSLA by reconstructing di-jet (or multi-jet) events. Particularly well motivated are ALPs with only couplings to gluons, because in models addressing the strong CP problem the ALP-gluon coupling is the only ALP coupling that cannot be avoided. We show the parameter space for which at least four \(a\rightarrow jj\) events are expected within the MATHUSLA volume in the \(m_a-C_{GG}^\text {eff}\) plane in Fig. 21 for different values of \(C_{Zh}^\text {eff}\) (left) and \(C_{ah}^\text {eff}\) (right). The expected minimal mass resolution of the MATHUSLA detector for ALPs in Higgs decays is of the order of \(m_a\approx 100\,\)MeV, assuming a spatial resolution of \(1\,\)cm. In Fig. 21 the lowest ALP mass is \(m_a= 600\) MeV.Footnote 9

4 Conclusions

Any ultraviolet completion of the SM in which an approximate global symmetry is broken gives rise to pseudo-Nambu–Goldstone bosons, which are light with respect to the symmetry breaking scale \(m_a \ll \Lambda \). The discovery of such ALPs at the LHC or future colliders could therefore be the first sign of a whole sector of new physics, and measuring its properties could reveal important hints about the UV theory.

We consider the most general effective Lagrangian including the leading operators in the \(1/\Lambda \) expansion that couple the ALP to SM particles. Whereas couplings to SM fermions and gauge bosons can arise at mass dimension-5, the Higgs portal only arises at dimension-6. We derive projections for the most promising ALP search channels for the LHC, its potential future high-energy upgrade, as well as a variety of possible future high-energy hadron and lepton colliders.

At lepton colliders, ALP production in association with a photon, a Z boson or a Higgs boson provide the dominant production processes, provided the ALP couplings to either hypercharge, \(SU(2)_L\) gauge bosons or to the Higgs boson are present in the Lagrangian. Even if only ALP-fermion couplings are present at tree-level, ALP couplings to gauge bosons are generated at one-loop order through the anomaly equation. We point out that a high-luminosity run at the Z pole would significantly increase the sensitivity to ALPs produced in \(e^+e^-\rightarrow \gamma a\) with subsequent decays \(a\rightarrow \gamma \gamma \) or \(a\rightarrow \ell ^+\ell ^-\). This favors the FCC-ee proposal over CLIC in these particular searches, whereas CLIC, operating at \(\sqrt{s}=1.5\,\)TeV or \(\sqrt{s}=3\,\)TeV, can discover significantly heavier ALPs.

At hadron colliders ALPs can be produced copiously in gluon-fusion and via exotic \(Z \rightarrow a \gamma \), \(h \rightarrow a Z\) and \(h \rightarrow a a\) decays. Searches for exotic Z decays at a future 100 TeV collider are less sensitive to ALP-photon couplings than a high-luminosity run of the FCC-ee at the Z pole. For the exotic Higgs decays \(h \rightarrow Z a \) and \(h \rightarrow a a\) already the LHC at \(\sqrt{s}=14\,\)TeV and \(3\,\)ab\(^{-1}\) provides a better reach compared to future \(e^+e^-\) colliders in the corresponding Wilson coefficients \(C_{ah}^\mathrm{eff}\) and \(C_{Zh}^\mathrm{eff}\). The sensitivity of a future \(100\,\)TeV collider in both \(C_{Zh}^\mathrm{eff}\) and \(C_{ah}^\mathrm{eff}\) is about an order of magnitude larger than at the LHC, and about a factor of 3 in the coefficients \(C_{\gamma \gamma }^\mathrm{eff}\) (for \(a\rightarrow \gamma \gamma \)) and \(c_{\ell \ell }^\mathrm{eff}\) (for \(a\rightarrow \ell ^+\ell ^-\)).

A future dedicated detector searching for long-lived particles at the LHC, such as MATHUSLA, FASER or Codex-B could provide sensitivity for even smaller ALP couplings to photons, charged leptons or jets. MATHUSLA has unique capabilities to search for long-lived ALPs with a mean decay length of \(100\,\)m and more, corresponding to couplings 2–3 orders of magnitude smaller than the ones that can be probed with ATLAS and CMS. Such ALPs cannot be produced resonantly with a significant cross section, but large numbers of ALPs with small widths can be produced in exotic decays of Higgs or Z bosons. The main backgrounds at MATHUSLA are cosmic rays, allowing for a cleaner environment for observing ALPs in the \(\mathcal {O}(1)-\mathcal {O}(10)\,\)GeV range. This is particularly powerful for hadronically decaying ALPs, where MATHUSLA can overcome the large QCD background at the LHC and thus provide the opportunity to constrain light ALPs decaying into jets, which are otherwise difficult due to the large QCD background at hadron colliders.

Long-lived ALPs or ALPs that couple to dark matter [105] can also be searched for by cutting on missing energy. The focus of this paper is on ALPs that can be reconstructed from their decay products, but projections for searches for missing energy signatures at the LHC with 3000 fb\(^{-1}\) have been presented in [37], and for a future ILC and TLEP with a center of mass energy of 240 GeV and 1 TeV, respectively in [34]. Since we demand the ALPs to decay within the detector for our projections, the part of the parameter space to which missing energy searches are sensitive is largely complementary to the parameter space for which ALPs can be discovered by the searches discussed in this paper.