1 Introduction

The Higgs discovery has set spin zero particles in the spotlight of searches for beyond the Standard Model (BSM) physics. This may have been the first incursion into new territory: scalar and pseudoscalar particles – elementary or not – as heralds of new physics.

Extra spin zero particles are in fact proposed by candidate solutions to major and pressing problems in particle physics. For instance, an option to explain the nature of dark matter (DM) is a new scalar particle within a \(Z_2\) invariant setup. A different and outstanding example is the strong CP problem of QCD, for which the paradigmatic solution relies on an anomalous global U(1) symmetry which is spontaneously broken; the associated (pseudo-) Nambu–Goldstone boson, the axion, is in addition an optimal candidate to explain the DM of the universe. The original formulation, the so-called PQWW axion [1,2,3], is now disfavoured by data, while other popular constructions that deal with the so-called invisible axion, such as the DFSZ [4, 5] and the KSVZ [6, 7] models, are still viable solutions to the strong CP problem. The magnitude of the couplings of axions to ordinary matter is inversely proportional to the scale of U(1) spontaneous symmetry breaking, which is much higher than the electroweak scale in the latter, “invisible”, constructions. Lower axion scales are considered though in other implementations of the Peccei–Quinn solution to the strong CP problem [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22].

Many other extensions of the Standard Model of Particle Physics (SM) feature one or several spontaneously broken global U(1) symmetries, thus predicting the existence of massless Nambu–Goldstone excitations whose couplings need not abide by the same stringent constraints of the original QCD axion: axion-like particles (ALPs). ALPs, if they get a small mass due to non-perturbative effects or other explicit symmetry breaking mechanism, are also good DM candidates and/or may affect the thermal evolution of the universe. The impact of ALPs, at both high and low energies, depends on their nature and on the type and strength of their couplings. In practice, the relevant generic characteristic of Nambu–Goldstone bosons is that they only enjoy derivative couplings, because of the underlying shift symmetry.

The ultimate nature of the Higgs particle itself is still at stake. Is this scalar elementary or composite? Should we accept the uncomfortable fine-tuning associated to the electroweak hierarchy problem as a feature of Nature or is there some dynamic explanation for it? Many of the efforts made in this direction are based on the search for symmetries which would justify a low Higgs mass. Two major frameworks are being considered: either linear realisations of electroweak symmetry breaking (EWSB), typical of weakly coupled new physics, as for instance in many supersymmetric models, or non-linear ones such as those in the so-called “composite Higgs models” and other constructions involving new strongly interacting physics. The model-independent way of formulating the ultimate exploration of the Higgs nature in low-energy data is provided by the use of effective Lagrangians: a “linear” expansion [23, 24] (often called SMEFT), in terms of towers of gauge invariant operators built out of SM fields and ordered by their mass dimension, is used when assuming linear realisations of EWSB, while “non-linear” expansions [25,26,27,28,29,30,31] – sometimes called “chiral” or HEFT – are the optimal instrument to treat regimes which are not necessarily weakly interacting. The non-linear formulation has the disadvantage of depending on a larger number of free parameters, while it has the advantage of being more general; in particular, it reduces to the SM Lagrangian in a particular limit.Footnote 1 The non-linear expansion does not presuppose that the Higgs particle at low energies belongs to an electroweak doublet, a crucial question to be explored in the years to come.

This paper explores the physics of an extra singlet scalar which is a CP-odd (pseudo-) Nambu–Goldstone boson. We will formulate in all generality its leading CP-invariant effective couplings to SM fields, which must be purely derivative couplings when its mass is neglected. This first – theoretical – part is general by nature and holds for ALP scales larger than the electroweak one (in the EWSB non-linear case also larger than its implicit BSM scale). While the dominant ALP interactions in the linear – SMEFT – expansion have been formulated long ago [33], the analogous analysis for the non-linear regime is missing and will be developed here. We will first concentrate on determining a complete basis of CP-even bosonic operators containing one ALP insertion; nevertheless, the fermionic operators are also derived in this paper, building a complete and non-redundant chiral set. The relation and differences between the dominant operators in both expansions – linear and chiral – will be subsequently discussed. It is interesting to note that all results to be obtained below apply as well to a different case: the complete basis of CP-odd derivative couplings of an hypothetical CP-even scalar (see also Ref. [34] for a generic CP-even scalar).

Up to now, most phenomenological ALPs analyses concentrated on their couplings to photons, gluons and/or quarks, as they dominate at low energies and determine astrophysical and cosmological constraints for very light ALPs. Nevertheless, ALPs may well show up first at colliders [35,36,37] or in rare mesonic decays [38, 39], and the \(SU(2)_L\times U(1)_Y\) invariant formulation of their interactions developed here provides new beautiful channels involving the electroweak gauge bosons and the Higgs particle. In the second – phenomenological – part of this work, the foreseen impact of those couplings at colliders and in particular at LHC will be analyzed for the first time, identifying the new signals and performing a detailed analysis of experimental bounds and prospects.Footnote 2 Unlike for the theoretical results, this search for (pseudo-) Nambu–Goldstone bosons at colliders implicitly assumes an overall ALP scale not far from the electroweak one, e.g. \({\mathcal {O}}\)(TeV), for observability.

The structure of the paper can easily be inferred from the Table of Contents.

2 The ALP linear Lagrangian

In linear realisations of EWSB with only SM fields at low energies, the leading-order (LO) effective Lagrangian is simply the SM one,

(1)

where \(\tilde{\Phi }=i\sigma ^2\Phi ^*\) and \(\mathbf{Y}_D\), \(\mathbf{Y}_U\) and \(\mathbf{Y}_E\) are \(3\times 3\) matrices in flavour space which encode the Yukawa couplings for down quarks, up quarks and charged leptons, respectively. Consider now an additional particle, singlet under the SM charges, which is a (pseudo-) Nambu–Goldstone boson of a spontaneously broken symmetry at energies higher than the electroweak scale v (set by the W mass). Neglecting its mass, its couplings would be pure derivative ones because of the underlying shift symmetry. Denoting by \(f_a\) the scale associated to the physics of this ALP particle a, insertions of the latter in effective operators will be weighted down by powers of \(a/f_a\). Focusing on interactions involving only one ALP, the next-to-leading-order (NLO) effective linear ALP Lagrangian has been determined long ago [33].

In this paper we mostly focus on the bosonic operators involving a, determining a complete and non-redundant set. For linear EWSB realisations the most general linear bosonic Lagrangian, including only the NLO corrections involving a, is given by

$$\begin{aligned} \mathscr {L}_{\mathrm{{eff}}}^{\mathrm{{linear}}}=\mathscr {L}^{\text {LO}}+\, \delta \mathscr {L}_a^{\mathrm{{bosonic}}}, \end{aligned}$$
(2)

where now the leading-order Lagrangian is the SM one plus the ALP kinetic term,

$$\begin{aligned} \mathscr {L}^{\text {LO}}=\mathscr {L}_{\mathrm{{SM}}}+\frac{1}{2}(\partial _\mu a)(\partial ^\mu a), \end{aligned}$$
(3)

while the NLO bosonic corrections are given by

$$\begin{aligned} \delta \mathscr {L}_a^{\mathrm{{bosonic}}}=c_{\tilde{W}}\mathcal {A}_{\tilde{W}} +c_{\tilde{B}}\mathcal {A}_{\tilde{B}}+c_{\tilde{G}}\mathcal {A}_{\tilde{G}}+c_{a\Phi }\mathbf {O}_{a\Phi }, \end{aligned}$$
(4)

with

$$\begin{aligned}&\mathcal {A}_{\tilde{B}} =-B_{\mu \nu }\tilde{B}^{\mu \nu }\dfrac{a}{f_a},\end{aligned}$$
(5)
$$\begin{aligned}&\mathcal {A}_{\tilde{W}} =-W_{\mu \nu }^a\tilde{W}^{a\mu \nu }\dfrac{a}{f_a},\end{aligned}$$
(6)
$$\begin{aligned}&\mathcal {A}_{\tilde{G}} =-G^a_{\mu \nu }\tilde{G}^{a\mu \nu }\dfrac{a}{f_a},\end{aligned}$$
(7)
$$\begin{aligned}&\mathbf {O}_{a\Phi }= i (\Phi ^\dag \overleftrightarrow {D}_\mu \Phi )\frac{\partial ^\mu a}{f_a}, \end{aligned}$$
(8)

and \(\tilde{X}^{\mu \nu }\equiv \frac{1}{2} \epsilon ^{\mu \nu \rho \sigma }X_{\rho \sigma }\). The action of the shift symmetry on the first three operators, \(a\rightarrow a+\alpha \), with \(\alpha \) constant, yields

$$\begin{aligned} \mathrm{Tr}[X_{\mu \nu }\tilde{X}^{\mu \nu }]\dfrac{a}{f_a}\equiv & {} \partial _\mu K_X^\mu \dfrac{a}{f_a}\rightarrow \partial _\mu K_X^\mu \dfrac{a+\alpha }{f_a}\nonumber \\= & {} -K_X^\mu \partial _\mu \dfrac{a}{f_a}+\dfrac{\alpha }{f_a}\partial _\mu K_X^\mu , \end{aligned}$$
(9)

and thus the corresponding associated current is anomalous as \(\delta \mathscr {L}= \dfrac{\alpha }{f_a}\partial _\mu K_X^\mu \,\). Even if this correction is a total derivative, in the case of \(\mathcal {A}_{\tilde{G}}\) the existence of instantonic configurations in the QCD Lagrangian implies that the action is modified because the integral of \(\partial _\mu K^\mu _G\) does not vanish (although a discrete version of the shift symmetry is preserved); it is nevertheless often added to the Lagrangian given its relevance for the case of the true QCD axion and the solution of the strong CP problem.Footnote 3

After electroweak symmetry breaking, \(\mathbf {O}_{a\Phi }\) induces a two-point function contribution, tantamount to a acting as an additional contribution to the longitudinal component of the electroweak gauge fields. An easy way of determining its impact on observables is to trade it for a fermionic vertex [33], either chirality conserving or chirality flipping, or a combination of them. For instance, the Higgs field redefinition

$$\begin{aligned} \Phi \rightarrow e^{ic_\Phi \, a/f_a}\Phi \end{aligned}$$
(10)

applied to the bosonic Lagrangian Eq. (2 3), induces a correction stemming from the Higgs kinetic energy term (see Eq. (1)) which cancels exactly \(\mathbf {O}_{a\Phi }\) up to \({\mathcal {O}}(a/f_a)\), while the Yukawa terms in that equation induce a new Yukawa–axion coupling for which \(\mathbf {O}_{a\Phi }\) can be entirely traded (see Appendix D for details and a general discussion of possible field redefinitions). The overall effect is thus the replacement in Eq. (4)

$$\begin{aligned} \mathbf {O}_{a\Phi } \longrightarrow \mathbf {O}^\psi _{a\Phi }, \end{aligned}$$
(11)

where

$$\begin{aligned} \mathbf {O}^\psi _{a\Phi }\equiv & {} i\left( \bar{Q}_L\,\mathbf{Y}_U\,{\tilde{\Phi }} u_R-\bar{Q}_L\,\mathbf{Y}_D\,\Phi d_R+\bar{L}_L\,\mathbf{Y}_E\,\Phi e_R\right) \frac{a}{f_a}\nonumber \\&+\, \mathrm{h.c.}, \end{aligned}$$
(12)

which exhibits a relative minus sign between the Yukawa-ALP type of interaction for up and down fermions. This coupling can then be written in a more compact way as

$$\begin{aligned} \mathbf {O}^{\psi }_{a\Phi }= i\,\frac{a}{f_{a}}\sum _{\psi =Q,\,L} \left( {\bar{\psi }}_{L}{\mathcal {\mathbf{Y}}}_{\psi }{\varvec{\Phi }}\sigma _3 \psi _{R}\right) + \mathrm{h.c.}, \end{aligned}$$
(13)

where \(Q_R\equiv \{u_R,d_R \} \, \, ( L_R\equiv \{0,e_R \} )\) – with \(\sigma _3\) acting on weak isospin space – and where the block matrices \(\mathcal {\mathbf{Y}}_\psi \) and \({\varvec{\Phi }}\) are defined by

$$\begin{aligned}&\mathcal {\mathbf{Y}}_Q \equiv \mathrm{diag}\left( \mathbf{Y}_U, \mathbf{Y}_{D}\right) ,\quad \mathcal {\mathbf{Y}}_L \equiv \mathrm{diag}\left( 0, \mathbf{Y}_{E}\right) ,\nonumber \\&\quad {\varvec{\Phi }}=\mathrm{diag}({\tilde{\Phi }},\Phi ). \end{aligned}$$
(14)

Alternatively, using the equations of motion of \(\mathscr {L}^{\text {LO}}\), \(\mathbf {O}_{a\Phi }\) could be entirely traded by a flavour-blind and chirality-conserving fermionic operator,

$$\begin{aligned} \mathbf {O}_{a\Phi }\longrightarrow -\frac{1}{2}\,\frac{\partial _\mu a}{f_a} \sum _{ \psi =Q,\,L} \left( \bar{\psi }\gamma _\mu \gamma _5 \sigma _3 \psi \right) + \mathrm{h.c.}, \end{aligned}$$
(15)

where again terms with more than one axion insertion have been neglected. In this work we choose to use the chirality-flipping version of the fermionic couplings, though. In summary, the expression for \(\delta \mathscr {L}_a^{\mathrm{{bosonic}}}\) to be used below reads

$$\begin{aligned} \delta \mathscr {L}_a^{\mathrm{{bosonic}}}= & {} c_{\tilde{W}}\mathcal {A}_{\tilde{W}} +c_{\tilde{B}}\mathcal {A}_{\tilde{B}}+c_{\tilde{G}}\mathcal {A}_{\tilde{G}} +c_{a\Phi }\mathbf {O}^\psi _{a\Phi }, \end{aligned}$$
(16)

with \(\mathbf {O}^\psi _{a\Phi }\) as defined in Eq. (13).

For completion, it is worth mentioning that when the complete NLO Lagrangian is considered in the linear case, additional fermionic operators are present. In fact the most general NLO ALP Lagrangian is given by [33, 40, 41]

$$\begin{aligned} \delta \mathscr {L}_a^\text {total}= & {} c_{\tilde{W}}\mathcal {A}_{\tilde{W}} +c_{\tilde{B}}\mathcal {A}_{\tilde{B}}+c_{\tilde{G}}\mathcal {A}_{\tilde{G}}\nonumber \\&+\,\frac{\partial _\mu a}{f_a}\sum _{\begin{array}{c} \psi =Q_L,\,Q_R, \\ \,L_L,\,L_R \end{array}} \bar{\psi }\gamma _\mu X_\psi \psi , \end{aligned}$$
(17)

where \(X_\Psi \) are \(3\times 3\) hermitian matrices in flavour space. The chirality-conserving operator in the last term of this equation could alternatively be traded using the equations of motion (EoM) by a chirality-flipping coupling:

$$\begin{aligned}&\frac{\partial _\mu a}{f_a}\sum _{\begin{array}{c} \psi =Q_L,\,Q_R, \\ \,L_L,\,L_R \end{array}} \bar{\psi }\gamma _\mu X_\psi \psi \, \longrightarrow \frac{ia}{f_a} \sum _{\psi = Q, L}{{\bar{\psi }}}_L {\varvec{\Phi }}\nonumber \\&\quad \times \left( X_{\psi _L} \mathbf{Y}_\psi - \mathbf {Y}_\psi X_{\psi _R} \right) \psi _R \, + \mathrm{h.c.}\end{aligned}$$
(18)

In this equation, the products \(X_{\psi _L} \mathbf{Y}_\psi \) and \(\mathbf {Y}_\psi X_{\psi _R}\) are completely generic matrices and in consequence, in the complete linear basis, operators of the type \(a\,{\bar{\psi }}_L {\varvec{\Phi }} \psi _R \) are not Yukawa suppressed. Note as well that it would be redundant to consider simultaneously a bosonic coupling such as \(\mathbf {O}_{a\Phi }\) in Eq. (8) and the general fermionic couplings in Eq. (17) or (18), as the effects of the former are already included in the flavour-blind components of the \(X_\psi \) matrices; see Eq. (15). In this paper, we concentrate on the thorough exploration of observables induced by the purely bosonic ALP couplings as expressed in Eq. (16), for the case of linear EWSB realisations.

2.1 Previous phenomenological bounds

The experimental bounds on the couplings of axions – and in general ALPs – to gluons, photons and fermions have been abundantly considered in the linear EWSB scenario (see e.g. [42,43,44,45,46,47,48,49,50,51,52,53]), including as well their impact at colliders for the case \(f_a\sim {\mathcal {O}}(\text {TeV})\) [35, 36]. Additionally, constraints on the linear coupling of the ALP to \(W^\pm \) gauge bosons have recently been obtained in Refs. [38, 39].

Coupling to photons Both \(\mathcal {A}_{\tilde{B}}\) and \(\mathcal {A}_{\tilde{W}}\) – Eqs. (5) and (6) – contribute to the interaction of the ALP with two photons,

$$\begin{aligned} \delta \mathscr {L}_a^{\text {bosonic}} \supset - \frac{1}{4}g_{a\gamma \gamma } \,a\,F_{\mu \nu }\tilde{F}^{\mu \nu }, \end{aligned}$$
(19)

where \(F_{\mu \nu }\) denotes the electromagnetic field strength, and the dimensionful coupling \(g_{a\gamma \gamma }\) is given by

$$\begin{aligned} g_{a\gamma \gamma } =\frac{4}{f_a}\left( c_{\tilde{B}}c_\theta ^2+c_{\tilde{W}}s_\theta ^2\right) , \end{aligned}$$
(20)

where \(c_\theta \) (\(s_\theta \)) denotes the cosinus (sinus) of the Weinberg angle. Bounds on \(g_{a\gamma \gamma }\) can be inferred as a function of the ALP mass \(m_a\) from various astrophysical constraints and low-energy data, which rely only on an hypothetical ALP–photon coupling and not on fermion–ALP interactions, as discussed e.g. in Ref. [35]. They enforce the combination \(|c_\theta ^2c_{\tilde{B}}+ s_\theta ^2 c_{\tilde{W}}|\) to cancel to one part in \(10^{3}\) (\(10^{8}\)) for \(m_a=1\,{\mathrm{MeV}\,\mathrm{(keV)}}\). Indeed, for \(m_a \simeq 1\,\mathrm{MeV}\) the best present constraint is set by beam dump experiments, \(g_{a\gamma \gamma } \lesssim 10^{-5}\,{\mathrm{GeV}^{-1}}\) [35, 43], that is,

$$\begin{aligned}&|c_{\tilde{B}}c_\theta ^2+c_{\tilde{W}}s_\theta ^2| \lesssim 0.0025 \,\left( \frac{f_a}{1\,\mathrm{TeV}} \right) \ (90\%\,\, \text {CL})\nonumber \\&\quad \text {for } m_a \le 1\,\mathrm{MeV}. \end{aligned}$$
(21)

For substantially lower masses astrophysical constraints may apply, e.g. for \(m_a=1\,\mathrm{keV}\) the combination of helioseismology, solar neutrino data observations [44] and horizontal branch stars data [45,46,47] results in \(g_{a\gamma \gamma }\lesssim 10^{-10}\,{\mathrm{GeV}^{-1}}\), that is,

$$\begin{aligned} |c_{\tilde{B}}c_\theta ^2+c_{\tilde{W}}s_\theta ^2| \lesssim 2.5 \times 10^{-8} \,\left( \frac{f_a}{1\,\mathrm{TeV}} \right) \quad \text {for } m_a\le 1\,\mathrm{keV}.\nonumber \\ \end{aligned}$$
(22)

These strong constraints on \(g_{a\gamma \gamma }\) could suggest that each of the two coefficients involved, \(c_{\tilde{B}}\) and \(c_{\tilde{W}}\), may be individually subject to bounds of the same order of magnitude. Nevertheless, often symmetry reasons force a given theory to produce couplings to photons much suppressed with respect to Z couplings. In any case, from the point of view of effective theory they are two independent degrees of freedom: the combination orthogonal to that in Eq. (20) should be probed and bounded independently. In practice, in most of the phenomenological analysis to be developed in this work the constraint

$$\begin{aligned} c_{\tilde{B}}= -t_\theta ^2 c_{\tilde{W}}\end{aligned}$$
(23)

will be systematically enforced.

Coupling to gluons In turn, the effective ALP–gluon \(g_{agg}\) coupling is analogously defined by

$$\begin{aligned} \delta \mathscr {L}_a^{\text {bosonic}} \supset - \frac{1}{4}g_{agg} \,a\,G_{\mu \nu }^a\tilde{G}^{a\mu \nu }, \end{aligned}$$
(24)

where \(G_{\mu \nu }\) denotes the QCD field strength. It receives contributions from the NLO effective operator \(\mathcal {A}_{\tilde{G}}\) in Eq. (7), where

$$\begin{aligned} g_{agg}=\frac{4}{f_a}\,c_{\tilde{G}} \end{aligned}$$
(25)

can be directly constrained at energies above the QCD scale \(\Lambda _{QCD}\) via axion–pion mixing effects, and also via mono-jet searches at hadron colliders.

Bounds on \({\mathrm {Br}}(K^+ \rightarrow \pi ^+ + {\mathrm {nothing}})\) [54] can be used to constrain the process \(K^+ \rightarrow \pi ^+ \,\pi ^0 \,(\pi ^0 \rightarrow a)\), where the pion–axion mixing arises through the anomalous coupling of mesons and of the axion to gluons [40, 55]. These bounds have been used to constrain \(f_a\) in contexts where the coupling of the ALP to gluons is only present due to the anomaly, i.e. where \(\mathcal {L}\supset \frac{\alpha _s}{8\pi }\frac{a}{f_a}G \tilde{G}\) (see, for example, Ref. [19]). They can be reinterpreted in terms of the generic ALP–gluon coupling, Eq. (24), yielding

$$\begin{aligned} g_{agg} \lesssim 1.1\times 10^{-5}\,{\mathrm{GeV}^{-1}} \ (90\%\,\, \text {CL}) \quad \text {for } m_a \lesssim 60\,\mathrm{MeV}.\nonumber \\ \end{aligned}$$
(26)

Slightly higher ALP masses have been considered at colliders, assuming only the coupling in Eq. (24). Limits of order

$$\begin{aligned} g_{agg} \lesssim 10^{-4}\,{\mathrm{GeV}^{-1}}\ (95\%\,\, \text {CL}) \quad \text {for } m_a \lesssim 0.1\,\mathrm{GeV}, \end{aligned}$$
(27)

were obtained [35] by recasting 8 TeV LHC analyses [48, 49].

Coupling to fermions Interesting bounds on ALP–fermion interactions can be obtained from several set of experimental data. For instance, considering those stemming from the purely bosonic operator \(\mathbf {O}_{a\Phi }\) – see Eq. (13) – or, in other words, the flavour-diagonal couplings in the last operator in Eq. (17) as expressed in Eq. (18) with \(X_{L,R}^{ij}=X_{L,R}^{ii}\delta ^{ij}\) and

$$\begin{aligned} g_{a\psi }=X_{L}-X_{R}, \end{aligned}$$
(28)

their contribution to the effective Lagrangian in the fermion mass basis reads

$$\begin{aligned} \delta \mathscr {L}_a^{\text {bosonic}} \supset \frac{ia}{f_a} \sum _{\psi =Q,\,L} g_{a\psi } m^{\text {diag}}_{\psi }\, {\bar{\psi }} \gamma _5 \psi \end{aligned}$$
(29)

where \(m^{\text {diag}}_\psi \) is the fermion mass matrix resulting from diagonalizing the product \(v \mathbf {Y}_\psi /\sqrt{2}\). The severity of the constraints on \(g_{a\psi }\) depends on the ALP mass range considered. The least constrained is the high-mass region, tested through rare meson decays and in DM direct detection searches (the latter being very model-dependent) [50]. The former provide bounds on ALP–fermion couplings below 10 GeV and in particular Beam Dump experiments (CHARM) constraints read [38, 56]:

$$\begin{aligned}&g_{a\psi }/f_a < (3.4\times 10^{-8} - 2.9\times 10^{-6}){\mathrm{GeV}^{-1}}\ (90\%\,\, \text {CL})\nonumber \\&\qquad \text {for } 1\,\mathrm{MeV}\lesssim m_a \lesssim 3\,\mathrm{GeV}. \end{aligned}$$
(30)

Lighter ALPs have been tested in axion searches in Xenon100 [52] through the axio-electric effect in liquid xenon (analogue of the photo-electric process with the absorption of an axion instead of a photon), bounding ALP couplings to electrons:

$$\begin{aligned} g_{ae}/f_a<1.5\times 10^{-8}\,{\mathrm{GeV}^{-1}}\ (90\%\,\, \text {CL})\quad \text {for } m_a<1\,\mathrm{keV}.\nonumber \\ \end{aligned}$$
(31)

Finally, the strongest bounds apply to very low ALP masses. They are inferred from high-precision photometry of the red giant branch of the colour–magnitude diagram for globular clusters [53]. Measurements of axionic recombination and de-excitation, Compton scattering and axion-bremsstrahlung set very strong bounds again on the coupling to electrons:

$$\begin{aligned} g_{ae}/f_a < 8.6\times 10^{-10}\,{\mathrm{GeV}^{-1}}\ (95\%\,\, \text {CL}) \quad \text {for } m_a \lesssim \mathrm{eV}.\nonumber \\ \end{aligned}$$
(32)

The above set of fermionic bounds could suggest to infer new limits on the coefficient of the linear bosonic operator \(\mathbf {O}_{a\Phi }\) of the bosonic linear ALP basis, Eq. (8), if considered by itself, via the equivalence discussed in Eqs. (10)–(13). This bound would depend on the ALP mass, and would be conservatively summarised in

$$\begin{aligned}&|c_{a\Phi }|/f_a<(3.4\times 10^{-8} - 2.9\times 10^{-6})\,{\mathrm{GeV}^{-1}}\ (90\%\,\, \text {CL})\nonumber \\&\qquad \text {for } m_a \lesssim 3\,\mathrm{{GeV}}, \end{aligned}$$
(33)

except for ALPs with masses in the \(1\,\mathrm{keV}\)\(1\,\mathrm{MeV}\) range, where the bounds from rare meson decay and DM searches are much weaker. Nevertheless, more than one effective operator can contribute to the rare processes under discussion and, in consequence, strictly speaking a bound can only be set on the corresponding combination of operators, see further below, in the same spirit that the bounds on \(a\gamma \gamma \) decay do not nullify simultaneously the two couplings in the set \(\{a_{\tilde{W}}, a_{\tilde{B}}\}\), but only a combination of them; see Eq. (23). For the time being, the value of \(c_{a\Phi }\) will be thus left free for further exploration below.

Axion-like particles are also appealing DM candidates and further bounds apply if such a hypothesis is considered. Indeed, heavy ALPs (in the GeV–TeV mass range) have largely been searched for at colliders as weakly interacting massive particles (WIMPs). However, the phenomenological analysis in this work will focus on a low-mass region with ALP masses below the MeV range; DM candidates in this range are known as weakly interacting slim particles (WISPs) and could be produced non-thermally through the misalignment mechanism [57,58,59,60,61]. ALP DM candidates capable of generating the correct relic abundance call for a large enough initial field value. Because of their (pseudo-) Nambu–Goldstone nature, these ALPs are the phase of a complex field and thus have field values limited to \(-\pi f_a< a(x) < \pi f_a\), implying that standard ALP CDM (cold DM) producing the correct relic density would require large ALP scales [62]: \(f_a \gtrsim 3.2\cdot 10^{10}\,{\mathrm{GeV}} (m_0 / \mathrm{eV})^{1/4}\) (smaller scale values cannot explain the totality of the relic abundance). In what follows, ALPs will not be required to account for the DM of the universe.

Coupling to massive vector bosons In contrast to the present constraints discussed above, the couplings of ALPs to the heavy SM bosons have been largely disregarded although they appear at NLO of the linear expansion, that is, at the same order as the pure photonic, gluonic and fermionic ALP couplings.

The associated signals stemming from the linear \(\delta \mathscr {L}_a^{\mathrm{{bosonic}}}\) in Eq. (16) are illustrated in the column on the right hand side of the Feynman rules detailed in Appendix B; they include in particular interaction vertices of the ALP with electroweak gauge bosons such as \(a\gamma Z\), aZZ, \(aW^+W^-\), \(a\gamma W^+W^-\) and \(aZW^+W^-\). Besides the collider signatures that will be presented in the phenomenological sections of this paper, rare decays provide an additional handle on the ALP couplings to massive vector bosons.

Consider the ALP–\(W^+W^-\) interaction defined by

$$\begin{aligned} \delta \mathscr {L}_a^{\text {bosonic}} \supset - \frac{1}{4}g_{a WW} \,a\,W_{\mu \nu }\tilde{W}^{\mu \nu }, \end{aligned}$$
(34)

which may induce flavour-changing rare meson decays via W exchange at one loop, and an ALP radiated from the W boson. \(\mathcal {A}_{\tilde{W}}\) contributes to such processes, with \(g_{a W W}=4 c_{\tilde{W}}/ f_a\). Upon considering the action of \(\mathcal {A}_{\tilde{W}}\) by itself, the same coupling induces the subsequent ALP decay into two photons. NA48/2, NA62 and Beam Dump experiments have been analysed in this context in Ref. [39], which extends to higher ALP masses the bounds in Eqs. (21) and (22) of Ref. [35], indicating a constraintFootnote 4

$$\begin{aligned} f_a/c_{\tilde{W}} \gtrsim 4-8000\,{\mathrm{TeV}},\quad \text {for } m_a < 500\,{\mathrm{MeV}}. \end{aligned}$$
(35)

Other limits have been obtained from the bounds on rare meson decays into invisible products, \(B \rightarrow K + a\) and \(K \rightarrow \pi + a\) with \(a\rightarrow \text {inv.}\). This is nevertheless at the price of assuming, in addition to \(\mathcal {A}_{\tilde{W}}\), the existence of some supplementary ALP decay channel into invisible sectors that furthermore is required to be largely dominant [39].

The bounds just discussed are precious and in particular the approach of having started considering just one operator at a time is a valid one. Nevertheless, with the discussed level of accuracy for \(c_{\tilde{W}}\) when considered just by itself, it may be pertinent to take into account the possible competing action of other specific ALP–SM couplings in the EFT, for instance those where the ALP would not be attached to the W boson but to the intermediate fermion in the loop. These stem from the fermionic couplings – in particular the top quark coupling – induced by the bosonic operator \(\mathbf {O}_{a\Phi }\) in Eq. (16), or from other ALP–fermion interactions such as the generic ones in Eq. (17) for the linear case. Indeed, like the analysis that lead to Eq. (23), from the point of view of the effective field theory in the linear EWSB framework, only combinations of the couplings in the set

$$\begin{aligned} \{c_{\tilde{W}}, c_{a \Phi },c_{\psi _i}\} \end{aligned}$$
(36)

can be strictly bound by such data, where \(c_{\psi _i}\) refers to the coefficients of the fermionic couplings in the complete NLO linear ALP Lagrangian Eq. (17) which are not tantamount to \(c_{a \Phi }\) via EoM.

In this paper we will explore the complementary information that the LHC can provide in various tree-level channels, e.g. mono-W, which are insensitive to the presence of the operator coefficient \(c_{a\Phi }\) but share with the rare-decay analyses the dependence on the linear operator coefficient \(c_{\tilde{W}}\). This complementarity is also manifest as the LHC has access to a larger kinematic range. Hence the breakdown of the ALP Effective Theory, and possible discovery of new physics, may be possible at the LHC but be hidden in physics at B-factories. For these reasons, in the phenomenological sections we will obtain LHC bounds on operators involved in tree-level ALP–W couplings (among others) and without the prejudice from rare decays. The combined impact at LHC of \(c_{\tilde{W}}\) and \(c_{a\Phi }\) plus general ALP–fermion couplings, as well as the impact of non-linear operators on rare decays is a subject for future work.

3 The bosonic chiral ALP Lagrangian

This section explores the leading effective couplings between an ALP and the SM fields, in the general framework of a non-linear (often referred to as chiral or HEFT) realisation of EWSB. The complete set of LO and NLO bosonic CP-even couplings involving one ALP will be determined (again, they could also be read as the complete bosonic set of derivative CP-odd couplings involving a CP-even singlet scalar). It will be assumed that the characteristic scale \(f_a\) associated to the Nambu–Goldstone boson origin of the ALP is at least of the same order of magnitude or larger than the cut-off of the BSM electroweak theory \(\Lambda \). The ALP scale and the electroweak BSM scale \(\Lambda \) will nevertheless be treated here as independent.

The chiral effective Lagrangian HEFT [25,26,27,28,29,30,31, 34, 63,64,65,66,67,68,69], which in the context of generic non-linear realisations of EWSB describes the interactions among SM gauge degrees of freedom, SM fermions and a light Higgs resonance, consists of all operators invariant under Lorenz and SM gauge symmetries and written in terms of the SM spectrum with the only exception of the Higgs doublet, whose four degrees of freedom are distributed in two separate sets. On the one side, a unitary matrix \(\mathbf {U}(x)\) describes only the three SM would-be Nambu–Goldstone bosons [25, 70,71,72] – that become the longitudinal components of the gauge bosons after EWSB. On the other side, the physical Higgs particle h is introduced as an independent field, a generic singlet of the SM with arbitrary couplings [25, 27,28,29, 73]. For particular values of the latter parameters and correlations of the operator coefficients the usual SMEFT linear formulation would be recovered [28, 31, 32, 34, 64,65,66,67,68,69, 74,75,76].

The HEFT building blocks can be chosen to be the gauge field strengths \(G_{\mu \nu }\), \(W_{\mu \nu }\) and \(B_{\mu \nu }\) plus two \(SU(2)_L\) covariant objects:

$$\begin{aligned} \mathbf {V}_\mu (x)\equiv \left( \mathbf {D}_\mu \mathbf {U}(x)\right) \mathbf {U}(x)^\dag ,\quad \mathbf {T}(x)\equiv \mathbf {U}(x)\sigma _3\mathbf {U}(x)^\dag , \end{aligned}$$
(37)

with

$$\begin{aligned} \mathbf {U}(x)=e^{i\sigma _a \pi ^a(x)/v}, \end{aligned}$$
(38)

where \(\pi ^a(x)\) denotes the longitudinal degrees of freedom of the gauge bosons and \(\sigma _a\) the Pauli matrices. In this notation, the covariant derivative reads

$$\begin{aligned} \mathbf {D}_\mu \mathbf {U}(x) \equiv \partial _\mu \mathbf {U}(x) +igW_{\mu }(x)\mathbf {U}(x) - \dfrac{ig'}{2} B_\mu (x) \mathbf {U}(x)\sigma _3.\nonumber \\ \end{aligned}$$
(39)

Under \(SU(2)_{L,R}\) global transformations (L, R, respectively), the objects defined above transform as

$$\begin{aligned}&\mathbf {U}(x) \rightarrow L\, \mathbf {U}(x) R^\dagger , \quad \mathbf {V}_\mu (x) \rightarrow L\, \mathbf {V}_\mu (x) L^\dagger ,\nonumber \\&\quad \mathbf {T}(x) \rightarrow L\, \mathbf {T}(x) L^\dagger . \end{aligned}$$
(40)

The physical Higgs particle h is then customarily introduced as a SM isosinglet via generic polynomial functions \(\mathcal{F}_i(h)\) [73] expanded in powers of h / v,

$$\begin{aligned} \mathcal{F}_i(h)=1+ a_i h/v + b_i (h/v)^2+\cdots , \end{aligned}$$
(41)

where \(a_i,b_i\ldots \) are constant coefficients. Finally, the SM fermions are often grouped into doublets of \(SU(2)_L\) and \(SU(2)_R\), \(Q_{L,R}\equiv (u_{L,R}, d_{L,R})\), \(L_L\equiv (\nu _L, e_L)\) and \(L_R\equiv (0, e_R)\). The notation chosen allows an easy identification of terms breaking the custodial symmetry \(SU(2)_C\) to which the global group \(SU(2)_L\times SU(2)_R\) gets broken after EWSB. \(SU(2)_C\) is explicitly broken by the gauging of the hypercharge \(U(1)_Y\) and by the heterogeneity of the fermion masses; insertions of the scalar chiral field \(\mathbf {T}(x)\), which is not invariant under transformations of the full \(SU(2)_R\), account for breaking of the custodial symmetry in the effective operators.

Table 1 Couplings resulting from the bosonic axion NLO linear coupling \(\mathbf {O}_{a\Phi }\) and from its LO chiral sibling \(\mathcal {A}_{2D}\), as formulated in the Lagrangians Eqs. (16) and (55), respectively. Only fermionic vertices survive as physical impact from \(\mathbf {O}_{a\Phi }\), as in the linear expansion higher orders (\(d\ge 7\)) are required for \(aZh^n\) (\(n\ne 1\)) couplings, while the latter are present in the chiral case at LO. For the complete Feynman rules see Appendix B

The task now consists in the generalisation of the HEFT Lagrangian to include insertions of derivatives of \(a/f_a\). This could be approached via the insertion in that Lagrangian of general polynomial functions of the SM singlet scalar a, \(\mathcal{F}_i(a/f_a)\), in analogy with the treatment given to the scalar h in the HEFT Lagrangian. After all, the \(\mathcal{F}_i(h/v)\) polynomials are reminiscent of the deformed exponential Nambu–Goldstone nature of the Higgs particle in some non-linear EWSB realisations, such as “composite Higgs” models [77,78,79]. From this point of view, to restrict below to terms with a single \(a(x)/f_a\) insertion is consistent with the assumption \(f_a\ge \Lambda \). In summary, the effective Lagrangian can be written as

$$\begin{aligned} \mathscr {L}_{\mathrm{{eff}}}^\text {chiral}=\mathscr {L}^{\text {LO}}+ \delta \mathscr {L}_a^{\mathrm{{bosonic}}}, \end{aligned}$$
(42)

where now the LO Lagrangian includes the usual HEFT LO terms plus two ALP-dependent terms,

$$\begin{aligned} \mathscr {L}^{\text {LO}}= \mathscr {L}^{\text {LO}}_{\text {HEFT}} + \mathscr {L}^{\text {LO}}_a \end{aligned}$$
(43)

with

(44)

where the dependence on x, as well as that on v of \(\mathcal{F}(h/v)\), has been left implicit for brevity. The first line in Eq. (44) accounts for the h and gauge boson kinetic terms, and a general scalar potential V(h). The first term in the second line describes the W and Z masses and their interactions with h, as well as the kinetic energy of their longitudinal components; the second term in this line is a custodial-breaking term that we will disregard in what follows, being phenomenologically extremely suppressed (for this reason sometimes it is included instead among the NLO chiral terms even if it is a two-derivative coupling). The fermion kinetic energy and Yukawa-like terms written in the mass eigenstate basis come next, with

$$\begin{aligned} \mathcal {Y}_{Q,L}(h)\equiv \mathbf{Y}_{Q,L}\mathcal{F}_{Q,L}(h), \end{aligned}$$
(45)

where \(\mathbf{Y}_{Q,L}\) are the \(6\times 6\) block-diagonal matrices containing the usual Yukawa couplings as defined in Eq. (14). This notation follows the assumption that the Yukawa-type fermion–h couplings are aligned with the fermion masses. Finally, the last line contains the usual QCD \(\theta \) term associated to the strong CP problem.

\(\mathscr {L}^{\text {LO}}_a\) contains two terms which are two-derivative couplings,

$$\begin{aligned} \mathscr {L}^{\text {LO}}_a=\frac{1}{2} (\partial _\mu a)(\partial ^\mu a)+c_{2D}\mathcal {A}_{2D}(h), \end{aligned}$$
(46)

where \(\mathcal {A}_{2D}(h)\) is a custodial-breaking two-derivative operator with mass dimension three,

$$\begin{aligned} \mathcal {A}_{2D}(h)=iv^2\mathrm{Tr}[\mathbf {T}\mathbf {V}_\mu ]\partial ^\mu \frac{a}{f_a}\mathcal {F}_{2D}(h). \end{aligned}$$
(47)

This operator appears then singled out at the LO in the chiral expansion, unlike the case of the linear expansion in which the only LO ALP term was the a kinetic energy; see Eq. (2) and Table 1. In other words, if the EWSB is non-linearly realised \(\mathcal {A}_{2D}(h)\) may well provide the dominant and distinctive signals. It induces a two-point function of the form \(Z^\mu \partial ^\mu a\) which contributes to the longitudinal component of the Z boson together with the usual would-be Nambu–Goldstone boson of the SM, and thus to the Z mass. Its impact is in this respect analogous to that of the two-point function stemming from the \(d=5\) NLO linear operator \(\mathbf {O}_{a\Phi }\); see Sect. 2 and Eq. (8). Nevertheless, it will be shown in Sects. 3.2 and 4 that \(\mathcal {A}_{2D}\) has additional physical consequences, distinct from those induced by \(\mathbf {O}_{a\Phi }\), as illustrated in Table 1.

A discussion of scales The normalisation of the operators in Eqs. (44)–(47) and in the NLO chiral corrections to be discussed below follows the Naive Dimensional Analysis (NDA) master formula for the HEFT Lagrangian as discussed in Refs. [80,81,82,83]. With this convention the gauge boson kinetic terms appear canonically normalised. In addition, the strongly interacting regime would correspond to operator coefficients of \({\sim }{\mathcal {O}}(1)\).

Furthermore, the mass parameter in front of several operators in Eqs. (44) and in Eq. (47) should be a generic scale f, which in specific models is that associated to a Nambu–Goldstone ancestry for the Higgs resonance (alike to \(f_\pi \) for QCD pions), such that \(\Lambda \le 4\pi f\) [80]. Instead, v – the electroweak scale – is shown as explicit mass parameter for bosons and fermions in Eqs. (44) and (47), with \(v< f\): this inequality is the well-known fine-tuning of the chiral electroweak Lagrangian, necessary to recover the correct scale of the gauge boson masses. It reflects as well the fine-tuning problems of specific “composite Higgs” scenarios. For consistency v has been then chosen as weight in all mass-related terms in those equations; for instance a factor of \(f^2/v^2\) is thus implicitly embedded in the definition of the coefficient \(c_{2D}\) in Eq. (46).

The same fine-tuning is at the origin of the \(\mathcal{F}_i(h)\) functions being customarily written as generic polynomials in h / v instead of h / f; see Eq. (41). It can be considered that in this parametrisation factors of v / f have been reabsorbed in the free parameters \(a_i\), \(b_i\), etc. in Eq. (41). Note as well that, in principle, a function \(\mathcal{F}_i(h)\) can be attached to any of the operators in Eqs. (44) and (46). However, those attachments can be redefined away in both Higgs and fermionic kinetic terms at the price of redefining \(\mathcal{F}_{Q,L}(h)\) [84] and \(\mathcal{F}_{2D}(h)\). Moreover, \(\mathcal {F}_i(h)\) insertions in the gauge bosons kinetic terms can be avoided assuming that the transverse components of the gauge fields do not couple at tree level to the Higgs sector, as has been explicitly shown in Refs. [65, 66] for composite Higgs models [77,78,79]. A similar assumption on the ALP sector prevents from writing terms of the type \(a X_{\mu \nu }\tilde{X}^{\mu \nu }\) at LO.

3.1 The NLO ALP operators

The complete list of HEFT CP-even bosonic operators at NLO is known [28, 31, 64] and will not be further discussed. We address here the NLO bosonic chiral interactions involving one insertion of \(a/f_a\), encoded in \(\delta \mathscr {L}_a^{\mathrm{{bosonic}}}\) in Eq. (42). The additional inclusion of fermionic couplings and the construction of a complete and non-redundant CP-even basis, which will turn out to be composed of a total of 32 – bosonic and fermionic – operator structures (including the LO axionic operator \(\mathcal {A}_{2D}\) and assuming one flavour), is deferred to Appendix A. The NLO Lagrangian \(\delta \mathscr {L}_a^{\mathrm{{bosonic}}}\) consists instead of 20 independent bosonic operator structures (disregarding in the counting the different coefficients inside the \(\mathcal{F}_i(h)\) functions),

$$\begin{aligned} \delta \mathscr {L}_a^{\text {bosonic}}=\sum _{X=\tilde{B},\tilde{W}, \tilde{G}}c_{X}\mathcal {A}_{X}+\sum _{i=1}^{17}c_i\mathcal {A}_i(h), \end{aligned}$$
(48)

where

$$\begin{aligned}&\left. \begin{array}{l} \mathcal {A}_{\tilde{B}} =-B_{\mu \nu }\tilde{B}^{\mu \nu }\dfrac{a}{f_a},\\ \mathcal {A}_{\tilde{W}} =-W_{\mu \nu }^a\tilde{W}^{a\mu \nu }\dfrac{a}{f_a},\\ \mathcal {A}_{\tilde{G}} =-G^a_{\mu \nu }\tilde{G}^{a\mu \nu }\dfrac{a}{f_a},\\ \mathcal {A}_1(h) = \dfrac{i}{4\pi } \tilde{B}_{\mu \nu } \mathrm{Tr}[\mathbf {T}\mathbf {V}^\mu ] \partial ^\nu \dfrac{a}{f_a} \,\mathcal {F}_1(h),\\ \mathcal {A}_2 (h)= \dfrac{i}{4\pi } \mathrm{Tr}[\tilde{W}_{\mu \nu }\mathbf {V}^\mu ] \partial ^\nu \dfrac{a}{f_a} \,\mathcal {F}_2(h),\\ \mathcal {A}_3(h)= \dfrac{1}{4\pi } B_{\mu \nu }\partial ^\mu \dfrac{a}{f_a}\partial ^\nu \,\mathcal {F}_3(h) .\\ \end{array}\right\} \text { Custodial symmetry preserving}\nonumber \\&\mathcal {A}_4 (h) = \frac{i}{(4\pi )^2} \mathrm{Tr}[\mathbf {V}_\mu \mathbf {V}_\nu ]\mathrm{Tr}[\mathbf {T}\mathbf {V}^\mu ]\partial ^\nu \frac{a}{f_a }\,\mathcal {F}_4(h),\nonumber \\&\mathcal {A}_5 (h) = \frac{i}{(4\pi )^2} \mathrm{Tr}[\mathbf {V}_\mu \mathbf {V}^\mu ]\mathrm{Tr}[\mathbf {T}\mathbf {V}^\nu ] \partial _\nu \frac{a}{f_a}\, \mathcal {F}_5(h),\nonumber \\&\mathcal {A}_6(h) = \frac{1}{4\pi } \mathrm{Tr}[\mathbf {T}[W_{\mu \nu },\mathbf {V}^\mu ]]\partial ^\nu \frac{a}{f_a} \,\mathcal {F}_6(h) , \nonumber \\&\mathcal {A}_7(h) = \frac{i}{4\pi } \mathrm{Tr}[\mathbf {T}\tilde{W}_{\mu \nu }]\mathrm{Tr}[\mathbf {T}\mathbf {V}^\mu ]\partial ^\nu \frac{a}{f_a} \,\mathcal {F}_7(h),\\&\mathcal {A}_8(h) = \frac{i}{(4\pi )^2} \mathrm{Tr}[[\mathbf {V}_\nu ,\mathbf {T}]\mathcal {D}_\mu \mathbf {V}^\mu ] \partial ^\nu \frac{a}{f_a} \,\mathcal {F}_8(h),\nonumber \\&\mathcal {A}_9(h) = \frac{i}{(4\pi )^2} \mathrm{Tr}[\mathbf {T}\mathbf {V}_\mu ]\mathrm{Tr}[\mathbf {T}\mathbf {V}^\mu ]\mathrm{Tr}[\mathbf {T}\mathbf {V}_\nu ]\partial ^\nu \frac{a}{f_a} \,\mathcal {F}_9(h) , \nonumber \\&\mathcal {A}_{10}(h) =\frac{1}{4\pi } \mathrm{Tr}[\mathbf {T}W_{\mu \nu }]\partial ^\mu \frac{a}{f_a} \partial ^\nu \,\mathcal {F}_{10}(h) , \nonumber \\&\mathcal {A}_{11}(h) = \frac{i}{(4\pi )^2} \,\mathrm{Tr}[\mathbf {T}\mathbf {V}_\mu ]\square \frac{a}{f_a}\partial ^\mu \,\mathcal {F}_{11}(h) , \nonumber \\&\mathcal {A}_{12}(h) = \frac{i}{(4\pi )^2} \,\mathrm{Tr}[\mathbf {T}\mathbf {V}_\mu ]\partial ^\mu \partial ^\nu \frac{a}{f_a}\partial _\nu \,\mathcal {F}_{12}(h) , \nonumber \\&\mathcal {A}_{13}(h) = \frac{i}{(4\pi )^2} \,\mathrm{Tr}[\mathbf {T}\mathbf {V}_\mu ]\partial ^\mu \frac{a}{f_a} \square \,\mathcal {F}_{13}(h) , \nonumber \\&\mathcal {A}_{14}(h) = \frac{i}{(4\pi )^2} \,\mathrm{Tr}[\mathbf {T}\mathbf {V}_\mu ]\partial _\nu \frac{a}{f_a}\partial ^\mu \partial ^\nu \mathcal {F}_{14}(h),\nonumber \\&\mathcal {A}_{15}(h) = \frac{i}{(4\pi )^2} \,\mathrm{Tr}[\mathbf {T}\mathbf {V}_\mu ]\partial ^\mu \frac{a}{f_a} \partial _\nu \,\mathcal {F}_{15}(h)\partial ^\nu \, \mathcal {F}_{15}^\prime (h),\nonumber \\&\mathcal {A}_{16}(h) = \frac{i}{(4\pi )^2} \,\mathrm{Tr}[\mathbf {T}\mathbf {V}_\mu ]\partial _\nu \frac{a}{f_a} \partial ^\mu \,\mathcal {F}_{16}(h)\partial ^\nu \,\mathcal {F}_{16}^\prime (h)\nonumber ,\\&\mathcal {A}_{17} (h) = \frac{i}{(4\pi )^2} \,\mathrm{Tr}[\mathbf {T}\mathbf {V}_\mu ]\partial ^\mu \frac{\square a}{f_a}\,\mathcal {F}_{17}(h).\nonumber \end{aligned}$$
(49)

The requirement that all ALP couplings respect a (continuous or discrete) shift symmetry prevents the insertion of \(\mathcal{F}_i(h)\) functions in the three first couplings in this list. The first block of six operators are those invariant under custodial symmetry, assuming as customary no sources of custodial symmetry breaking other than those present in the SM.

The “penalisation” of the operator coefficients by inverse powers of \(4\pi \) is a most conservative choice of their possible value, which reflects the NDA normalisation of the chiral sector [80,81,82,83] in which \(~{\mathcal {O}}(1)\) operator coefficients indicate the strong regime. A particular case is that of vertices involving one Higgs leg, for which the overall amplitude will be proportional in practice to the product

$$\begin{aligned} \tilde{a}_i\equiv c_i\,a_i; \end{aligned}$$
(50)

see Eq. (41). Given the f / v factor absorbed in the definition of \(a_i\), in the strong coupling limit \(\tilde{a}_i\) is expected to be somewhat smaller than 1 for all \(i\ne 2D\). Conversely, \(\tilde{a}_{2D}\) as defined here is expected to be larger than 1 by a factor \(~{\mathcal {O}}(f/v)\) in that limit; see the discussion at the end of Sect. 3. Analogous reasoning applies to vertices with more than one Higgs leg.

3.2 Two-point functions

The last NLO operator in Eq. (49), \(\mathcal {A}_{17}(h)\), introduces a Za two-point function alike to that from the LO coupling \(\mathcal {A}_{2D}(h)\), albeit with a higher momentum dependence. That is, both operators feed derivatives of the ALP field into the longitudinal components of the Z boson, in addition to the usual derivative of the SM would-be Nambu–Goldstone neutral field:

$$\begin{aligned}&c_{2D}\mathcal {A}_{2D}(h)+c_{17}\mathcal {A}_{17}(h) \nonumber \\&\quad \supset -\frac{i}{f_a}\, \mathrm{Tr}(\mathbf {T}\,(\partial _\mu \partial ^\mu \mathbf {U})\mathbf {U}^\dag )\, \left( c_{2 D}\,v^2\,a + \frac{c_{17}}{16\pi ^2}\, \Box a\right) \nonumber \\&\qquad +\, \frac{i}{2}\,g'\,B^\mu \, \left\{ v^2\mathrm{Tr}((\partial _\mu \mathbf {U})\,\tau _3\,\mathbf {U}^\dag -\mathbf {U}\,\tau _3 (\partial _\mu \mathbf {U}^\dag ))\right. \nonumber \\&\qquad \left. -\, \dfrac{2i}{f_a}\left[ c_{2D}\,v^2\,\partial _\mu a + \frac{c_{17}}{16\pi ^2}\,\partial _\mu (\Box a)\right] \right\} \nonumber \\&\qquad +\, \frac{i}{2}\,g\,W^i_\mu \left\{ v^2\mathrm{Tr}((\partial ^\mu \mathbf {U}^\dag )\tau ^i\mathbf {U}-\mathbf {U}^\dag \tau ^i (\partial ^\mu \mathbf {U}))\right. \nonumber \\&\qquad \left. +\, \frac{i}{f_a}\left[ c_{2D}\,v^2\,\partial _\mu a + \frac{c_{17}}{16\pi ^2}\,\partial _\mu (\Box a)\right] \mathrm{Tr}(\mathbf {T}\,\tau ^i)\right\} . \end{aligned}$$
(51)

The physical impact can be illustrated best via a field redefinition which trades completely this combination of two-point functions by interaction vertices, alike to the procedure applied to the linear operator \(\mathbf {O}_{a\Phi }\) in Sect. 2,

$$\begin{aligned} \mathbf {U}(x) \rightarrow \mathbf {U}(x) \,\exp \left\{ \frac{2i}{ f_a} \left( c_{2D}\, a(x)+c_{17}\frac{1}{16\pi ^2 v^2}\,\square a(x)\right) \sigma ^3\right\} ,\nonumber \\ \end{aligned}$$
(52)

which translates also in contributions to the definition of the gauge fixing terms, the mass term for the gauge bosons and the Yukawa couplings (see also Ref. [34] for a similar discussion in the context of CP-odd effective operators within non-linearly realised EWSB). The net physical impact is:

  • The introduction of new fermionic couplings, alike to those fully equivalent in the linear case to the bosonic operator \(\mathbf {O}_{a\Phi }\); see Eqs. (10)–(16).

  • The presence in addition of aZh and other vertices of the form \((Z_\mu \partial ^\mu a) h^n,\, n\ge 1\) interactions, which are not redefined away in the non-linear case. The reason is that the functional dependence on h of \(\mathcal{F}_i(h)\) differs generically from that characteristic of the linear regime (in powers of \((v+h)^2\)).

The purely bosonic couplings cannot be thus completely traded by fermionic ones in the generic case of non-linear EWSB. This is remarkable, as it implies that aZh couplings could be then expected among the dominant signals of ALPs, at variance with linear realisations in which they are only expected at NNLO (as argued in Sect. 4 below). This comparison is illustrated in Table 1.

The fermionic couplings, stemming from \(\mathcal {A}_{2D}(h)\) and \(\mathcal {A}_{17}(h)\) after the field redefinition discussed, will be denoted by \(\mathcal {A}^\psi _{2D}\) and \(\mathcal {A}^\psi _{17}\) and defined by

$$\begin{aligned} \mathcal {A}^\psi _{2D}= & {} -i\sqrt{2}v \, \frac{a}{f_a}\,\sum _{\psi =Q,L}\left( \bar{\psi }_{L}\mathcal {Y}_\psi (h) \mathbf {U}\sigma ^3 \psi _{R}\right) +\mathrm{h.c.},\nonumber \\ \mathcal {A}^\psi _{17}= & {} -\dfrac{i\sqrt{2}v}{16\pi ^2}\, \frac{\Box a}{v^2 f_a}\,\sum _{\psi =Q,L}\left( \bar{\psi }_{L}\mathcal {Y}_\psi (h) \mathbf {U}\sigma ^3 \psi _{R}\right) +\mathrm{h.c.},\nonumber \\ \end{aligned}$$
(53)

see Eq. (45) and Appendix D for details. These expressions are the non-linear equivalent of the linear interaction \(\mathbf {O}_{a \Phi }^\psi \) in Eq. (13). Alternatively, the part of \(\mathcal {A}_{2D}\) and \(\mathcal {A}_{17}\) that can be traded by fermionic couplings could be written as chirality-conserving transitions, e.g.

$$\begin{aligned} \mathcal {A}^\psi _{2D}\rightarrow & {} \frac{\partial _\mu a}{f_a}\sum _{ \psi =Q,\,L}\left( \bar{\psi }\gamma ^\mu \gamma _5\sigma ^3 \psi \right) \mathcal {F}_\psi (h),\nonumber \\ \mathcal {A}^\psi _{17}\rightarrow & {} \frac{1}{16\pi ^2 v^2}\left( \frac{\partial _\mu \Box a}{ f_a}\right) \sum _{ \psi =Q,\,L} \left( \bar{\psi }\gamma ^\mu \gamma _5\sigma ^3 \psi \right) \mathcal {F}_\psi (h),\nonumber \\ \end{aligned}$$
(54)

which are the chiral equivalent of Eq. (15). In this work, when analyzing the non-linear EWSB scenario we will use the formulation of chirality-flipping fermionic couplings in Eq. (53).Footnote 5

3.3 The bosonic chiral ALP basis

In summary, the resulting bosonic ALP Lagrangian up to NLO couplings can be written, after the redefinition in Eq. (52), as the sum of 23 terms, besides the kinetic term:

$$\begin{aligned} \mathscr {L}_a^\text {chiral}= & {} \frac{1}{2}(\partial _\mu a)(\partial ^\mu a)+c_{2D}\mathcal {A}'_{2D}(h) +\sum _{X=\tilde{B},\tilde{W},\tilde{G}}c_X \mathcal {A}_X\nonumber \\&+\sum _{i=1}^{16}c_i\mathcal {A}_i+c_{17}\mathcal {A}'_{17}(h)+\sum _{i=2D,17}c_{i}\mathcal {A}_i^{\psi }, \end{aligned}$$
(55)

where \(\mathcal {A}'_{2D}(h)\) and \(\mathcal {A}'_{17}(h)\) are defined as the operators \(\mathcal {A}_{2D}(h)\) and \(\mathcal {A}_{17}(h)\) without their h-independent terms, which have been traded instead by the fermionic \(\mathcal {A}_i^{\psi }\) couplings as defined in Eq. (53). The rest of the operators have been defined in Eq. (49). All Feynman rules stemming from \(\mathscr {L}_a^\text {chiral}\) can be found in Appendix B, up to four-leg interactions.

\(\mathcal {A}_{\tilde{B}}\), \(\mathcal {A}_{\tilde{W}}\), \(\mathcal {A}_{\tilde{G}}\) and \(\mathcal {A}_{2D}^\psi \) are identical to the operators found in the framework of the linear EWSB Lagrangian. In consequence, the bounds on ALP–photon and ALP–gluon vertices in Eqs. (21) and (22) apply. This would also hold, restricted to the indicated mass ranges, for the \(aW^+W^-\) coupling in Eq. (35), if \(\mathcal {A}_{\tilde{W}}\) was considered just by itself. Nevertheless, the caveats to that approach discussed in the linear case are even stronger here in the sense that the \(aW^+W^-\) couplings may receive contributions in the non-linear case from the set (see FR.4)

$$\begin{aligned} \{c_{\tilde{W}}, c_2, c_6, c_8\}. \end{aligned}$$
(56)

Analogously, the ALP–fermion vertices in Eqs. (30)–(33) would constrain the magnitude of \(\mathcal {A}_{2D}\), if the latter is taken just by itself, to

$$\begin{aligned}&|c_{2D}|/f_a<(1.7\times 10^{-8} - 1.4\times 10^{-6})\,{\mathrm{GeV}^{-1}}\ (90\%\,\, \text {CL})\nonumber \\&\quad \text {for } m_a \lesssim 3\,{\mathrm{GeV}}. \end{aligned}$$
(57)

Again, in the non-linear EWSB setup many other couplings may contribute in addition to rare meson decay processes than in the linear case, see FR.17–FR.19, to wit

$$\begin{aligned} \{c_{2D}, c_{\tilde{W}}, c_2, c_6, c_8, c_{17}, \{c_{\mathcal {B}^q_i}\}\}. \end{aligned}$$
(58)

In this ensemble, the subset \(\{c_{\mathcal {B}^q_i}\}\) of operator coefficients refers to the flavour-changing operators of the general ALP–fermion couplings \(\mathcal {B}^{q}_i\) in the complete Lagrangian, see Eq. (94), which can contribute either at tree level or at one loop via W, Z or h exchange, and thus on the same footing than for instance \(c_{\tilde{W}}\) or \(c_2\), \(c_6\), \(c_8\) and \(c_{17}\). Even if the data analysis was restricted for simplicity to bosonic couplings (the focus of this work), a six-dimensional parameter space would still remain, which means that a large freedom remains for the possible value of one given coupling. In consequence, consistently with the complementarity perspective, in the second – phenomenological – part of this work we will explore the independent impact that the bosonic non-linear operator coefficients in Eq. (58) may have on LHC signals, which they impact via a different combination than in rare decays. Those couplings will be thus considered there first one at a time and occasionally in some combinations.

4 Linear vs. non-linear expansions

The results in the previous sections on bosonic ALP–SM interactions uncovered a plethora of effective couplings in the bosonic sector of the chiral expansion, in contrast with the mere four operator structures of the linear one shown in Eq. (17), when both Lagrangians are considered up to NLO. All ALP couplings are NLO ones in the linear case, while one of the chiral set (\(\mathcal {A}_{2D}\)) stands out at LO.

Three operators are exactly the same in both expansions. They are those with an “anomalous-type” structure of the form \(a X_{\mu \nu } \tilde{X}^{\mu \nu }\), where \(X_{\mu \nu }\) stands for a SM field strength: \(\mathcal {A}_{\tilde{B}}, \mathcal {A}_{\tilde{W}}\) and \(\mathcal {A}_{\tilde{G}}\). The total number of independent interactions has to be equal in both expansions when all orders are considered, though. It is thus pertinent to identify which are the effective operators of the linear expansion that lead to the same interaction vertices than the chiral (up to) NLO couplings. This is accomplished in Appendix C, which identifies the linear siblings with mass dimension:

  • \(d=5\), corresponding to \(\mathcal {A}_{\tilde{B}}, \mathcal {A}_{\tilde{W}}\) and \(\mathcal {A}_{\tilde{G}}\) and to the fermionic couplings induced by \(\mathcal {A}_{2D}\) with no attached Higgs leg (these are identical in both expansions), as well as other fermionic vertices.

  • \(d=7\), corresponding to \(\mathcal {A}_1\)\(\mathcal {A}_6\), \(\mathcal {A}_8\), \(\mathcal {A}_{10}\)\(\mathcal {A}_{12}\) and \(\mathcal {A}_{15}\)\(\mathcal {A}_{17}\).

  • \(d=9\), corresponding to \(\mathcal {A}_7\), \(\mathcal {A}_{13}\) and \(\mathcal {A}_{14}\).

  • \(d=11\), corresponding to \(\mathcal {A}_{9}\).

Furthermore, the siblings of the vertices induced by \(\mathcal {A}_{2D}\) with one or more Higgs legs are linear effective operators with dimension \(d=7\) [85] or higher, depending on their Lorentz structure.

Common/distinctive phenomenological signals Interaction vertices predicted by both expansions include the well-known ALP–photon and ALP–gluon couplings, and in addition the yet mainly unexplored \(a\gamma Z\), aZZ, \(aW^+W^-\), \(a\gamma W^+W^-\) and \(aZW^+W^-\) signals.

Distinctive signals are those only present in the chiral EWSB Lagrangian at the order considered, which are: (i) extra ALP–gauge boson vertices \(a\gamma W^+W^-\), \(aZW^+W^-\) and aZZZ; (ii) ALP–Higgs interactions stemming from \(\mathcal {A}_{2D}\), which include \(a\gamma h\), aZh, \(a\gamma Zh\), aZZh, \(aW^+W^-h\), \(a\gamma h h\) and aZhh interactions, among others. All these signals are thus putatively important pointers of non-linear realisations of EWSB.

A natural question about the bosonic ALP–Higgs interactions is how come those \((Z_\mu \partial ^\mu a) h^n\) couplings with \(n\ge 1\) appear at LO in the non-linear expansion while they are instead very suppressed in the linear one, as after all the latter is a limit of the former. The gist lies in the generality of the \(\mathcal {F}_i(h)\) functions, and more specifically in the difference between \(\mathcal{F}_C(h)\) and \(\mathcal{F}_{2D}(h)\), see Eqs. (41), (44) and (47). Would those two functions be equal, as it happens in the linear expansion, all bosonic ALP vertices involving the Higgs would also be redefined away completely in the chiral expansion at LO and NLO. Furthermore, even if the difference between the \(a_i\), \(b_i\) etc. coefficients for those two \(\mathcal{F}_i(h)\) functions was considered to be qualitatively a NLO effect, all \((Z_\mu \partial ^\mu a) h^n,\, n\ge 1\) couplings would still be phenomenologically considered NLO effects, which means in any case higher strength expected than in linear realisations of EWSB (where they start to appear only at NNLO).

The phenomenology of the ALP couplings to heavy SM bosons will be explored in Sects. 6 and 7 below.

5 Assumptions and validity of the EFT

The theoretical results in the previous sections focussed on a generic Nambu–Goldstone boson, singlet under the SM, identifying all bosonic derivative couplings at LO in the linear and chiral expansions (a complete set including fermionic ones was also derived and for the chiral case they can be found in Appendix A). They hold independently of whether the – unknown – underlying global symmetry is exact or slightly and explicitly broken, that is, of whether the ALP is indeed exactly massless or not, as far as its mass is negligible compared to the typical momenta considered. A few considerations are nevertheless in order before moving to the phenomenological analysis of ALPs signatures at colliders.

Validity of the EFT For the effective Lagrangian description to be valid, the relevant suppression scale, in this case \(f_a\), must be significantly larger than the typical energy scale of the process under study. In order to strictly ensure the validity of the EFT, one should require \(\sqrt{\hat{s}} < f_a\) for each event (\(\sqrt{\hat{s}}\) corresponding to the invariant mass of the event). However, \(\sqrt{\hat{s}}\) is not experimentally observable in processes with invisible particles in the final state. In this case, the comparison to \(f_a\) may be naively performed using either the missing transverse energy of a given event or the transverse mass \(m_T\), defined as (in events characterised by the presence of a lepton and significant )

(59)

where \(p_T^{\ell }\) is the transverse momentum of the lepton and \(\phi \) is the azimuthal angle between the lepton and the missing transverse momentum vector (note that \(m_T\) encompasses contributions from both the visible and the invisible parts of the final state). We use these two variables in the analysis below, depending on the process, and require that the maximum values allowed for those variables obey

  • \(m_T^{\text {max}}<f_a\) for mono-W analyses (see Sect. 6.3), as the ATLAS search we reinterpret uses \(m_T\) as discriminating variable. \(m_T^{\text {max}}\) corresponds to the highest \(m_T\) data bin in a given analysis, for each value of \(f_a\) considered.

  • for all other processes analyzed. is the highest data bin in a given analysis for each value of \(f_a\) considered.

The effect of imposing the strict validity criterion \(\sqrt{\hat{s}} < f_a\) can be assessed through the correlation between and \(\sqrt{\hat{s}}\) for each analyzed signal, obtained from Monte Carlo. For binned analyses, the signal event fraction for which in different bins may then be discarded. We will explicitly use this procedure for the mono-W and mono-Z analyses in Sects. 6.3 and 7.1, and discuss the impact of the strict validity criterion on the bounds/sensitivities on \(f_a/c_i\) obtained from the rest of analyses.Footnote 6

On a different note, we stress that as the chiral expansion has an implicit BSM electroweak scale \(\Lambda \le 4\pi f\), there is an underlying assumption that \(f_a\ge \Lambda \). This \(\Lambda /f_a\) hierarchy sustains the choice of restraining the analysis to vertices involving only one ALP.

ALP stability at the LHC and its mass In the LHC phenomenological exploration to follow, it will be assumed that the ALP is stable on collider scales, thus escaping the detector as missing transverse energy . This further restricts the range of values of \(m_a\), \(f_a\), appropriate for the concrete numerical analysis below, given the various interactions of a that could allow its decay – see Eqs. (FR.1)–(FR.7) and (FR.17)–(FR.19) in Appendix B. The valid \(m_a\) range should be specified for a correct interpretation of the collider results: because of the assumed stability, all phenomenological results to be obtained below hold for ALP masses \(m_a\le 1\) MeV, without any additional assumption as regards which channels may be open. The ratio between the ALP mass \(m_a\) and \(f_a\) is then safely small, \(m_a/f_a\le \,\)MeV/TeV, for characteristic \(f_a\) scales of at least a few TeV.

For ALP masses above the MeV, the signals to be studied below may also be present even if the pattern is altered, accompanied by new ones which can be used to precisely test the couplings through which the ALP may decay within the detector (e.g. leptonic couplings).Footnote 7 This would require an extended dedicated study.

In this work, an ALP mass \(m_a\sim 1\,\mathrm{MeV}\) is used in the numerical simulations, light enough to avoid altogether \(a \rightarrow \ell ^{+}\ell ^{-}\) and \(a \rightarrow \nu \bar{\nu }\ell ^{+}\ell ^{-}\) decays. The decay channels which then remain a priori available are:

  • \({\underline{a\rightarrow \nu \bar{\nu }\nu \bar{\nu }}}\) As neutrinos are undetectable at the LHC, this decay does not have any impact on our phenomenological analysis. It would simply become part of the contributions.

  • \({\underline{a\rightarrow \gamma \gamma }}\) This decay is constrained by astrophysical observations, as detailed at the end of Sect. 2. The distance d covered in the laboratory frame by an ALP before decaying can be estimated as

    $$\begin{aligned} d = \tau \beta c = \frac{\hbar }{\Gamma (a)}\frac{|\vec {p}_a|}{m_a}\, c, \end{aligned}$$
    (60)

    where \(\tau \), \(\Gamma (a)\) and \(\vec {p}_a\) are, respectively, the proper lifetime, width and three-momentum of the a particle, and c denotes the speed of light. Restricting the width to \(\Gamma (a\rightarrow \gamma \gamma )\) and using the coupling strength \(g_{a\gamma \gamma }\) as defined in Eq. (19), it follows that

    $$\begin{aligned} d = \frac{16\pi \hbar c}{m_a^4}\frac{1}{g_{a\gamma \gamma }^2}|\vec {p}_a|, \end{aligned}$$
    (61)

    which can be rewritten as

    $$\begin{aligned} d\simeq 10^8 \left( \frac{\mathrm{MeV}}{m_a}\right) ^4 \left( \frac{10^{-5}{\mathrm{GeV}^{-1}}}{g_{a\gamma \gamma }}\right) ^2 \left( \frac{|p_a|}{\mathrm{GeV}}\right) {\mathrm{m}}. \end{aligned}$$
    (62)

    For \(m_a=1\,\mathrm{MeV}\), given the experimental constraint (see Eqs. (20)–(21)), it results

    $$\begin{aligned} d> 4 \times 10^8\,{\mathrm{m}} \times \left( \frac{|\vec {p}_a|}{\mathrm{GeV}}\right) . \end{aligned}$$
    (63)

    The ALP momentum \(|\vec {p}_a|\) is typically of the order of the missing energy of the candidate signals, selected imposing a minimum cut, which for instance using ATLAS and CMS data is \(\gtrsim \mathcal{O}( 100)\) GeV. Thus, within the allowed range for \(g_{a\gamma \gamma }\) and , the ALP always covers an enormous distance – many orders of magnitude larger than the LHC detectors size (\(\sim 10\,{\mathrm{m}}\)) – before decaying into two photons. For lighter ALPs, the situation is even safer given the inverse quartic dependence of d with \(m_a\). ALP masses above the MeV range and up to hundreds of MeV could be considered without risking two-photon ALP decay in the data analyzedFootnote 8 by raising the minimum cut imposed on data, but this would open the \(e^+\,e^-\) leptonic decay channels.

  • \({\underline{a\rightarrow \gamma \nu \bar{\nu }}}\) Analogously, this process does not affect the stability of the ALP particle at the LHC. It could be mediated by the ALP–Z\(\gamma \) interaction parametrised by \(g_{aZ\gamma }\),

    $$\begin{aligned} \delta \mathscr {L}_a \supset - \frac{1}{4}g_{aZ\gamma } \,a\,F_{\mu \nu }\tilde{Z}^{\mu \nu }, \end{aligned}$$
    (64)

    where \(Z^{\mu \nu }\) denotes the Z-boson field strength. The decay width shows a very strong dependence on the mass of the ALP, due to a peculiar cancellation occurring in the phase-space integration. In the limit \(m_a\ll m_Z\) (and neglecting the Z boson width for simplicity) we find

    $$\begin{aligned} \Gamma (a\rightarrow \gamma \nu \bar{\nu })= & {} \frac{g^2\,g_{aZ\gamma }^2\,m_Z^3}{1024\, (2\pi )^3\,c_{\theta }^2}\nonumber \\&\times \left( \frac{13}{20}\frac{m_a^7}{m_Z^7} + {\mathcal {O}}(m_a^{9}/m_Z^{9})\right) . \end{aligned}$$
    (65)

For \(m_a=1\,\mathrm{MeV}\), this corresponds to a distance covered by the ALP before decaying

$$\begin{aligned} d\simeq 10^{22}\,{\mathrm{m}} \times \left( \frac{|\vec {p}_a|/g_{aZ\gamma }^2}{{\mathrm{GeV}^3}}\right) > 3.3 \cdot 10^{27}\,{\mathrm{m}} \times \left( \frac{|\vec {p}_a|}{\mathrm{GeV}}\right) ,\nonumber \\ \end{aligned}$$
(66)

where on the last inequality the constraint on \(g_{aZ\gamma }\) derived further below has been used (see Eq. (70)).

Table 2 Couplings contributing to the observables considered here, stemming from the purely bosonic operators in the linear and non-linear scenarios. The block of new constraints explores the sensitivity of LEP and present LHC data to different operators; see Sect. 6. The last block corresponds instead to the sensitivity analysis from Sect. 7, which assumes both LHC prospects with 300 fb\(^{-1}\) of data and projections to the HL-LHC phase with 3000 fb\(^{-1}\) of data. The operator coefficients to which present or expected measurements are found to be sensitive appear in bold

6 Phenomenological analysis I: new bounds

In this section we derive new constraints on the operator coefficients using LEP and LHC Run I and II data. Table 2 summarises the observables/processes which are sensitive to the various effective operator coefficients, to be considered in this and the next section.

Unless otherwise specified, we will consider the effect of one operator at a time. Note that the dependence of the signal cross section \(\sigma \) or partial width \(\Gamma \) on an operator coefficient \(c_i\) is \((c_i/f_a)^2\), hence the ratio \(c_i/f_a\) is the relevant combination of parameters throughout the analysis.

For the operator coefficients we will use the notation of the chiral expansion, as its couplings outnumber and include those of the linear expansion – see Sect. 3. Whenever pertinent, the applicability of a given bound or a sensitivity prospect to both expansions will be specified. Special attention will be paid overall to the comparison between the expectations based on the linear and non-linear effective Lagrangians.

Fig. 1
figure 1

Left Constraints on the parameters \(c_{\tilde{B}}/f_a\) and \(c_{\tilde{W}}/f_a\) derived from the tree-level bounds on the combinations \(g_{a\gamma \gamma }\) (y-axis) and \(g_{aZ\gamma }\) (x-axis) defined in Eqs. (20) and (68). The hatched (solid) region is obtained with the benchmark mass \(m_a\simeq 1\,\mathrm{MeV\, (keV)}\). The different colours show how the allowed region is shifted in the non-linear setup, depending on the parameter \(c_{127} = \frac{g}{4\pi }(2c_1+t_\theta (c_2+2c_7))\). The value \(c_{127}=0.2\) is about maximal, as it is obtained fixing \(c_1=c_2=c_7=1\), typical of the strongly interacting regime. The linear case corresponds to \(c_{127}=0\). Right The rotated figure shows only the region allowed for \(m_a=1\,\mathrm{MeV}\)

6.1 ALP coupling to Z-photon

In the non-linear expansion, the effective \(aZ\gamma \) coupling

$$\begin{aligned} \delta \mathscr {L}_a \supset - \frac{1}{4}g_{aZ\gamma } \,a\,Z_{\mu \nu }\tilde{F}^{\mu \nu }\, \end{aligned}$$
(67)

takes the form:

$$\begin{aligned} g_{aZ\gamma } =f_a^{-1}\left[ 4s_{2\theta }(c_{\tilde{W}}-c_{\tilde{B}})+\frac{g}{4\pi }\left( 2c_1+t_\theta (c_2+2c_7)\right) \right] ,\nonumber \\ \end{aligned}$$
(68)

with the custodial-preserving limit recovered for \(c_7=0\) and the linear limit at NLO recovered for \(c_1=c_2=c_7=0\). This interaction can be constrained from various sets of experimental data:

  • The uncertainty on the Z boson width [42], \(\Gamma (Z\rightarrow \text {BSM}) \lesssim 2\,\mathrm{MeV}\) at \(95\%\,\, \text {CL}\), allows one to set a conservative bound on the process \(Z\rightarrow a\gamma \). The latter would contribute to the Z width as

    $$\begin{aligned} \Gamma (Z\rightarrow a \gamma )=\frac{M_Z^3}{384\pi }\, g_{aZ\gamma }^2\left( 1-\frac{m_a^2}{M_Z^2}\right) ^3. \end{aligned}$$
    (69)

    In consequence, we use for the first time the Z boson width to obtain a bound on this coupling, constraining the combination of coefficients in Eq. (68) within the limit (with basically no dependence on \(m_a\) for \(m_a\lesssim 1\,\mathrm{GeV}\))

    $$\begin{aligned} |g_{aZ\gamma }|<1.8\,{\mathrm{TeV}^{-1}}\quad (95\%\,\, \text {CL}). \end{aligned}$$
    (70)
  • LEP limits on \(Z \rightarrow 3 \gamma \) [36] constrain the product \(g_{a\gamma \gamma } \, g_{aZ\gamma }\). However, given the bounds on \(g_{a\gamma \gamma }\) reported at the end of Sect. 2, the inferred bound on \(g_{aZ\gamma }\) is weaker than that in Eq. (70).

As illustrated in Fig. 1, the Z boson width is able to probe regions in the parameter space orthogonal to those tested by \(g_{a\gamma \gamma }\). In the linear EWSB setup, those two bounds constrain \(c_{\tilde{B}}/f_a\) and \(c_{\tilde{W}}/f_a\) to take values within a limited area: imposing Eq. (23) leads to \(|c_{\tilde{W}}/f_a|< 0.42\,{\mathrm{TeV}^{-1}}\). In the non-linear EWSB case, that region can be shifted depending on the value taken by the combination \(c_{127}\equiv \frac{g}{4\pi }(2c_1+t_\theta (c_2+2c_7))\), as shown in Fig. 1. Overall, the constraints on the quantities \(c_i/f_a\) are of order \({\mathrm{TeV}^{-1}}\) and thus correspond to a loose \({\mathcal {O}}(1)\) bound on the coefficients \(c_i\) for \(f_a=1\,\mathrm{TeV}\).

6.2 ALP coupling to Z-Higgs: non-standard Higgs decays

As shown in Sect. 3.2 and Appendix D, the presence of the coupling aZh is a characteristic feature of the non-linear effective Lagrangian, as in the linear expansion it would only be expected at NNLO. We propose here for the first time to use non-standard Higgs channels to bind couplings of the ALP to the Higgs particle. Consider a range of ALP masses such that it allows the Higgs particle to decay into Za. The presence of non-standard decay modes of the Higgs is constrained by ATLAS and CMS global fits to Higgs signal strengths. Current constraints on the Higgs non-standard branching fraction \({{\mathrm{Br}}}(h\rightarrow \mathrm{BSM})\) from LHC 7 and 8 TeV data yield [87]

$$\begin{aligned} {{\mathrm{Br}}}(h\rightarrow \mathrm{BSM}) = \frac{\Gamma _\text {BSM}}{\Gamma _\text {BSM}+\Gamma _{\mathrm{{SM}}}} \le 0.34 \quad (95\%\,\, \text {CL}), \end{aligned}$$
(71)

where the SM Higgs width is \(\Gamma _{\mathrm{{SM}}} = (4.07\pm 0.16)\,\mathrm{MeV}\) [88] and \(\Gamma _\text {BSM}\) denotes the non-standard Higgs partial width stemming in this case from the presence of the ALP,

$$\begin{aligned} \Gamma _\text {BSM}= & {} \Gamma _{h \rightarrow aZ} + \Gamma _{h \rightarrow aZ\gamma } + \Gamma _{h \rightarrow a\,\text {f}\bar{\text {f}}}. \end{aligned}$$
(72)

The interaction vertices contributing to \(\Gamma _{h \rightarrow aZ}\), \(\Gamma _{h \rightarrow a\,\text {f}\bar{\text {f}}}\) and \(\Gamma _{h \rightarrow aZ\gamma }\) are shown in FR.7, FR.20–FR.22 and FR.14 in Appendix B, respectively. The last two terms are three-body phase-space suppressed and yield negligible contributions to the Higgs total width;Footnote 9 they will be then discarded in what follows. Using then \(\Gamma _\text {BSM} \simeq \Gamma _{h \rightarrow aZ}\) in Eq. (71) yields the present bound

$$\begin{aligned} \Gamma _{h \rightarrow aZ} < 2.1\,\mathrm{MeV} \quad (95\%\,\, \text {CL}). \end{aligned}$$
(73)

\(\Gamma _{h \rightarrow aZ}\) receives contributions from the chiral LO operator \(\mathcal {A}_{2D}\), Eq. (47), and from several NLO ones in Eq. (49),

$$\begin{aligned} \Gamma _{h \rightarrow aZ}= & {} \frac{m_h^7}{1024\pi ^5 v^4 f_a^2} \left( \left( 1-\frac{m_a^2}{m_h^2}-\frac{m_Z^2}{m_h^2}\right) ^2 -\frac{4m_a^2m_Z^2}{m_h^4}\right) ^{{3}/{2}}\nonumber \\&\times \left( \kappa _h+\kappa _Z \frac{m_Z^2}{m_h^2}+\kappa _a \frac{m_a^2}{m_h^2} \right) ^2\nonumber \\\simeq & {} \frac{m_h^7}{1024\pi ^5 v^4 f_a^2}\left( 1-\frac{m_Z^2}{m_h^2}\right) ^3 \left( \kappa _h+\kappa _Z \frac{m_Z^2}{m_h^2}\right) ^2\nonumber \\&+\,{\mathcal {O}}(m_a/m_h), \end{aligned}$$
(74)

with

$$\begin{aligned} \kappa _h= & {} \tilde{a}_{13}+\frac{1}{2}(\tilde{a}_{12}-\tilde{a}_{14})-\frac{2\pi s_{2\theta }}{e} (\tilde{a}_3s_\theta -\tilde{a}_{10}c_\theta )\nonumber \\&-\,16\pi ^2\tilde{a}_{2D}\frac{v^2}{m_h^2},\nonumber \\ \kappa _Z= & {} -\frac{1}{2}(\tilde{a}_{12}-\tilde{a}_{14}),\nonumber \\ \kappa _a= & {} \tilde{a}_{17}-\tilde{a}_{11}+\frac{1}{2}(\tilde{a}_{12}-\tilde{a}_{14}) +\frac{2\pi s_{2\theta }}{e}(\tilde{a}_3s_\theta -\tilde{a}_{10}c_\theta ),\nonumber \\ \end{aligned}$$
(75)

and where the coefficients \(\tilde{a}_i\) for the couplings involving one Higgs leg have been defined in Eq. (50). The bound in Eq. (73) translates into the constraint

$$\begin{aligned}&\dfrac{1}{f_a} \left| \kappa _h+\dfrac{m_Z^2}{m_h^2} \kappa _Z +\dfrac{m_a^2}{m_h^2} \kappa _a\right| \lesssim 0.22\,{\mathrm{GeV}^{-1}} \,\nonumber \\&\quad \longrightarrow \frac{f_a}{\tilde{a}_{2D}} \gtrsim 2.78\,\mathrm{TeV}\quad \text {for } m_a \lesssim 34\,\mathrm{GeV}, \end{aligned}$$
(76)

where we use the fact that the inequality on the left is generically dominated by the \(\tilde{a}_{2D}\) contribution, as it enters weighted by a large factor. If the constraint in Eq. (57) is considered, the impact of \(\mathcal {A}_{2D}\) on \(h \rightarrow aZ\) decay is negligible for ALP masses below 3 GeV, and in consequence the bound in Eq. (73) would apply to the combination of \(\tilde{a}_3\) and \(\tilde{a}_{10}\). However, present LHC sensitivity does not allow one to constrain these operators.

The above limits are expected to improve significantly at the high-luminosity phase of LHC (HL-LHC). For example, Ref. [89] estimates that a bound

$$\begin{aligned} {{\mathrm{Br}}}(h\rightarrow \mathrm{BSM}) \le 0.1\quad \,\, (95\%\,\, \text {CL}), \end{aligned}$$
(77)

will be reached for \(3000\,{\mathrm{fb}^{-1}}\) of data at \(\sqrt{s}=14\,\mathrm{TeV}\) (neglecting here theoretical uncertainties). This would roughly translate into a sensitivity \(\Gamma _{h \rightarrow aZ}^{3\,{\mathrm{ab}^{-1}}} \lesssim 0.45\,\mathrm{MeV}\) (\(f_a/\tilde{a}_{2D} \gtrsim 6\,\mathrm{TeV}\) for the case in Eq. (76)).

An alternative approach to tackle \(\Gamma _{h \rightarrow aZ}\) is to use the constraints from direct searches for invisible Higgs decays, since \(h \rightarrow a Z\) yields an invisible Higgs decay for \(Z\rightarrow \nu \bar{\nu }\). Current experimental searches by ATLAS [90, 91] and CMS [92] constrain the branching ratio for Higgs decay into invisible states \({{\mathrm{Br}}}(h\rightarrow \text {inv})\) to [91]

$$\begin{aligned} {{\mathrm{Br}}}(h\rightarrow \text {inv}) < 0.23 \quad \,\, (95\%\,\, \text {CL}). \end{aligned}$$
(78)

Nevertheless, no constraint on \(\Gamma _{h \rightarrow aZ}\) follows from the present bound, since \({{\mathrm{Br}}}(Z \rightarrow \nu \bar{\nu }) = 0.2 \pm 0.006\) [93]. In the future, given the improvement on the sensitivity to \({{\mathrm{Br}}}(h\rightarrow \text {inv})\) foreseen at HL-LHC with \(3000\,{\mathrm{fb}^{-1}}\) of data at \(\sqrt{s}=14\,{\mathrm{TeV}}\) [94],

$$\begin{aligned} {{\mathrm{Br}}}(h\rightarrow \text {inv}) < 0.08\quad \,\, (95\%\,\, \text {CL}), \end{aligned}$$
(79)

direct searches of the invisible decays of the Higgs resonance may be sensitive to \(\Gamma _{h \rightarrow aZ}\). Indeed, in the ALP scenarios under discussion

$$\begin{aligned} {{\mathrm{Br}}}(h\rightarrow \text {inv})\simeq \frac{\Gamma _{h \rightarrow a Z} \times {{\mathrm{Br}}}(Z \rightarrow \nu \bar{\nu })}{\Gamma _{h \rightarrow a Z}+\Gamma _{\mathrm{{SM}}}}, \end{aligned}$$
(80)

and in consequence, barring a positive signal in future data, Eq. (79) may translate into \(\Gamma _{h \rightarrow aZ}^{3\,{\mathrm{ab}^{-1}}} \lesssim 2.71\,\mathrm{MeV}\), setting new limits on the operator coefficients participating in this decay. This expected sensitivity is, however, weaker than the present bound obtained from global fits to Higgs signal strengths in Eq. (75), and the latter will be used in the remainder of the paper.

6.3 Mono-W and mono-Z searches at \(\sqrt{s}=13\,\mathrm{TeV}\)

We now study the production of a in association with a W and a Z boson, as illustrated in Fig. 2. Since the ALP escapes the LHC detectors as missing transverse energy , this yields, respectively, “mono-W” [95] and “mono-Z” [96,97,98,99,100] signatures. Both channels are being currently searched for by the ATLAS and CMS experimental collaborations. In this section we use their studies from public Run II data to set limits on the presence of different ALP effective operators that contribute to these signals.

Fig. 2
figure 2

Feynman diagrams contributing to mono-W and mono-Z production

Fig. 3
figure 3

Transverse mass \(m_T\) distribution for \(a\,W^{\pm }\) (\(W^{\pm } \rightarrow \ell ^{\pm } \nu _{\ell }\)) production in the final state (left) and final state (right), generated from \(\mathcal {A}_{\tilde{W}}\) (green), \(\mathcal {A}_2\) (purple), \(\mathcal {A}_6\) (orange) and \(\mathcal {A}_8\) (yellow). Also shown are the binned experimental data and dominant backgrounds from the 13 TeV \((3.3\,{\mathrm {fb}}^{-1})\) ATLAS analysis [102]

6.3.1 Analysis tools

All signals and backgrounds to be discussed below in this and the next section will be generated using \(\mathtt{MadGraph5\_aMC}\) \(\mathtt{@NLO}\) [101]. For this section it is enough to consider a parton-level analysis as the final states considered involve only leptons in addition to the ALP.

6.3.2 Statistical tools

In order to set limits on \(c_i/f_a\) for each effective operator, a binned likelihood analysis will be performed. The likelihood function for a given lepton flavour in the final state \(\ell = e, \,\mu \), is built as a product of bin Poisson probabilities

$$\begin{aligned} L^{\ell }(\mu _i) = \prod _k \,e^{-(\mu _i\,s^i_k +\, b_k)}\, \frac{(\mu _i\,s^i_k + b_k)^{n_k}}{n_k !}, \end{aligned}$$
(81)

where

$$\begin{aligned} \mu _i \equiv (c_i/f_a)^2 \end{aligned}$$
(82)

and \(b_k\) and \(s^i_k\) are, respectively, the background prediction and the signal prediction for \(c_i = 1\) and \(f_a = 1\) TeV in a given bin k. The significance is estimated via the test statistic \(Q^{\ell }_{\mu _i}\),

$$\begin{aligned} Q^{\ell }_{\mu _i} \equiv -2\, {\mathrm {Log}} \left[ \frac{L^{\ell }(\mu _i)}{L^{\ell }(\hat{\mu _i})} \right] , \end{aligned}$$
(83)

with \(\hat{\mu _i}\) being the value of \(\mu _i\) which maximises \(L^{\ell }(\mu _i)\). Alternatively, we may include the effect of systematic uncertainties on the background prediction (which for the mono-W searches can be obtained from Refs. [102, 103] and for the mono-Z searches from Ref. [104]) by convoluting each bin Poisson probability with a Gaussian prior,Footnote 10 such that the likelihood function is given by

$$\begin{aligned} L^{\ell }_{S}(\mu _i) = \prod _k \int _{0}^{\infty } \mathrm{d}r\, \frac{e^{\frac{-(r-1)^2}{2\sigma _k^2}}}{\sqrt{2\pi }\sigma _k} \,e^{-(\mu _i\,s^i_k +\, r\, b_k)}\, \frac{(\mu _i\,s^i_k + r\,b_k)^{n_k}}{n_k !} ,\nonumber \\ \end{aligned}$$
(84)

with \(\sigma _k\) being the background systematic uncertainty in each bin k. Our test statistic accounting for background systematic uncertainties \(Q^{\ell }_{S\,\mu _i}\) is then defined as

$$\begin{aligned} Q^{\ell }_{S\,\mu _i} = -2\, \mathrm {Log} \left[ \frac{L^{\ell }_S(\mu _i)}{L^{\ell }_S(\hat{\mu _i})} \right] . \end{aligned}$$
(85)

The value of \(\mu _i\) that can be excluded at \(95\%\,\, \text {CL}\) corresponds to \(Q^{\ell }_{\,\mu _i} = 3.84\) (\(Q^{\ell }_{S\,\mu _i} = 3.84\)) if background systematic uncertainties are not (are) included.

6.3.3 Mono-W signatures: \(pp \rightarrow a \, W^{\pm }\)

We are targeting in this paper bosonic couplings of the ALP particle, and here in particular ALP couplings to electroweak gauge bosons, as illustrated in Fig. 2. Let us first concentrate on the ALP production in association with a W boson, as illustrated in Fig. 2(left). It is possible to derive limits on the coefficient of each effective operator contributing to this process from LHC Run II data at \(\sqrt{s} = 13\,\mathrm{TeV}\), by reinterpreting the ATLAS search for \(W'\) decaying to final states with \(3.3\,{\mathrm {fb}}^{-1}\) of integrated luminosity [102] (with \(\ell = e,\,\mu \)). The backgrounds will be taken from Ref. [102], considering independently the electron and muon samples and selecting events with transverse momentum \(p_T > 65\) GeV (55 GeV) as well as  GeV (55 GeV) and transverse mass \(m_T > 130\) GeV (110 GeV) in events with electrons (muons).

The couplings that may contribute to this process are the custodial-invariant \(\mathcal {A}_{\tilde{W}}\) and \(\mathcal {A}_2\) operators, and the custodial-breaking ones \(\mathcal {A}_6\) and \(\mathcal {A}_8\) in Eq. (49), as illustrated in Fig. 2 and shown in the Feynman rules FR.5. Figure 3 depicts the \(m_T\) spectrum of the SM background contributions, as well as the various signals corresponding to the \(c_2\), \(c_6\), \(c_8\), \(c_{\tilde{W}}\) Wilson coefficients, for \(f_a = 1\) TeV and \(c_i = 1\) (with \(c_{\tilde{B}}\) obeying Eq. (23)). The bins used in this figure are those for which there is experimental information on the background [102], corresponding to \(m_T < m_T^{\text {max}} = 2.6\) TeV for electrons and \(m_T < m_T^{\text {max}} =3\) TeV for muons. As discussed in Sect. 5, the strict EFT validity condition \(\sqrt{\hat{s}} < f_a\) can be imposed by computing the fraction of events in each bin for which \(\sqrt{\hat{s}}> m_T^{\text {max}}\), and discarding it. In Fig. 4 (left) we show the correlation between \(m_T\) and \(\sqrt{\hat{s}}\) for mono-W through a double-differential Monte Carlo distribution for the \(\mathcal {A}_{\tilde{W}}\) signal. We also show the normalised \(m_T\) distribution (again, for the \(\mathcal {A}_{\tilde{W}}\) signal) before/after discarding the events for which \(\sqrt{\hat{s}}> m_T^{\text {max}}\).

We note that although the \(c_6\) and \(c_8\) signatures in Fig. 3 exhibit a kinematical shape a priori much more favourable to be distinguished from background than those proportional to \(c_{\tilde{W}}\) and \(c_2\), at the end the most prominent impact on this purely LHC analysis is that of \(c_{\tilde{W}}\) (followed by that of \(c_6\)) due to suppression factors in the cross sections.Footnote 11 Mono-W signatures from the operators \(\mathcal {A}_6\), \(\mathcal {A}_8\) and \(\mathcal {A}_2\) are buried in the backgrounds of present LHC data, and they will remain out of reach with future HL-LHC data, except for \(\mathcal {A}_6\); see Sect. 7.

The loop-level bound obtained in Eq. (35) would imply (if taken at face value) that \(\mathcal {A}_{\tilde{W}}\) is out of reach of foreseen LHC prospects, for light enough ALPs; however, as previously discussed, because more than one operator contributes to those rare process – see Eq. (58) – the data only constrain a combination of operator coefficients which differs from that in LHC signals, see Eq. (56); it is thus pertinent to analyze the impact of \(\mathcal {A}_{\tilde{W}}\) on LHC independently.

The results obtained, for which the LHC sensitivity in \(f_a/c_{\tilde{W}}\) extends up to significant values, are listed in Table 3. They show an important impact of the systematic uncertainties on the background and also indicate that present LHC Run II limits on \(f_a/c_{\tilde{W}}\) from mono-W signals would a priori be sensitive to \(c_{\tilde{W}}\) only in the region of strong coupling \(c_{\tilde{W}} \gtrsim 1\) (possible in non-linear EWSB constructions), for values of \(f_a\) compatible with the validity of the EFT. These bounds have been computed in compliance with the strict validity criterion (\(\sqrt{\hat{s}}<f_a\)) by discarding the fraction of events in each bin for which \(\sqrt{\hat{s}}>m_T^{\text {max}}\) (recall the discussion in Sect. 5). We note that here the effect of considering a strict validity criterion instead of the milder \(f_a > m_T^{\text {max}}\) one is of the order of the few percent on the numbers in Table 3. The bound which suffers the most from applying the strict validity criterion is the present constraint from the \(W\rightarrow e\nu \) final state, where applying the naive validity criterion would imply overestimating the bound in \({\sim } 20\%\). However, this is not a problem since it is the muon channel with yields a more constraining result.

Fig. 4
figure 4

The left (right) top panel shows the correlation between \(m_T\) () and \(\sqrt{\hat{s}}\) for mono-W (mono-Z) through a double-differential Monte Carlo distribution for the \(\mathcal {A}_{\tilde{W}}\) signal. The centre plots show the normalised \(m_T\) () distribution before/after discarding the events for which \(\sqrt{\hat{s}}> m_T^{\text {max}}\)) (grey/black). The bottom panel displays the ratio between the two distributions above

Table 3 Present \(95\%\,\, \text {CL}\) \(f_a/c_{\tilde{W}}\) exclusion limits for the effective operator \(\mathcal {A}_{\tilde{W}}\) from mono-W (left), inferred from the search presented in Ref, [102] as detailed in Sect. 6.3.1 and mono-Z (right) inferred from the search presented in Ref. [104] as detailed in Sect. 7.1.1. Values obtained without including background systematics are labelled [No Syst.]
Fig. 5
figure 5

distribution for \(a\,Z\) (\(Z \rightarrow \ell ^{+} \ell ^{-}\)) production in the final state (left) and final state (right), generated from \(\mathcal {A}_{\tilde{W}}\) (green), \(\mathcal {A}_1\) (blue), and \(\mathcal {A}_2\) (purple). Also shown are the binned experimental data and dominant backgrounds from the 13 TeV (\(2.3\,{\mathrm {fb}}^{-1}\)) CMS analysis [104]

6.3.4 Mono-Z signatures: \(pp \rightarrow a \, Z\)

Consider now ALP production in association with a Z boson, in hadronic collisions, as illustrated in Fig. 2 (centre and right). The recent CMS search [104] with \(\sqrt{s} = 13\,\mathrm{TeV}\) and integrated luminosity \(2.3~{\mathrm {fb}}^{-1}\) will be used to estimate present sensitivities to various Wilson coefficients. Table 2 summarises the couplings which may a priori contribute to a mono-Z signal among those in the chiral basis, Eqs. (47) and (49). It will be argued next that only \(c_{\tilde{W}}\) may be expected to be seriously tested by this signal.

The distribution for signal and background will be used as kinematic discriminator, applying the same tools and procedure described at the beginning of Sect. 6.3. In order to optimise the search, the following preselection and selection cuts are applied: \(p^{\ell }_T > 20\) GeV, \(\left| \eta _\ell \right| < 2.5\), \(p^{\ell \ell }_{T} > 50\) GeV, \(m_{\ell \ell } \in [80,\,100]\) GeV,  GeV, , (rad), and furthermore third-lepton and extra high-\(p_T\) jets vetoes are implemented. The cut GeV ensures that a contamination from the gluon-fusion initiated signal leading to s-channel Higgs mediation can be safely neglected: for an on-shell Higgs the maximum is \({\sim } 30\) GeV. Furthermore, for a higher cut the fraction of the cross section contributed by this channel may be estimated as the integral of the Breit–Wigner distribution of the Higgs resonance for , which for GeV gives a suppression factor of \(5\times 10^{-6}\). Given that the on-shell Higgs production via gluon-fusion is \(\sigma (gg \rightarrow h) = 48.6\,\mathrm{pb}\) [105], the Higgs-mediated contribution is completely negligible. Similarly, contributions involving a quark in the t-channel are not relevant in the kinematic region considered. In summary, with present data the signal cross sections for \(p p \rightarrow Z a\) have a negligible dependence on the Wilson coefficients parameterizing the qqa and hZa vertices, i.e. \(c_{a\Phi }\) in the linear case and \(c_{2D},\,c_3,\,c_{10-14},c_{17}\) in the non-linear one (see Appendix B).

The remaining ALP–gauge boson interactions which may induce a mono-Z signal are the custodial-invariant operators \(\mathcal {A}_{\tilde{W}}\), \(\mathcal {A}_{\tilde{B}}\), \(\mathcal {A}_1\) and \(\mathcal {A}_2\), and the custodial-breaking coupling \(\mathcal {A}_7\); see Fig. 2 (centre and right). \(\mathcal {A}_{\tilde{B}}\) will not be considered independently all through the rest of this work, given the constraint in Eq. (23). The contribution from \(\mathcal {A}_7\) does not need to be considered separately either, as \(c_7\) enters exclusively through the combination \(c_2+ 2c_7\); see the Feynman rules FR.3 and FR.2. The analysis focuses thus on \(c_{{\tilde{W}}}\), \(c_1\) and \(c_2\).

Fig. 6
figure 6

Main diagrams contributing to the processes analysed in Sect. 7.2. Upper line \(a\gamma W\) associated production. Lower line VBF-type interaction producing ajj (iii) and \(ajj\gamma \) (iv). The proportionality of each diagram to the non-linear parameters is indicated in the figure (overall factors and relative coefficients are not displayed)

The comparison of signals and background distributions for \(\ell = e, \mu \) is shown in Fig. 5. The highest-energy bin considered is TeV. In analogy with the previous mono-W analysis, the EFT validity condition \(\sqrt{\hat{s}} < f_a\) is implemented by discarding the fraction of events in each bin for which (see Sect. 5). The correlation between \(\sqrt{\hat{s}}\) and is shown in Fig. 4 (right), as well as the normalised distributions before/after discarding the invalid event fraction in each bin.

The results obtained for \(\mathcal {A}_{{\tilde{W}}}\) are listed in Table 3: the present mono-Z search turns out to be significantly more powerful in constraining \(c_{{\tilde{W}}}/f_a\) than the ATLAS mono-W search previously analyzed. Furthermore, the impact of systematic errors is negligible in this case. An interesting fact is the different discriminating power of electrons and muons in mono-Z signals induced by ALP emission with respect to the coupling strength: while present muon data could a priori be sensitive to \(c_{\tilde{W}}\) only in the region of strong coupling \(c_{\tilde{W}}\ge 1\), a signal in electron data would be compatible as well with \(c_{\tilde{W}}\) values in the perturbative regime, \(c_{\tilde{W}}\le 1\). It is relevant to point out that these results, obtained imposing \(\sqrt{\hat{s}}< m_T^{\text {max}} < f_a\), are equal up to the permille level to the ones which are obtained if the naive validity criterion (only \(m_T^{\text {max}} < f_a\)) is used instead.

The contributions to mono-Z signals from \(\mathcal {A}_{1,2}\) are shown in Fig. 5 for illustration only, as the corresponding values for \(c_{1,2}\) would lie outside the region of validity of the EFT in present data and also if assuming the 3000 fb\(^{-1}\) integrated luminosity foreseeable at HL-LHC; see the next section. The mono-Z analysis with present and projected data is thus only sensitive to the \(\mathcal {A}_{{\tilde{W}}}\) operator, which is common to the NLO of the linear and of the non-linear expansion. It follows that mono-Z searches alone are not sensitive to a possible non-linear component in the nature of EWSB, unlike mono-W future searches at HL-LHC.

7 Phenomenological analysis II: \(\sqrt{s}=13\,\mathrm{TeV}\) LHC prospects

This section explores the sensitivity prospects for constraining the effective ALP couplings to SM bosons at the HL-LHC, as well as the analysis strategy sensitive to the linear/non-linear character of the underlying EWSB mechanism. Assuming thus proton–proton collisions at c.o.m. energy \(\sqrt{s}=13\,\mathrm{TeV}\) and successive integrated luminosities of 300 and 3000 fb\(^{-1}\), the following channels will be analyzed:

  • Mono-W and mono-Z signatures, see Fig. 2, projecting the analysis in Sect. 6 onto future data. A qualitative discussion of their ratio as a probe of non-linear character will be added.

  • \(Wa\gamma \) associated production; see Fig. 6.

  • Mono-Higgs signatures; see Fig. 10.

Table 2 summarises the set of operator coefficients that could contribute to these signals and be tested in LHC prospects, among those defined in the Lagrangians Eqs. (5)–(8) and (47)–(48). The corresponding Feynman rules are shown in Appendix B.

Mono-W and \(Wa\gamma \) associated production, together with \(ajj(\gamma )\) production through vector-boson fusion (VBF) – also shown in Fig. 6 – are intimately related processes, as they probe the same limited set of effective operator coefficientsFootnote 12 \(c_{\tilde{W}}\), \(c_{\tilde{B}}\), \(c_1\), \(c_2\), \(c_6\), \(c_7\) and \(c_8\).

A special role is played by the Higgs-ALP couplings associated to \(c_{2D}\), which, barring extreme fine-tunings, may be only expected among the leading signals if the underlying EWSB enjoys a non-linear character, and within a certain ALP mass range, as previously discussed. This coupling will be shown to be a priori testable through mono-Higgs searches at HL-LHC, which exhibit a sensitivity reach well beyond the bounds obtained in Sect. 6.2 from the limits on the non-standard Higgs decay width.

Relevant information about the structure of the ALP couplings can be inferred both by analyzing the different signatures independently and by studying their interplay. As some effective operators contribute to several processes, a combined analysis may be necessary in order to access the individual Wilson coefficients. Furthermore, the study of (de)correlations between the various putative signals serves as a good probe of the degree of EWSB non-linearity.

Table 4 Projected \(95\%\,\, \text {CL}\) \(f_a /c_i\) reach at LHC, with \(\mathcal {L} = 300\,{\mathrm {fb}}^{-1}\) and \(\mathcal {L} = 3000\,\,{\mathrm {fb}}^{-1}\) for \(\mu _{\tilde{W}}=(c_{\tilde{W}}/f_a)^2\) from mono-Z production, as detailed in Sect. 6.3.2. Top row: Assuming future systematic uncertainties on the background scale as present ones. Middle row: Assuming systematic uncertainties are reduced by a factor 2 w.r.t. present ones. Bottom row: Assuming no background systematic uncertainties
Table 5 Projected \(95\%\,\, \text {CL}\) \(f_a/c_i\) LHC reach for \(\ell = e\) final states, with \(\mathcal {L} = 300\,\,{\mathrm {fb}}^{-1}\) and \(\mathcal {L} = 3000\,\,{\mathrm {fb}}^{-1}\) for the effective operators relevant to mono-W production, as detailed in Sect. 6.3.1. Top row: Assuming future systematic uncertainties on the background scale as present ones. Middle row: Assuming systematic uncertainties are reduced by a factor 2 w.r.t. present ones. Bottom row: Assuming no background systematic uncertainties

7.1 Mono-W and mono-Z signatures

The result of extending the analysis in Sect. 6.3.1 to the projected sensitivity in \((c_i/f_a)^2\) for LHC 13 TeV with 300 and 3000 \({\mathrm {fb}}^{-1}\) is summarised in Table 4 (for mono-Z) and Table 5 (for mono-W), considering electrons and/or muons in the final state.

They show that mono-Z searches will be stronger than mono-W ones in probing at LHC the effective operator \(\mathcal {A}_{\tilde{W}}\). Both electron and muon channels will access the perturbative regime \(c_{\tilde{W}}<1\). Mono-Z searches would reach ALP scales up to \(f_a \sim 20\) TeV (for \(c_{\tilde{W}} = 1\)) with 3000 \({\mathrm {fb}}^{-1}\) disregarding background systematics – see Table 4. Assuming instead future background systematics as (1 / 2 of) the present ones, the mono-Z reach is somewhat milder, up to \(f_a \sim 15\) TeV (\({\sim }18\) TeV). Table 5 shows that instead the limits on \(c_{\tilde{W}}/f_a\) from LHC mono-W searches are systematics dominated.Footnote 13

Future mono-W searches appear instead of special interest in order to uncover the \(\mathcal {A}_{6}\) coupling, which is a signal of non-linearity up to NLO. Table 5 shows that with 300 and 3000 \({\mathrm {fb}}^{-1}\) it is possible to either discover it or derive a consistent projected limit. The sensitivity to \(c_6\) turns out to be mainly limited by statistical uncertainties, being less dependent than \(\mathcal {A}_{\tilde{W}}\) on SM background systematics. Nevertheless a significant reduction of the latter is shown to have a significant impact also on tackling \(\mathcal {A}_6\), particularly with 3000 \({\mathrm {fb}}^{-1}\): scales up to \(f_a/c_6 \le 3.44\,\mathrm{TeV}\) (\(4.68\,\mathrm{TeV}\)) would be then attainable if systematic errors were reduced by 1 / 2 (completely) with respect to their present value (see Table 5), leading to \(c_6\) being testable within the perturbative region.

Finally, mono-W and mono-Z signals may turn out to be especially prominent as phenomenological signals of the complete NLO ALP basis, in particular of ALP–fermion couplings in the chiral EWSB case. For instance, the \(aZ{\bar{\psi }}\psi \) couplings \(\mathcal {B}^q_3\), \(\mathcal {B}^q_4\), \(\mathcal {B}^q_6\), \(\mathcal {B}^q_7\), \(\mathcal {B}^q_8\) and \(\mathcal {B}^q_{10}\) in Eq. (94) may have a large impact on the very sensitive mono-Z channel, while the \(aW{\bar{\psi }}\psi \) vertices in \(\mathcal {B}^q_3\), \(\mathcal {B}^q_5\), \(\mathcal {B}^q_6\), \(\mathcal {B}^q_7\), \(\mathcal {B}^q_9\) and \(\mathcal {B}^q_{10}\) may induce mono-W signals; these couplings are not Yukawa suppressed and will be explored in a future study.

7.1.1 Strategy for a combined analysis

As is apparent from the discussion above, the interplay between mono-Z and mono-W signatures may be relevant as a way of disentangling the presence of non-linearity in the Higgs sector. Up to NLO in both expansions and barring extreme fine-tunings of operator coefficients, the cross sections for those two processes are:

  • Strongly correlated in the linear case, being both controlled by the coefficient \(c_{\tilde{W}}\) (\(c_{\tilde{B}}\) is not independent; see Eq. (23)).

  • Less correlated in the non-linear case, as operators other than \(\mathcal {A}_{\tilde{W}}\) and \(\mathcal {A}_{\tilde{B}}\) are expected to contribute to those mono-signals. For instance the purely chiral \(\mathcal {A}_6\) operator may contribute visibly to mono-W production within the projected HL-LHC prospects, as shown above.

A combined analysis of mono-Z and mono-W appears thus to be a valid method to shed light on the nature of the EWSB dynamics, once a positive detection occurs. Here we illustrate the (de)correlations of those signals in a purely qualitative way. The cross sections for \(pp\rightarrow Za\) and \(pp\rightarrow W^\pm a\) at a c.o.m. energy \(\sqrt{s}=13\,\mathrm{TeV}\) are computed using \(\mathtt{MadGraph5\_aMC@NLO}\), and subject to no other constraint than Eqs. (21) and (70) and a kinematical cut GeV. A random scan of Wilson coefficients \(c_i \in [-1,1]\) has been performed, along three scenarios: (i) the linear setup, which in practice reduces to the custodial-preserving \(\mathcal {A}_{{\tilde{W}}}\) and \(\mathcal {A}_{{\tilde{B}}}\) operators, see Eqs. (5) and (6); (ii) the non-linear custodial case, involving operators \(\mathcal {A}_{{\tilde{W}}}\), \(\mathcal {A}_{{\tilde{B}}}\), \(\mathcal {A}_{1}\) and \(\mathcal {A}_{2}\); (iii) the non-linear case including both custodial-preserving and non-custodially invariant couplings, here denominated non-linear for short, which adds to the previous set \(\mathcal {A}_{6}\) and \(\mathcal {A}_{7}\); see Eq. (48).

Fig. 7
figure 7

Top Cross sections for \(pp\rightarrow Z a\) and \(pp\rightarrow W^\pm a\) at \(\sqrt{s}=13\,\mathrm{TeV}\) with , computed with Madgraph5_aMC@NLO for a random scan of Wilson coefficients \(c_i \in [-1,1]\) (see text for details) within the region allowed by Eqs. (21) and (70), and using \(m_a=1\,\mathrm{MeV}\) and \(f_a=1\,\mathrm{TeV}\) for illustration. Bottom Distribution of the ratio \(R_{ZW}\) defined in Eq. (86), with the area of the histograms normalised to 1. In both cases, orange, cyan and dark blue correspond, respectively, to linear, non-linear custodial and non-linear non-custodial setups

The results for the cross sections are summarised in Fig. 7 (top) in the \(\sigma (p p \rightarrow W^{\pm }a)\), \(\sigma (p p \rightarrow Z a\)) plane, for linear (orange), cyan (non-linear custodial) and dark blue (non-linear), for \(f_a = 1\) TeV. The strong correlation characteristic of the EWSB linear scenario is clearly seen. In contrast, in the non-linear setup deviations from the sharp linear pattern emerge as expected as they stem from the non-linear operators \(\mathcal {A}_{1,2,6,7}\). Those deviations are necessarily small, though, as the contribution from any of the coefficients \(c_{1,2,6,7}\) is suppressed by a factor \(g/(16\pi )\) – see the Feynman rules in Appendix B – compared to that of \(c_{\tilde{W}}\), \(c_{\tilde{B}}\) (this conclusion may, however, be somewhat modified if a harder cut is imposed on the signal). In any case, in the event of a mono-W and/or mono-Z excess in future data, the ratio

$$\begin{aligned} R_{ZW} = \frac{\sigma (pp\rightarrow Za)}{\sigma (pp\rightarrow W^\pm a)} \end{aligned}$$
(86)

may be used to discern possible physical explanations. This observable has the advantage of being in principle independent of the scale \(f_a\) as well as of the ALP mass \(m_a\) (provided that \(m_a\ll m_Z\), as is the case assumed in this analysis for ALP stability reasons). Figure 7 (bottom) shows, for the three sets of operators considered, the \(R_{ZW}\) distributions obtained letting the coefficients of each operator considered assume random values in the interval \([-1,1].\) Note that neither the slope in the upper plot in Fig. 7, nor the numerical values in the other plots in this figure, are meaningful per se, but rather strongly dependent on the specifics of the analysis (e.g. on the kinematical cuts applied, here  GeV). Therefore, the strategy to follow in a realistic experimental analysis would be to look for a coincidence/tension between the expected value of \(R_{ZW}\) in the linear scenario and the measured one which, if detected, could indicate the presence of non-linearity in the Higgs sector.

We stress that the considerations in this subsection are aimed at discussing the expected relative strength of the mono-W and mono-Z observables in a purely qualitative way, having in mind future hadronic machines in general. Indeed, besides the strong dependence of the results on the kinematical cuts chosen, no consideration of backgrounds has been taken into account here. This is in contrast to the detailed phenomenological analysis at the beginning of the subsection, where it was shown that only the deviations stemming from \(\mathcal {A}_{6}\) have a chance of being visible within the foreseen HL-LHC prospects.

7.2 Associated production: \(pp\rightarrow a W^\pm \gamma \)

Consider next ALP production in association with both a W boson and a photon, as illustrated in Fig. 6(i) and (ii). Examining the interactions in the chiral effective Lagrangian Eq. (49), it is easy to see that those couplings exhibit a particularly interesting combined potential for disentangling the presence of different effective operators:

$$\begin{aligned}&a\, W^{+}\, W^{-} \rightarrow \frac{g}{4\pi f_a}\left[ c_6\,g^{\mu \nu }(p_+^2-p_-^2)\right. \nonumber \\&\quad +\left. \left( \frac{g}{4\pi }c_8-c_6\right) \left( p_+^\mu p_+^\nu -p_-^\nu p_-^\nu \right) \right] \nonumber \\&\quad -\frac{4i}{f_a}\left( c_{\tilde{W}}+\frac{g}{16\pi }c_2\right) p_{+\alpha }p_{-\beta }\varepsilon ^{\mu \nu \alpha \beta },\\&a\, W^{+}\, W^{-}\, \gamma \rightarrow \frac{ge}{4\pi f_a}\left[ \left( \frac{g}{4\pi }c_8-c_6\right) (g^{\mu \rho }p_a^\nu +g^{\nu \rho }p_a^\mu )\right. \nonumber \\&\quad +\left. 2c_6g^{\mu \nu }p_a^\rho \right] -\frac{4ig}{f_a}\left( c_{\tilde{W}}+\frac{g}{16\pi }c_2\right) \varepsilon ^{\mu \nu \rho \alpha }p_{a\alpha }, \nonumber \end{aligned}$$
(87)

as illustrated, respectively, in FR.4 and FR.11 of Appendix B and summarised in Table 2. Both processes are thus a priori sensitiveFootnote 14 to \(\mathcal {A}_6\), \(\mathcal {A}_8\), and to a fixed combination of \(\mathcal {A}_{\tilde{W}}\) and \(\mathcal {A}_2\), which therefore singles out a flat direction. In contrast, in the linear scenario only the Wilson coefficient \(c_{\tilde{W}}\) contributes significantly to both interaction vertices.

The first process in Eq. (87) leads to the striking mono-W signal being already searched by LHC collaborations and whose physics impact has been explored in Sects. 6.3.1 and 7.1. The second process leads to \(a W^\pm \gamma \) associated production, a search not being yet performed by the ATLAS and CMS collaborations. We will explore its prospects next, focusing on final states characterised by leptonic W decays. It is necessary to take into account, though, that the \(p p \rightarrow a W^\pm \gamma \) channel may be induced also by \(a\,Z\,\gamma \)-mediated contributions, to which the set \(\{\mathcal {A}_{\tilde{W}}, \mathcal {A}_{\tilde{B}},\mathcal {A}_1,\, \mathcal {A}_2,\,\mathcal {A}_7\}\) may contribute as illustrated in Fig. 6(ii),Footnote 15

$$\begin{aligned} a\,Z\,\gamma\rightarrow & {} \frac{i}{f_a}p_{Z\alpha }p_{A\beta }\varepsilon ^{\mu \nu \alpha \beta }\nonumber \\&\times \left( -2t_\theta c_{\tilde{W}}-\frac{g}{8\pi }(2c_1+t_\theta (c_2+2c_7))\right) , \end{aligned}$$
(88)

where the constraint in Eq. (23) has been applied. The contribution of \(\mathcal {A}_7\) is equivalent to that of \(\mathcal {A}_1\) and it is not necessary to consider it independently. In summary, the analysis is done on five distinct operators: \(\{ \mathcal {A}_1,\,\mathcal {A}_2,\, \mathcal {A}_6, \mathcal {A}_8\}\) and the combination of \(\{\mathcal {A}_{\tilde{W}},\,\mathcal {A}_{\tilde{B}}\}\) orthogonal to the \(a\gamma \gamma \) coupling. They are studied next, one at a time and keeping our analysis at parton level.Footnote 16

Fig. 8
figure 8

Missing transverse energy distributions for \(pp\rightarrow aW^{\pm }\gamma \) (\(W^{\pm }\rightarrow \ell ^{\pm }\nu \)) at \(\sqrt{s}=13\,\mathrm{TeV}\) LHC normalised to unity, for signals generated one by one for operators \(\mathcal {A}_1\) (blue), \(\mathcal {A}_2\) (violet), \(\mathcal {A}_6\) (orange), \(\mathcal {A}_8\) (yellow), \(\mathcal {A}_{\tilde{W}}\) (green) and \(\mathcal {A}_{\tilde{B}}\) (red)

The main irreducible SM background is \(pp\rightarrow W^{\pm }\gamma \) (with \(W^{\pm }\rightarrow \ell ^{\pm }\nu \)), a process which has been measured by ATLAS [106] and CMS [107] during the LHC \(\sqrt{s} = 7\) TeV Run. The reducible backgrounds are subdominant with respect to the direct \(W^\pm \gamma \) production and consist of (i) \(W^{\pm }\)+jets (with a jet misidentified as a photon), (ii) \(Z\,\ell ^+ \ell ^-\) (with one of the leptons misidentified as a photon or unidentified and the Z decaying into neutrinos), (iii) \(\gamma \)+jets (with a lepton originating from a heavy quark decay) and (iv) \(t\bar{t}\) (with a semileptonic decay of the top pair and a misidentification of a field as a photon). Their combined effect is to approximately increase the size of the \(W^{\pm }\gamma \) background by 15–25% depending on kinematics and the flavour of the lepton [106, 107]. For the present analysis, we simply account for this by scaling up our dominant SM \(W^{\pm }\gamma \) background by 20%.

The event selection requirements for photons and leptons for both signal and background are \(p_T^\gamma > 20\,\mathrm{GeV}\), \(p_T^{\ell } > 20\,\mathrm{GeV}\), \(|\eta ^\gamma | < 2.5\) and \(|\eta ^{\ell }| < 2.5\). will be employed as kinematic variable for distinguishing signal from background, as we find that this variable has significantly more signal discrimination power than the \(p_T\) of the lepton, because it receives contributions directly from the ALP in the signal set. The distributions (normalised to unity) for the various effective operators and the SM background are shown in Fig. 8. The harder momentum dependence of the effective couplings explored compared to the SM contribution are illustrated. In practice, we simulate events only up to TeV, as we find that signal cross sections for TeV are negligible.

The significance \(~\!\!{\shortmid }\!\!{\sigma }_{i}\) of a signal associated to one given operator \(\mathcal {A}_i\) is defined here as [108]

$$\begin{aligned} {{\shortmid }\!{\sigma }}_i=\sqrt{2\left[ (\mu _i s_i+b) \ln \left( 1+\frac{\mu _i s_i}{b}\right) -\mu _i s_i\right] }, \end{aligned}$$
(89)

where \(\mu _i\) was defined in Eq. (82), and \(\mu _i s_i\) and b denote, respectively, the number of events in the signal and the background, alike to the definitions used in Eq. (81) with, in this case,

(90)

where \(\mathcal {L}\) is the integrated luminosity, \(\sigma _i\) stands for the cross section induced by \(\mathcal {A}_i\) and \(\sigma _{\text {SM}}\) for the SM one. The kinematical cuts are taken as follows:

  • , as it optimises the sensitivity by removing most of the background; see Fig. 8. Higher values do not improve the signal-to-background ratio.

  • for a given \(f_a\) value, as required by the EFT validity considerations; see Sect. 5.

A very slight improvement in sensitivity to \(c_6\) is found for , as illustrated in Table 6 where the optimal cuts in and the corresponding sensitivity reach are shown. Nevertheless, for comparison purposes it is more appropriate to use one single cut for all operators, and the value indicated above will be used in the \(aW\gamma \) analysis for all operators.

Fig. 9
figure 9

Contours for \(~\!\!{\shortmid }\!\!{\sigma }=2\) (dashed) and \(~\!\!{\shortmid }\!\!{\sigma }=5\) (solid) sensitivity to \(p p \rightarrow aW^{\pm }\gamma \) (\(W^{\pm }\rightarrow \ell ^{\pm }\nu \)) signal at the LHC with \(\sqrt{s}=13\,\mathrm{TeV}\) and for an integrated luminosity of \(300\,{\mathrm{fb}^{-1}}\) (dark blue) and \(3000\,{\mathrm{fb}^{-1}}\) (light blue), as a function of \(\{f_a,c_i\}\). The left (right) panel shows the results obtained assuming that only the operator \(\mathcal {A}_{6}\) (the combination of operators \(\left( \mathcal {A}_{\tilde{W}}-t_\theta ^2 \mathcal {A}_{\tilde{B}}\right) \)) is contributing. The hatched region corresponds to , and is excluded by the EFT validity. The yellow region is excluded by the bound on \(g_{aZ\gamma }\) reported in Eq. (70). The mono-Z exclusion region from \(\sqrt{s}=13\,\mathrm{TeV}\) LHC with \(2.3\,{\mathrm{fb}^{-1}}\) of data is depicted by the red region. The grey reference lines correspond to constant values of \(f_a/c_i\). The region explored for \(c_{\tilde{W}}\) would be superseded by the bound from rare decays in Eq. (35), within their range of applicability, if the correlation between operators contributing simultaneously was disregarded

Table 6 Optimal missing transverse energy cut , and \((f_a/c_i)_{\mathrm {max}}\) \(2~\!\!{\shortmid }\!\!{\sigma }\) projected sensitivity reach for \(aW\gamma \) production, for \(\sqrt{s}=13{\,TeV}\) and integrated luminosities \(300\,{\mathrm{fb}^{-1}}\) and \(3000\,{\mathrm{fb}^{-1}}\)

Figure 9 shows the \(2~\!\!{\shortmid }\!\!{\sigma }\) and \(5~\!\!{\shortmid }\!\!{\sigma }\) sensitivity to the Lagrangian terms \(c_{\tilde{W}}\left( \mathcal {A}_{\tilde{W}}-t_\theta ^2 \mathcal {A}_{\tilde{B}}\right) \) (right) and \(c_6 \mathcal {A}_6\) (left), depicted in the \(\{f_a,\,c_i\}\) plane and for 300 and \(3000\,\mathrm{fb^{-1}}\). The hatched area is excluded as it would correspond to (corresponding to all signal events being outside the range of validity of the EFT). The \(2~\!\!{\shortmid }\!\!{\sigma }\) exclusion sensitivity reaches \(f_a / c_{\tilde{W}}\lesssim 3.8\,\mathrm{TeV}\) (\( 6.8\,\mathrm{TeV}\)) and \(f_a / c_6 \lesssim 0.4\,\mathrm{TeV}\) \((0.8\,\mathrm{TeV})\) for an integrated luminosity of \(300\,{\mathrm{fb}^{-1}}\) (\(3000\,{\mathrm{fb}^{-1}}\)) of data, assuming the naive EFT validity criterion .Footnote 17 We also note that when \(f_a\) drops below \(2\,\mathrm{TeV}\) (twice the energy of the highest bin in the distribution in Fig. 8) the reach in \(f_a/c_i\) is diminished: this can be seen from Figure 9 comparing the sensitivity curves with the grey reference lines which correspond to constant \(f_a/c_i\). The rightmost parts of the sensitivity curves is “parallel” to the latter lines, signalling that the reach in \(f_a/c_i\) is constant in this region. For \(f_a\le 2\,\mathrm{TeV}\) instead, the sensitivity lines drift upwards compared to the reference lines, meaning that in that region the analysis is sensitive only to smaller values of \(f_a/c_i\) than for the regions to the right. This effect is due to the fact that, as \(f_a\) is diminished, the EFT validity gradually excludes the high-energy bins from the analysis, thus losing discrimination power (note also that considering the strict EFT validity criterion \(\sqrt{\hat{s}} < f_a\) would amplify this effect).

Alike to the conclusions in Sect. 7.1 based on mono-W and mono-Z searches, associated \(aW\gamma \) production at the LHC exhibits thus some (weaker but complementary) sensitivity to \(c_{\tilde{W}}\) and to \(c_6\) (for large values of the latter), and may potentially reach stronger constraints on \(c_{\tilde{W}}\) than those obtained from LEP data, see Sect. 6.1, but weaker than the limits from rare decays – see the discussion around Eq. (35). It should be possible to further increase the reach of the analysis by using a sophisticated version of the transverse mass instead of , the so-called \(m_{T2}\) variable [109, 110].

7.2.1 Decorrelating power

The vertices in Figs. 2 (left) and 6 contribute simultaneously to the production of a and \(a\gamma \) in association with \(W^{\pm }\), as well as to VBF processes. Equations (88) and (87) show a dependence of these processes on certain combinations of coefficients and thus correlation effects are a priori expected. A combined analysis of \(a/a\gamma \) production in association with \(W^{\pm }\) and through VBF would enlarge the amount of kinematical information available, helping to disentangle their respective contributions.Footnote 18

We consider for illustration the simultaneous action of \(c_2\) and \(c_{\tilde{W}}\) on mono-W signals and on \(aW\gamma \) production. Note that it is precisely the contribution to the latter process of the \(aZ\gamma \) vertex in Eq. (88) which would allow one to separate the contributions of those two operator coefficients if data were sensitive to both, while from Eq. (87) alone the two coefficients would have been tied in a blind direction. Similar considerations apply to other combinations of coefficients.

Nevertheless, our results indicate that, within the foreseen experimental prospects, no sensitivity is expected via mono-W and \(aW\gamma \) production to couplings other than \(c_{\tilde{W}}\) and \(c_6\), that is, to \(\{ \mathcal {A}_1,\,\mathcal {A}_2,\, \mathcal {A}_7, \mathcal {A}_8\}\) as these yield very suppressed contributions. The combination of mono-W data and \(aW\gamma \) production data will therefore allow one to disentangle the measurement/constraint of \(c_{\tilde{W}}\) (a custodial-invariant signal common to linear and non-linear EWSB) from that of \(c_6\) (only expected if the EWSB mechanism enjoys non-linear aspects and violates custodial symmetry).

Fig. 10
figure 10

Main diagrams contributing to mono-Higgs (left) and di-Higgs (right) production in association with an ALP. The non-linear parameters entering each vertex are reported in the figure. Note that for the case of mono-Higgs, the contributions of \(\tilde{a}_{2D}\) and \(\tilde{a}_3\) are phenomenologically dominant, as the other coefficients enter the coupling with an extra suppression. The compact notation \(\tilde{a}_i\equiv c_i a_i\), \(\tilde{b}_i\equiv c_i b_i\) has been adopted

7.3 Higgs signatures

Bosonic ALP–Higgs couplings are an interesting class of new signals which may be observable only within non-linear realisations of EWSB. Indeed, in the latter case \(aZh^{n}\) vertices with \(n \ge 1\) are expected at LO, while they do not appear in the linear expansion below NNLO, as discussed in Sects. 3.2 and 4. They could induce especially interesting ALP signals: non-standard Higgs decay (\(h\rightarrow a Z\)) including invisible Higgs decay (\(h\rightarrow a \nu \bar{\nu }\)), associated ALP–Higgs production yielding an “mono-Higgs” signature at the LHC, or even a di-Higgs signature, see Fig. 10. The possibility that aZh couplings of heavy pseudoscalars with masses in the 0.5–\(2\,\mathrm{TeV}\) range may yield observable signals in \(pp \rightarrow a \rightarrow Zh \,(h\rightarrow b\bar{b})\) was recently considered in the context of the linear expansion [85] (while the ALP signatures in Higgs and Z decays are presented in this paper for the first time), stemming from one loop corrections to the NLO linear Lagrangian and from \(d=7\) operators.

The set of operators in the Lagrangian Eq. (48) contributing a priori to those signals are \(\{\mathcal {A}_{2D},\,\mathcal {A}_{3},\,\mathcal {A}_{10},\,\mathcal {A}_{11},\,\mathcal {A}_{12},\mathcal {A}_{13},\,\mathcal {A}_{14},\,\mathcal {A}_{17}\}\), see Table 2. Nevertheless, only the first three will be phenomenologically relevant within the LHC prospects, as the contributions from the rest are comparatively much suppressed by extra powers of \(1/(4\pi )\) and/or \(m_a^2/v^2\) in the case of \(\mathcal {A}_{17}\); see Feynman rules FR.6 and FR.7. This section focusses thus on the prospects for detecting \(\mathcal {A}_{2D}\), \(\mathcal {A}_3\) and \(\mathcal {A}_{10}\), both taken one by one and in a combined analysis. The vertices relevant to the mono-h signal and the non-standard Higgs decays are

$$\begin{aligned} a Z h\rightarrow & {} -\frac{4e v}{ s_{2\theta }f_a} \tilde{a}_{2D}\, p_a^\mu \nonumber \\&+ \frac{1}{2\pi v f_a}(\tilde{a}_3s_\theta -\tilde{a}_{10}c_\theta )(p_Z^\mu p_h\cdot p_Z-p_h^\mu p_Z^2),\nonumber \\ a \gamma h\rightarrow & {} \frac{1}{2\pi v f_a}(\tilde{a}_3c_\theta +\tilde{a}_{10}s_\theta )\left( p_A^\mu p_a\cdot p_A-p_A^2 p_a^\mu \right) ,\nonumber \\ \end{aligned}$$
(91)

showing that \(a_3\) and \(a_{10}\) enter in two different combinations into the processes considered:

  • the mono-h (and di-Higgs) signatures depend on the combination \(\tilde{a}_3s_\theta -\tilde{a}_{10}c_\theta \) via Z exchange, and also on the orthogonal one \(\tilde{a}_3c_\theta +\tilde{a}_{10}s_\theta \) via \(\gamma \) exchange – see Fig. 10;

  • in contrast, the non-standard Higgs decays depends only on \(\tilde{a}_3s_\theta -\tilde{a}_{10}c_\theta \).

We are thus contemplating three coefficients and two distinct processes. For the range of ALP masses used in the present numerical simulations (\(m_a\le 1\) MeV), the fermionic-induced bound on \(c_{2D}\) in Eq. (57) would lead to disregard the impact of \(\mathcal {A}_{2D}\) on LHC data if that coupling were considered by itself. Nevertheless, given that a different combination of couplings is at work in rare decays and in LHC signals, for consistency with the perspective of exploring complementary approaches, and given that for larger ALP masses the LHC signals would still be present in a refined analysis, the contributions of \(\mathcal {A}_{2D}\) must be retained in the analysis to follow. With this strategy, the impact of \(\mathcal {A}_3\) and \(\mathcal {A}_{10}\) on the non-standard Higgs decay width is subdominant with respect to that of \(\mathcal {A}_{2D}\), given the different v dependence; see Eq. (91). On the contrary, LHC data are instead quite sensitive to \(c_3\) and \(c_{10}\), in addition to \(c_{2D}\), given the stronger momentum dependence of \(\mathcal {A}_3\) and \(\mathcal {A}_{10}\). This suggests that, in order to disentangle the contributions from \(\mathcal {A}_3\) and \(\mathcal {A}_{10}\), a detailed study of the kinematic distributions of the mono-Higgs channel would be necessary, together with the combination of these results with those stemming from bounds on \(h\rightarrow \mathrm{BSM}\) from Higgs signal strength measurements. On the other side, \(a_3\) and \(a_{10}\) have a similar overall impact on the total mono-h cross section. For the sake of simplicity, we will then consider here only the impact of \(a_{2D}\) and \(a_3\), separately and combined, deferring the detailed study of \(\mathcal {A}_{10}\) to a future work.

A remark on the range of values of the operator coefficients is pertinent. Generally speaking, large values correspond to strongly interacting regimes, and NDA suggests \(c_{i}\le 1\), with the bound saturated in the strong regime. Nevertheless, as discussed in Sect. 3, a factor (f / v) has been implicitly absorbed in the definition of the parameter \(\tilde{a}_{2D} = c_{2D}a_{2D}\), where \(a_{2D}\) is the coefficient of the one-Higgs contribution in the polynomial \(\mathcal {F}_{2D}(h)\). The ratio \(\xi \equiv v^2/f^2\) is not a parameter from the effective theory point of view, but it is currently bounded to be \(\lesssim 0.2\) in concrete models [105] such as composite Higgs scenarios. Numerically, this would translate into an enhancement of a factor \(f/v\gtrsim 2.3\), which implies that the absolute value of the parameter \(\tilde{a}_{2D}\) can naturally exceed by at least 2-3 units the bare NDA constraint \(\tilde{a}_{2D}\le 1\). In this section we will assume a maximum absolute value \(\tilde{a}_\mathrm {max}= 3\) for both \(\tilde{a}_{2D}\) and \(\tilde{a}_3\), along the same lines as the analysis presented in Sect. 7.2.

Fig. 11
figure 11

distributions for signal and background for \(\sqrt{s}=13\,\mathrm{TeV}\) and \(3000\,{\mathrm{fb}^{-1}}\) of integrated luminosity, after applying the selection cuts from [111]. SM background distributions are obtained directly from [111], and the signal \(pp \rightarrow a h\) (\(h \rightarrow 4 \ell \)) distribution is shown for \(\mathcal {A}_{2D}\) (red) and \(\mathcal {A}_{3}\) (orange)

Fig. 12
figure 12

Contours for \(~\!\!{\shortmid }\!\!{\sigma }=2\) (dashed) and \(~\!\!{\shortmid }\!\!{\sigma }=5\) (solid) sensitivity to mono-H signal at the LHC with \(\sqrt{s}=13\,\mathrm{TeV}\) and for an integrated luminosity of \(300\,{\mathrm{fb}^{-1}}\) (dark blue) and \(3000\,{\mathrm{fb}^{-1}}\) (light blue), as a function of \(\{f_a,c_i\}\). The left (right) panel shows the results obtained assuming that only the operator \(\mathcal {A}_3\) (\(\mathcal {A}_{2D}\)) is contributing. The hatched region corresponds to , and is excluded by the EFT validity, while the green region is excluded by the bound on \({{\mathrm{Br}}}(h\rightarrow \mathrm{BSM})\) reported in Eq. (76) (for the left panel, this bound is not visible). The grey reference lines correspond to constant values of \(f_a/c_i\)

7.3.1 Mono-Higgs: \(p p \rightarrow a\, h\)

The process \( p p \rightarrow a h \,(h \rightarrow 4\ell )\) is considered next at 13 TeV LHC, and it follows the mono-Higgs analysis from the “Les Houches 2015” report [111], considering both 300 fb\(^{-1}\) and 3000 fb\(^{-1}\) integrated luminosity. Our signal sample is produced with MadGraph5_aMC@NLO [101], passed on to Pythia 8 [112] for showering and hadronisation and then to FastJet [113] for jet reconstruction. The reconstructed events are finally filtered imposing the selection cuts from Ref. [111], for a consistent comparison with SM backgrounds which are taken precisely from that reference.

The spectrum can be used to disentangle the new interactions from the SM background. This applies in particular to \(\mathcal {A}_3\), which induces a strong momentum dependence through both the aZh and the \(a\gamma h\) contributions to the mono-h signal. This is illustrated in Fig. 11 for an integrated luminosity of \(3000\,{\mathrm{fb}^{-1}}\). As expected, the spectrum produced by \(\mathcal {A}_3\) (orange line) is harder compared to that produced by \(\mathcal {A}_{2D}\) (red line) while, at the same time, the total (no cuts) integrated cross section for the signal generated with \(\mathcal {A}_3\) is manifestly lower than the one induced by \(\mathcal {A}_{2D}\).

In order to quantify the potential for observing in future LHC data a mono-Higgs signal generated by either of the two operators \(\mathcal {A}_{2D}\) and \(\mathcal {A}_3\), the analysis is done in two different stages.Footnote 19

7.3.2 One operator at a time

In a first stage, each of the two relevant operators, \(\mathcal {A}_{2D}\) and \(\mathcal {A}_3\), is considered individually, i.e. assuming that only one of the coefficients \(c_{2D}\) and \(c_3\) has a non-zero value. With this choice, the procedure already described in Sect. 7.2, Eqs. (89) and (90), is applied. The significance is computed as a function of \(f_a/c_i\), integrating the distributions in Fig. 11 from a chosen (which removes most of the background contribution) up to , according to the naive validity criterion (recall the discussion in Sect. 5).

Figure 12 shows the \(~\!\!{\shortmid }\!\!{\sigma }=2\) and \(~\!\!{\shortmid }\!\!{\sigma }=5\) sensitivity regions obtained for the two coefficients \(\tilde{a}_{2D}\) and \(\tilde{a}_3\) individually (see Eq. (89)), and integrated luminosities of \(300\,{\mathrm{fb}^{-1}}\) and \(3000\,{\mathrm{fb}^{-1}}\). As shown in Fig. 12, only a very restricted region of the parameter space for \(\mathcal {A}_3\) is accessible within \(3000\,{\mathrm{fb}^{-1}}\) at the LHC, due to its very small cross section: it results in a \(2~\!\!{\shortmid }\!\!{\sigma }\) sensitivity to \(f_a / \tilde{a}_3 \lesssim 470\,\mathrm{GeV}\), which is expected to further degrade if the strict EFT validity criterion \(\sqrt{\hat{s}} < f_a\) would be considered.

In contrast, Fig. 12 illustrates that mono-Higgs signatures in the \(h \rightarrow 4 \ell \) final state at HL-LHC have the potential to explore some region of parameter space for \(\mathcal {A}_{2D}\) within the range of EFT validity. The \(2~\!\!{\shortmid }\!\!{\sigma }\) exclusion sensitivity reaches \(f_a / \tilde{a}_{2D} \lesssim 340\,\mathrm{GeV}\) (\( 780\,\mathrm{GeV}\)) for an integrated luminosity of \(300\,{\mathrm{fb}^{-1}}\) (\(3000\,{\mathrm{fb}^{-1}}\)) of data. While considering the strict EFT validity criterion would somewhat degrade these limits, we also stress that considering other final states, e.g. \(h \rightarrow \gamma \gamma \), \(h \rightarrow b \bar{b}\), would significantly increase the sensitivity of this search, and we leave such a study for the future.

These results can be contrasted with the bounds on \(f_a/\tilde{a}_i,\,(i=2D,\,3)\) inferred from the current upper limit on \({{\mathrm{Br}}}(h\rightarrow \mathrm{BSM})\) in Sect. 6.2, which is depicted as a green region in Fig. 12 (right). If only \(\mathcal {A}_{2D}\) is considered, the area of parameter space which is to be probed by LHC with \(3000\,{\mathrm{fb}^{-1}}\) is already ruled out by that limit. This is not the case when only \(\mathcal {A}_3\) is considered, since its contribution to \(h\rightarrow \mathrm{BSM}\) is very suppressed. Nevertheless, cancellations might exist amongst the contributions of those two operators to non-standard Higgs decays, in regions of the parameter space where a mono-Higgs signal could be expected at a testable level. This is the motivation for the second stage in the analysis: a combined study where both operators are considered simultaneously.

Fig. 13
figure 13

Contours for \(2\sigma \) and \(5\sigma \) sensitivity to the mono-H signal at the LHC with \(\sqrt{s}=13\,\mathrm{TeV}\) and for an integrated luminosity of \(3000\,{\mathrm{fb}^{-1}}\), for different values of the parameters \((f_a/\tilde{a}_{2D})\) and \((f_a/\tilde{a}_3)\). The left (right) panel shows the result obtained for opposite-sign (same-sign) scaling factors. The grey shaded region is excluded by the bound on \({{\mathrm{Br}}}(h\rightarrow \mathrm{BSM})\) reported in Eq. (76)

7.3.3 Combination of the two operators \(\mathcal {A}_{2D}\) and \(\mathcal {A}_3\)

In this case of simultaneous consideration, the shape of the distribution after applying the analysis cuts can be estimated, for an arbitrary choice of \(f_a\), \(\tilde{a}_{2D}\) and \(\tilde{a}_3\), as

$$\begin{aligned} (f_a/{\mathrm{TeV}})^{-1}\left[ \tilde{a}_{2D}^2 \,x_k + \tilde{a}_3^2 \,y_k + \tilde{a}_{2D} \tilde{a}_3 \,(z_k - x_k - y_k)\right] ,\nonumber \\ \end{aligned}$$
(92)

where the index k runs over the distribution bins, and \(x_k,\, y_k\), and \(z_k\) represent the prediction in the kth bin obtained with \(f_a=1\,\mathrm{TeV}\) and for the configurations \((\tilde{a}_{2D} = 1, \tilde{a}_3=0)\), \((\tilde{a}_{2D}=0, \tilde{a}_3=1)\) and \((\tilde{a}_{2D}=1, \tilde{a}_3=1)\), respectively. With this estimate of the distribution one can easily compute the maximal projected sensitivity to mono-Higgs signals, varying the lower cut in missing transverse energy, , in order to maximise the sensitivity \(~\!\!{\shortmid }\!\!{\sigma }\) at each \(\{ f_a / \tilde{a}_{2D},\,f_a/\tilde{a}_3 \}\) point.

The results are shown in the scatter plot in Fig. 13. The yellow (orange) points are those for which there exists a lower cut within the EFT validity region, which allows one to observe a mono-Higgs signature with a significance of least 2 (5) \({\shortmid }\!\!{\sigma }\) at the 13 TeV LHC with \(3000\,{\mathrm{fb}^{-1}}\). The left and right panels distinguish the two cases in which \(\tilde{a}_{2D}\) and \(\tilde{a}_3\) have either opposite or same sign. In both cases, in the limit \(f_a/\tilde{a}_{2D}\rightarrow \infty \), the \(2~\!\!{\shortmid }\!\!{\sigma }\) and \(5~\!\!{\shortmid }\!\!{\sigma }\) sensitivity curves for \(f_a/\tilde{a}_3\) converge towards values close to the optimal ones found in the \(\mathcal {A}_3\)-only analysis (the discrepancy is due to the different treatment of the cut). An analogous behaviour is also observed in the orthogonal direction.

More interesting is the region where \(\tilde{a}_{2D}\) and \(\tilde{a}_3\) are close in absolute value. In particular, it shows that the contributions to the mono-Higgs process stemming from the two operators produce destructive interference when \(\tilde{a}_{2D}\) and \(\tilde{a}_3\) have the same sign: for \(\tilde{a}_{2D}\simeq \tilde{a}_3\) (right panel) the signal is reduced compared to the case in which one of the two operators dominates, and the sensitivity is therefore lower in this region of the parameter space. On the other hand, for \(\tilde{a}_{2D}\simeq - \tilde{a}_3\) (left panel) constructive interference effects enhance mono-Higgs production, so that the LHC would be sensitive to larger values of \(f_a/\tilde{a}_i\) than in the one-operator case.

As with the previous study, the results obtained for projected mono-Higgs searches in the \((\tilde{a}_{2D}, \tilde{a}_3)\) plane can easily be contrasted with the bound inferred in Sect. 6.2 from the current upper limits on \({{\mathrm{Br}}}(h\rightarrow \mathrm{BSM})\). This is depicted as a grey-shaded region in Fig. 13 and seen to be more stringent for same-sign \(\tilde{a}_{2D}\) and \(\tilde{a}_3\), as no cancellation can then take place in the dominant expression in \({{\mathrm{Br}}}(h\rightarrow \mathrm{BSM})\); see Eq. (75).

As a result of the combination of the existing bound with the projected reach, it appears that mono-Higgs searches may be useful for probing a relevant region of the parameter space, namely that with \(300\,\mathrm{GeV}\lesssim |f_a/\tilde{a}_3| \lesssim 700\,\mathrm{GeV}\), where the lower bound is a direct consequence of requiring the EFT validity. In this region, \(|f_a/\tilde{a}_{2D}|\) may be no smaller than 2–\(3\,\mathrm{TeV}\), as lower values are already excluded by the \(h\rightarrow \mathrm{BSM}\) constraint that we derived from present data in Sect. 6.2. Overall, we find that although mono-Higgs searches at the LHC are sensitive to the presence of both operators \(\mathcal {A}_{2D}\) and \(\mathcal {A}_3\), they are not competitive in constraining \(f_a/\tilde{a}_{2D}\) with the Br(\(h\rightarrow \mathrm{BSM}\)) bound, neither with the fermionic-induced bound in Eq. (57) when that coupling is considered just by itself. On the other hand, they are more sensitive to the presence of \(\tilde{a}_3\) and therefore they may provide valuable, complementary, information in the study of the ALP’s coupling to the Higgs.

In conclusion, if interpreted in terms of the presence of a light pseudo-Goldstone boson, and barring fine-tunings, the observation of a mono-Higgs signature at the LHC represents a smoking gun of non-linearity in the EWSB sector, as couplings such as \(aZ(\gamma )h\) are not to be found in the NLO Lagrangian of linear EWSB setups (see FR.1, FR.7). Within the effective Lagrangian in Eq. (42), the observation of this signal at foreseen LHC data can only be attributed to the presence of \(\mathcal {A}_3\) (or eventually \(\mathcal {A}_{10}\)), as \(\tilde{a}_{2D}\) is out of reach in that data set given the range of values allowed by the current bounds.

7.3.4 A comment on di-Higgs production

The aZhh interaction allows for di-Higgs final state, due to a quark-initiated production via Drell–Yan (see Fig. 10 (right)). This is in contrast to di-Higgs production in the SM, which is exclusively gluon-fusion initiated. Moreover, the presence of in the final state could serve as an additional handle to suppress SM backgrounds to the di-Higgs process. This discussion highlights that ah interactions could constitute a very promising avenue for non-linear ALP phenomenology at the LHC, which we intend to explore in the future.

7.4 Coupling to fermions

In this paper we have focussed on the relation of the ALP with the EWSB sector via bosonic operators, and explored the impact of couplings of the ALP to SM bosons. However, in Sects. 2 and 3 we noticed that bosonic operators would lead to ALP–fermion couplings via a field redefinition.

Although the bounds we obtained in Eqs. (33) and (57) when considering operators one at a time are very strong, it is worth exploring complementary searches at the LHC. The structure of these fermionic couplings is very specific, proportional to the Yukawa matrices; see the Feynman rules in Appendix B. One would then expect the ALP to couple more strongly to third generation quarks, provided the matrices \(X_\psi \) in Eq. (18) are generic. We then consider the characteristics of the leading ALP production in association with a \(t\bar{t}\) pair at LHC.

For ALPs stable on LHC scales, this final state is similar to searches for supersymmetric scenarios, where two stops are strongly produced and produce a signature of \(t\bar{t}\) in association with two neutralinos (dark matter candidates). For example, via the LO coupling \(c_{2D}\) the production cross section of the final state \(t {\bar{t}}\)+ALP, where the ALP is emitted as final state radiation – see FR.17, is given by

$$\begin{aligned} \sigma (p \, p \rightarrow t \, \bar{t} \, a) [\sqrt{s} = 13 \text {TeV}] = c^2_{2 D} \, \left( \frac{1 \text { TeV}}{f_a}\right) ^2 \, (50~\text {fb}).\nonumber \\ \end{aligned}$$
(93)

In these searches, final states are selected by requiring a number of jets, b-jets with characteristics matching those of top decays. More importantly, a substantial cut on missing energy is required. For example, a recent study with 13 TeV data by ATLAS [114], the cut on missing energy for the channel of interest ( TT) (topology of two tops) is 400 GeV. In our scenario, with single-production of a light pseudoscalar via strong production of two tops, the distribution of missing energy is not as hard as in scenarios where heavy stops are pair produced and inject a large boost into the neutralino. This is shown in Fig. 14, where we compare our results for \(f_a= 1\) TeV with the ATLAS data and Monte Carlo simulations for a supersymmetric scenario with 800 GeV stops and a light neutralino.

Fig. 14
figure 14

Missing energy distribution for the production of an light ALP in association with \(t \bar{t}\) for 13.3 fb\(^{-1}\) of 13 TeV data. The normalisation has been chosen with \(f_a= 1\) TeV and then multiplied by a factor 10. We show the corresponding simulation of supersymmetric scenarios by ATLAS, as well as their event count

This type of analysis opens the way to further phenomenological explorations of the fermionic signals associated to ALP production. This is most relevant and promising in order to tackle the ALP–fermionic couplings identified in Appendix A, which are part of the complete NLO basis of operators – bosonic and fermionic – involving one ALP and established in this work. See also the phenomenological signals discussed just before Sect. 7.1.1.

8 Summary and outlook

In this paper we have developed a systematic approach to describe interactions of an axion or an axion-like particle (ALP) with special attention to the sector responsible for electroweak symmetry breaking (EWSB), obtaining the complete – bosonic and fermionic – NLO Lagrangian in the case that the latter is non-linearly realised. With this theoretical framework in place, we have then studied new collider phenomenology associated with ALPs, as well as explored the sensitivity of the LHC in the high-luminosity phase (HL-LHC). Both the approach and the phenomenological results in this paper are novel, and they will hopefully guide new searches at the LHC and the study of complementarity with other experiments at lower energies.

Fig. 15
figure 15

Summary of the most significant constraints stemming from the studies on tree-level ALP couplings presented in this work, upon the assumption \(g_{a\gamma \gamma }=0\) or equivalently \(c_{\tilde{B}}=-t_\theta ^2 c_{\tilde{W}}\). The upper bars down to \(\text {mono}-Z\) prospects included correspond to \(95\%\,\, \text {CL}\) existing constraints and expected reaches, inferred in Sects. 6 and 7. The lower bars, instead, indicate the \(2~\!\!{\shortmid }\!\!{\sigma }\) projected reach of given searches at the LHC with \(\sqrt{s}=13\,\mathrm{TeV}\); see Sect. 7. Systematic uncertainties are taken into account for the present constraints but are neglected in the projected ones

8.1 Theoretical developments

Neglecting ALP masses, we have developed a complete list of bosonic operators under two scenarios, with EWSB linearly and non-linearly realised, valid in all generality for any value of the axionic scale larger than the electroweak scale (and in the non-linear case also larger than its implicit electroweak BSM scale). In the linear case, in which the couplings involving an ALP first appear at \(d=5\), special attention has been paid to recalling the subtle effect of the operator \((\Phi ^\dag \overleftrightarrow {D}_\mu \Phi )\frac{\partial ^\mu a}{f_a}\), which induces a contribution to the two-point function involving longitudinal gauge bosons, and can be removed via a Higgs field redefinition. This redefinition generates new couplings of the ALP to fermions, with the distinctive feature of being proportional to the Yukawa couplings.

In the non-linear realisation, we have employed a systematic approach to classify the new operators order-by-order, and much care has been paid to define the expansion in both its non-linear and ALP sectors. A complete and non-redundant basis of operators involving an ALP has been determined, even though the impact analysis has focussed on the bosonic couplings. Several interesting features arise when considering ALPs coupled to a non-linear realisation of EWSB, in particular the existence – already at the leading order in the derivative expansion – of interaction vertices involving the Higgs and gauge bosons with the ALP. This is due to the fact that the two-point function stemming from the operator \(\mathcal {A}_{2D}\) cannot be entirely traded by fermionic couplings (in contrast to the linear case above). Additionally, we find that the non-linear effects induce new Lorentz structures beyond those in the traditional (linear) ALPs couplings.

Furthermore, a detailed comparison of the differences and correspondences between the operators in the linear and non-linear setups has been developed, as well as a prospective study on how to disentangle a priori both expansions if a signal is found.

For the most part, phenomenological studies on the ALP effective Lagrangians have focussed on couplings to photons, gluons and fermions. However, if the ALP couples to photons SM gauge invariance also implies the existence of similar couplings to the massive gauge bosons, irrespective of whether the mechanism behind EWSB gives rise to a linear or a non-linear expansion.

In this paper we have obtained new constraints on ALP couplings to SM particles, as well as provided a guide for future searches of ALPs and the sensitivity HL-LHC could reach, for ALP scales of \(\mathcal {O}(\)TeV) or somewhat above. Special attention has been paid to the consistency of the kinematic regions used for each search with the assumption of validity of the ALP expansion in powers of \(1/f_a\).

8.2 Current constraints

We started by looking at new constraints on (linear) ALP couplings to hypercharge and left interactions. In particular, we looked for observables sensitive to the linear combination of \(SU(2)_L\times U(1)_Y\) operators \((c_{\tilde{B}}, \, c_{\tilde{W}})\) orthogonal to the coupling to photons, i.e. orthogonal to \(g_{a\gamma \gamma } \sim c_{\tilde{W}}s_\theta ^2 + c_{\tilde{B}}c_\theta ^2\).

To account for the strong constraints on the value of \(g_{a\gamma \gamma }\) we then imposed \( c_{\tilde{B}}\simeq - t_\theta ^2 c_{\tilde{W}}\) in our analyses, effectively reducing the number of parameter by one. In Fig. 15 one can see that LEP constraints on the invisible width of the Z boson and LHC searches for final states with one massive boson and missing energy (mono-Z and mono-W channels) provide handles to probe the Wilson coefficient \(c_{\tilde{W}}\). We find that mono-Z limits impose at present a constraint \(f_a/c_{\tilde{W}} \gtrsim 4\,\mathrm{TeV} \).

We also discussed the impact on bounds from rare decays to mesons and missing energy, and how they provide a complementary approach to accelerator searches. Besides the stringent constraints existing in the literature on \(f_a/c_{\tilde{W}}\) from the former searches – which strictly speaking only apply when all other operators are set to zero – a similar new bound on the strength of the linear operator \(\mathbf {O}_{a\Phi }\) has been obtained here.

In non-linear realisations of EWSB in particular, many other operators affect LHC physics. For ALP masses under 3 GeV, data on rare meson decays allowed one to strongly bound \(c_{2D}\) if considered by itself. Furthermore, of particular interest are operators which induce new type of couplings, specifically new couplings of the ALP to Higgs particles, e.g. ALP–Zh or ALP–Zhh, which are dominantly generated by the non-linear operators \(\mathcal {A}_{2D}\), \(\mathcal {A}_3\) and \(\mathcal {A}_{10}\). In the LHC RunI the coupling ALP–Zh can be probed via non-standard Higgs decays; if the impact of the different operators contributing is considered one at a time, a bound on \(f_a/\tilde{a}_{2D}\) of the order of 3 TeV follows for ALP masses in the range 3–34 GeV.

8.2.1 Future sensitivity

We then moved on to examine the capability of the HL-LHC to search for ALPs. Apart from improvements on current channels (non-standard Higgs decays, mono-W and mono-Z), we proposed and evaluated possible new channels at 13 TeV which could dramatically change our understanding of ALPs both in the linear and non-linear realisations.

Future improvements of mono-Z searches with 3 ab\(^{-1}\) of data could bring the collider sensitivity to the linear operator coefficient \(f_a/c_{\tilde{W}}\) to above 20 TeV. But the most striking signatures stemming from bosonic operators, i.e. mono-Higgs and associated \(W^\pm \gamma \) production plus missing energy, would access the non-linear operators mentioned before, \(\mathcal {A}_3\) and \(\mathcal {A}_{2D}\), and a new one, \(\mathcal {A}_6\). We also propose the study of different channels, like mono-W in combination with \(aW\gamma \) production, to disentangle the presence of two different operators. On the other hand, the very sensitive mono-Z and mono-W signals may play a specially important role in probing fermion–ALP interactions; this will be tackled in a future study.

Besides these signals, we proposed to use the searches on stops in on-shell top final states to look for ALPs, whose couplings to quarks are derived from couplings to the bosonic sector and are proportional to the fermion mass.

This study motivates further work on ALP physics beyond the usual framework of couplings to photons and gluons, and more emphasis was placed on the effects in the sector responsible for electroweak symmetry. Additionally, we propose to perform dedicated experimental analyses in channels like mono-Higgs and new channels involving the ALP and two bosons in the final state, such as \(W^\pm \gamma \) and missing energy.

Although in this paper we presented a rather comprehensive analysis of the effective theory for ALPs as well as their phenomenology, there are a number of open issues that deserve further study. To name a few: the extension of the collider analysis to higher ALP mass regions (including signals from ALP decays), the study of vector-boson fusion channels, the analysis of ALP–fermion signals to probe the complete NLO basis of operators – bosonic and fermionic – established in this work, the combination of collider constraints with lower-energy experiments (particularly rare decays of mesons), and the evaluation of modifications to the history of the axion in the early universe due to the non-linear effects.