Probing the Interactions of Axion-Like Particles with Electroweak Bosons and the Higgs Boson in the High Energy Regime at LHC

We study the interactions of Axion-Like Particles (ALPs) with the Standard Model particles, aiming to probe their phenomenology via non-resonant searches at the LHC. These interactions are mediated by higher dimensional effective operators within two possible frameworks of linearly and non-linearly realised electroweak symmetry breaking. We consider the ALPs to be light enough to be produced on-shell and exploit their derivative couplings with the SM Higgs boson and the gauge bosons. We will use the high momentum transfer processes, namely $hZ, Z\gamma, WW$ and $WW\gamma$ production from $pp$ collisions. We derive upper limits on the gauge-invariant interactions of ALPs with the electroweak bosons and/or Higgs boson that contribute to these processes, from the re-interpretation of the latest Run 2 available LHC data. The constraints we obtain are strong for ALP masses below 100 GeV. These allowed effective interactions in the ALP parameter space yield better significance at HL-LHC and thus, offer promising avenues for subsequent studies. Furthermore, we augment our cut-based analysis with gradient-boosted decision trees, which improve the statistical significance distinctly across these interaction channels. We briefly compare the results with the complementary probe of these couplings via direct production of ALPs in association with the Higgs boson or a vector boson.


I. INTRODUCTION
Originally motivated by the efforts to solve the strong CP problem [1][2][3][4][5], pseudo-Nambu-Goldstone bosons (pNGBs) generically arise in a variety of new physics (NP) scenarios.Their implications are many, including the dynamic generation of small neutrino masses (Majorons) [6], attempting to solve the flavor problem (Flavons) [7], contributing to composite Higgs models and extra-dimensional theories [8].The pNGBs also play a role in addressing the long standing anomaly of muon magnetic moment [9], the hierarchy problem [10] and electroweak baryogenesis [11].In addition, they can serve as potential dark matter candidates or provide a portal connecting the Standard Model (SM) particles to dark sector [12][13][14].Typically, pNGBs exhibit symmetry under a continuous or, in some cases, a discrete shift of the field.These pNGBs, which enjoy a variety of origins, interactions and masses, are often grouped together in a much broader class of the axion-like particles (ALPs).
Owing to diverse origins, ALPs connect different sectors in high-energy physics.Studies aiming to detect ALPs in a range of masses and interactions have guided the current and future direction of experiments for beyond the Standard Model (BSM) physics, as discussed in recent reviews [15][16][17].ALPs can manifest through a variety of traces in experiments running at different energy scales.At the LHC, ALP interactions are probed through signatures including new resonances or missing energy [18][19][20][21], or interactions with top quarks [22] and via Higgs decays [23,24].Meson decay experiments typically provide the conventional constraints on ALP-QCD couplings.In the context of flavor experiments, ALPs with masses below a few GeV can be resonantly produced via meson [25][26][27][28] and lepton decays [29,30] or directly in e + e − interactions [31,32].Fixed-target settings [33,34] further enable the search of sub-GeV ALPs.In cosmological and astrophysical probes, still lighter ALPs manifest through observable phenomena [35][36][37].
In this work, we aim to probe the effects of non-resonant ALP-mediated production processes involving SM final states only, at the LHC.The ALP serves as an off-shell propagator in these s-channel scattering processes.We analyze the behavior of differential cross-sections at high energies for scatterings that produce electroweak gauge bosons and the Higgs boson from pp collisions.As we will see, the enhanced high-energy sensitivity of the LHC enables us to impose significant constraints on ALP-interactions, from such processes as their rates grow with energy.This deviates from the SM scenarios, which exhibit a decrease in the production rates with the collisional center-of-mass energy( √ ŝ), as 1/ŝ.The explicit dependence of the derivative interactions of the ALPs with SM particles lead to • We also briefly compare the limits on the ALP couplings obtained from the aforementioned non-resonant production processes with those from direct probe of "mono-X" signatures (X = Z, W ± , h) through the production of ALP in association with a Higgs or a vector boson.
• To enhance the distinction between signal and background in the processes under study, we employ a multivariate analysis using the Boosted Decision Tree (BDT) technique.This approach, going beyond the conventional cutbased method, exhibits a marked improvement in signal significance, as will be explicitly demonstrated in the subsequent sections.
In this work, we adopt a model-independent effective field theory (EFT) approach.Considering that the Higgs boson observed at LHC is still part of an SU (2) L doublet, as in the SM, then any electroweak (EW) physics that extends beyond the SM can be systematically examined using a linear EFT expansion [53,54].The setup of a linear EFT includes the SM and an ALP [18,20,55] and contrasts with the framework of a chiral EFT when considering interactions involving the ALP and the Higgs boson [20,21].The current experimental results do not preclude the existence of a Higgs component that deviates from this doublet structure, at least within a 10% uncertainty margin [56], thus making the non-linear EFT methodology equally pertinent for exploration [57][58][59][60][61][62][63].We will mainly focus on the linear EFT framework in this paper, while also consider the chiral EFT context to assess the ALP-Higgs interactions.
In future LHC runs, the non-resonant ALP searches are set to become increasingly competitive.This improvement is expected not just because of the significant growth of available data on the high luminosity frontier but also due to progress inspired by the SMEFT studies which encourage a generalised, systematic approach to probing new physics [64].While the SMEFT presumes that new physics manifests through particles that are too heavy to be produced on-shell [65,66], non-resonant ALP searches aim to look for ALPs too light to undergo resonant decays.This distinct approach enables non-resonant ALP searches to explore complementary areas of the parameter space, depending on minimal assumptions about the ALP decay width.
The plan of the paper is the following.In section II, we describe the ALP effective theory and set the framework for our analysis.This has been followed by a discussion on the general features of non-resonant ALP EW processes considered in this study in section III.In section IV, we undertake a detailed collider analysis studying the kinematical features of the signal and background processes.We present the constraints derived on the parameters of the ALP Lagrangian using measurements from the latest available Run 2 LHC data.We discuss the validity range of our analysis.Thereafter, we define some benchmark scenarios for ALP signals and discuss the projected sensitivities to the effective couplings in the upcoming HL-LHC run.We also discuss the constraints arising from direct probe of these couplings through the production of ALP in association with a Higgs boson or a vector boson.In section V, numerical results and their interpretations, along with detailed discussions on cross-section parameter dependencies, are covered.In section VI, the use of boosted decision trees to improve the cut-based results is explored.In section VII, we summarise the existing constraints from other experiments on ALP mass and couplings.Finally, we draw our conclusions in section VIII.

II. ALP EFFECTIVE LAGRANGIAN
We consider an ALP, denoted by a, which is a pseudo-scalar state.Its interactions are constructed to respect the invariance under shifts a(x) → a(x) + α, where α is a constant (reflecting to be of the form J µ ∂ µ α, consistent with its Goldstone nature).Within the EFT framework, we express all ALP interactions with suppression factors which are inversely proportional to the characteristic scale f a ≫ m a (mass of the ALP), that is unknown and naturally close to the mass scale of the heavy sector the ALP originates from.Also, it is implicitly assumed that f a ≫ v where v denotes the EW scale.We require all ALP interactions to be invariant under the full SM gauge group.For linear EWSB realization, the most general linear bosonic Lagrangian, incorporating next-to-leading order (NLO) effects related to a, is given by where the leading order Lagrangian now comprises the SM Lagrangian along with the ALP kinetic term, while the NLO bosonic corrections due to the ALP interactions with the SM fields are included in the effective Lagrangian : Eqn.
(3) contains a complete and non-redundant set of dimension-5 bosonic operators which are given by: Here, G µν , W µν and B µν are the generic field strength tensors corresponding to the SM gauge groups SU (3) c , SU (2) L and U (1) Y respectively.The dual field strength tensors X µν are defined by Xµν ≡ 1 2 ϵ µνρσ Xρσ, with ε 0123 = 1.The associated operator coefficients c i in Eqn.(3) are real constants.Φ is the SM Higgs doublet, with The first three operators in Eqn.(3) induce ALP couplings to the gluon, the photon and the Z and W bosons as given by : The coupling strengths are defined as: with s w and c w denoting the sine and cosine of the Weinberg angle, respectively.After electroweak symmetry breaking, the last operator in Eqn.(3), O aΦ induces a contribution to a two-point function involving longitudinal gauge fields and can be removed via a Higgs field redefinition.To assess its effect on observables, one approach is to substitute it with a fermionic vertex [18].This substitution can involve a vertex that either conserves or flips chirality, or a combination of both.For illustration, the Higgs field redefinition: Φ → e icaΦ a/fa Φ (7) when applied to the bosonic Lagrangian in Eqn.(1), leads to a modification originating from the Higgs kinetic energy term in the SM.This modification precisely negates O aΦ up to O(a/f a ).Meanwhile, the Yukawa terms in the SM generate a new Yukawa-axion coupling, allowing for a complete substitution of O aΦ .The overall effect is, the replacement in Eqn.(3) by: where Y u,d,ℓ are the SM Yukawa matrices.In this work, we focus on experimental signatures that involve ALPs and SM bosons (W, Z, γ and h).We do not consider the CP-violating terms and direct ALP-fermion interactions (stemming from the O aΦ operator) since such interactions are markedly suppressed at tree-level due to their proportionality to the involved fermion Yukawa couplings1 .
Within the framework of non-linear (chiral) electroweak theory, the interactions of the ALP with SM fields at leading order are captured by the following expression: Here, L HEFT LO denotes the chiral Lagrangian within the Higgs Effective Field Theory (HEFT) [57,[67][68][69] framework.In this model, the Higgs boson is treated as a singlet field, while the Goldstone bosons π a are introduced in a non-linear representation, through the exponential parametrization by means of a unitary matrix U given by : with τ a , a = 1, 2, 3 are the Pauli matrices.The U matrix which transforms as a bi-fundamental under SU (2) L × SU (2) R : The series expansion of U is as follows: where G ± and G 0 are defined as G ± = (π 2 ± iπ 1 )/ √ 2 and G 0 = −π 3 , respectively.This peculiarity implies that there are multiple Goldstone boson interactions possible in the HEFT formalism, not just among themselves but also with the other fields.We work under this framework to study novel ALP-Higgs interactions that can probe the unique singlet nature of the Higgs boson as described by the HEFT Lagrangian.Now the leading order Lagrangian for ALP interactions is expressed as: where the fields V µ (x) and T (x) are defined by the relations : In this framework, as stated, the Higgs boson is introduced as a gauge-singlet scalar field.There are no limitations from symmetry arguments on the implementation of this field and its interactions with itself and with the other fields.Its interactions incorporated by polynomial functions such as: where coefficients a 2D and b 2D are independent constants.The term A 2D serves as the chiral analogue to the linear operator O aΦ , with a distinct feature: it facilitates not only ALP-fermion interactions comparable to those in Eqn. 8, but it also induces new interactions at leading order between the ALP, electroweak gauge bosons and the Higgs, such as the trilinear aZh, aγh coupling.Exploring these interaction phenomenology yield an understanding of the process of electroweak symmetry breaking, distinct from the linear approach and its interplay with axion-like states.Also, the other induced interactions in the polynomial function (14) can be important compared to the effects from other possible operator involving interactions of Higgs and the gauge boson at the same order.Within the linear paradigm, such interactions emerge at the next-to-next-to-leading order (NNLO), corresponding to mass dimension seven and thus, their effects are expected to be relatively subdominant.Furthermore, within the chiral framework, the operators O G, O W and O B (in Eqn. ( 4)) also become relevant at NLO.

III. ALP MEDIATED PROCESSES
We focus exclusively on processes with off-shell production of ALP into SM final states only.These processes are production of Zh, Zγ, W ± W ∓ and W ± W ∓ γ from pp collisions.They all probe different operator combinations within the ALP EFT parameter space.To facilitate our discussion, we present in Fig. 1, the Feynman diagrams which, by virtue of higher dimensional operators, contribute to the aforementioned processes.The blobs on the vertices of diagrams (a)-(i) stand for possible inclusion of one of the higher dimensional operators listed in Eqns.( 5) and (13).ALP production in these processes is dominated by gluon-gluon fusion as the q q induced process for these final states is proportional to the quark masses from the operator O aΦ (See Eqn. ( 8)) and thus, highly suppressed.These channels have been studied for heavy resonant searches in the differential measurements of the invariant mass of the final state system by the CMS and ATLAS collaborations.No excess of events have been found and we shall reinterpret these measurements for the ALP mediated processes.We particularly aim to probe the boosted regime with at least one of the weak bosons or the Higgs boson decaying hadronically.This ensures that we have a large fraction of events and reduce uncertainties and yet maintain a balance with clean environment, using the jet substructure techniques for tagging the heavy bosons.Such boosted regimes with improved techniques are useful for identifying lighter ALPs that get rejected by the selection criteria of the cross-section measurements.The W W γ channel is an exception for which we will study the fully leptonic final state.All the four processes receive contributions from s-channel mediated non-resonant ALP.The W W γ process receives additional contribution from initial quark states.We included these diagrams in our calculation for consistency.We have, however, checked that their contribution is significantly lower compared to those initiated by gluons.We investigate into the non-resonant triboson production, mediated through ALP.It is known that the resonant triboson production puts stringent constraints on ALP couplings for m a > 100 GeV [49].The non-resonant ALP mediated W W γ process can be induced by the couplings {g agg , g aZγ , g aW W , g aW W γ , g aγγ }.The couplings g aW W and g aW W γ depend on one parameter c W .However, a four point interaction of aW W γ, with a different Lorentz structure, can leave distinct kinematic effects in the process than the aW W interaction.Both the couplings g aW W and g aW W γ lead to an amplitude growing with energy.In the case of g aW W , the energy growth arises because of the extra powers of momenta in the aW W vertex, whereas for the contact interaction, g aW W γ , the energy growth is also due to the fact that there is absence of one propagator in the diagram involving this vertex (Fig. 1 (d)).
As the ALP is always off-shell, its propagator acts as a suppression in the hadronic scattering amplitudes.However, due to the presence of the explicit momentum dependence of the ALP interactions under discussion, the ALP couplings lead to higher energy growth with the invariant mass of the event final states as compared to that in the corresponding SM backgrounds.
All the diagrams in Fig. 1 must arise with double insertions of ALP operators.This results for the amplitude to scale as f −2 a and cross-sections in the order f −4  a .In all generality, the contributions from bosonic ALP couplings in Eqn.(5) interfere with the SM amplitudes.Thus, a generic cross-section when expressed as polynomial functions of Wilson coefficients ci fa , including both SM and EW ALP contributions, has the structure where FIG. 1: Representative Feynman diagrams depicting the production of (a) gg → Zh, (b) gg → Zγ, (c) gg → W + W − and (d)-(i) gg(q q) → W + W − γ mediated by an off-shell ALP.Each of the diagram consistently involves a double insertion of ALP operators.
When the ALP couplings are relatively small, their interference with the SM background may become comparable with the pure ALP-signal and thus must be considered in evaluation of the process.The coupling value at which this interference becomes significant varies based on the specific final state being analyzed.In processes with electroweak diboson final states, the ALP signal interferes with the SM ones occurring at one-loop.The nature of interference could be constructive or destructive and it depends on the relative sign of the couplings g agg and g aV1V2 (new vertices in the diagram, Fig. 1).Currently, however, the magnitude of ALP-gluon couplings accessible at LHC are loose and the interference effect is suppressed in the total cross-section estimation [50].The quartic dependence from the pure ALP interactions dominate and result in large-ŝ enhancement in the cross-section, σ ALP ∼ ŝ/f 4 a .Such energy scaling is valid only as long as the energies involved in the scattering process remain below the cutoff scale of EFT, √ ŝ < f a .On the other hand, the SM backgrounds usually scale as 1/ŝ well above the resonance of the s-channel.In hadronic collisions, the calculation of any cross-section involves a convolution of this partonic cross-section with the parton distribution functions (PDFs).These PDFs exhibit a declination with the increase in energy.Taking this effect into account, the ALP mediated rates show a slower decrement with the invariant mass of the system compared to the SM background.This allows to distinguish ALP-mediated processes from the SM background as discussed in the following sections.

IV. COLLIDER ANALYSIS
The effective Lagrangian has been implemented into FeynRules [70] to generate the UFO model file [20] for the event generator Madgraph5 aMC@NLO [71].MadGraph was employed for producing all signal and background sample events.These events are generated at leading order (LO) and subsequently processed by Pythia (v8) [72] for parton showering and hadronization.For event generation, NNPDFNLO parton distribution functions [73] are utilized, setting both factorization and renormalization scales dynamically to half the sum of all final state transverse energies in the scattering processes.The matching parameter, QCUT, was specifically determined for the different processes as discussed in Ref. [74].Detector effects are incorporated by passing the events through Delphes-v3.4.1 [75].Jets are reconstructed using Fastjet-v3.3.2 [76].We impose a set of cuts at the generator level on the final state particles, namely, all processes, except for the W W γ channel where we require m ll ′ > 10 GeV.The angular separation between two particles is defined as ∆R = ∆η 2 + ∆ϕ 2 , where η is the pseudorapidity and ϕ is its azimuthal angle of each particle.
The ALP signal events are generated fixing m a = 1 MeV, treating the ALP as effectively massless at LHC energies.
The ALP width Γ a is assumed to be considerably smaller than m a .The specific choices of the ALP mass and its decay width have negligible impact in the non-resonant regime.We generate signal samples with pure ALP-mediated production and the interference between the ALP and the SM processes.However, we have checked that the estimation of the total rate of the process is numerically dominated by the σ NP (Eqn.( 15)).
A. 13 TeV LHC probes In this section, we will present the details of the process analyses.All of these processes are sensitive to the product of the ALP-gluon coupling g agg and the relevant ALP-bosonic couplings.We derive constraints on these ALP interactions via the non-resonant ALP-mediated signals mentioned above, utilizing publicly available data from the ATLAS and CMS collaborations at Run II 13 TeV LHC as listed in Table I This process yields a powerful probe of the ALP-Higgs coupling through the operator A 2D in Eqn.(13) and also assumes the additional presence of g agg .It may be expected among the leading signals for ALP-Higgs interactions and a conclusive evidence if the underlying EWSB enjoys a non-linear character.There can be further probes of this operator contribution in double Higgs production.In fact, this operator also induces ahγ interaction and thus, one can question for a process of pp → hγ signal mediated by the ALP.However, as the ALP forces the interaction to be derivative and the photon being transverse and on-shell in pp → hγ, leading to a vanishing cross-section.
In order to study the current reach of the LHC in constraining this coupling through pp → Zh, we optimize a hadron-level analysis to obtain the sensitivity to the BSM signal, which is well-pronounced in the high energy bins.To achieve this, we consider the Z(ℓ + ℓ − )h production and scrutinize the h → b b decay channel.The dominant backgrounds consist of Zb b and the irreducible SM production of Zh.Reducible contributions arise from Z+ jets production (c-quarks included but not explicitly tagged), where the light jets can be misidentified as b-jets and t t production in the fully leptonic decay mode.Rather than performing a resolved analysis with two distinct b-tagged jets, our method focuses on a single fat-jet with a cone-radius R = 1.0.We apply the BDRS method [81] with some minor modifications to enhance sensitivity.This technique merges jets using the CA algorithm, from a significantly large cone radius to encapsulate all decay products of a resonance (like the Higgs boson).The process involves breaking the primary jet J into two subjets, j 1 and j 2 with m j1 > m j2 .We impose a mass drop condition such that m j1 < µm J where µ = 0.66 (m J is mass of the fat-jet), along with a symmetry criterion between the subjets requiring j1,j2 > 0.09.If the condition fails, the lighter subjet, j 2 , is removed and the process repeats with j 1 .This iteration continues until a final jet J is obtained that satisfies the mass drop condition.This selection is fairly efficient in filtering out QCD jets but can still be impacted by the underlying events at the high energies and luminosities of the LHC.To further eliminate rare QCD events and effects from hard gluon emissions or underlying events, we refine the Higgs vicinity by recombining the components of j 1 and j 2 using the CA algorithm with a reduced radius R filt = min(0.2,R b b/2)2 .We keep only the three strongest filtered subjets for resonance (Higgs boson) reconstruction.Overall, this approach effectively distinguishes boosted electroweak-scale resonances from significant QCD backgrounds.
The event selection criteria are based on Ref. [77].We constructed fat-jets with a minimum transverse momentum, p T > 100 GeV and a rapidity cut of |y| < 2.5.Leptons are isolated within a R = 0.3 radius, with their p T > 25 GeV and |η| < 2.5.Events with exactly two isolated, opposite-charge, same-flavor leptons, conforming to the Z-peak with invariant mass between max[40, 87 − 0.030.mZh ] GeV and [97 + 0.013.mZh ] GeV (as a function of m Zh ) and a leptonic separation of ∆R > 0.2 are selected.For Higgs reconstruction, we required at least one fatjet with a minimum of two B-meson tracks (p T > 15 GeV) and a fatjet p T > 250 GeV.After mass-drop and filtering criteria, events with exactly two b-tagged subjets, well-separated from isolated leptons are selected.The Higgs invariant mass is required to be between 75 and 145 GeV.To minimize the backgrounds, both reconstructed Z and Higgs were required to have p T > 200 GeV and the t t background was significantly reduced by setting a / E T / √ H T < (1.15 + (8 × 10 −3 ).m Zh )/1 GeV limit.The p ll T was also optimised to be greater than 20 + 9. m V h /(1 GeV) − 320 GeV, where all the events are required to have a minimum invariant mass of Z and Higgs of 320 GeV.The ATLAS provides a measurement of invariant mass of the Zh system in the 2 leptons+2 b-jets final state [77].The bins extend in varying steps from 320 GeV to 2.8 TeV.These cuts are relaxed for higher-energy tails to account for resolution effects and smaller backgrounds and lead to a higher signal acceptance upto energies of multiple TeV.The corresponding signal and background distributions with the ATLAS data are shown in Fig. 2 (a).The SM background and the experimental data have been obtained from [83].
Cross-sections for each of the background processes simulated, are summarised in Table II.All the aforementioned background processes are generated at LO and multiplied with appropriate K-factors to obtain the higher order in QCD cross-sections.The cross-sections for the background processes used in this analysis are shown with the order of QCD corrections provided in brackets.σ bc 's and σ ac 's are cross-sections before the cuts and after the cuts discussed in the text are applied.
The last column presents the K-factors for the higher order corrections of the processes with respect to the leading order cross-sections.
In Ref. [82], the CMS collaboration has performed a search analysis for the non-resonant ALP-mediated production of Zh in the semileptonic channel.The analysis requires the leading (sub-leading) lepton from the event to have p T > 40 (30) GeV and |η| < 2.1 (2.4).The invariant mass of the dilepton pair is required to be in the range 70 GeV < m ℓℓ < 110 GeV and have p ℓℓ T > 200 GeV.In addition, the events contain an anti-kT jet with radius R = 0.8 and p J T > 200 GeV.The merged jet mass is required to be in the range 65 GeV < m J < 105 GeV.The analysis also makes use of the N-subjettiness variable and requires events with τ 21 < 0.4 for the fat-jet.This study spans m Zh bins from 450 GeV to 2 TeV.Overall, the CMS analysis translates into an average ALP signal selection efficiency of ∼ 7% (Ref.[82])3 .

pp → Zγ
We then consider the signal of Zγ production mediated by an off-shell ALP and the Z decays hadronically.This process receives contributions from bosonic operator coefficients c B and c W , apart from the ALP-gluon coupling c G. These coefficients also affect aγγ and aZZ vertices.Hence, to fully understand aZγ vertex modification, assumptions on g aγγ and g aZZ are necessary as we elaborate later.In this process, we consider the regime where both the Z boson and the photon are significantly boosted, leading to the total hadronic decay products of Z being contained within a large radius jet.Consequently, the final state features a fat-jet in recoil against a hard photon.We employ jet substructure techniques to reconstruct the Z jet from its invariant mass, with the fat-jet radius estimated by the relation . The following SM processes can mimic the Zγ signal.Continuum γj process emerges as the most dominant background.The Z/W γ+jets process, while having a similar topology to the signal, is less prevalent due to lower cross-section.Production of t tγ with hadronic decays of the top quarks also contribute to the background.However, demanding a high p T photon and Z tagging can suppress these backgrounds.Similarly, single top productions like tjγ, tbγ also contribute in the background.The pp → h(→ b b)γ associated production in the SM has a nominal rate, either due to the very small couplings of Higgs with the initial state quarks or because the process predominantly receives contribution at one-loop.
The ATLAS [78] Collaboration has searched for a resonance decaying into Z and photon.No significant excess over the SM expectation has been reported.In the signal from 800 GeV < m Jγ < 2 TeV, the ATLAS has collected 55 events with Ldt = 36.1 fb −1 .We reinterpret this analysis for deriving constraints on ALP interactions.With the SM background expectation, we compare Fig. 5 (a) of Ref. [78] as shown in Fig. 2 (b).The selection criteria based on Ref. [78] and the corresponding cut efficiency are presented in Table III

pp → W W
The ALP mediated production of W W via the gluon-gluon fusion depends on only one bosonic operator O W and the ALP-gluonic operator O G. We consider final states where one W decays leptonically (eν or µν) and the other W decays hadronically.The fully leptonic decay channel has been recently studied in Ref. [51].Although the hadronic decay channel of a vector boson is overwhelmed by the presence of background processes with significantly large cross-sections, it has a larger branching fraction than the leptonic decay channel.It also allows a full kinematic reconstruction of the diboson system (W lep + W had ), using the W mass to constrain the combined four-momentum of the lepton and neutrino.The semileptonic final state, therefore, offers a good balance between efficiency and purity.
Since the effects of the ALPs are most dramatic at high momenta of vector boson, we consider highly Lorentz-boosted vector bosons where the hadronization products of the two final state quarks overlap in the detector to form a single, large-radius jet.Dominant backgrounds to this signal come from SM processes: W + jets (and the W decaying leptonically), t t (semi-leptonic mode), single top quark production (t( t) j, tW ), W + W − + jets (W → lν, W → jj), t tW + jets (when both top quarks decay hadronically and W → lν) and W Z (with W → lν, Z → jj).The event reconstruction and event selection criteria are based on Ref. [79].To reject other subdominant backgrounds from Drell-Yan and fully leptonic t t events, we reject events that contain more than one lepton.Jets are clustered by the anti-kT algorithm with radius parameter R = 0.8 and required to have a hard p T > 200 GeV.The ⃗ p miss T is required to be larger than 110 GeV to reject QCD multijet background events.
The leptonic W boson candidate is reconstructed from the lepton and the ⃗ p miss T .The longitudinal momentum of the neutrino can be solved for by appying the W boson mass constraint, assuming that the neutrino is the sole contributor to p miss T .Here, we follow the CMS analysis method [79].The transverse component of the neutrino momentum comes directly from the ⃗ p miss T .Fixing the mass of the W boson candidate to its pole mass value, one can relate the fourmomentum of the W boson to those of the lepton and neutrino via a quadratic equation, which can have two real or complex solutions.In case of two real solutions, the solution with the smaller absolute value is assigned as the neutrino longitudinal momentum, whereas in case of two complex solutions, the real part common to both is instead assigned.The leptonic and hadronic boson candidates are combined into a diboson system by adding their fourmomenta.Because the signal events are expected to have a back-to-back topology in the detector, we require events in the signal region to satisfy the following criteria: ∆R(J, lepton) > π/2, ∆ϕ(J, ⃗ p miss T ) > 2 and ∆ϕ(J, W lep ) > 2, where W lep denotes the reconstructed leptonic W boson candidate.Additionally, we require m W W > 900 GeV to isolate the signal events.The CMS collaboration presents a measurement of the m W W distribution in the 1 lepton+1 fat-jet+missing energy channel, employing a dataset of 35.9 fb −1 integrated luminosity from the Run II LHC [79].This analysis spans m W W bins up to 4 TeV.The invariant mass of the reconstructed diboson system, m W W , is the chosen event variable for the signal extraction.The comparison of the ALP signal with CMS data is illustrated in Fig. 2 (c).

pp → W W γ
We now consider the non-resonant ALP mediated production of triboson states of W + W − γ from pp collisions and both the W bosons decaying leptonically.We find that even for an elusive ALP mass of m a < 100 GeV, the process W + W − γ deviates from the SM case as it gets modified due to the presence of ALP-gluon coupling and ALP-bosonic couplings {g agg , g aZγ , g aW W , g aW W γ , g aγγ }.Both the couplings g aW W and g aW W γ depend on one parameter c W while couplings g aZγ and g aγγ depend on c B also along with c W .The event reconstruction and event selection criteria are based on Ref. [80].We look into final states with two different flavour, opposite sign (DFOS) leptons and one photon along with / E T .Amongst the existing analyses for the same final state carried by the experimental collaborations, the CMS analysis has recently reported the first observation of SM W + W − γ production in the leptonically decay channel [80] and hence, we reinterpret this measurement for our analysis to constrain the new physics parameter space.Although the cross-section for the ALP signal in the 2 → 3 process is small (∼ O(1) fb for f a ∼ O(1) TeV and m a < 100 GeV), but the SM backgrounds for this channel are also small.The main SM backgrounds arise from W W γ, W Zγ, Zγ and t tγ and processes with non-prompt leptons and photons.The final state events comprise of a photon having a transverse momentum of p γ T > 20 GeV and |η γ | < 2.5.There should be exactly one pair of DFOS leptons requiring |η l | < 2.5 and p l T > 20 GeV.We also require p miss T > 20 GeV.To minimise backgrounds from W Zγ and relevant top quark processes, events are rejected that contain an additional lepton with p T > 10 GeV or at least one b-jet.The photon and the lepton must be well separated, such that ∆R(l, γ) > 0.5.To further suppress background contributions, we impose specific criteria: the dilepton invariant mass (m ll ) > 10 GeV, the dilepton transverse momentum (p ll T ) > 15 GeV and the transverse mass, in the bins of the invariant mass of dilepton-photon system (m llγ ) are compared with the ALP signal (as shown in Fig. 2 (d) for one such benchmark case of ALP scenario) to derive constraints on its couplings.

B. Fits to EFT coefficients
We take the experimental measurements in Table I as input and our theoretical expectations for the observables in the ALP model.For the Zh and Zγ channels, we quantify the effects of the Wilson coefficients in the ALP EFT from a simplified binned likelihood ratio analysis.The likelihood function, constructed as a product of binned Poisson probabilities can be expressed as : where s k , b k and n k denote respectively the number of ALP signal, SM background and observed data events in a given bin k, and the signal strength modifier µ involves the ALP signal couplings (c i /f a ) and is the only variable parameter in the likelihood function, with no systematic uncertainties considered for simplicity (for details see Ref. [20]).L(µ) is maximised for no ALP signal events and corresponds to the background-only hypothesis.It is tested against the combined background and signal hypothesis.No significant excess was observed by the experimental data with respect to the SM expectations.ALP couplings c i /f a are considered excluded at 95% C.L. when the negative log-likelihood (NLL) (− log L) of the combined signal and background hypothesis exceeds 3.84/2 units the NLL of the background-only hypothesis.
For the W W and W W γ channels, we perform a χ 2 fit to the data including systematic errors but no correlations between the bins.The χ 2 function of the Wilson coefficients is minimised to find the best fit value of c i /f a and the 95% C.L. intervals are obtained by requiring ∆χ 2 = χ 2 − χ 2 min ≤ 3.84.The bounds extracted from these four process analyses constrain the products, g agg g aV1V2 and g agg g aZh .For the Zh process, we obtain g agg a 2D < 0.075 TeV −2 at 95% C.L. The limits on the coupling product g agg g aW W at 95% C.L. are determined to be : g agg g aW W < 0.59 TeV −2 from W W analysis and g agg g aW W < 0.27 TeV −2 from W W γ analysis.In addition, the W W γ process induces a four point interaction of aW W γ and the analysis puts a constraint on it of g agg g aW W γ < 0.18 TeV −2 .The Zγ process analysis yields a 95% C.L. exclusion limit of g agg g aZγ < 0.24 TeV −2 .These limits can be interpreted as constraints on g aV1V2 , assuming a constant g agg value of 1 TeV −1 .A smaller g agg would result in more stringent limits on g aV1V2 .It is noteworthy that these operator coefficient bounds are more significantly constrained by the higher energy data bins.

C. Validity of EFT
In this subsection, we discuss the validity of our theoretical expectations discussed.As we explore the non-resonant s-channel ALP signatures, they have several interesting characteristics that could potentially benefit the detection , sensitivity information on its couplings and calls for further study.When the momentum transfer through the ALP propagator (p a ) obeys |⃗ p a | 2 ≫ m a , Γ a where Γ a is its decay width, the cross-section and differential distribution of the ALP signal remain largely independent of actual value of m a .This implies that our search strategy retains its validity over a wide range of ALP masses, particularly those significantly below the energy scale of the experiment.For the LHC searches we investigated, this translates into a consistent detection capability for ALP masses below 100 GeV.Fig. 3 (left panel) verifies the off-shell approximation for the processes.It shows the ALP signal cross-section at √ s = 13 TeV, applying the cuts defined in Eqn. ( 16), plotted against m a for fixed values of a 2D , c W , c B and f a .Here, Γ a is implicitly computed at each point which is dependent on m a and the ALP couplings, following the relation Γ a ∝ (c i /f a ) 2 m 3 a .The lines running almost parallel with the mass of the ALP in Fig. 3 (left panel) confirm that our simulations are relevant even for small values of m a and till about 100 GeV.We perform the analyses on the assumption that the ALP contributes only off-shell in all the processes we considered, setting the ALP mass and decay width in our simulations at m a = 1 MeV, Γ a = 0.
As the mass m a increases, the cross-sections for processes of Zh, Zγ, W ± W ∓ and W ± W ∓ γ show a resonance effect when the propagator becomes predominantly influenced by the ALP mass.This is particularly noticeable for all the processes.The chosen point values for c i , f a facilitate resonant ALP exchange in the Zγ, W ± W ∓ , W ± W ∓ γ at masses above 150 GeV, close to 250 GeV and around 400 GeV, respectively.The slight shifts in the W W γ and Zγ processes can be attributed to the photon p T preselection cut.We evaluated the Zγ channel at a point (c W = 1, c B = −0.305) to ensure a "photophobic" interaction (where g aγγ = 0) and to explore the resonant effect induced by g aZγ coupling.
In the Zh process, the resonance effect is apparent near 300 GeV.These observations serve as a validation that our results hold for ALP masses up to approximately 100 GeV.At this mass, the cross-sections for all four mentioned processes deviate by less than 5% from their asymptotic values when m a approaches zero.
Furthermore, an important feature of the non-resonant process is its lack of dependence on specific assumptions about extra couplings not directly contributing to the process and any other model specific parameters.This is in contrast to on-shell analyses, which are usually limited to particular mass and width ranges and where the impact of extra ALP couplings becomes evident in their partial decay widths.Conventionally, studies on ALP limits from resonant processes have focused on a single independent g aV1V2 coupling, as outlined in Eqn.(5) [19,33,38,94,95].However, recent studies have started to explore scenarios incorporating two or three independent couplings simultaneously [39,40,96].Thus, the model-independence of non-resonant searches is evident, making them more effective in detecting new physics phenomena.
Estimating the validity of the EFT expansion is crucial for collider bounds, especially given the broad range of energies encountered at hadron collider experiments.We now consider the range of Wilson coefficients constrained and check whether they allow for a valid EFT interpretation of cross-sections.Theoretically, the g aV1V2 couplings depend only on the ratio c i /f a (as detailed in Eqns.(6a)-(6c)).However, the value of f a is important in assessing the validity of EFT, which in turn restricts the energy range feasible for LHC searches, such as energy bins where √ ŝ < f a .If the underlying BSM theory operates in a weak coupling regime, leading to the operators in Eqn.(4) at one-loop, the coefficients might be attenuated by an additional 16π 2 factor.This would considerably restrict the valid energy bins for LHC searches.
For illustration, in the Zγ production process where Z decays into two bottom quarks and is detected as a fat-jet, the energy scale of the collision is determined by the invariant mass of the jet-photon system, m Jγ .The Zγ EFT expansion validity can be maintained by ensuring m Jγ stays below the cut-off scale, f a .However, precisely defining the EFT cut-off scale in a model-independent manner is difficult without taking into account the specific details of the underlying UV-complete theory.To have an idea of the cut-off scale, we adopt a methodology based on Refs.[97,98].If m Jγ is consistently smaller than f a in most collisions, the ratio R M V 1 V 2 (where 'V 1 V 2 ' refers to 'final state bosons'), defined below would tend to unity.
In Fig. 3 (right panel), we see this effect in the process at √ s =13 TeV, involving non-zero EFT couplings, whose values are set at the limits obtained at 95% C.L. As for the Zγ process, ratio of R M V 1 V 2 close to 1 suggests that the energy exchange in the process remains considerably below m max Jγ , the maximum value allowed for m Jγ .Identifying such peak m Jγ values provides a practical reference for the EFT cutoff scale, f a .Additionally, we examine similar variations for the Zh and W W processes, using their respective invariant mass measurements.When m max V1V2 (V 1 V 2 denoting 'final state bosons') is 1 TeV, for example, 20% of signal events are lost and this implies that final limits are weakened.In cases involving both the Higgs chiral operator and linear bosonic operators, the ratio R M V 1 V 2 is approaching unity when m max V1V2 > 2.0 TeV and thus, more than 95% of the collision events respect the EFT validity considerations.

D. Collider Analysis with HL-LHC probes
We will discuss the results of our cut-based analysis for a few benchmark points (BPs) to accentuate the distinguishability of the ALP signal from the backgrounds.The BPs are so chosen such that they obey the experimental constraints obtained from the 13 TeV data.The selected benchmark points are listed in Table IV.As some of the operator coefficients probe more than one process at a time, we choose these points to highlight specific regions of parameter space so that they probe one effective coupling at time for a specific process as detailed below.
It is to be noted that all of these four processes depend on ALP-gluon coupling g agg and the relevant ALP-bosonic or -Zh coupling.In the simulation for all the BPs, we choose c G = 1 and f a = 5 TeV.For Zh production, we have an ALP-Higgs operator that contributes at LO and we choose the corresponding operator coefficient value for a 2D = 0.2.The ALP mediated Zγ production is induced from g aZγ coupling which in turn receives contributions from c W and c B .We choose BP2 such that c W = −c B (g aZγ ̸ = 0) and BP3 such that c W = −c B t 2 θ , i.e., g aZZ = 0.The W ± W ∓ production receives bosonic contribution from g aW W coupling only and thus, depends on c W .The W ± W ∓ γ production receives contributions from g aγγ , g aZγ and g aW W , g aW W γ couplings.BP5 corresponds to g aZγ = 0 while BP6 corresponds to c W = −c B /t 2 θ , i.e., g aγγ = 0. Couplings g aW W , g aW W γ are proportional to c W only. Equipped with these benchmark points, we now discuss some kinematic differences between the ALP signal and the SM backgrounds for each of the above mentioned processes.We first consider the Higgs-strahlung process, which has a radius R = 1 fat-jet and two leptons in the final state.Fig. 4 (a) shows the mass of the leading fat-jet for the signal BP1 and the dominant backgrounds.It is evident from the distributions that the peak around 115 − 140 GeV reflects the Higgs peak for the signal process whereas for most of the backgrounds, the peaks are below 50 GeV reflecting that the fat-jet mimicing either single prong hard QCD jet or a peak around 90 GeV reflects Z boson or peak about 165 − 185 GeV from a top.Numerically, the m J ∈ [115,140] GeV selection suppresses the Z+jets backgrounds by a factor of 20% at the price of keeping ∼ 60% of the signal events.
The variable p T J (Fig. 4(b)) is quite efficient in distinguishing the new interactions from most of the SM backgrounds.The availability of larger parton center-of-mass energy in these derivative interactions pushes the transverse momentum of fat-jet (p T J ) to higher values.We thus, put slightly tighter cuts on these variables compared to the 13 TeV analysis, namely, p T J > 250 GeV and 115 < m J < 140 GeV.We also select events satisfying m Zh > 500 GeV.
For Zγ production, the photon p T is a strong discriminator.The distributions for the ALP signals corresponding to BP2 and BP3 and various SM backgrounds are shown in Fig. 5 (a).The photons in the signal events exhibit a hard p T .Requiring an energetic photon puts a high p T threshold on the jet in recoil, above which the Z boson becomes sufficiently boosted.The E γ distributions extend to 1 TeV.In the signal process, which despite being an s-channel process, we see significant enhancement from the SM backgrounds at high energy tails of the distribution due to the contribution of bosonic type of dimension-5 operators.In presence of the effective operators, the cross-section grows faster at higher energies compared to the SM backgrounds whose effect diminish with increasing energy.
The fat-jet resulting from the Z → b b decay can potentially retain information about its two-pronged structure.This characteristic feature is captured by the jet-shape variable known as N-subjettiness [99,100], which is computed

Signal
Coupling parameter Process  as follows: where N refers to the number of subjet axes taken within the fat-jet.The index i runs over the individual jet constituents and p i,T represents their transverse momenta.∆R ij = (∆η) 2 + (∆ϕ) 2 measures the separation in the η − ϕ plane between a possible subjet j candidate and a constituent particle i.The normalization factor, N 0 , is computed as i p i,T R 0 , where R 0 denotes the fat-jet radius.The β represents the angular exponent and is taken to be 1 here.Essentially, the ratio τ N /τ N −1 serves to differentiate between jets that likely contain N internal energy clusters versus those with N − 1 clusters.Specifically in our analysis, the jet coming from the Z boson is observed to exhibit smaller values for τ 21 in comparison to typical QCD jets, a pattern evident in Fig 5(b).Thus, a cut of τ 21 < 0.45 can reduce a significant amount of background while translating into a signal selection efficiency of ∼ 12%.
We analyze the pp → W ± W ∓ process, in which one W boson undergoes a leptonic decay and the other a hadronic decay.Here, we consider the m ef f or the effective mass variable which is an important variable for BSM searches.It defined as follows : Here, i encapsulates all entities in the event, including the reconstructed jets and p T refers to their transverse momenta and / E T is the total transverse missing energy in the event.This global variable, which does not rely on specific event topology, proves to be highly useful, especially given that signal events receive a high parton-level center-of-mass energy compared to most SM background processes.In Fig. 6 (a), we present the effective mass of the ALP process distribution for this channel for BP4.It is evident that for most SM backgrounds, the distributions tend to peak at lower values than in the ALP scenario.It is to be noted that these are normalized distributions, providing qualitative insights into potential additional cuts on these variables, rather than quantitative ones.
In Fig. 6 (b), we plot the ∆ϕ(jet, / E T ) distribution for both the signal and background processes.In case of the ALP signal, the / E T is most likely to recoil against the leading jet in the azimuthal plane.Therefore, the distribution peaks around ∼ π for the signal, and similarly for the SM W W and W +jets backgrounds.Moreover, the veto on additional hard jets largely reduce the W Z, single top and t t (semileptonic) backgrounds.
Next, we examine the pp → W ± W ∓ γ process in a fully leptonic channel, characterized by two DFOS leptons, a photon and missing energy in the final state.In Fig. 7 (a), the invariant mass distribution of the dilepton-photon system is shown.In case of the signal, the leptons originating from the W bosons are boosted due to the influence of the ALP coupling.As a result, the distribution for the signal shows a prominent enhancement towards higher values of the invariant mass system, in contrast to the SM background processes.We also show the distribution of / E T in Fig. 7 (b) for both the SM backgrounds and the ALP signal events.It is evident that for all the benchmark points considered, the event distribution in the presence of an ALP is shifted towards increased missing transverse energy, distinguishing it from the typical SM scenarios.Thus, these variables play a significant role in isolating the ALP interaction effects in the events.It is relevant to mention here that the kinematic distributions for BP5 and BP6 look   quite similar.As the mass of the ALPs are the same and it indicates that the process receives dominant contribution from g aW W as we re-iterate that the benchmark points have been chosen such that BP5 leads to g aZγ = 0 and BP6 leads to g aγγ = 0.
We now delve into an interesting feature in this 2 → 3 process.We will explore the relationship between two variables in the W W γ final state: the invariant mass of the dilepton-photon system, m llγ and the ∆R separation between the two leptons.Fig. 8 highlights how the populated regions in the phase-space shift with the inclusion of new physics effects from higher dimensional operators.The following observations emerge from this figure: • In the background scenarios, such as W Zγ and events with non-prompt leptons or photon, the m llγ distribution typically decreases smoothly and rapidly.However, in scenarios involving new physics, this distribution fall more slowly.We observe that background dileptons are more likely to appear in the same hemisphere, contrasting with the signal events.In case of the ALP signal, most of the events are produced with all the three bosons being equally energetic.There is a notable increase in event density as ∆R ll approaches π, indicating that the leptons from the W bosons have greater separation (indicated by red color for a higher number of events).Implementing a cut on the invariant mass of the dilepton and photon at 200 GeV would distinctly highlight these new phase space regions.Additionally, an angular separation cut of ∆R ll ≥ 2.5 could effectively filter out a significant portion of the background events, which tend to cluster at lower angular separations.
• In case of the backgrounds such as Z(→ τ + τ − )γ (and tau leptons decaying leptonically) and non-prompt photons, the photon is significantly energetic and is in recoil to the heavy boson.Thus, the decay leptons appear boosted with less separation between them.Overall, this implies that ALP interactions which result in both the dilepton and the photon gaining higher energy, also results in the angular separation between the leptons tending to be larger compared to that in the SM backgrounds.This correlation is especially evident in Fig. 8(a), where the most populated event regions are around ∆R ll ∼ π, especially in high m llγ regions (around 200 GeV).We assess the sensitivity reach for the various benchmark points at the 14 TeV LHC.To quantify the signal significance, we use the following definition: Here, S and B are the numbers of signal and background events, respectively, corresponding to the residual signal and background cross-sections after applying the selection criteria that isolates the signal events from the backgrounds.The calculated signal significance for each benchmark point across the four different processes is presented in Table V.This includes for different possible choices of integrated luminosities, specifically at L = 300, 1000 and 3000 fb −1 .We can see from Table V that BP1 for the ALP mediated Higgs-strahlung signal will have substantial significance at 3000 fb −1 luminosity.The main reason is large production cross-section of the ALP signal.Detecting signatures of the aZh interaction in this process, a phenomenon not expected in linear expansions up to NNLO, would essentially serve as the smoking gun evidence for non-linearity.The Zγ process via BP2 and BP3 shows to have the most prominent separation between the signal and the background.The benchmark point BP4 uniquely probes the g aW W coupling, reaching a 3σ level sensitivity at 1000 fb −1 .The W W γ BP5 and BP6 benchmark points are only slightly less sensitive in probing g aW W coupling.

E. Direct probes of ALP coupling
In this subsection, we focus on another ALP production mechanism which involves the ALPs produced in association with Higgs or vector bosons or the 'ALP-strahlung' process and study the constraints it puts on ALP-Higgs and ALPvector boson interactions.We assume the ALP to be stable within the collider, meaning that it has a sufficiently long lifetime to leave the detector without decaying.This assumption depends on the decay modes available to the ALP, Signal statistical significance at various benchmark points for the distinct four processes of our study at the 14 TeV LHC.The significance levels are evaluated for integrated luminosities of L = 300, 1000, and 3000 fb −1 .We also estimate the integrated luminosity required to attain a 3σ and 5σ excess over the background for each benchmark point at the LHC running at √ s = 14 TeV.
which in turn depend on its mass and couplings.For an ALP with a mass around 1 MeV, decaying to fermions or heavier particles is not kinematically possible.The possible decay channels include a → ν νν ν (indistinguishable from a pure missing energy signature), a → γγ and a → γν ν.Both of the latter decays would typically allow the ALP to traverse distances much greater than the detector's dimensions before decaying.When the ALP mass exceeds 1 MeV, new decay channels to fermions become feasible once the ALP mass becomes greater than twice the mass of the fermions in the final state.Also, m a ≥ 3m π (∼ 0.5 GeV) would enable hadronic decay channels.However, this introduces a dependence on complex model-specific factors, which we do not delve into in this study.One motive in this subsection is to compare the constraints derived from direct ALP searches with those obtained from non-resonant ALP-mediated processes.Direct ALP probes involve additional model-based assumptions, limiting the generality of fit results.Since we ignore the ALP couplings to SM fermions, the associated production at colliders is dominated by the s-channel diagram through a vector boson propagator.Here, the production rates drop faster as m a increases, due to the power suppression of energy from s-channel propagator.For our simulations in the MG5aMC @NLO framework, we assume an ALP mass of 1 MeV, consistent with our non-resonant ALP analysis and treat the ALPs as stable within the collider for the purposes of detector simulation.
a. ATLAS measurement of Higgs boson production in association with missing energy and h decays to b-quarks: We will study the ALP signal pp → h(→ b b) + a and reinterpret the ATLAS search for dark matter via missing energy in association with a SM Higgs boson channel [101] in the context of the ALP signal.ATLAS has recently provided measurements of the missing transverse energy (E miss T ) distribution in events with a large-radius jet with two b-tags and missing energy, using the Run II data from the LHC at √ s = 13 TeV with an integrated luminosity of 139 fb −1 , along with an estimation of the SM background.The analysis is confined to a fiducial region, which is closely replicated by the phase space cuts outlined in Table VI.
We consider a 5 bin-data set, with the bin widths increasing with higher values of E miss T .The boundaries of these bins are set at (150, 200, 350, 500, 750) GeV.For the ALP signal simulation, we consider the process with the reconstructed Higgs jet having a radius parameter R = 1.The comparative results between the ALP signal and the ATLAS data are depicted in Fig. 9 (a), showcasing a slight increase in energy across the E miss T bins.We perform a χ 2 fit to obtain a limit of : By studying the indirect probe in the non-resonant ALP mediated m Zh bins, we have an enhanced sensitivity to the ALP-Higgs coupling.
b. CMS search for new physics events with Z production and large missing energy: We consider now ALP production in association with a Z boson, in hadronic collisions.We will study the impact of the ALP signal on the CMS measurement of Z + / E T search [102] with √ s = 13 TeV and integrated luminosity 35.9 fb −1 .This time, we will be considering a measurement in the leptonic channel to assess the sensitivity to the effective ALP interaction.
We will use the p miss T distribution as a key kinematic discriminator between signal and background.Data within the fiducial region, as opposed to the full phase space, will be used to refine the search.The selection cuts from the  bin being set at 600 GeV.To ensure the EFT applicability, we remove events in each bin where √ ŝ exceeds 2p miss T ,max .The ALP-photon-Z and ALP-Z-Z couplings could potentially lead to a mono-Z final state.This indicates contributions from both Wilson coefficients c W and c B to this process.We establish constraints on c W , assuming c B = −t 2 θ c W . Similar to previous processes, a χ 2 fit, as outlined in Eqn.(??), will be used to derive constraints on c W , giving : The mono-Z search proves to be useful in constraining the effect of c W /f a .Notably, in the higher-energy regime of p miss T > 250 GeV, the ALP contribution becomes considerable, especially in the tail of the p miss T distribution, which is where the most significant constraints originate.Nonetheless, this results in a constraint on c W that is less stringent than what is derived from the CMS m W W and m W W T distributions.
c. ATLAS measurement of charged lepton with missing energy: Let us now concentrate on the ALP production in association with a W boson.We reinterpret the ATLAS search for W ′ decaying to ℓ + / E T final states with 139 fb −1 integrated luminosity [103].We employ the transverse mass distribution of the leptonically decaying W for our analysis, as depicted in Fig. 9(c).To study the influence of the ALP signal on the m T distribution, we apply the selection criteria outlined in the final two columns of TableVI.The figure also includes the m T spectrum of the SM background.The ALP coupling involved in this signal is c W , with the high-m T bins playing a significant role in shaping the constraints on c W .
Background data for the electron and muon samples are taken from Ref. [103] and the depicted bins correspond to those with available experimental background information, following m T < m max T = 2.6 TeV for electrons and m T < m max T = 3 TeV for muons.From this analysis, we derive a constraint of: from the ATLAS m T data.Thus, the mono-W analysis yields stronger constraints than those obtained from the non-resonant pp → W W process.On the other hand, a dedicated search in the channel W γ+MET has not yet been performed at the LHC.This channel has several advantages over the W + M ET channel search.First, the high efficiency of reconstruction of high energy photons will lead to better sensitivity to the new physics effect and the SM background is also expected to be lower.Second, this channel like the non-resonant ALP mediated W W γ production will be able to probe couplings such as the four-point interaction of aW W γ and also can help disentangle more than one direction in non-linear ALP EFT parameter space.Thus, a combination of such probes will lead to better refining of the observables.
It is important to note that most direct bounds usually depend on specific model assumptions, which often involve setting all other coefficients to zero, unlike indirect bounds.As such, the indirect limits presented in this study act as a good complementary probe, proving useful even in instances where direct probes might provide more stringent constraints.

V. PROJECTED SENSITIVITIES ON ALP EFT COUPLINGS
a. Sensitivity to ALP-Higgs coupling: The results presented in Table V provide the sensitivities for different benchmark points.This section outlines the sensitivity projections within the parameter space of ALP couplings using the relevant ALP-mediated non-resonant Zh, Zγ, W W, W W γ production processes.These processes are sensitive to the product of the ALP coupling to gluons with the respective ALP coupling to bosons.The ALP-gluon coupling, in principle, is an independent free parameter.We present the results for the ALP-boson couplings in this section assuming g agg = 1 TeV −1 .
In Fig. 10 (a), we present the variation of significance of ALP mediated hZ signal with operator coefficient a 2D fa for an integrated luminosity of 139 fb −1 at √ s = 13 TeV (red curve) and 3000 fb −1 at √ s = 14 TeV (yellow curve).Signal stands over the background a 3σ level for a 2D fa ≃ 0.095 TeV −1 (0.058 TeV −1 ) at 13 TeV (14 TeV).Fig. 10 (b) shows the sensitivity levels at 2σ (red), 3σ (yellow) and 5σ (green) for the pp → Zh signal with √ s = 14 TeV and for an integrated luminosity of 3000 fb −1 in the parameter space of f a -a 2D .The green shaded region represents f a ≤ √ ŝmin and is thus, excluded since all signal events will break the validity criterion of EFT.The 5σ sensitivity level is achieved for f a /a 2D ≃ 15 TeV for an integrated luminosity of 3000 fb −1 of data.Thus, the region is allowed to observation at the HL-LHC.
The dash-dotted reference lines correspond to constant values of f a /a 2D .In the region with higher values of f a , we find that the sensitivity curves run almost parallel to the lines of constant f a /a 2D indicating a stable detection range for f a /a 2D here, despite a loose constraint on a 2D .When f a decreases below 1 TeV, the sensitivity curves fall slowly compared to the reference lines.This indicates that the analysis in this lower f a region is limited to smaller f a /a 2D ratios compared to the higher f a regions.This change in sensitivity is attributed to the reason that as f a decreases, more and more events from the higher energy bins are excluded to ensure the applicability of the EFT in the region.This leads to loss in discerning power of the signal.
The interaction between ALP and the Higgs boson also induces non-standard decays of Higgs.These decay modes of the Higgs boson will put constraints on ALP interactions through the unobserved Higgs branching fraction (h→ BSM).Considering that these exotic decays are the only LO modifications to Higgs properties, the global signal strength measurements can be used to constrain a 2D /f a .This is because the invisible Higgs branching fraction will be proportional to 1 − BR(h → SM).The latest combined CMS global signal strength measurement restricts BR(h → BSM) < 0.11 [104].Assuming Γ BSM ≃ Γ h→aZ , we obtain a limit of Γ h→aZ < 0.5 MeV at 95% C.L.This limit translates into a constraint of f a /a 2D ≥ 5.95 TeV for m a ≤ 34 GeV.However, this expected sensitivity is less stringent than the current limit derived from the pp → a * → Zh process, as depicted in the blue shaded region of Fig. 10 (b).
b. Sensitivity to ALP-electroweak gauge bosons coupling : Fig. 11 presents the upper bounds on the coefficients c W fa and c B fa (in TeV −1 ), derived from Zγ, W W and W W γ analyses.These limits can also be interpreted as products of ALP couplings in the plane of { } as all of these processes involve ALP-gluon coupling.They are calculated for each individual experimental channel and based on the differential measurements of relevant energy-dependent variables (refer to Sec.IV A).We will present these limits assuming g agg = 1 TeV −1 .The Zγ process which gets modified by both c W and c B coefficients, constrains the difference |c B − c W | < 0.074 TeV −1 , as derived from the 13 TeV m Zγ differential measurement.The W W γ analysis imposes a stricter limit on c W .The expected limit for W W γ, based on m W W , is |c W | < 0.147 TeV −1 , which is twice as stringent as that from the W W analysis based on m W W T .The W W process is not affected by c B , whereas W W γ has a slight dependence on it, as seen in Fig. 11.Combining the results from W W and W W γ, along with other diboson channels like ZZ, W γ and triboson channels such as ZZγ, could potentially yield improved sensitive limits and is a prospect for global analysis in future work.The non-linear framework of ALP EFT generates other operators that could modify the interactions of the charged weak bosons with the ALP.Exploring the W W γ process further could help disentangle more than one directions in the ALP parameter space, an endeavor to be taken up in the follow-up.When all constraints are considered together, only a narrow overlapping region near zero remains viable, with |c W | < 0.06 and |c B | < 0.072.The limits from Zγ measurement provide the most stringent constraints along the c B axis.These constraints can also 0.02 0.04 0.06 0.08 0.10 fa (TeV −1 ) at √ s = 13 TeV, 139 fb −1 (red) and √ s = 14 TeV, 3000 fb −1 (yellow) for pp → Zh for gagg = 1 TeV −1 .Right: Sensitivity contours at 2σ (red), 3σ (yellow) and 5σ (green) levels for the ALP mediated pp → Zh signal at √ s = 14 TeV LHC and for an integrated luminosity of 3000 fb −1 , in a2D-fa plane assuming gagg = 1 TeV −1 .The green shaded region depicting fa < √ s min , is excluded by the criterion of EFT validity.The blue region is excluded by the limits from Br(h → BSM) [104].The dash-dotted lines represent constant values of fa/a2D.be interpreted in the plane of effective couplings like g aγγ , g aZγ and g aZZ (using Eqn. ( 6)), which are depicted in the dashed, dotted, dot-dashed lines in the Fig. 11.
The Z boson can decay into a light ALP and a photon.The upper limit on the width of Z boson to exotic channels is Γ(Z → BSM) ≲ 2 MeV at a 95% C.L. [105].This puts a strong limit on the tree-level decay of Z → aγ.This contribution Γ(Z → aγ) is given by: Using the Z boson width data, the coefficient g aZγ can be constrained which is largely independent of m a for values of m a ≲ m Z GeV: Constraints from LEP experiments on the Z → 3γ decay process [94] cosntrains a combination of g aγγ and g aZγ .However, based on the already strong limits of g aγγ , the resulting bound on g aZγ turns out to be less stringent than the one derived in Eqn.(27).
We will discuss the projected constraints of ALP couplings to EW gauge bosons via these ALP mediated processes at the HL-LHC.Fig. 11 (b) shows the sensitivity regions at 2σ (darker shaded region) and 5σ (lighter shaded region) significance levels on the fa plane for a 14 TeV LHC and 3000 fb −1 of data.At 2σ level, a more stringent region for each channel is seen, with the Zγ channel exhibiting the most significant individual improvement (|c B − c W | = 0.05 TeV −1 ).The combined limits are mostly constrained by the W W γ and Zγ channels.Additionally, Fig. 11 (b) highlights the expected discovery threshold (lighter shaded region) at 14 TeV, where the SM point would be excluded by 5 standard deviations if the measurements align with the predicted ALP signal.This region is within the exclusion limits of the current 13 TeV LHC data and suggests that the absence of results from the current LHC data does not necessarily rule out the possibility of a discovery at the HL-LHC.

VI. MULTIVARIATE ANALYSIS
After performing a cut-based analysis for each of the signals in the four distinct non-resonant ALP processes at the LHC in Sec.IV, we now delve into investigating for potential improvement in the analysis with some advanced techniques like Gradient Boosted Decision Trees [106].The usefulness of these methods have been extensively studied in recent studies [107,108], particularly in the Higgs sector [109,110] and have demonstrated better efficacy in differentiating between signal and background characteristics compared to conventional rectangular cut-based analyses.Their application in ALP scenario searches at colliders is yet to be thoroughly explored.In our study, we assess the possibility for maximizing signal significance in the specific signal processes under our consideration.To achieve this, we utilized the AdaBoost classifier from the scikit-learn library in Python.
At first, we discuss the details of our analysis for the Higgs-strahlung process, considering the BP1 benchmark scenario for the ALP mediated signal.We take into account all relevant SM backgrounds in the process − here, Z+jets which includes Z+b b is the most dominant background for Zh production.To optimize the classifier's performance in identifying the signal region, we impose slightly looser cuts compared to the cut-based analysis, thereby ensuring better training.The selection criteria we employed are as follows: 75 GeV < m ll < 105 GeV, p T ll > 160 GeV, ∆R ll > 0.2, p T J > 60 GeV, 95 GeV < m J < 155 GeV, ∆R bi,bj > 0.4 and / E T < 70 GeV.After these pre-selections, we trained the classifier on the signal and background samples with the following set of variables: • Transverse momenta (p T ) of the two isolated leptons • Reconstructed Z boson and its p T • ∆R separation between the two b-tagged subjets (∆R bi,bj ), subjet i and lepton j (∆R bi,lj ) and two leptons (∆R li,lj ) • Scattering angle of reconstructed Z boson.
• N-subjettiness of the leading fat-jet (τ 21 ) • ∆ϕ separation between the leading fat-jet and the reconstructed Z boson • Mass of the reconstructed Higgs jet and its p T For the gradient boosted decision tree method of separation, we have taken 1000 estimators and maximum depth of 4 with learning rate 0.1.We have used 75% of the total dataset for training purpose and 25% for validation.After implementing the BDT algorithm, we obtain the distribution of the response of the BDT classifier for the signal and total background events for Higgs-strahlung process as shown in Fig. 12 (top-left panel).We can see a clear distinction between the signal and the background distributions.We have checked that in this process, p T distribution of the leading lepton plays the role of the most important input variable.The ∆R separation between the two b-tagged subjets and p T of the reconstructed Higgs jet are the second and third best discriminators, respectively.Thus, stronger transverse momenta of the leading fatjet and the lepton are favourable to retain the correct classification of these variables.We have plotted the Receiver Operating Characteristic (ROC) curve (that estimates the degree of rejecting the backgrounds with respect to the signal) for the benchmark signal process BP1 in Fig. 12 (right panel).One of the possible demerits of these techniques is over-training of the data sample.In case of over-training, the training sample gives extremely good accuracy but the test sample fails to achieve that.We have explicitly checked that with our choice of parameters, the algorithm does not over-train.The ROC curve remains almost same for training and testing samples.The area under the ROC curve is 0.90 for BP1.At the √ s = 14 TeV LHC with 3000 fb −1 of integrated luminosity, we expect to observe 833 signal events and 3542 background events for an optimal cut of 0.1982 on the BDT output.The signal significance computed using the formula in Eqn.(21), is 13.192.Upon assuming a systematic uncertainty, σ sys un , the signal significance formula is modified in the following form : where σ B = σ sys un × B. The performance of the multivariate analyses was optimized to maximize the signal significane while also maintaining a reasonably good value of S/B.Adding a 5% systematic uncertainty translates to a significance of 4.164.We present our results for √ s = 14 TeV to make it easier to translate to the case of Run-3 ( √ s = 13.6 TeV) and HL-LHC ( √ s = 14 TeV) as the cross-sections are not expected to change much.For the Zγ process, we study the BP2 and BP3 ALP mediated signals.We considered all the backgrounds listed in Table III in the background class.For the MVA, we have adopted cuts that are slightly less stringent than those used in the cut-based approach (detailed in Table III).Along with the preliminary selection cuts, we have applied a requirement for the leading fat-jet and the photon to have a minimum transverse momentum of 175 GeV.We have set a minimum threshold for the reconstructed fat-jet mass at 60 GeV.These criteria effectively minimize the dominant background while retaining most of the signal events.This approach is important because the MVA tends to be less effective with only pre-selection cuts, given the small signal size relative to the large background.It is also worth noting that the more stringent cuts from Table III do not necessarily lead to better results in the MVA context.Therefore, the cuts chosen for MVA are carefully calibrated to be neither too strict nor too relaxed compared to the cut-based analysis.
The BDT classifier is configured with the following hyperparameters: 'n estimators':800, 'learning rate':0.1,'trees':10, 'max depth':4 For the training, we have selected a range of observables that are effective at distinguishing between the signal and background.These observables are chosen as input variables for the BDT to optimize its discerning potency.
where the symbols have their usual meaning.p bi T denotes the transverse momentum of i th b-tagged sub-jet and cos θ * γ is the scattering angle of photon in the Zγ rest frame.Among these variables used, the four most important variables to distinguish the ALP signal from the backgrounds are : m J , τ 21 , p J T , E γ .The classifier, after being trained with these kinematic variables, is used to discriminate the signal benchmark from the background class by computing the significance of observing the signal over the background events.We find that the signal significance over background for the benchmark scenarios BP2 (shown in Fig. 13 (bottom-left)) and BP3 are 15.056 (4.653) and 19.89 (6.102), respectively, assuming zero (5%) systematic uncertainty at 14 TeV HL-LHC.It is to be noted that there is no significant difference in the spread of the background BDT score for the two BPs with the change in effective coupling g aZγ but the signal distribution spreads away from the background as g aZγ increases.This is also reflected in the signal significance since the signal and background discrimination becomes more obvious with the increase in g aZγ .The ROC curve is shown in Fig. 13 (right).
For the MVA of the semileptonic channel of the W W production, we consider BP4 ALP mediated signal category and all the relevant background processes which the mimic 1J + 1ℓ + / E T final state in the background class.The different backgrounds are mixed according to their proper weights to obtain the kinematical distributions for the combined background class.In order to be quantitative, we have applied some weak kinematical cuts than discussed in Sec.IV A 3, eg./ E T > 100 GeV, m lep W > 65 GeV, p T,W lep > 120 GeV and p J T > 100 GeV on signal and background events in addition to the pre-selection criteria mentioned in Sec.IV.Upon inspecting various kinematic distributions, we choose the following 12 variables for our multivariate analysis: systematic uncertainties.In Fig. 14, the ROC curve for the benchmark BP4 is shown and an area of ∼ 89%(BP4) is obtained under ROC curve.Before concluding this subsection, we make an attempt to decipher the potential of the leptonic final state for the W W γ channel.We study the benchmark scenarios BP5 and BP6 separately for the signal.We consider the same set of cuts as for this channel before performing the multivariate analysis as the cuts are neither too strong nor too loose.For this case, we find the following variables to have the best discriminatory properties.
where p T,ll and ∆η ll refer to the p T of the dilepton system and the rapidity separation between the leptons respectively.The best four variables among these are ∆R ll , m W W T , ∆η ll and p γ T .Hence, in an analogous way to the W W case, we train the classifier with the signal and the background samples, albeit with proper weight factors for the backgrounds.We find a similar significance of 14.422 and 14.614 for the benchmark scenarios BP5 and BP6 respectively, assuming zero systematic uncertainties.The results are summarised in Table VII respectively.The response of the classifier and the ROC curve for BP5 are shown in Fig. 15.
The signal significance computed for all the benchmark points with Adaptive BDT algorithm is presented in Table VII.One can compare these results with the ones presented in Table V.It is clear that in all cases there is significant improvement from rectangular cut-based analysis.We particularly point out the BP2 and BP3 in case of Zγ production.Here, we observe a considerable improvement from the cut-based results.The BDT algorithm finds the best possible combination of feature variables to separate the signal and background by choosing the best possible set of cuts on the most relevant observables.We remark here that the data sample used for training purpose may in principle be subjected to some pre-assigned additional cuts, such as demanding specific invariant masses for opposite-sign dileptons in W W γ or using variables that directly are proportional to the energy scale of the process, for instance, the invariant mass of the final state system in the 2 → 2 scattering processes is one of the most important distinguishing features between signal and background.However, to minimize the bias, we do not use it as an input variable to the BDT.Thus, the analysis always has the scope of improvement, by choosing a better set of variables and cuts.However, the variables that we have used are good discriminators as demonstrated in the following.) and background (N bc Bkg ) events before and after applying the optimal BDT cut (BDTopt), along with the signal (ϵS) and background (ϵB) acceptance efficiencies at the BDTopt cut value are given.The statistical significance (S with no systematic uncertainty) for each benchmark point is presented.The last column presents the signal significance for 5% systematic uncertainty.

VII. ALP COUPLINGS AND MASSES
Fig. 16 illustrates the constraints obtained in our study at 13 TeV, plotting them in the subspace of the EW g aV1V2 couplings as defined in Eqns.(6b) and (6c) and the ALP mass m a .We compare the constraints on g aW W and g aZγ with those from various other experiments (See, for instance, Refs.[39,111]).A comment is in order.It is important to note that most measurements often rely on several ALP couplings.To depict these constraints on a two-dimensional plane of (m a , g aV1V2 ), it is necessary to employ a specific underlying rationale or theoretical assumptions, which can differ widely among the various constraints applied.In collider searches, the interplay between specific EW couplings g aXY and gluon couplings g agg is important.This relationship is often modelled as , motivated by the pseudo Nambu-Goldstone bosons with anomalous couplings generated by the triangle diagram with O(1) group theory factors (Ref.[39]).For m a > 3m π , with these assumptions and for LHC searches with resonant processes, it is equivalent to consider g agg ≫ g aV1V2 .Also, for loop induced contributions, bounds on fermionic or photonic couplings could be translated to EW gauge boson couplings and they involve a logarithmic dependence on the cut-off scale f a , related to g aV1V2 by f . To compare the constraints from other experiments, some of these assumptions for LHC searches and loop-induced couplings are incorporated.The constraints derived from the allowed region in the c W fa − c B fa plane, inherently incorporate gauge invariance relations.The constraints depicted in brown-hatched region in Fig. 16 (left and right panels) are obtained from the non-resonant gg → a * → V 1 V 2 processes.They scale with 1/g agg and for c G → 0, are lifted completely.For visualization purposes, these figures are normalized to g agg = 1 TeV −1 .Bounds derived on g aZγ , g aW W from the analysis of non-resonant VBS processes in Ref. [52] are shown in magenta.We will discuss some constraints that involve more complex assumptions about the ALP parameter space.Majority of these constraints, particularly those relating to the interactions of ALP with massive gauge bosons, assume that the ALP is stable and focus on the mass range m a < 1 GeV.These constraints are derived from mono-W and mono-Z searches at the LHC and for g aγZ , from the hitherto unobserved exotic Z → γ +inv.decays at LEP [49] and LHC [112].It is to be noted that, resonant triboson constraints on g aW W and g aγZ are based on a photophobic ALP model [49] and they provide dominant bounds for ALP masses above 100 GeV.
All these searches for a stable ALP (including mono-W , mono-Z, Z → γ + inv.) implicitly assume a sufficiently small ALP decay width, which, in the relevant mass range, implies certain assumptions about its coupling to photons, electrons and muons.If we move away from the stable ALP assumption, a more conservative constraint arises from the total Z decay width measurements at LEP, applicable up to m a ≲ m Z [20,49].The LEP constraints are predicated on negligible decay rates into leptons.It is also important to note that this bound cuts off at m a ≤ 3m π ≃ 0.5 GeV, as beyond this point, hadronic decay channels for the ALP become kinematically feasible.This leads to potential Z → γ +hadrons decays [113], introducing additional dependence on ALP-gluon coupling that would require a detailed analysis [39], possibly weakening the LEP constraints.
Various precise SM measurements would be modified due to the presence of a light state, the ALP, coupled to the SM through the electroweak gauge bosons.This has been extensively discussed in Ref. [111], where the impact of the ALP on precision observables is explored.The EWPO set an upper limit on the coupling constant g aW W , illustrated by the blue line, at 95% C.L. It is to be noted that the EWPT results align with the SM expectation of g aW W = 0 at 95% C.L.For ALPs with a mass greater than 500 GeV, the EWPT emerges as the most sensitive method for probing their effects.The model becomes less favored for values of g aW W ≳ 4 − 6 TeV −1 .Precise limits on the rare Kaon and B-meson decays can be used to set bounds on the ALP.In particular, for an invisible axion, the relevant searches are transitions from K → π+invisible and B → K+invisible.
The recent NA62 measurement of K → πν ν [114] has established new constraints on the new X particles in the decay of K. Specifically, it has reported limits on the branching ratio BR(K → π + X) ≲ (3 − 6) × 10 −11 at 90% C.L. for m a < 110 MeV, and BR(K → π + X) ≲ 10 −11 at 90% C.L. for m a ∈ [160, 260] MeV.From the searches of B-decays, the most stringent limit currently comes from BaBar [115], setting BR(B → K + inv.) < 3.2 × 10 −5 at 90% C.L. for m a ≲ 5 GeV.However, Belle II has already achieved comparable results with a limit of 4.1 × 10 −5 [116], and is expected to reach approximately 10 −6 [117] with 1 ab −1 data.These decays would be mediated by a loop with a W as virtual states, and where the W would radiate an ALP.We can compare with the current NA62 and Babar limits (shown in blue bound regions) to obtain mass-dependent limits on g aW W that uniquely contributes to rare meson decays at the 1-loop level [38,40] (in blue shaded region).In the case of g aZγ also, much of the mass range addressed by this analysis was already covered by LEP studies.However, our analysis expands the detection scope to lower couplings by nearly an order of magnitude.
While the resonant triboson production yields stringent constraints in the mass range above 100 GeV, the nonresonant W W γ process provides constraints valid over a mass window from 1 MeV to 100 GeV.
The constraints labelled as "Photons (1-loop)" derived are from a combination of beam dump experiments, observations from supernova SN1987a and LHC studies.For ALP masses below the GeV scale, beam dump searches (in blue region) [118][119][120] as compiled in Ref. [33] and energy-loss considerations related to supernova SN1987a [121,122] set limits on g aZγ .These parameters are primarily constrained by the absence of additional cooling and a lack of photon bursts from decaying axions.Due to radiative corrections of axion-boson couplings to axion-photon couplings, these results after translation can help establish bounds on g aW W and g aZγ , assuming minimal dependence on f a [39].
The use of MVA techniques and improved search strategies are also likely to significantly refine these constraints.Summarizing, the primary advantage of non-resonant searches lies in their ability to directly probe ALP interactions with EW bosons at the tree level, across a broad range of ALP masses, with minimal dependence on specific model assumptions.This work included processes initiated by gluons, which are influenced by the value of g agg .For this analysis, we set g agg at 1 TeV −1 .In Ref. [51], the ALP-mediated W W and Zγ production processes have been studied in the fully leptonic decays of the massive gauge bosons.The 95% C.L. exclusion limits valid upton m a ≤ 100 GeV, assuming g agg = 1 TeV −1 are g aW W < 0.62 TeV −1 and g aZγ < 0.37 TeV −1 .In cases where g agg falls below a certain level, non-resonant constraints from EW processes, such as those from vector boson scattering processes, could become more prominent, depending on the specific EW coupling being probed.These constraints have been studied in Ref. [52] and the 95% C.L. limits derived on the aforementioned two couplings are : g aW W < 2.98 TeV −1 and g aZγ < 5.54 TeV −1 .

VIII. SUMMARY AND CONCLUSIONS
Exploring the phenomenology of new, light, propagating particles such as the axion-like particles are pivotal to beyond the SM endeavours as investigated at, for eg. the LHC.The LHC allows for a plethora of processes that are sensitive to the ALPs and in recent times, it has expanded the range of probing the ALP interactions particularly with electroweak bosons and the top quark.Similarly the Higgs particle, whose global understanding still remains elusive, presents a vital area for potential discoveries in new physics.As experiments at the LHC gain in their sensitivity to rare phenomena, they may unveil evidence of new physics linked to the Higgs.Our study focuses on the interactions of ALP with the SM Higgs boson and the electroweak gauge bosons through non-resonant searches at the LHC.In particular, we have studied the potential impact of ALP couplings in the effective theory framework, on the production of Zh, Zγ, W W and W W γ processes at the LHC.Here, the ALP serves as an off-shell mediator in these scattering processes.The key strategy utilises the presence of explicit dependence of derivative interaction of the ALP with the SM bosons.As a consequence, there is a high energy growth of these scattering processes which deviates significantly from the SM.This has been exhibited in the regime where √ ŝ ≫ v and the ALP mass respects m a ≪ √ ŝ.Additionally, we ensure the consistency of the ALP EFT expansion with √ ŝ ≪ f a .With reinterpretation of the public data from the ATLAS and CMS collaborations at 13 TeV for the measurement of the aforementioned SM processes, we obtained constraints on ALP couplings to SM gauge bosons in the set {g aZh , g aZγ , g aW W , g aW W γ }.We underline the importance of using information from differential distributions in the high energy tails of the final system mass spectrum.The limits we get are rigorous across a broad mass window of ALP from 1 MeV to 100 GeV, assuming an ALP-gluonic coupling exists.For the Zh and Zγ production processes, depending on the value of the scale f a and g agg = 1 TeV −1 , upper limits on the ALP coupling to Zh and Zγ of a 2D = 0.078 TeV −1 and |c B − c W | = 0.073 TeV −1 have been extracted at 95% C.L. We also carried out the analyses for W W and W W γ processes which provide a handle to probe the coupling c W .We find that these processes impose a constraint of c W < 0.068 TeV −1 and 0.147 TeV −1 , respectively.Combining these channels yield an additional constraint of c B < 0.075 TeV −1 .Among the multi-boson final states, the Zγ channel enjoys the highest sensitivity.
We have chosen a few representative benchmark points which give distinct signatures from the SM backgrounds in the boosted regime.The potential of HL-LHC in probing these ALP interactions via non-resonant searches with the chosen BPs are examined and projections for integrated luminosities up to 3000 fb −1 at 14 TeV LHC are presented.The upcoming HL-LHC program will allow for improved sensitivity of ALPs through their relevant electroweak boson couplings at discovery level.Detection of statistically significant Zh signal events mediated by the ALP at LHC would essentially indicate an evidence of non-linear EWSB.
To explore for potential improvement of the sensitivity for the non-resonant signals at the LHC, we employed a multivariate analysis.This method differs from the rectangular cut-based analysis by considering all the input kinematic variables at one go and providing an optimal separation between the signal and the background yields.We utilized a boosted decision tree network algorithm and trained it with a variety of kinematic variables specific to each of the relevant process to enhance the signal distinction.The results show a clear improvement in the LHC sensitivity to detect new interactions using this method, especially for the benchmark points we considered.
The associated production of ALP is another complementary probe.We also concluded that if the ALP is collider stable and escapes detection, the W + MET (mono-W ) signature in terms of a direct search for ALP production with a W boson is more sensitive than the off-shell mediated processes involving the ALP-W interactions while processes such as mono-Higgs and mono-Z are less sensitive in the direct probes than the corresponding non-resonant ALP signal analysis carried out.Nevertheless, a comprehensive global analysis of both the direct and indirect ALP searches would yield more information on the constraints of the various ALP operators, both in the linear and non-linear frameworks, with emphasis on effects responsible for electroweak symmetry breaking.
The non-resonant searches offer a complementary probe for very light ALP masses.The main advantage lies in the independence of specific assumptions on the ALP characteristics.Exploring phenomenology of additional processes like di-higgs production, vector boson fusion channels, W W Z, ZZγ processes and other multi-particle productions could further refine our understanding of the ALP parameter space, providing access to disentangle between various operators in both, linear and non-linear mechanisms.While the EFT usually serves as a useful model-independent theoretical framework for experimental searches, expanding the works in the direction of UV completions could predict sensitivity of (model-dependent) degrees of freedom and signals.With the LHC entering a new phase with higher energy and luminosity, it becomes increasingly important to focus on the possible ALP-mediated processes and dedicated designs of observables and analyses which offer significant sensitivity to phenomena beyond the standard paradigm.
III: The selection criteria applied for Z(→ b b)+photon production at √ s = 13 TeV.The signal corresponds to an ALP mediated process of Z(→ b b)+photon production, with c G = 1.25, c W = −c B = 1 and fa = 5 TeV.

FIG. 2 :
FIG. 2: (a)The differential distribution of events at 13 TeV LHC and an integrated luminosity of 139 fb −1 with respect to reconstructed m Zh for the SM+ALP signal (red line) as well as the total SM background and the data (black dots with error bars) as given by the ATLAS measurement in Ref[77].The signal corresponds to coefficients a2D = 0.1 and fa = 10 TeV with gagg = 1 TeV −1 (b) Invariant mass distribution of Jγ with 36.1 fb −1 data at 13 TeV run of LHC.The total SM prediction (blue line) and the data are taken from the analysis by the ATLAS collaboration in Ref.[78].The signal (red line) correspond to coefficients c W = −c B = 1 and fa = 5 TeV with gagg = 1 TeV −1 , (c) mW W distribution in 1ℓ + J + / E T channel, incorporating data points and total SM background from Ref.[79] by the CMS measurement at 13 TeV and an integrated luminosity of 35.9 fb −1 .The solid red line represents the ALP signal for c W = 1 and fa = 5 TeV with gagg = 1 TeV −1 and (d) Comparison of ALP signal events (c W = −c B = 1 and fa = 5 TeV and gagg = 1 TeV −1 ) and the total SM expectation along with the CMS measurement data points for the transverse mass distribution of W W system from the production of W W γ in 2ℓ + / E T + γ channel[80] at 13 TeV and 138 fb −1 integrated luminosity.

FIG. 3 :
FIG.3: Left: Total cross-sections at √ s = 13 TeV for the ALP contributions to the different scattering processes as a function of the ALP mass.The value of fa in each of these processes is taken to be 4 TeV and c G = 1.The 'Zh' curve is evaluated at a2D = 1 and 'W W ' curve at c W = 1.For the 'Zγ' and 'W W γ' cases, they are evaluated at c W = 1, c B = −0.305.At each point in the plot, the ALP decay width was re-estimated as a function of ma and the Wilson coefficients.Right: The ratio RM V 1 V 2 variation as a function of maximum invariant mass of the final state system in Zh (red), Zγ (yellow) and W W (blue) production processes.

FIG. 4 :
FIG. 4: Normalized distributions of (a) the mass mJ and (b) the transverse momentum pT J of the jet, both for the pp → hZ ALP mediated signal and SM backgrounds at √ s = 14 TeV.For the ALP mediated signal, we have chosen BP1 with a2D = 0.2, c G = 1.0 and fa = 5 TeV (blue).

FIG. 5 :
FIG. 5: Normalized distributions of (a) the transverse energy pT J of the photon and (b) N-subjettiness of the jet, both for the pp → Zγ ALP mediated signal and SM backgrounds at √ s = 14 TeV.For the ALP mediated signal, we have chosen BP2 with with c W = 0.5, c B = −0.5, c G = 1.0, fa = 5 TeV (blue) and BP3 with with c W = 0.5, c B = −1.639,c G = 1.0, fa = 5 TeV (red).

FIG. 6 :
FIG. 6: Normalized distributions of (a) the effective mass m ef f variable and (b) ∆ϕ(jet, / E T ), for the pp → W W ALP mediated signal and SM backgrounds at √ s = 14 TeV.For the ALP mediated signal, we have chosen BP4 with c W = 0.5, c G = 1.0, fa = 5 TeV (blue).

FIG. 7 :
FIG. 7: Normalized distributions of (a) the invariant mass of the dilepton and photon system, m llγ and (b) transverse missing energy / E T , for the pp → W W γ ALP mediated signal and SM backgrounds in the fully leptonic decay channel at √ s = 14 TeV.For the ALP mediated signal, we have chosen BP5 with with c W = 0.5, c B = −0.5, c G = 1.0, fa = 5 TeV (blue) and BP6 with with c W = 0.5, c B = −0.152,c G = 1.0, fa = 5 TeV (red).

FIG. 8 :
FIG.8: Two dimensional histograms showing the correlation between invariant mass of the dilepton and photon system, m llγ and the separation between the two leptons, ∆R ll .The z-axis indicates the normalized frequency of events, in arbitrary units.Fig (a) represents BP6 ALP scenario with c W = 0.5, c B = −0.152,c G = 1.0, fa = 5 TeV and (b) represents the SM backgrounds, comprising, SM W W γ, W Zγ, V γ, t tγ and backgrounds from non-prompt leptons and non-prompt photon at √ s = 14 TeV.

Nl 2 )> 8 TABLEFIG. 9 :
FIG. 9: (a) The differential distribution of E miss T for h(→ b b) + E miss T signal and background for √ s = 13 TeV and 139 fb −1 of integrated luminosity, following the selection cuts from Table VI (a).The total SM E miss T background (blue) distribution is obtained from [101] and for the signal pp → ah (h → b b) , E miss T distribution involves contribution from coefficients a2D = 1 and fa = 4 TeV along with SM contribution (red).(b) Distribution of p miss T for a Z (Z → ℓ + ℓ − ) production, with coefficients c W = 1, c B = −1, fa = 5 TeV (red) and the experimental data and SM backgrounds from the CMS analysis [102] at 13 TeV and 35.9 fb −1 .(c) Distribution of transverse mass mT for a W ± (W ± → ℓ ± ν ℓ ) production in the µ + / E T final state, obtained with c W = 1, fa = 5 TeV (red), compared with experimental data and SM backgrounds from the ATLAS analysis [103] at 13 TeV and 139 fb −1 integrated luminosity.
second and third column of Table VI are employed.The comparison of signal and background p miss T distributions for ℓ = µ can be seen in Fig. 9 (b), with the maximum p miss T

1 ]FIG. 11 :
FIG. 11: Left: 95% C.L. allowed region in c W fa -c B fa plane from 13 TeV analysis of differential distribution measurements for Zγ (dark pink), W W (blue) and W W γ (yellow) production processes with gagg = 1 TeV −1 .Right: Projected 2σ limit (darkershaded) and 5σ discovery level (light-shaded) regions in c W fa -c B fa plane for the respective ALP mediated signals at √ s = 14 TeV and integrated luminosity of 3000 fb −1 with gagg = 1 TeV −1 .The thin dashed, dotted and dot-dashed lines represent the directions of vanishing couplings for gaγγ, gaZγ and gaZZ respectively.The vertical axis at c W = 0 respresents gaW W = 0.

FIG. 16 :
FIG.16: Summary of current constraints as a function of the ALP mass and couplings gaZγ (left) and gaW W (right). Limits derived in this work are labeled "Non-resonant via ggF at LHC" and shown in brown.These constraints are normalised with gagg = 1 TeV −1 .Bounds from "Non-resonant VBS" are shown in magenta.Orange region refers to an assumed gluon dominance gagg ≫ gaV 1 V 2 for constraints from γ+hadrons search.Green region (constraints from LHC searches such as mono-W, Z, resonant triboson production) indicates more complex assumptions on the ALP EW couplings.Bounds with minimal assumptions on the ALP model are in blue.See the main text for more details. .

TABLE IV :
Summary of selected benchmark points for the study

TABLE V :
. Assuming 5% systematic uncertainties, we obtain a significance of 3.678 and 3.726 for BP5 and BP6 FIG.15: pp → W W γ in fully leptonic channel.Left: The normalized BDT score distributions for the signal and the background.Significance as a function of the BDT cut value for BP5 at √ s = 14 TeV, Lint = 3000 fb −1 .Right: ROC curve for BP5.

TABLE VII :
Evaluation of signal and background events at 14 TeV LHC for an integrated luminosity of 3000 fb −1 .The table includes the number of signal (N bc S