Probing axion-like particles coupling to gluons at the LHC

Assuming ALPs couple to gluons only, they can be produced via the $p p \to a j$ process and decay into two jets at the LHC. When the coupling parameter, $C_{\tilde{G}} / f_a$, is small, the lifetime of ALPs can be long enough leading to displaced final state jets. In this paper, we consider the signal including both the prompt and long-lived cases of ALPs by employing a specialized Delphes module to handle displaced jets. Relevant background processes are generated and simulated at the detector level, and multivariate analyses based on machine-learning are performed to discriminate signal and background events and achieve the best sensitivities. Based on the data accumulated for this study, we forecast the expected upper limits on $C_{\tilde{G}}/f_a$ for ALP mass $m_a$ in the range 5$-$2300 GeV at 2-, 3- and 5-$\sigma$ significances at the High Luminosity-LHC with 14 TeV center-of-mass energy and 3 ab$^{-1}$ integrated luminosity. Vast previously unprobed regions in the parameter space spanned by $C_{\tilde{G}}/f_a$ and $m_a$ are probed and the best upper limits on $C_{\tilde{G}}/f_a$ at 2-$\sigma$ significance is found to be around $1.0 \times 10^{-2} \,\, {\rm TeV^{-1}}$ for $m_a \sim 500$ GeV. The ALP mass is reconstructed from the kinematics of final state jets and we find that it is measurable in this method when $m_a$ is below about 1 TeV at the HL-LHC. The effects of systematic uncertainties and validation of the EFT framework are also checked and discussed.


Introduction
Granting the fact that the Standard Model (SM) has demonstrated huge successes in providing experimental predictions, and is generally believed to be a self-consistent effective theory, it still leaves some phenomena unexplained and is deprived of being a complete theory of everything. One of such challenges is the strong CP problem in QCD [1][2][3]. The resolution to the strong CP problem came through the works of Peccei and Quinn [4,5] by incorporating a spontaneously broken global U(1) PQ symmetry in the SM, and axions are the pseudo-Nambu-Goldstone bosons of U(1) PQ . Over the decades, several aspects of the original axion model have been investigated and more models have been developed [6][7][8][9][10][11][12][13][14].
In the original model, the mass of QCD axion is strictly related to the energy scale f a of the global symmetry breaking. Setting the stringent relation between the axion mass and f a aside, the idea of Axion-like Particles (ALPs) with wide-ranging masses and couplings arises. Unlike QCD axions, the couplings and masses of ALPs are considered to be independent parameters. Generally, these pseudo-Nambu-Goldstone bosons can arise naturally from the spontaneous breaking of a new U(1) global symmetry at some large energy scale f a in many beyond standard model theories. They are gauge-singlets, odd under CP transformations, and the strength of their couplings to SM particles is proportional to the inverse of the global symmetry breaking scale f a .
Because their masses and couplings to SM particles can range over many orders of magnitude, ALPs have been searched extensively according to different production processes and decay products at colliders and constraints are set in different regions of parameter space spanned by the ALP mass and various couplings. Collider studies on ALPs can be found in the reviews [15][16][17][18][19][20] and references therein. Recent collider experimental searches for ALPs by the ATLAS [21][22][23], CMS [24][25][26][27][28], and Belle II [29] collaborations are also reviewed in Ref. [30].
Current studies mainly focus on probing ALPs' couplings to electroweak gauge bosons (photons, Z-and W -bosons), SM Higgs bosons, and leptons, while the coupling to gluons needs more investigations, especially for heavy ALPs. Ref. [31] reviews some current constraints on the ALP-gluon coupling from various collider experiments. These include GlueX constraints through the photoproduction of ALPs [32], and reinterpretations from KOTO search for K L → π 0 + invisible [33], NA62 search for K + → π + + invisible [34], NA62 search for K + → π + a(γγ) [35], proton beam-dump searches at the νCAL [36,37] and CHARM [38], and constraints on rare pion decay π + → aeν at the PIENU [39] and PIBETA [40]. All of the above studies set limits on ALP-gluon coupling for ALP masses below about 1 GeV. Moreover, Ref. [41] reviews existing constraints on the ALP-gluon coupling in the mass range m π < m a < 3 GeV, and more current limits on the ALP-gluon coupling can be found in review [42] and references therein.
Concerning collider searches for constraining the ALP-gluon-gluon coupling at the LHC, the ATLAS collaboration has performed one study through the decay of the Higgs boson into the Z-boson and an ALP (h → Z a), in which the ALP decays into a gluon pair (a → gg) [43]. The ATLAS collaboration has also probed ALPs' coupling to gluons through the mono-jet final-state topology [44], i.e. an energetic jet plus large missing transverse momentum, using proton-proton collision data with a center-of-mass energy √ s = 13 TeV and an integrated luminosity L int = 139 fb −1 . The study has explored ALP masses up to m a = 1 GeV, ALP-gluon couplings up to CG = 1 and effective scales f a in the range 1-10 TeV. The result has excluded the values of CG/f a above 8 × 10 −3 TeV −1 at 95% confidence level, and the exclusion does not significantly depend on the ALP masses up to about 1 GeV.
Another search constrains the ALP-gluon-gluon coupling through the signal process gg → a → ZZ, Zh at the LHC by the CMS collaboration [45]. Furthermore, the CMS collaboration has carried out a search for a vector resonance produced in proton-proton collision and decaying into a quark-antiquark pair, accompanied by a high p T jet from initial state radiation (ISR) [46]. The minimum p T of the vector resonance is high enough, so that its decay products are merged into a single, large-radius jet. They analyze the data with √ s = 13 TeV and L int = 35.9 fb −1 , and set upper limits on the production cross section for the vector boson mass in the range of 50-300 GeV. The results are reinterpreted to constrain the ALP-gluon coupling in Ref. [47], by recasting the limits on the production cross section of a qq-initiated resonance σ CMS qq given in Ref. [46] to that of a gg-initiated resonance σ gg through the equation σ gg = σ CMS qq · qq H T / gg H T . Here, qq H T and gg H T are cut efficiencies when applying hadronic activity H T > 650 GeV. Converting into our notation convention of ALPgluon coupling CG/f a in following sections, Ref. [47] shows that CG/f a 3 × 10 −2 TeV −1 can be excluded for ALP masses m a in the range 50-125 GeV.
Additionally, a plethora of phenomenological researches [48][49][50][51][52][53][54][55][56][57] have also been amassed on the sensitivities of the ALP-gluon-gluon coupling, albeit mostly focusing on the ALP mass range below a few GeVs. Recently, we have presented a strategy to detect the long-lived ALPs and explore the discovery potential of various FAr Detectors at the Electron Positron Collider (FADEPC) [30], where we estimate their sensitivities on the model parameters in terms of the effective ALP-photon-photon coupling, the effective ALP-photon-Z coupling, and ALP mass m a via the signal process e − e + → γ a, a → γγ at future e − e + colliders running at center-of-mass energy of √ s = 91.2 GeV and integrated luminosities of 16, 150, and 750 ab −1 .
In this work, we probe the ALP's coupling to gluons at the High-Luminosity Large Hadron Collider (HL-LHC) via the signal process pp → aj, a → jj. Since this process can be independent of ALP's couplings to other SM particles, it is a unique signal channel if ALP couples to gluons only. The signal including both the prompt and long-lived cases of ALPs are considered and multivariate analyses based on machine-learning are performed to reject background events and achieve the best sensitivities. The expected upper limits on CG/f a for ALP mass in the range 5−2300 GeV are derived at the HL-LHC with 14 TeV center-of-mass energy and 3 ab −1 integrated luminosity.
This article is structured as follows: Sec. 2 describes the theoretical foundation and production cross section for the signal process. In Sec. 3, we present the search strategy including the relevant SM background processes, the methodology employed for the event simulation, and the data analyses thoroughly. Sec. 4 shows the sensitivities on the model parameters achieved in this study. The method to reconstruct the ALP mass, the effects of systematic uncertainties and the validation of the effective Lagrangian are also discussed in this section. We summarize and conclude in Sec. 5. Some details of our study which complement the main body of the paper are listed in the appendices.

The Signal Production
Assuming ALPs couple to gluons only, the effective linear ALP Lagrangian is given by [58] where G A µν (A = 1,...,8) represents the gluon field strength tensor corresponding to SU c (3) group andG µν,A ≡ 1 2 ε µναβ G A αβ is the dual field strength tensor. m a and f a represent the ALP mass and the characteristic energy scale of the global symmetry breaking, correspondingly, and are assumed to be independent parameters throughout this article. In this study, we consider the production of ALPs in association with a jet at the HL-LHC with a center-of-mass energy √ s = 14 TeV and integrated luminosity L int = 3 ab −1 . The corresponding signal process is, therefore, p p → a j and representative processes are represented in Fig. 1. We implement the ALP model file with the linear Lagrangian [58] in the Universal FeynRules Output (UFO) format [59] into the MadGraph5 program [60] to simulate the proton-proton collisions and generate the p p → a j events. Details of data simulation are shown in Sec. 3.2. To maintain consistency throughout our study, the production cross sections calculated by MadGraph5 are used to estimate the number of signal events. Fig. 2 shows the dependence of the production cross section of the signal process, σ(p p → a j), on the model parameters m a and CG/f a . As we can see from the figure, the production cross section declines with the ALP mass m a for fixed CG/f a , while it is proportional to the square of CG/f a . In this paper, we have focused on the interactions of the ALPs with gluons. As such, we assume that the ALP decays completely into a pair of gluons with a decay width of at leading order [15,41] and a branching ratio Br(a → gg) = 1 1 .

The SM Background
The dominant background processes that are relevant to our signal are as follows.
(i) multi-jet production; (ii) A pair of taus in association with a jet, τ + τ − j; (iii) Production of massive gauge bosons in association with a jet, W ± j and Zj.
(iv) Production of two massive gauge bosons, W + W − and ZZ.
At proton-proton colliders, multi-jet background has very large cross section and , hence, is more significant than other SM background processes. Therefore, it requires large event samples during simulation. In this study, because the signal final state has at least three jets and the computational resource is limited, we have generated totally 1.244 × 10 8 events of pp → jj and jjj processes and combine them as multi-jet samples. The remaining background processes have relatively smaller cross sections, and 10 6 τ + τ − j, 3 × 10 5 W ± j, 3 × 10 5 Zj, 10 5 W + W − , 10 5 ZZ and 10 5 tt events have been generated.

Event Simulation
First, events are generated using madgraph5_amc@NLO version 2.6.7 [60] using the nnpdf23 [61] as the parton distribution function (PDF) of the proton at 14 TeV center-ofmass energy. For the signal, a scan of the ALP mass, m a , is done in the following fashion: 15 GeV increments in the mass range 5−50 GeV, 25 GeV increments in the range 50−100 GeV, 50 GeV increments in the range 100−500 GeV and 200 GeV increments in the range 500−2300 GeV. For each ALP mass, the second model parameter, CG/f a , has been scanned between 1.0 × 10 −2 − 5.5 × 10 −2 TeV −1 with 5.0 × 10 −3 TeV −1 increments for masses up to 1300 GeV and between 1.0 × 10 −2 − 2.5 × 10 −1 TeV −1 with coarse steps for all the masses above 1300 GeV. For each iteration of ALP mass m a and coupling CG/f a , 10 6 events have been generated to ensure that statistical uncertainties are minimized as much as possible within the bounds of computational resources.
At the parton level, all signal and background events are required to satisfy the following relatively lenient cuts: (a) The minimum transverse momentum of jets and b-jets is set to 5 GeV, i.e. p T (j/b) > 5 GeV, and for photons and charged leptons it is set to 1 GeV, i.e. p T (γ/l ± ) > 1 GeV; (b) The maximum pseudorapidity of all particles is set to 8, i.e. |η(j/b/γ/l ± )| < 8; (c) the minimum angular separation between every permutation of jets, b-jets, photons and charged leptons is set to 0.01, i.e. ∆R(j/b/γ/l ± , j/b/γ/l ± ) > 0.01.
All events are then passed through pythia version 8.244 [62] for parton showering, hadronization and decay of unstable particles. For detector effects, we use the fast detector simulation tool delphes [63] with an ATLAS detector configuration. The FastJet version 3.1.3 [64] package is used to perform jet reconstruction employing the anti-k t algorithm [65] with a jet cone size of 0.3 and a minimum transverse momentum of 5 GeV. Identifying displaced vertices in delphes has been very tricky and an official workable module has not yet been released. However, Ref. [66] introduces a customized version, delphes version 3.3.2_JD_TVS [67], including additional modules such as TrackVertexSmearing and JetDisplacement modules that allow the definition of a displaced jet. More specifically, the transverse displacement of a jet V T (j) = d 2 where d x and d y are displacements of the vertex along the respective axes, is defined by taking into consideration the displacement of the origin of all tracks with transverse momentum larger than a certain threshold and selecting the minimum of this displacements. For our analysis, we employed the use of these displaced jet modules and set ∆R(track, j) < 0.3 and p T (track) > 20 GeV.

Data Analyses
At the detector level, the following pre-selection cuts have been employed before a multivariate analysis is done.
(3) The minimum transverse momentum of the first three hardest jets is set to 30 GeV, i.e. p T (j 1 ), p T (j 2 ), p T (j 3 ) > 30 GeV.
The expected number of events N exp = σ prod × L int × pre after applying pre-selection cuts (1)-(4) sequentially are displayed in Table 1. Here, σ prod is the production cross section of the signal/background process, L int is the integrated luminosity, which we have taken to be 3 ab −1 for the LHC, and pre is the pre-selection cut efficiency. One sees that the multi-jet background is still very large after all pre-selection cuts and is a factor of ∼ 10 3 larger than the other background processes. Therefore, the background is dominated by the multi-jet process. After pre-selection, the jets are sorted according to their transverse displacement V T , therefore j 1 beyond this point refers to the most displaced jet in the event, j 2 the second most displaced jet and so forth. initial cut (1) cut (2) cut (3) cut (4) Table 1: The expected number of events for the signal (with benchmark mass m a = 500 GeV and coupling CG/f a = 10 −2 TeV −1 ) and relevant SM background processes after applying pre-selection cuts (1)-(4) sequentially at the HL-LHC with center-of-mass energy 14 TeV and integrated luminosity 3 ab −1 .
Next, the Toolkit for Multivariate Analysis (TMVA) package [68] is used to perform the Multivariate analyses (MVA) and better discriminate signal and background events.
The events which pass pre-selection cuts are analyzed through the Boosted Decision Trees (BDT) algorithm in the TMVA package where the discrimination is done using the following thirty observables. (c) Transverse momentum, pseudorapidity and azimuth angle of the first three jets: (d) Transverse and longitudinal displacements of the first three jets: (e) Pseudorapidity difference, azimuth angle difference and angular separation between the first and second jets: ∆η(j 1 , j 2 ), ∆φ(j 1 , j 2 ), ∆R(j 1 , j 2 ).
(f) Invariant mass, transverse momenta, pseudorapidity and azimuth angle of the system of the first and second jets: (g) Transverse momentum, pseudorapidity and invariant mass of the system of the first three jets: (h) Pseudorapidity difference, azimuth angle difference and angular separation between the system of (j 1 + j 2 ) and j 3 : ∆η(j 1 + j 2 , j 3 ), ∆φ(j 1 + j 2 , j 3 ), ∆R(j 1 + j 2 , j 3 ).  SM background and the signal with ALP mass m a = 500 GeV and three different ALPgluon couplings, CG/f a = 10 −1 , 10 −2 and 10 −3 TeV −1 at the HL-LHC with center-of-mass energy of 14 TeV. It can be seen that there is no significant difference in the spread of the background BDT score with the change in CG/f a but the signal distribution spreads away from the background as CG/f a increases, or in other words, the signal and background discrimination becomes more obvious with the increase in CG/f a for a fixed m a .
Given the specified method for the multivariate analysis, the TMVA package ranks the input observables according to their importance when discriminating between the signal and background. As the ALP mass varies, the signal kinematics changes, and therefore, the rank also changes. In Fig. 10 of Appendix C, we show distributions of observables for the signal with benchmark coupling CG/f a = 10 −2 TeV −1 and representative ALP masses, and for the total SM background. For m a 900 GeV, H T , p T (j 1 ) and p T (j 2 ) are the highest ranked observables. This is because as ALP masses get higher, the jets from the decay of the ALPs are harder, and therefore, these observables related to the jet momentum become more distinct between the signal and background, as demonstrated through the m a = 1500 GeV case in Fig. 10. For m a 100 GeV, since the jets from ALP decay are soft, the distributions of jet momentum-related kinematic variables are similar between the signal and background, as demonstrated through the m a = 50 GeV case in Fig. 10. Therefore, these observables are not as discriminating as in the case of heavier ALPs, while angular observables of final state jets play a more important role. φ(j 1 ) and φ(j 2 ) are the highest ranked observables in this mass range. For 100 GeV m a 900 GeV, both momentumrelated and angular variables can have similar discriminating powers. Therefore, H T , p T (j i ) and φ(j i ) are alternatively ranked highest in this mass range.

Sensitivities on Model Parameters
As has been mentioned, multivariate analysis has been used to further discriminate the signal and background processes, and this analysis allots a BDT score for each of the processes. In the next part of our analysis, we use the following formula to calculate the statistical significance, where N s and N b are the expected numbers of signal and total background events, respectively, after all the selection cuts. The method we employ for finding the discovery limits in the parameter space spanned by the parameter CG/f a and the ALP mass m a is as follows. Fig. 9 in Appendix B depicts BDT response distributions for the signal (black, filled) with fixed parameter CG/f a = 10 −2 TeV −1 and dominant SM background processes at the HL-LHC assuming different m a cases. As we can see from the distributions, as the ALP mass increases the BDT responses for the signal and background become more distinct. This has the advantage of rejecting the background while retaining the signal events when applying a BDT cut. However, as shown in Fig. 2, the signal production cross section decreases rapidly as the ALP mass increases, which limits the discovery potential for heavy ALP. For a case with fixed ALP mass m a and parameter CG/f a , a BDT cut is then chosen to further reject the background. Table 3 in Appendix D shows selection efficiencies for the signal processes with representative ALP masses and fixed CG/f a = 10 −2 TeV −1 and the background processes at the HL-LHC. The final upper limits corresponding to 2-, 3-and 5-σ significances at the HL-LHC with center-of-mass energy √ s = 14 TeV and an integrated luminosity of 3 ab −1 are shown in Fig. 4.

Reconstruction of Invariant Mass of the ALPs
As has been mentioned previously, in this study we have assumed that ALPs couple only to gluons and as such we have considered that they decay into gluons and appear as jets at the detector level. To reconstruct the invariant mass of the ALPs from the kinematics of the final state jets, therefore, we have employed the following method. We determine the invariant mass of each pair of jets taking into account all possible combinations and then select those that have an invariant mass closest to the assumed ALP mass. Some representative jet pair invariant mass M jj distributions after BDT cuts determined in this way are displayed in Fig. 5. Figure 5: Reconstruction of the invariant mass of ALPs for various representative ALP mass assumptions. The histograms represent distributions of invariant mass constructed from a jet pair in an event whose combined invariant mass is closest to the assumed ALP mass. The parameter CG/f a is fixed at 10 −2 TeV −1 for all signal distributions.
As we can see from Fig. 5, for ALP masses up to about 1 TeV a sharp peak for the M jj distribution can be observed around the respective ALP mass. However, for masses above 1 TeV the distribution becomes flatter and the peak appears slightly below the ALP mass. We generated the signal sample with higher center-of-mass energies, such as the 100 TeV, and found that the M jj can once again be reconstructed with the peak around the ALP mass. This suggests that the validation of this method depends on the ALP mass and center-of-mass energy, and the ALP mass can be reconstructed in this method when m a 1 TeV at the HL-LHC.

Effects of Systematic Uncertainties
Systematic uncertainties arise during experiments due to a number of factors, including but not limited to jet energy scale and resolution, renormalization and factorization scales, parton shower and hadronization models, PDF variation, jet flavor tagging, etc. As shown and calculated from Table 1 and 3, when m a = 500 GeV, the number of total background events N b and that of signal events N s after all cuts are 7.6×10 12 and 5.0×10 5 , respectively. Because N b N s and N b has a very huge value, the systematic uncertainty on total background can affect the final limits greatly. In this section, we estimate and comment on the effects of systematic uncertainties on the limits. We consider 0.05%, 0.1%, 0.5%, 1%, 5%, 10% and 15% relative systematic uncertainties on the expected number of events for the total background and discuss their implications on the upper limits for the benchmark case with m a = 500 GeV. Systematic uncertainties are introduced as nuisance parameters in estimating the expected median signal significance σ sig , and therefore we employ the following formula to approximate its value since N b N s .
where N s and N b are expected numbers of signal and total background events, respectively, after all selection cuts and δ b = r b ·N b denotes uncertainty in the number of total background event when considering a relative systematic uncertainty r b . Table 2 summarizes the upper bounds achieved with the afore-mentioned representative relative systematic uncertainties at the 2-, 3-and 5-σ confidence levels for the benchmark case with m a = 500 GeV.
1.8 2.2 2.9 Table 2: Upper limits on the parameter CG/f a (TeV −1 ) achieved by considering 0.05%, 0.1%, 0.5%, 1%, 5%, 10% and 15% relative systematic uncertainties on the number of total background events for the benchmark case with m a = 500 GeV. The first row represents the limits achieved without any systematic uncertainties on the background.
Since the expected number of multi-jet events is much higher than that of other background processes combined, the effects of systematic uncertainties on the total number of background events is highly governed by the multi-jet process. Hence, we have checked systematic uncertainties on the multi-jet final state from current LHC experimental studies [69][70][71][72], and also the projections at the HL-LHC [73]. The ATLAS studies at centerof-mass energies of 7 TeV [69,70], 8 TeV [71] and 13 TeV [72] show that the combined systematic uncertainty of calibrated jets could lie below ∼ 2%. Ref. [73] recommends values of relative systematic uncertainties at the HL-LHC for different final state objects. The suggested systematic uncertainties on jets include 0-2% due to the pile-up effect, and 0.1-0.2%, 0.1-0.5%, 0.75% due to measurements of absolute Jet Energy Scale (JES), relative JES, jet flavour, respectively. Therefore, at future HL-LHC experiments, with upgraded detector design and technology, it is possible to achieve 1% systematic uncertainty on the multi-jet process.
To estimate the effects of different systematic uncertainties on the final upper limits in the considered mass range, we consider 0.1% and 1% relative systematic uncertainties on the expected number of the total background events, and repeat the analyses for representative masses. The 2-σ limits on the model parameter CG/f a with 0% (solid, black), 0.1% (dashed, blue), and 1% (dotted, red) relative systematic uncertainties at the HL-LHC with a centerof-mass energy of 14 TeV and an integrated luminosity of 3 ab −1 are shown in Fig. 6. The results show that the limits are very sensitive to the systematic uncertainty. Even a small relative systematic uncertainty on the total background can weaken the limits. This is mainly because, as we have mentioned above, the number of background events N b after all cuts is very huge and much larger than the number of signal events N s . Since the background is dominated by the multi-jet process, to reduce the effects of systematic uncertainty and get strong limits, it is important to reduce the multi-jet background for future experimental analyses.
One method is to apply more stringent BDT cuts. However, as shown in Fig. 3, for example, stringent BDT cut is related to the tail distribution of the background. To reduce the statistical fluctuation and obtain smooth tail distribution, huge number of multi-jet background events need to be generated. Limited by our computational resources, results in Table 2 have been achieved by generating ∼ 1.26×10 8 total background events and 10 6 signal events for each iteration of ALP mass m a and the parameter CG/f a which has amounted to ∼ 40 TB of data. Due to lack of sufficient multi-jet background sample, the tail distribution of the background has very big fluctuations. To get reliable results, our analysis has been done by preserving sizable multi-jet background events and, hence, we are not able to reject this background further. Therefore, the results listed in Table 2 and Fig. 6 merely represent how systematic uncertainties affect the upper limits for the data we have accumulated for this study. Future experimental analyses with ample computational resources could achieve better results by obtaining more background, especially the multi-jet event samples, and simulating the tail part of the total background much better. One can also input more observables to the multivariate analysis to reject the multi-jet background, which needs to check more kinematic observables and find those having clear differences between the signal and background, and we leave it for future studies. The effective operators are weighted down by the energy scale f a in the effective Lagrangian provided in Eq. (2.1). This suppression energy scale must be considerably larger than the typical energy scale of the process under investigation in order for the description of the Effective Field Theory (EFT) to be valid. That is, it is required that for every generated signal event, the center-of-mass energy of the signal process should be less than the energy scale, i.e.

Validation of the Effective Lagrangian
√ŝ < f a [58]. Since √ŝ cannot be measured directly in experiment and the signal final state consists of all jets, we, approximately, consider the invariant mass of all final state jets M all j as √ŝ . Fig. 7 shows M all j distributions of the signal with several representative ALP masses after pre-selection cuts. Considering the validation of the EFT framework, the constraint M all j < f a needs to be imposed. When f a is 2 TeV, for example, a fraction of signal events are invalid for m a > 500 GeV, and thus final limits are weakened for heavy masses. However, such effects on limits are small when f a is very large (for example, 5 TeV) even for heavy masses. Therefore, the results depend strongly on the symmetry breaking scale f a , especially for the heavy masses.

Conclusion
A wide area of the parameter space spanned by the ALP mass m a and its couplings to gluons CG/f a is probed at the HL-LHC with a center-of-mass energy √ s = 14 TeV and integrated luminosity L int = 3 ab −1 . Assuming ALPs couple to gluons only, they are produced in association with one jet, p p → a j, and decay into gluon pairs. Thus, the signal final state has at least three jets. Seven dominant background processes relevant to our signal are considered and simulated at the detector level. Signal processes are scanned for model parameters of m a and CG/f a and events are generated for each specific combination. To consider the long-lived properties of ALPs, a specialized Delphes module [67] has been employed to deal with ALPs decaying into displaced jets, leading to corresponding observables such as the transverse and longitudinal displacements of one jet V T (j) and V z (j).
For data analyses, we first apply pre-selection cuts to select the signal final state. Owing to large expected number of background events, thirty observables including V T (j) and V z (j) are input to the TMVA package to perform multivariate analyses via the BDT algorithm. We present the distributions of representative observables and the corresponding BDT responses for the signal process with benchmark mass m a = 500 GeV and the background process in Fig. 8 and Fig. 3, respectively, and BDT response distributions for various m a assumptions are depicted in Fig. 9. The highest ranked observables are given in different mass ranges, according to their importance when discriminating between the signal and background, which are demonstrated in Fig. 10.
The BDT responses are exploited to discriminate signal and background events and achieve the best sensitivities. We list selection efficiencies for the signal processes with representative ALP masses and fixed CG/f a = 10 −2 TeV −1 and the background processes in Table 3, and upper limits on the model parameter CG/f a for ALP mass m a in the range 5−2300 GeV at 2-, 3-and 5-σ significances at the HL-LHC with 3 ab −1 integrated luminosity are shown in Fig. 4. Our results show that the best upper limits at 2-, 3-and 5-σ significances are 1.09 × 10 −2 , 1.33 × 10 −2 and 1.71 × 10 −2 TeV −1 for m a ∼ 500 GeV, respectively.
To reconstruct the invariant mass of the ALPs from the kinematics of the final state jets, we plot distributions of invariant mass constructed from a jet pair in an event whose combined invariant mass is closest to the assumed ALP mass and find that the ALP mass can be reconstructed in this method when m a 1 TeV at the HL-LHC.
To estimate the effects of systematic uncertainties, we consider 0.05%, 0.1%, 0.5%, 1%, 5%, 10% and 15% relative systematic uncertainties on the expected number of events for the total background and check their implications on the upper limits for the benchmark case with m a = 500 GeV. We also consider 0.1% and 1% relative systematic uncertainties, and repeat the analyses for representative masses. The 2-σ limits on the model parameter CG/f a in the considered mass range are shown in Fig. 6. Results show that upper limits are very sensitive to the systematic uncertainty for the data we have accumulated for this study. This is mainly because the number of background events after all cuts is still very huge and much larger than the number of signal events. Since the background is dominated by the multi-jet process, to reduce the effects of systematic uncertainty, it is important to reduce the multi-jet background for future experimental analyses. The possible methods are discussed.
To check the validation of the EFT framework, we, approximately, consider the invariant mass of all final state jets M all j as the center-of-mass energy of the signal process √ŝ and plot its distributions with several representative ALP masses after pre-selection cuts. We find that the limits depend strongly on the symmetry breaking scale f a , especially for the heavy masses.
Granted that there is room for improvement, the analyses done in this study and the sensitivities achieved shed light on vast previously unprobed regions of heavy ALP physics through the ALP coupling to gluons at the HL-LHC and give an insight into the sensitivities that can be achieved at future upgraded proton-proton colliders.    Table 3: Selection efficiencies for the signal processes with a range of representative ALP masses and fixed CG/f a = 10 −2 TeV −1 and the background processes multi-jet, W ± j, Zj, τ + τ − j, tt, W + W − and ZZ at the HL-LHC with center-of-mass energy √ s = 14 TeV.