Double Higgs production at the HL-LHC: probing a loop-enhanced model with kinematical distributions

.


Introduction
One of the most crucial measurements that can be accomplished in the high luminosity era at the LHC (HL-LHC) is the production of a Higgs (h) pair from the dominant gluon fusion production channel [1][2][3][4][5][6].In particular, this is a predicted process from the Standard Model (SM) of particle physics in which the quartic Higgs coupling λ from the electroweak symmetry breaking (EWSB) potential enters explicitly [7][8][9][10][11].Thus this measurement, along with the more difficult to asses triple h production, may help in elucidating the mechanism of EWSB in the SM.It is well-known however that the gluon initiated double h production in the SM suffers from a partial cancellation between the top quark dominated box and triangle diagrams, which becomes exact at threshold [3,5,7].This leads to a cross section for double h production that at leading order (LO) is ∼ 17 fb −1 and that at next-to-next-to-leading-order (NNLO) increases to ∼ 39 fb −1 [12].Current measurements from ATLAS and CMS at a luminosity L ∼ 140 fb −1 have little sensitivity, putting constraints to σ hh ≲ (4 ∼ 8) × σ hh,SM for the pair of Higgs bosons decaying to b bγγ [13,14].Considering the most promising Higgs decay combinations b bγγ, b bb b, b bτ τ , b bZZ * (4l) and b bW W * (lνlν) , it is claimed that a 4.5σ discovery significance in the SM is possible for L ∼ 3 ab −1 , when combining all channels and both experiments [15], neglecting systematic uncertainties.
In this context, it becomes interesting to consider possible beyond the SM (BSM) contributions that could affect double h production at the LHC 1 , providing larger cross sections that may be probed at luminosities less than 3 ab −1 .Modifications to the quartic λ as predicted by the SM have been explored, where usually large luminosities and center of mass energies ( √ s ≥ 14 TeV) are required for discovery (see for example [16][17][18][19][20]).New BSM particles running in the loops of the double h gluon fusion diagrams are also an interesting alternative to enhance the production (loop-enhanced models), see for example the seminal Refs.[3,21] for supersymmetric theories, [22][23][24] for new fermions, [25][26][27] for LQs as well as other possibilities [28][29][30][31][32].In Ref. [33] we considered a leptoquark (LQ) model with a minimal content of LQs and showed that at LO σ hh,LO ∼ 2.3×σ hh,SM −LO could be obtained for a light enough LQ, m LQ ∼ 400 GeV, via large cubic interactions in the LQ-Higgs potential, while at the same time avoiding all current experimental constraints, in particular: single Higgs measurements, LQ direct searches and electroweak precision measurements.
In this work we do a full collider study for gluon-fused double h production at the HL-LHC for four benchmark (BM) points of what we dubbed the "scenario with a light LQ" in our previous paper, focusing on the b bγγ channel [10].We consider the main SM backgrounds for our signal and take our benchmarks to satisfy the latest measurements from single Higgs and LQ direct searches.We validate our Monte Carlo (MC) results against our previous numerical studies, finding excellent agreement.Studying the differential cross-section distributions, we find that the presence of the light LQ in the loop can provide a resonant enhancement that manifests itself clearly in the invariant di-Higgs mass m hh distribution but also in the separation between the two photons, ∆R γγ , providing important handles that allow a strong background suppression.This is a unique property of the loop-enhanced models, which could serve as a clear distinction from BSM models that modify Higgs quartic interaction or via modifications from effective operators (EFTs) [34][35][36].
The paper is organized as follows: in section 2 we briefly recapitulate on the LQ model which enhances di-Higgs production via gluon fusion, its experimental constraints, and calculate the total and differential cross sections for the four BM points.In section 3 we discuss how we simulate the BSM signal and SM backgrounds.In section 4 we present our results, showing the power of considering the differential cross-section distributions.Finally, in section 5 we present our conclusions and briefly discuss the possibility of implementing machine learning tools for the analysis.

LQ model for the enhancement of di-Higgs production
We consider a model with the following scalar LQs: R2 ∼ (3, 2) 1/6 , S 1 ∼ ( 3, 1) 1/3 and S1 ∼ ( 3, 1) −2/3 , allowing cubic interactions with the Higgs, that can lead to sizable corrections of dihiggs production at the LHC.The full Lagrangian of the model, as well as its phenomenology, have been discussed in Ref. [33].Below we describe the ingredients relevant for the study in this work.The most general renormalizable potential is where where the sum is over all the scalar fields: ϕ = H, R2 , S 1 , S1, the dots stand for quartic terms that do not involve the Higgs and µ 1,2 are the dimension-full cubic couplings.
The Higgs vacuum expectation value (vev), as well as the cubic coupling µ 2 mix the LQs of equal electric charge.The mass matrices can be diagonalized by unitary transformations [33], such that in the mass basis, the potential reads: with C (q) and Q (q) the cubic and quartic couplings, respectively, i, j = 1, 2 and q = u, d, indicating the electromagnetic charge of the states, u for charge 2/3 and d for −1/3.In Ref. [33] one can find the mass matrices, as well as the matrices of the cubic and quartic interactions, in the gauge basis of Eq. ( 1).
There are two sources for cubic couplings: the coefficients µ i , that although in the gauge basis only couple different LQs, in the mass basis lead also to diagonal components proportional to the mixing angles, and the quartic coefficients λ i with one Higgs vev, that in the mass basis lead to diagonal as well as off-diagonal interactions.The quartic couplings are proportional to λ i , and in the mass basis lead to diagonal and off-diagonal interactions.
For the study of double Higgs production, we have made a random scan of several hundred points and selected four benchmark points (BM) that span a range of masses and couplings of the LQs.The corresponding parameters of the potential of Eq. ( 1) are in the following ranges: m LQ ∈ (0.6, 3.4) TeV, µ 2 ∈ (0.9, 5.7) TeV, λ i ∼ O(1) and λ ′ 1 = 0. 2 With µ 1 = λ ′ 1 = 0 there is no mixing between down-type LQs.Since the di-Higgs production cross-section is mostly driven by the mass and couplings of the lightest LQ, we choose BM points in which the lightest mass ranges from ∼ 400 to 800 GeV and that have sizable couplings.The masses and couplings of the BMs in the mass basis are shown in table 1. 3 The parameters of the potential of Eq. (1) that generate these BMs can be computed analytically by performing the inverse unitary rotations that lead to Eq. (5).
It is worth stressing that none of the selected BM points is excluded by theoretical or experimental constraints related to electroweak precision tests, flavor, or collider physics.In particular, the impact of LQs on oblique corrections, the Zb b coupling, and flavor-changing transitions has been studied in Ref. [33].Among them, it was shown there that the T -parameter imposes restrictive constraints on the model and, as commented above, specifically on the coupling λ ′ 1 .Without overlooking the fact that the effect of this coupling on double Higgs production is mild, the simple choice λ ′ 1 = 0 in all the BM points alleviates these constraints.The LQs, being colored and electrically charged fields, correct to one-loop order the Higgs interactions with gluons and photons.This affects the single Higgs production via gluon fusion and the Higgs decay into a pair of photons.The corrections to these couplings can be studied in terms of the κ formalism [37].We have checked that for the BM points δκ g and δκ γ are compatible at 2σ level with a combined fit performed by ATLAS including analyses with luminosities ranging from 24.5 fb −1 to 139 fb −1 [38], with BM1, BM2 and BM3 being between 1σ and 2σ, whereas BM4 has a very small correction compared with 2 The T parameter requires a moderate value of λ ′ 1 [33], for simplicity we take it vanishing in this work.1: Masses, cubic C (q) and quartic Q (q) couplings of the LQs corresponding to the BM points, in the mass basis Eq. ( 5).
the SM (see also Ref. [39] by CMS for less restrictive limits). 4In our analysis we will also take into account this effect in the di-photon decay of the Higgs.
The lightest LQ in each BM point decays dominantly to dijets.Important constraints arise in this case from the following searches.Reinterpreting the experimental bounds on the mass of stops decaying promptly through R-parity-violating couplings as limits on LQ masses, direct searches of nonresonant pair production of resonances lead to relevant restrictions on the allowed LQ spectrum.In an analysis performed by ATLAS at 36.7 fb −1 [42] stop masses are excluded from 100 GeV to 410 GeV for stop decays into two light quarks and in the ranges (100 − 470) GeV and (480 − 610) GeV for decays into b and a light quark.Using 35.9 fb −1 CMS [43] excludes masses in the range (80 − 520) GeV for final states with two light quarks, and reports the exclusion of values (80 − 270) GeV, (285 − 340) GeV and (400 − 525) GeV for a b and a light quark in the final state.In a search carried out by CMS at 138 fb −1 [44] stop masses are excluded in the range (580 − 770) GeV for stops decaying into a pair of light quarks.In light of these results, the mass of the lightest LQ in BM points 1, 2, and 3, as shown in table 1, seems to be excluded.However, it is worthwhile to remark that in all the referenced searches it is assumed that the stop decays into the selected final state with a branching ratio (BR) equal to 1 and, therefore, any relaxation of this assumption might lead to less restrictive constraints.For the purpose of this study it is important to point out that the LQs interactions with fermions enter as higher order corrections to Higgs productions.In this sense, it is implicitly understood that in the four BM points the LQ-fermion couplings result in a proportion of branching ratios that fulfill the experimental limits.This is not difficult to achieve by assuming, for instance, a BR equal to 0.7 for decays into light quarks and a BR equal to 0.3 for decays into b and a light quark.The next to lightest LQ in each BM point decays dominantly into quarks and leptons.There are tight constraints coming from direct searches of LQs in this case as well.Bounds on LQs decaying predominantly into third generation fermions can be found in Refs.[45][46][47][48], with a bound of ∼ 0.8−1.5 TeV, depending on the BRs.For decays into quarks of the third generation and light leptons, the bounds are ∼ 1 − 1.4 TeV [49], according to different values of the BRs.The most stringent bounds for decays into light fermions are 0.8 − 1.8 TeV considering BRs as low as 0.1 [50].In the case of decays into light quarks and τ Ref. [51] has considered single production that depends on the Yukawa couplings obtaining bounds ∼ 0.6 − 2 TeV.Since these constraints depend on the BRs, we follow the same assumption made above for the lightest LQ decaying into dijets: LQs couplings are such that the branching ratios lead to decays compatible with the experimental limits.To accomplish this restriction is easier than in the case of the lightest LQ since now LQ couplings beyond those to quarks and leptons are present and then more decay channels are available.Finally, as mentioned, these analyses from ATLAS and CMS assume prompt decays.We have analyzed in Ref. [33] the experimental limits on long-lived LQs.We have now proceeded as in that work and assumed that the LQ-fermion couplings are small enough to avoid flavor issues but also sufficiently large to evade long-lived particle searches.
An interesting observation may be the following.At least in recent experimental searches of colored particles, as Ref. [44], the minimum explored mass corresponds to 500 GeV.The reason behind this would lie perhaps in some degree of confidence that lower masses are already excluded.As we discussed above, the exclusion limits are compatible only with the assumption that a given final state has a branching ratio equal to 1. Thus, if future searches focus on large values of masses, it could be possible that the first signal of light LQs appears in single Higgs processes through corrections to κ g and κ γ since LQ contributions to these quantities are potentially large.For instance, we can easily see this in BM1 and BM2 where (δκ g , δκ γ ) is close to the 2σ limit with values (−0.121, 0.045) and (−0.116, 0.043), respectively [38].Although to a lesser extent, BM3 is still above the 1σ level with (δκ g , δκ γ ) = (−0.093,0.029).
As thoroughly discussed in Ref. [33], in the present model there are correlations between single and double Higgs production at hadron colliders, since the same colored particles are involved in the virtual processes.However both production channels are complementary, at least in the following way: on the one hand, single Higgs is sensitive to corrections to the Wilson coefficient of effective operators as hG 2 , but it is difficult to identify the source of those corrections5 ; on the other hand, di-Higgs production is sensitive to the properties of the virtual particles via the kinematical distributions of the di-Higgs system, as we will show in the next sections.Though precise measurements of single Higgs production alone could discard the BM points of the model, the discovery of new physics in single Higgs would require more information to decipher its nature, part of which could be provided by di-Higgs production.Furthermore, there are couplings that mix different LQs that enter in di-Higgs and are absent in single Higgs production.

Di-Higgs production
The di-Higgs production cross-section in the presence of LQs has been computed in Refs.[25,33], in this article we follow the calculation of Ref. [33].The di-Higgs cross section via gluon fusion σ hh is driven by the cubic and quartic couplings of the potential that involve the Higgs field.The cubic couplings enter in the Feynman diagrams of di-higgs production in several ways, see Fig. 1: the triangle is linear in the diagonal components ii , whereas the box is quadratic in C (q) , with C (q) ii entering in diagrams with one LQ running in the loop and C (q) i̸ =j in diagrams with two different LQs in the loop.For quartic couplings only the diagonal components play a role, Q ii , entering in the diagrams with two Higgs fields attached to the same vertex of the loop.We wrote our own numerical program to calculate the differential di-Higgs production crosssection at LO.The software implements LHAPDF6 [53] to evaluate the gluon PDF of the proton at the center-of-mass energy scale of the scattering process.We chose the set PDF4LHC15 nnlo mc [54] and used LoopTools [55] to numerically compute the scalar one-loop integrals involved in the SM and new physics form factors.We show the results in table 2, normalized to the SM one.
In Figs. 2 and 3 we show the LO prediction of the differential cross sections for di-Higgs production as a function of the invariant di-Higgs mass m hh and transverse momentum of one of the Higgses p T h , at the √ s = 14 TeV LHC, for the four chosen BM points in solid black and the SM prediction in dashed orange.Both the BM and the SM predictions were calculated using the same numerical code from our previous work [33].The influence of the light LQ on the differential distribution which shows up as a resonant peak can clearly be appreciated in both distributions (m hh and p T h ) for BM1 and BM2 in Fig. 2, whereas the peak is barely perceptible for BM3 and totally imperceptible for BM4 as can be appreciated in Fig. 3, where the LQ mass is heavier.This resonant peak is a feature of the loop-enhanced models and in the invariant mass distribution it peaks at m hh ∼ 2m LQ , a region in which, for the masses considered, one may expect small SM backgrounds.Notice that the light LQ not only produces the resonance but also affects the peak associated with the top quark running in the loop at m hh ∼ 2m t (similarly for p T h ).It is useful to compare with the SM curve in which the top quark dominates the loop contribution.For BM1, BM2, and to a lesser degree BM3, we see a clear enhancement on the first peak of the distributions when comparing the solid black with the dashed orange curve (reaching roughly two times the SM peak value for BM1 and BM2).This is a consequence of the interference between the top quark (SM) and the LQ (BSM) in the di-Higgs loop production.Another interesting feature related to this is the observation that more Higgs pairs are expected with larger p T than in the pure SM case.Thus these Higgs bosons will tend to be boosted, implying that their decay products are collimated.This important observation will be exploited in the coming sections and shows to be a powerful handle in suppressing SM background.Note that for BM3 only a slight enhancement in the top quark peak is observed while for BM4 we obtain roughly the SM prediction (with a tiny suppression on the peak).

Simulations
To study the di-Higgs generation at the level of final states and reconstructed events after detection, we have simulated the signal and backgrounds with MadGraph5 aMC@NLO (MG5) [56,57] with the PDF set PDF4LHC15 nnlo mc [54], at √ s = 14 TeV.We have considered up to one extra jet with the parton-jet MLM matching scheme [58].For the simulation of the Higgs decays, we have used MadSpin [59].Parton showering and hadronization have been implemented with Pythia 8 [60][61][62], and a fast detector simulation has been carried out with Delphes 3 [63], using the default ATLAS card for the high luminosity LHC.

Signal
To be able to simulate signal events, that require one-loop calculations, we have implemented the model of Eq. ( 1), with LQs, in FeynRules [64,65] at tree level.By making use of FeynArts [66] and NLOCT [67] we have generated the new Lagrangian renormalized at one loop level and exported the corresponding model in UFO format [68] ready for simulations with MG5.We have checked that the simulations with this model correctly reproduce the SM cross-section and distributions by comparison with the implementation of the SM at one loop already included in MG5, namely the loop sm.We have also checked that in the presence of LQs the simulations reproduce the cross-sections and ratios shown in table 2 as well as the distributions described in the previous section.
The UFO of the model is available in https://github.com/manuepele/SM_LQs.git.The external parameters of the model are the physical masses of the LQs and the quartic couplings in the gauge basis, the couplings in the mass basis are internally computed.
Armed with these tools we have simulated the di-Higgs production adding an extra jet with the aim of obtaining more realistic differential distributions of the cross section due to the large  hadronic activity.The parton-jet matching has been performed by using the MLM scheme.In order to normalize the production cross section we consider the SM value of Ref. [12] for m h = 125 GeV, which has been computed at NNLL: σ NNLL = 39.59 fb, and multiply by the values shown in table 2, assuming that most higher order corrections will cancel in the ratio. 6he analysis of angular variables requires keeping spin correlations in the production as well as in the particle decays.The later ones were included in our simulations by making use of MadSpin for the decays of the Higgs to b b and γγ.We have also included the corrections to the BR(h → γγ) by the presence of the LQs, as described in sec. 2 in the context of the κ formalism.The BR(h → b b) has been taken from Ref. [12].

Backgrounds
The backgrounds are given by final states with photons and jets and at detector level one cares about events with two reconstructed photons and two reconstructed b-jets.At parton level, the irreducible background without a Higgs is b bγγ, but b bγj and b bjj with light jets misidentified as photons can also be sizable given the large QCD production cross-section, as well as ccγγ and ccγj with c-jets misidentified as b-jets.Corrections to these backgrounds with one extra final jet are also taken into account.There are in addition backgrounds with a single Higgs, as Zh and t th, for which the two photons reconstruct the Higgs; although these processes have a partonic cross-sections much smaller than other backgrounds, their selection acceptance is much higher [16].
Following the analysis of ATLAS [16], we have generated the following backgrounds, that give the leading contributions7 : the irreducible b bγγ + b bγγj, the reducible processes b bγj + b bγjj and ccγγ + ccγγj, the associated production of the Higgs with a top pair, t th, and with a Z boson, Zh.For b bγj + b bγjj it is required that one of the light jets produced either at parton level or in the showering and hadronization is misidentified as a photon.The selection of such misidentified jet was carried out by considering a j → γ fake rate of 0.5 × 10 −3 and sampling over all the reconstructed jets.For ccγγ + ccγγj, where one of the c-jets must fake a photon, we used the misidentification rate of the default ATLAS HL-LHC card from Delphes, which is roughly of order 10%, although it depends on its transverse momentum and pseudorapidity.For t th, we took into account both the hadronic and semileptonic decay modes for the tops.Finally, for Zh, we included the irreducible contribution from Z → b b and the reducible given by Z → cc.Since the misidentification rate of light jets into b-jets is significantly smaller than that of c-jets , we have not considered this reducible contribution. 8 Analysis and results Following the ATLAS analysis in Ref. [16], we apply the following selection criteria to the events simulated at detector level: • The number of isolated photons with p T > 30 GeV and |η| < 1.37 or 1.52 < |η| < 2.37 is required to be ≥ 2.
• The number of b-tagged jets must be ≥ 2, with the leading and subleading b-jets satisfying |η| < 2.4 and p T > 40 GeV and 30 GeV, respectively.
• Events with more than five jets with |p T | > 30 GeV and |η| < 2.5 are discarded.For this requirement, the number of jets in the event includes those tagged as b-jets with |η| < 2.4.
• The photon pair candidate to reconstruct one of the Higgs bosons must contain photons that are separated from other photons by ∆R γγ > 0.4 and from the jets in the event by ∆R γj > 0.4.In addition, ∆R γγ < 2.0.
• The b-jets in the pair taken as a candidate to reconstruct one of the Higgs bosons must fulfill 0.4 < ∆R bb < 2.0.
• From all the possible candidate pairs of photons and b-jets, those with the closest invariant mass to the Higgs mass are selected.The transverse momenta of the resulting γγ and b b systems are required to satisfy p γγ T , p bb T > 80 GeV.
From now on the above criteria will be denoted collectively as selection cuts.In addition to the cuts listed above, the ATLAS analysis includes the requirements 122 GeV < m γγ < 128 GeV and 100 GeV < m bb < 150 GeV.In order to validate our simulation setup we have computed the significance for the SM obtained with the full set of cuts proposed by ATLAS, obtaining 1σ for an integrated luminosity of 3 ab −1 which is consistent with the value reported in [16].
In Fig. 4 we show the invariant mass distributions m hh , m bb and m γγ obtained after applying the selection cuts.The four signal benchmarks exhibit peaks around the Higgs mass in the m γγ distribution, while in the case of the m bb the peaks are shifted to smaller invariant masses due to the jet energy correction applied to account for differences between the generator and reconstruction levels.The distinctive feature of the LQ model consisting on a second peak around 2 m χ u 2 in the m hh distribution is also found at detector level and after applying the selection cuts.Moreover, by comparison with Fig. 2, we see that the relative proportion of events populating the second peak increases with respect to those in the first peak after imposing the selection cuts due to their bigger impact on events in the region of smaller m hh values.
In Fig. 5 we display the ∆R γγ distribution for the four signal benchmarks.While in the case of BM1 and BM2 the distributions are peaked at small ∆R γγ values, for BM3 and BM4 the peaks are shifted to larger ∆R γγ values.This different behavior can be understood if we separate the contribution of events arising from the two mutually exclusive regions m hh < 780 GeV and > 780 GeV.The corresponding ∆R γγ distributions are shown in Fig. 6, where we see that for all benchmarks the events with m hh > 780 GeV tend to accumulate at small values of ∆R γγ which is expected since for these events the Higgs bosons are more likely to be boosted leading to collimated decay products.Therefore, the shape of the combined distributions in Fig. 5 is driven by the proportion of events in which the Higgs bosons tend to be boosted.In the case of BM1 and BM2 almost 50% of the events lie in the region m hh > 780 GeV, while for BM3 and BM4 only 10% of the events correspond to this region.The small event yield of these BMs at large m hh is a consequence of the light LQ being heavier (m χ u 2 = 621 and 800 GeV, respectively) since in this case the second peak of the distribution is moved to a region where the PDFs are very suppressed.
Instead of carrying out a global analysis based on cuts, in this work, we take advantage of the discriminating power of the ∆R γγ distribution by performing a statistical analysis based on the binned log-likelihood ratio 9 .However, for this analysis to be sensitive it is necessary to improve the signal-to-background ratio first by applying further cuts.In particular, we add the same cut on m γγ as in the ATLAS analysis but modify the cut on m bb by shifting the mass window to smaller values; specifically, we take 80 GeV < m bb < 130 GeV.The fact that a mass window located at smaller m bb values leads to an improvement in the sensitivity was already pointed out in Ref. [19], even though the cut applied in our case is slightly different.The corresponding cut flow is shown in Table 3 for the NP benchmarks and also for the SM along with the main backgrounds.We also include the expected discovery significance computed as [97] where s and b are the number of signal and background events, respectively.As expected, the adding of the invariant mass cuts improves the signal-to-background ratio substantially, even to the point of leading to significance values above the evidence level for BM1 and BM2.Moreover, after applying these cuts the ∆R γγ distribution keeps most of the features found in Fig. 5 for the signal, while for the dominant backgrounds the maximum is now reached at larger ∆R γγ values, which increases the discrimination power of this variable (see Fig. 7).We make use of the whole distribution by applying a statistical analysis based on the test statistics q 0 and q µ [97], which    allows us to determine the expected discovery sensitivity and exclusion limits.The analysis is implemented through the package pyhf [98,99] and the statistical uncertainties are incorporated by modifiers that are paired with constraint terms that limit the rate modification.Specifically, in our analysis, we consider multiplicative modifiers regulated by Poissonian constraint terms.The results for the discovery sensitivity corresponding to the four benchmarks are summarized in Table 4.It is clear that the statistical analysis of the ∆R γγ distribution leads to an improvement of the significance reached by only applying the cuts listed in Table 3.As expected, the increase in the significance is stronger for BM1 and BM2 (∼ 43%) than for BM3 and BM4 (∼ 21% and ∼ 16%, respectively).Finally, for the most promising benchmarks, BM1 and BM2, the evidence level (3σ) is reached for 905 and 1180 fb −1 , respectively.
In principle, this analysis could also be used to set 95 % C.L. exclusion limits; in particular, the luminosities required to exclude the model benchmarks 1 and 2 are 600 fb −1 and 750 fb −1 , respectively, while the benchmarks 3 and 4 would require 3 ab −1 and more.However, even if light LQs scenarios such as BM1 an BM2 are not explored anymore by future direct LQs searches, one would still expect these benchmarks to be excluded earlier by single Higgs channels.
We conclude this section providing an estimation of the impact of systematic uncertainties on the background cross section and the reported discovery significances.By enabling the systematics studies in MadGraph [57], we have estimated the uncertainties arising from variations of the factorization and renormalization scales, the scale for the first emission in MLM matching, the merging scale used by Pythia 8, and the PDF set, as well as from using different dynamical schemes to set the reference scales.For the three main backgrounds driving the results in Table 4, the largest variations in the cross sections are due to the uncertainty in the factorization and renormalization scales and the chosen dynamical scheme, being the impact of the other sources of uncertainty considerably smaller.These uncertainties in the cross section of the backgrounds lead to variations in the expected discovery significances in Table 4: for BM1 the central value 5.31 varies within the range 4.79-6.03,while for BM2, with central value 4.68, the range is 4.21-5.33.We see that in both cases the variations are smaller than 14% and do not change the conclusions regarding the discovery prospects of the most promising benchmarks.BM1 BM2 BM3 BM4 5.31 4.68 2.33 1.62 Table 4: Expected discovery significances obtained with the statistical analysis of the ∆R γγ distribution assuming a luminosity of 3 ab −1 .

Discussions and conclusions
We have studied the di-Higgs production initiated by gluon fusion and its decay to b bγγ at the HL-LHC, under the presence of new physics with color charge, the main motivation being the possibility of enhancing it due to the presence of new physics.As for the new physics states we considered a minimal set of scalar leptoquarks with the same representations under the SM gauge group as one generation of quarks, such that there are cubic and quartic interactions with the Higgs boson.We selected four benchmark points of the parameter space of the model in which the lightest leptoquark has an electric charge 2/3 and a mass between 400 and 800 GeV, decaying to dijets, leading to benchmarks that are allowed by all current experimental constraints.Studying several differential distributions of σ hh , first using the numerical code of [33] by analytic calculation of the scattering amplitude, as well as by Monte Carlo simulations via the implementation of the model at one loop level in MadGraph5, we find that the presence of the light LQ manifests itself as a resonance in some of the differential distributions.We have also simulated the main SM backgrounds for the final states under consideration.In particular, we looked at differential kinematical distributions like p T,h , m hh , m pp and ∆R pp , with p = γ or b.It was found that, after the application of a suitable set of cuts, the distribution ∆R γγ is one of the most sensitive observables for discriminating signal from background, as it is correlated with the presence of the resonant peak associated with the light LQ in the differential distributions.
After applying a set of selection cuts that already improve the signal-to-background ratio, we performed a statistical analysis on the ∆R γγ for the four benchmarks, obtaining significances for L = 3 ab −1 above the discovery level and close to it for BM1 and BM2 (5.31 and 4.68, respectively).Moreover, we found that the evidence level could be achieved for luminosities of 905 and 1180 fb −1 , respectively.Regarding the exclusion prospects, luminosities of 600 and 750 fb −1 would be enough.The benchmarks 3 and 4 are more challenging since in this case there are very few events with boosted Higgs, and these are precisely the events that make the distribution more efficient to discriminate the signal from the backgrounds.The significances reached for BM3 and BM4 are in fact below the evidence level: 2.33 and 1.62, respectively.For these benchmarks, one may wonder if a strategy incorporating machine-learning tools could improve the discovery prospects.
Deep neural networks (DNNs) are a well-known example of these powerful algorithms.DNNs can be trained to perform non-linear global cuts that make use of complex and non-intuitive patterns to improve human performance in classification tasks.As part of an ongoing work, we carried out a preliminary study in which we optimized DNN models from statistically balanced samples of signal and background events tagged by the same five kinematical features considered in the analysis of Section 4: m bb , m γγ , m hh , ∆R bb and ∆R γγ .We reached discovery significances of 2.65 for BM3 and 1.69 for BM4, which means an improvement compared to Table 3.As an example of the impact of potential background uncertainties, we found that the significances decrease to 2.25 and 1.31 when a 15% of systematic uncertainty is considered.These results are below those shown in Table 4 for the binned log-likelihood analysis.Further tests, including additional kinematical variables in the training phase and different DNNs architectures will be presented in a future article.
To conclude, a similar analysis based on kinematical distributions could be applied to other models with loop-enhanced di-Higgs production at the LHC.In particular, it would be very interesting to establish the reach with new colored fermions running in the loop.We leave this study for a future work.

Figure 1 :
Figure 1: Diagrams mediated by scalar leptoquarks that contribute to di-Higgs production.

Figure 4 :
Figure 4: Invariant mass distributions after applying the selection cuts.Upper panels: m hh distribution for BM1 and BM2 (left panel) and BM3 and BM4 (right panel).Lower panels: m bb (left panel) and m γγ (right panel) distributions.In all the cases the dominant backgrounds b bγγ and b bγj are included for reference.

Figure 5 :
Figure 5: ∆R γγ distribution for BM1 and BM2 (left panel) and BM3 and BM4 (right panel) after applying the selection cuts.The dominant backgrounds b bγγ and b bγj are included for reference.

Figure 6 :
Figure 6: ∆R γγ distributions after applying the selection cuts and the cut M hh < 780 GeV (upper panels) and M hh > 780 GeV (lower panels) for BM1 and BM2 (left panels) and BM3 and BM4 (right panels).The dominant backgrounds b bγγ and b bγj are included for reference.

Figure 7 :
Figure 7: ∆R γγ distribution for BM1 and BM2 (left panel) and BM3 and BM4 (right panel) after applying the selection cuts along with the invariant mass cuts 122 GeV < m γγ < 128 GeV and 80 GeV < m bb < 130 GeV .

Table 2 :
1661.96 1.37 0.98 Di-Higgs production cross-section for the BMs, normalized with respect to the SM one at LO.

Table 3 :
Cut-flow table listing all the cuts applied prior to the statistical analysis of the ∆R γγ distribution.The displayed event rates correspond to a luminosity of 3 ab −1 .Significance values computed with Eq. 6 are also shown.