Revisiting the non-resonant Higgs pair production at the HL-LHC

We study the prospects of observing the non-resonant di-Higgs pair production in the Standard Model (SM) at the high luminosity run of the 14 TeV LHC (HL-LHC), upon combining multiple final states chosen on the basis of their yield and cleanliness. In particular, we consider the $b\bar{b}\gamma \gamma, b\bar{b} \tau^+ \tau^-, b\bar{b} WW^*, WW^*\gamma \gamma$ and $4W$ channels mostly focusing on final states with photons and/or leptons and study 11 final states. We employ multivariate analyses to optimise the discrimination between signal and backgrounds and find it performing better than simple cut-based analyses. The various differential distributions for the Higgs pair production have non-trivial dependencies on the Higgs self-coupling ($\lambda_{hhh}$). We thus explore the implications of varying $\lambda_{hhh}$ for the most sensitive search channel for the double Higgs production, \textit{viz.}, $b\bar{b}\gamma\gamma$. The number of signal events originating from SM di-Higgs production in each final state is small and for this reason measurement of differential distributions may not be possible. Furthermore, we consider various physics beyond the standard model scenarios to quantify the effects of contamination while trying to measure the SM di-Higgs signals in detail. In particular, we study generic resonant heavy Higgs decays to a pair of SM-like Higgs bosons or to a pair of top quarks, heavy pseudoscalar decaying to an SM-like Higgs and a $Z$-boson, charged Higgs production in association with a top and a bottom quark and also various well-motivated supersymmetric channels. We set limits on the cross-sections for the aforementioned new physics scenarios, above which these can be seen as excesses over the SM background and affect the measurement of Higgs quartic coupling. We also discuss the correlations among various channels which can be useful to identify the new physics model.


Introduction
The existence of a scalar boson with a mass around 125 GeV has been unambiguously confirmed by both the ATLAS and CMS collaborations at the Large Hadron Collider (LHC). It is, however, still early to conclude whether this discovered scalar is the Higgs boson as conjectured in the Standard Model of particle physics (SM) 1 . Therefore, it is of paramount importance to precisely measure its couplings to the various SM particles, its width, spin and CP properties. As already seen from the Run I data and gradually being reiterated by the Run II data, the Higgs couplings to the SM electroweak gauge bosons are in excellent agreement with the SM expectations [1][2][3][4][5][6][7][8]. The Yukawa couplings to the first two generation fermions are extremely difficult to measure owing to their smallness [9]. However, the couplings to the third generation quarks and lepton are gradually gaining in significance [10][11][12][13][14][15][16]. The only other measurable coupling (the first generation Yukawa couplings being extremely small, is considerably challenging to getting measured in the near future) which also describes the scalar potential of the theory is the elusive Higgs self-coupling (λ hhh ). The focus of the present work is to study in considerable details, various possible final states of the Higgs pair production and to study the effects of contamination due to the presence of several new physics effects. The only direct probe to this coupling is via a pair production of Higgs bosons which further decay to various SM final states. However, it has been shown in Refs. [17][18][19][20][21][22][23] that an indirect measurement of the Higgs trilinear coupling is possible through radiative corrections of single Higgs processes both at the HL-LHC and at future e + e − colliders. Ref. [22] has shown that this coupling can be constrained in the range of [0.1,2.3] times that of its SM expectation at 68% confidence level. It has also been shown in Ref. [24] that it is possible to constrain λ hhh from the electroweak oblique parameters. The triumph of the experiments in having already probed most of the standard Higgs couplings, urges the community to constrain the self-coupling in a plethora of channels. Such measurements have received considerable attention in recent times both from theoretical and experimental communities [21,. However, a precise direct measurement of the self-coupling is extremely challenging at the LHC because the SM production cross-section is small even at √ s = 14 TeV. The dominant di-Higgs production process proceeds through top quark loop diagrams in the gluon fusion channel. An interesting aspect of this process lies in the fact that there is a fine cancellation owing to a destructive interference between the box and the triangle diagrams. This results in an extremely small cross section, viz., 39.56 +7.32% −8.38% fb at the NNLO+NNLL level [68][69][70] (with full top mass effects at NLO [71]) for the 14 TeV run of the LHC. However, various decay channels of the Higgs provide phenomenologically rich final states and appropriate combinations might help in improving the discovery potential at the high luminosity run of the LHC (HL-LHC), provided we identify optimised sets of selection cuts to reduce backgrounds (B) and improve the signal (S) over background ratio (S/B) and the statistical significance (S/ √ B). Searches for both resonant and non-resonant Higgs pair production have been performed in various channels by both the ATLAS and CMS experiments [72][73][74][75][76][77][78][79][80][81][82]. At present, one of the strongest bounds on the non-resonant Higgs pair production comes from the 4b search performed by ATLAS [73] with an integrated luminosity of 13.3 fb −1 , putting an upper bound of 29 times that of the SM expectation. Very recently, the bbτ τ search by CMS [79,83] has put a strong observed limit at 30 times the SM number, with an integrated luminosity of 35.9 fb −1 . The strongest (second strongest) constraint, at 13 (19.2) times that of the SM expectation, comes from the bbbb (bbγγ) search by ATLAS [84] (CMS [85]) with an integrated luminosity of 36.1 (35.9) fb −1 . As for the resonant searches, at present, the strongest limits are obtained from the hh → bbγγ [85], hh → bbbb [84,86] and bbτ + τ − [83] modes, competing in the mass range [∼ 250 GeV; 3 TeV]. However, the bbW W * channel is also predicted to be a competitive probe in the future runs of the LHC [87,88].
The di-Higgs production rate can be enhanced in various beyond the Standard Model (BSM) scenarios. Some such new physics scenarios involve new heavy coloured states propagating in both the box and triangle loops, e.g., supersymmetric and extra-dimensional theories, theories with heavy resonance(s) decaying into a pair of SM-like Higgs, viz., a multitude of models with an extended Higgs sector, strongly interacting theories, composite Higgs models and also various effective field theories (EFTs) modifying the tth coupling [21,. Since the Higgs discovery, many of the models exhibiting new coloured states, have been severely constrained owing to the near-precise measurements in the single Higgs channels. Many of these extensions are responsible not only for an enhancement in the di-Higgs production cross-section, but also for certain distinct kinematic distributions, often having minimal overlap with their SM counterparts. We must, however, remember that even the enhanced cross-sections might not be entirely sufficient to obtain an adequate significance because large SM backgrounds, primarily ensuing from tt, ZZ, ZH, pure QCD and also fakes, may swamp the signal completely. In this regard, modified kinematics, especially the presence of resonances might be somewhat helpful. In the quest to reduce backgrounds to the best of one's abilities, one has to envision a combination of optimal final states. In addition, for each such final state, one has to identify the most suited set of selection cuts in order to enhance signal-to-background ratio. A thorough literature survey points us to studies which show that the trilinear coupling can be best probed when studied in multiple channels with a combination of the numerous final states of the Higgses. These final states are chosen owing to the largeness of the Higgs branching ratios and their cleanliness with respect to the backgrounds. A more inclusive search procedure takes a closer look into various kinematic regions of di-Higgs processes. In particular, studies utilising variables reconstructed from boosted objects, jet substructure techniques, stransverse mass (m T 2 ) and other novel variables, are also shown to have potential importance in the future runs of the LHC [89,90]. Multivariate analyses also turn out to be very efficient in segregating the signal from the backgrounds, thus offering encouraging results [91][92][93]. Nevertheless, an exhaustive study in the di-Higgs sector, involving detector simulations and also alongside an inclusion of the effects of new physics effects (as we shall discuss below) on such measurements, is by and large missing from the literature, since some of the aforementioned studies claiming very optimistic results have been performed at the parton level or with minimal detector effects. Hence, one of the primary goals in this work is to optimise the di-Higgs search strategy by systematically studying a number of final states taking into account detector effects and conservative systematic uncertainties.
In the first part of our study, we focus on the non-resonant di-Higgs production in the familiar bbγγ, bbτ + τ − and bbW W * channels and try to estimate the statistical significances at the HL-LHC. Being mostly agnostic to the previous studies, we try to identify the sets of optimised cuts which show the greatest sensitivities in these channels. The bbγγ and bbW W * have been shown to be the most promising channels in this regard [88,94,95]. The bbτ + τ − channel, however, suffers from large tt backgrounds. The reconstruction of τ s, which is always accompanied by missing transverse energy ( / E T ), is a complicated process at the colliders and involves identifying optimal τ -tagging and mistagging efficiencies. However, improvements in the reconstruction of invariant mass of the di-tau system using the missing mass algorithm [96], dynamical likelihood techniques [97] or the modified m T 2 algorithm [90,98] may provide encouraging results in this channel. Before performing these studies, we stress that the analyses involving these channels are not novel and hence we will be more cautious in our claims. CMS predicts final significances of 1.6σ, 0.39σ, 0.45σ and 0.39σ respectively in the bbγγ, bbτ + τ − , bbV V * and bbbb channels for the non-resonant di-Higgs production, at the end of HL-LHC run with an integrated luminosity of 3 ab −1 [99]. ATLAS on the other hand predicts their best-case significance at 1.05σ for the bbγγ non-resonant channel at the HL-LHC [100]. Moreover, for the bbW W * channel, we study both the semi-leptonic and di-leptonic modes. Besides, we look into the γγW W * channel with both the semi-leptonic and di-leptonic final states. Finally, we also look for the 4W channel in the same-sign di-lepton (SS2 ), tri-lepton (3 ) and four lepton (4 ) final states. We compare the numbers obtained from the experimental projections with our study by including detailed detector effects and conservative background systematics.
In this work, we will not concern ourselves with dedicated analyses for resonant di-Higgs searches. Neither will we focus on scenarios where the rescaling or the modification in the tth Yukawa coupling (y t )may alter the nature of interference between the triangle and box diagrams. However, we will briefly discuss the case where one can have λ hhh different from the SM expectations. These, in principle, can have drastic ramifications in the production cross-sections as well as the kinematics of the di-Higgs system. New physics contributions may also show up in the BR(h → XX), modifying the total rate. These will be considered as a separate future study. In the present work, we will however consider various BSM signatures which have the potential to contaminate the non-resonant SM di-Higgs production and affect the measurement of λ hhh . Observing any significant difference in the number of events for a particular channel, with respect to its SM expectation, may be interpreted as a modification in the value of λ hhh . This is one of the main aims of this present work. We want to quantify the degree to which we can discard such contamination after having established a robust set of cuts which optimises the SM signal. We will be using multivariate analyses for this purpose. We classify these contaminating scenarios into three broad categories, viz., hh(+X), h + X and X, where X denotes an object or a group of objects not coming from an SM Higgs decay. The hh(+X) mode is one of the most studied scenarios. Di-Higgs production from the decays of heavy scalar particles is the classic case considered in the literature [58,[101][102][103]. A heavy scalar particle arises naturally in many extensions of the SM, for instance, in the minimal supersymmetric standard model (MSSM) or in further extended scenarios [49,60,104], general two-Higgs doublet models (2HDMs) [29,30], extra-dimensional models [61], models with an extra U (1) gauge group [62][63][64], to name a few. In the present work, we do not focus on any particu-lar model and consider a generic heavy resonance decaying to a pair of SM Higgses which further decay to various final states. We vary the mass of the heavy resonance but do not optimise the selection cuts for each benchmark and keep them fixed at the optimisation obtained for the corresponding SM non-resonant Higgs pair production channel. Delving a bit more into well-motivated models, we consider certain different channels in the MSSM from which we can obtain a pair of SM-like Higgs bosons. For generic supersymmetric (SUSY) scenarios, we will encounter high effective masses (m eff ) and high missing transverse momentum ( / E T ). This will lead to a minimal or no overlap of kinematic variables with their SM di-Higgs counterparts. For a degenerate SUSY spectrum, however, we will obtain low m eff and low / E T and this may potentially contaminate several di-Higgs final states. The hh(+X) state may come from a squark pair production, i.e., pp →q iqj → q i q j + hh + χ 0 1 χ 0 1 , whereq i refers to squarks (anti-squarks), q i refers to quarks (anti-quarks) with i being the flavour index and χ 0 1 to the lightest supersymmetric particle (LSP), here the lightest neutralino. Thus, we obtain a hh + jets + / E T state which has the potential to contaminate the SM di-Higgs signal unless specific cuts are designed to subdue its effect. For the second category, we consider a mono-Higgs production in association with other objects and this can specifically mimic some of the Higgs pair production final states. We consider few such scenarios, viz., A → Zh, i.e., a pseudoscalar decaying to the Z boson along with the SM-like Higgs: this scenario is particularly interesting in the MSSM and also in classes of generic 2HDMs. We will encounter the bbγγ, bbW W * , bbτ + τ − final states from this channel. Besides, we will even have some contamination to the SS2 , 3 and 4 final states. Furthermore, an electroweakino pair production may also exhibit a mono-Higgs final state with a significant rate. Processes like pp → χ 0 2 χ ± 1 → hW ± + χ 0 1 χ 0 1 , where the lightest chargino and the second-lightest neutralino are wino-like can contribute significantly. For such a scenario, BR(χ 0 2 → hχ 0 1 ) can be dominant and BR(χ ± 1 → W ± + χ 0 1 ) is close to unity. From such channels, we can have possible contaminations to the semi-leptonic bbW + W − , γγW + W − and bbτ + τ − channels and also to the SS2 and 3 modes. The final category of BSM scenarios having potential contaminating effects to the SM di-Higgs production are processes with no SM-like Higgs bosons. In this paper, we study three such examples. We may have the production of a pair of top quarks emanating from a heavy (pseudo-)scalar resonance, displaying prowess for resonant masses above the tt threshold. Besides, in various classes of models we have an associated production of a charged Higgs boson with a top and a bottom quark (H ± tb). For m H ± > m t , we have the tbtb production. Another potential contamination can come from the stop-anti-stop (t it * i , where i = 1, 2 ) pair production which can lead to the tt + / E T or the bW bW + / E T final states. All the above three channels can mimic the hh → bbW W * and bbτ + τ − modes. In the following, we make an attempt to study these contamination effects as functions of the neutral/charged heavy Higgs masses for certain well-chosen benchmark points. In the following sections, we will see the importance of multivariate analyses in discriminating the SM di-Higgs signal from the SM backgrounds and later also from possible new physics contaminations. Hence, the backbone of the analyses techniques used in this work, are the boosted-decision tree (BDT) algorithms.
Having described the various aspects studied in this work, we dissect our paper into the following sections. In section 2, we study the SM non-resonant di-Higgs final states in considerable details and present the reach of the HL-LHC in observing various channels. We discuss the variation of the Higgs self-coupling and the effects one obtains on the signal sensitivity, in section 3. In section 4, we consider the contamination effects ensuing from the aforementioned three categories with the help of benchmark points. Finally, in section 5, we summarise our results, conclude and present a future outlook for the vast field of di-Higgs searches.

Non-resonant di-Higgs production
As discussed in the introduction, the objective of this present work is two-fold, viz., estimating the observability of SM di-Higgs production in multifarious channels at the HL-LHC and also to decipher the contamination to such SM processes from various new physics scenarios as we will discuss at length in section 4. In this section, we will focus on several possible final states of the SM Higgs pair production. Our guiding principles in choosing these final states are cleanliness and substantial production rates. Hence, we choose states containing either photons or leptons (e, µ and τ ) or both. Thus, we consider the bbγγ, bbτ + τ − , bbW W * , W W * γγ and 4W channels for the present work. We do not consider the 4τ , W W * τ + τ − , ZZ * τ + τ − , 4γ, ZZ * γγ and 4Z states on account of their negligible rates. We must mention however that some of these neglected channels at the 14 TeV study may have important ramifications for 100 TeV collider studies [105]. At this point, it is important to mention that we closely follow the ATLAS and CMS analyses whenever available. For channels where we are unable to find such studies, we optimise the cuts to maximise the significance.
As we have emphasised in the introduction, the gluon fusion mode prevails as the dominant contribution to the SM di-Higgs production when compared with the remaining modes, such as vector boson fusion, associated production with a vector boson [106], or double Higgs production in association with a pair of top quarks [94]. Hence, for the present study, we concern ourselves only with the former production mode. On the simulation front, we generate the di-Higgs signal samples at leading order (LO) upon using MG5 aMC@NLO [107]. To attain the final states discussed above, we decay these samples with Pythia-6 [108,109]. We generate the background event samples also at LO using MG5 aMC@NLO 2 . Unless the decays are done at the MG5 aMC@NLO level, we decay these with Pythia-6. The generation level cuts for the various processes are listed in Appendix A. For all our simulations, the NN23LO parton distribution function (PDF) [112] has been employed. Also for all our sample generations, we use the default factorisation and renormalisation scales as defined in MG5 aMC@NLO [113]. Next, we shower and hadronise the signal and background samples with Pythia-6. Following this, the final state jets are reconstructed with the anti-kT [114] algorithm with a minimum p T of 20 GeV and a jet parameter of R = 0.4 in the FastJet [115] framework. In order to simulate detector effects, we use Delphes-3.4.1 [116]. Unless otherwise stated, we demand the electrons, muons and photons to be isolated as follows: the total energy activity within a cone of ∆R = 0.5 around each such object, is required to be smaller than 12%, 25% and 12% respectively of its p T . Besides, we consider the default identification efficiencies of the electrons, muons and photons as specified in the ATLAS detector card in Delphes-3.4.1. For channels with b-jets as final state objects, we consider a flat b-tagging efficiency of 70% [100]. We also consider flat j → b and c → b mistag rates of 1% and 30% respectively. Here we would also like to clarify that whenever in the following sub-sections, we mention a lepton ( ) as a final state, we always refer to an electron or a muon.
In almost all the channels which follow, we perform a cut-based analysis (whenever an equivalent analysis has been performed by CMS or ATLAS) for the signal optimisation. For these channels and also for the rest where we do not perform a cut-based analysis, we perform a multivariate analysis in order to capture the full machinery of an optimised search. For such studies, we choose numerous discriminatory variables, depending on the analysis and use the TMVA framework [117] to discriminate between the signal and background samples. For the following analyses, we use the decorrelated boosted decision tree (BDTD) algorithm. We must admit here that, it is possible to have a further improved algorithm but here we stick to a standard discriminator. In all cases, we train the signal and background samples, carefully avoiding overtraining of the samples at each step. For this purpose, we demand that the Kolmogorov-Smirnov test results are always greater than 0.1. It is, however, mentioned in Ref. [118] that a non-oscillatory critical test value of 0.01 may also suffice as a test for overtraining. We systematically modulate the BDT optimisation procedure with sufficiently large number of signal and background samples and always ensure a KS test value greater than 0.1 for both signal and background.
With this machinery in hand, we outline and detail the prospects of the nonresonant di-Higgs process in various final states in the following sections. We also note that all our generated samples are at a centre of mass energy of 14 TeV and the final analyses are performed for an integrated luminosity of 3 ab −1 .

The bbγγ channel
Having set the stage, we begin by studying one of the most promising non-resonant di-Higgs search channels at the HL-LHC, viz., the bbγγ final state. Even though this channel is somewhat at a disadvantage from the point of view of the total rate, because of the extremely small branching ratio of h → γγ, the cleanliness of this channel makes way for an adequate compensation, as we will gather at the end of this section. Numerous studies in the literature [43,91,94,100,119,120] have attempted to constrain the Higgs self-coupling (λ) by focusing on this particular final state. In performing this study, we closely follow the analysis presented in Ref. [100].
The most dominant background stems from the QCD-QED bbγγ process. We generate this background upon merging with an additional jet by employing the MLM merging scheme [121]. We must also mention here that the pure QED contribution (not involving the Higgs) to bbγγ is O(1%) that of its QCD-QED counterpart. Other significant backgrounds arise from the associated production of the Higgs with a pair of bottom (top) quarks, bbh (tth) and the associated production of Higgs with a Z-boson (Zh). In addition to these backgrounds, contributions also arise from numerous fakes, having event yields comparable to the QCD-QED bbγγ process. Although, the list of such relevant fake backgrounds is exhaustive, viz., ccγγ, jjγγ, bbjj, bbjγ and ccjγ, it is considerably difficult to simulate them. Thus, for the ccγγ and jjγγ channels, which bear a similar topology to the QCD-QED bbγγ process, we estimate the fake event yields upon employing a simple scaling: N ccγγ (jjγγ) = (N ccγγ (jjγγ) ATLAS /N bbγγ ATLAS ) · N bbγγ , where the subscript ATLAS denotes the event yields as listed in Ref. [100], while, N bbγγ is our simulated estimation. In an analogous manner, we simulate the bbjγ and bbjj backgrounds and scale N ccjγ = (N ccjγ ATLAS /N bbjγ ATLAS ) · N bbjγ . Following Ref. [100], we consider a j → γ fake probability of ∼ 0.1%. Also, at this point, we would like to mention that the fake rates are p T /η dependent functions and for precise analyses, these must be dealt with more care.
Upon generating the samples, for every event we require exactly two b-tagged jets and two photons in the final state. The leading (sub-leading) b-jet is required to have p T,b 1 (b 2 ) > 40 (30) GeV and must lie within a pseudo-rapidity range of |η b 1 ,b 2 | < 2.4. The two photons are required to have p T,γ > 30 GeV and are required to lie within |η γ | < 1.37 (barrel) or 1.52 < |η γ | < 2.37 (endcap). Additionally, we also veto events having one or more isolated leptons with p T > 25 GeV and within |η| < 2.5. The following selection cuts are implemented and are also tabulated in Table 1. We demand that the jet multiplicity, N j must be less than 6 in order to reduce the large tth background when either or both the top-quarks decay hadronically via the decays of the W -bosons. We also find that the ∆R cuts are highly effective in tackling the QCD-QED bbγγ background. Here, ∆R ab refers to the distance between the final state particles a and b in the η-φ plane. In addition, we also impose an upper and lower limits on the invariant masses of the two b-jets (100 GeV < m bb < 150 GeV) and the two photons (122 GeV < m γγ < 128 GeV), which impressively reduces the QCD-QED bbγγ background and sufficiently affects all the other backgrounds as well. Lastly, we also impose a lower bound on the transverse momenta of the b-jet pair (p T,bb > 80 GeV) and the transverse momenta of the di-photon pair (p T,γγ > 80 GeV).
We tabulate the signal and background yields for each selection cut in Table 2. We also quote the statistical significance S/ √ B, where S represents the signal yield and B refers to the sum of all relevant backgrounds. Upon applying all the aforementioned cuts, we obtain a final significance of 1.46, assuming zero systematic uncertainty. Because this first part of our paper somewhat serves as a validation of the studies performed by the ATLAS and CMS collaborations, we would like to confirm that our statistical significance is consistent with the results obtained by ATLAS [100].
Before moving on to discussing the multivariate analyses, we slightly digress in discussing the effects of certain possible cuts in improving the significance when compared to the one we derived just above. One of the largest background yields even after imposing all the aforementioned cuts is tth. However, it is interesting to note that this channel is associated with missing transverse energy even at the parton level when at least one of the W -bosons decays leptonically. Our signal, on the other hand, other than / E T emanating from experimental noise, does not have Selection cuts N j < 6 0.4 < ∆R γγ < 2.0, 0.4 < ∆R bb < 2.0, ∆R γb > 0. 4 100 GeV < m bb < 150 GeV 122 GeV < m γγ < 128 GeV p T,bb > 80 GeV, p T,γγ > 80 GeV  any missing energy. Hence, we demand an upper limit of / E T < 50 GeV and show in Table 3 (a) that the tth background reduces to almost half its previous value. The bbγγ and Fake 1 backgrounds also incur modest reductions. The signal on the other hand reduces marginally. This improves the S/B from 0.17 to 0.19. Accordingly, the signal significance with zero systematics, acquires a slight increase at 1.51.
On a slightly different note, the ATLAS analysis [100] that we follow has considered jet energy corrections, to account for the parton radiation sourced from outside the jet cone. This results in the invariant mass distribution of the bb pair coming from the Higgs boson to peak at a value less than that of the Higgs mass. In the present study, we have however, only implemented the default jet energy correction considered in Delphes. As a result, we attempt to study the consequence of modifying the range of the selection cut on m bb to 90 GeV < m bb < 130 GeV. We present the new results in Table. 3 (b). This modified selection cut results in an increase in the Zh background but the signal also receives a relatively large increase, resulting in an S/B of 0.19 and a significance of 1.64. We left these last two modified cuts at the discussion level as issues concerning both / E T and jet-energy correction are primarily experimental and it is non-trivial to predict if our modified cuts can be incorporated seamlessly in an experimental setup. (  In the last leg of this subsection, we perform a multivariate analysis of the bbγγ final state by utilising the BDT algorithm in an attempt to isolate the signal and backgrounds more efficiently and improve upon the signal significance. The BDT optimisation procedure is performed upon using the following kinematic variables: where the numerical subscripts signify the p T ordering of an object with the subscript 1 corresponding to the hardest object. In the course of training the BDT, the kinematic variables m bb , p T,γγ , ∆R b 1 γ 1 and ∆R bb showed the maximal prowess in discriminating the signal from the background. We present the normalised distributions of these variables for the signal and the dominant backgrounds in Fig. 1 after the basic selection cuts. The corresponding signal and background yields along with the final significance are tabulated in Table 4. We observe that the multivariate analysis features a ∼ 20% improvement in the significance (S/ √ B = 1.76) over its cut-based counterpart.   Table 4: Signal and background yields after the BDT analysis along with the significance.

The bbτ τ channel
Having studied the cleanest di-Higgs channel, we now turn our focus towards the channel which at present imposes one of the stronger limits on the di-Higgs crosssection. The bbτ + τ − channel has a considerably larger rate compared to bbγγ and has the advantage of three different final states as we shall discuss in details below. The τ -lepton can decay either leptonically with a ∼ 34% branching ratio or hadronically.
This yields us rich final states, viz., bb , bb j and bbjj, all accompanied with / E T . The jets are formed from the hadronic τ -decays and we will tag them in order to discriminate more from the backgrounds.
The major backgrounds for these channels stem from the fully hadronic, semileptonic and fully leptonic decays of pair produced tt. The QCD-QED background, gg → bbZ ( * ) /γ * → bbτ + τ − is also substantial. As we will see, demanding a large invariant mass in the τ + τ − system, eradicates the γ * contribution almost completely. Other backgrounds include bbh, Zh, ttW , ttZ and tth. Besides, we also have the bbjj background, with jets faking hadronic τ s. In context of the Zh channel, we once decay the Z-boson to a pair of bottom quarks while forcing the Higgs to decay to a pair of τ -leptons and then interchange these decay modes in order to have all possible bbτ + τ − final states. The cross-sections of the backgrounds are large and hence in order to improve statistics in our final analyses, we generate the samples with hard generation level cuts (see Appendix A). We neglect W (→ τ ν) + jets, W h, W Z, h → ZZ * and single top production owing to their very small production rate.
On the one hand the tt backgrounds are significantly large when compared to the small signal rate. However, boosted techniques and several kinematic variables do provide us some handle over the situation [89]. On the other hand, reconstruction of invariant mass of the τ -pair is a delicate issue at the LHC since it is always accompanied by missing transverse energy. Several m τ τ reconstruction techniques have been discussed in the literature [96,122,123] and extensively used in various previous analyses. In this work, we will considerably focus on the collinear mass approximation technique [96]. This approximation is based on two important assumptions, viz., the visible decay products of a τ lepton along with the neutrinos coming from it are all nearly collinear (i.e., θ vis = θ ν and φ vis = φ ν ) and the total missing energy in the event is solely due to these neutrinos. Upon utilising these two assumptions, the x-and the y-components of / E T can be easily expressed in terms of the momenta of the neutrinos. Solving this, one obtains the individual momentum of each neutrino. The above method has a drawback because only in the cases where the τ τ system is boosted against a hard object (examples being energetic jet, boosted objects), do we recover a reasonable mass. In our present scenario, the τ τ system (h → τ τ ) is boosted against the other Higgs which decays to a pair of b-quarks. The reason for this drawback is that this technique is extremely sensitive to the / E T resolution and may overestimate the reconstructed mass, M τ τ . Another drawback of this assumption is that, the solutions of the / E T equation diverge when the visible τ decay products are produced back to back in the transverse plane. We discuss another τ τ reconstruction technique, viz., the Higgs-bound technique [124,125], in Appendix B.
We are aware of the fact that the ATLAS [96] 6 and CMS [97] collaborations use different algorithms to reconstruct a resonance decaying into a pair of τ -leptons.
In the following sub-subsections, we present the analyses with sets of optimised cuts aimed for the HL-LHC. For the major part, we closely follow the predicted performance of an upgraded ATLAS detector [126] to model the detector effects and tagging efficiencies. For this part of the study, we use a different isolation criteria for the leptons (e, µ) upon following this ATLAS reference [127]. We demand the total energy activity around the lepton and within a cone of radius ∆R = 0.2 must be less than 10 GeV. Following Ref. [126], we fix the medium-level τ selection efficiencies for candidates with p T > 20 GeV and |η| < 2.3 at 55% and 50% respectively for the one-pronged and three-pronged τ candidates. We also allow for QCD-jets faking τ -jets with mistag rates of 5% and 2% respectively for one and three tracks passing the medium level τ identification.
We dissect the analysis into three independent parts corresponding to the decay mode of the τ -lepton, viz., the bbτ h τ h , bbτ h τ and bbτ τ final states, where the subscript h( ) denotes the hadronic (leptonic) decay mode of the τ . For the following three sub-analyses, we demand some common sets of cuts. We select events with exactly two reconstructed b-tagged jets with a minimum p T requirement of 40 (30) GeV for the leading (subleading) jet. We also require these b-tagged jets to be within a pseudorapidity range of |η| < 2.5. We require m bb > 50 GeV in order to bring the signal and backgrounds on the same footing because the backgrounds have been generated with this cut at the generation level. In case of the Higgs decaying to τ pair, we take ∆R bτ > 0.4, ∆R τ τ > 0.4 and m vis τ τ > 30 GeV, which signifies the minimum invariant mass on the visible products from the τ -pair. We also apply a common set of selection cuts as follows: In addition to the aforementioned common cuts, we require exactly two τ -tagged jets having a minimum p T of 30 GeV and a maximal pseudorapidity range of |η| < 2.5. In each of these sub-analyses, we first consider the variable m vis τ τ , constructed out of the visible τ objects and afterwards we consider the collinear mass variable, M τ τ .
For the first case, we further optimise p T,bb , m T 2 and m vis τ τ in order to have the best possible signal over background ratio.
Upon performing the optimised cut-based analysis, we obtain a final significance of 0.44 for the HL-LHC. The cut-flow and the final significance are tabulated in Table 5. In contrast to the bbγγ channel, the S/B ratio here is ∼ 0.67% and hence one needs data-driven background techniques and a drastic reduction in systematic uncertainties in order for this channel to be relevant in the future.  Next, we use the collinear approximation technique, discussed above, to reconstruct the invariant mass of the Higgs decaying to a pair of τ leptons. To overcome the limitations as discussed above, we select events by putting an additional cut, ∆φ τ τ < 3.0 radian. For the BDT analysis, we impose an upper cut on the collinear mass, M τ τ < 200 GeV. The cut-flow and the statistical significance are tabulated in Table 6 with the following optimised cuts on top of the other variables. We obtain a significance of 0.65, which shows a small improvement over the previous analysis with the m vis τ τ variable.   In order to be certain if our optimised cuts can be improved further, we employ a multivariate analysis using the BDT algorithm after the basic selection cuts. We train our signal and background samples with the following 12 kinematic variables 7 "Others" include tth, ttW and ttZ. 8 "Others" include tth, ttW and ttZ.
for the case with the m vis τ h τ h variable: For the other case, with the M τ τ variable, we train our signal and background samples with the following 9 kinematic variables: where the symbols have their usual meaning. ∆φ ab is the azimuthal angle separation for the ab system. M τ h τ h is the collinear mass of Higgs from hadronic τ decays. The signal and background yields after this multivariate analysis are shown in Table 7.
The normalised distributions of the four best discriminating kinematic variables, viz., M τ h τ h , m T 2 , m bb and p T,bb are shown in Fig. 2. We find that the S/B ratio increases slightly and we also have a non-negligible increase in the significance at 0.74, assuming zero systematic uncertainty.  Table 7: Signal, background yields and final significance for the bbτ h τ h channel after the BDT analysis with (a) m vis τ τ (b) M τ τ variable.

The bbτ h τ channel
In the present instalment, we choose events containing exactly one isolated lepton and one reconstructed τ -tagged jet over and above the common requirements. We also require the isolated lepton to have a p T > 20 GeV and |η| < 2.5. The additional optimised selection cuts for this present mode, involving the m vis τ τ , are: After imposing the various cuts, we obtain a signal significance of 0.26 for the HL-LHC. The event yields along with the significance are shown in Table 8.  Table 8: Same as in Table 5 for the bbτ h τ mode. The various orders of the signal and backgrounds are same as in Table 5.
We get the following optimised cuts upon the other variables with M τ τ variable. The event yields at HL-LHC are shown in Table 9 with a significance of 0.44 Here also we perform a BDT analysis to see its potential. We choose the following 13 kinematic variables to train our signal and background event samples with the m vis τ h τ l variable:  Table 9: Same as in Table 5 for the bbτ h τ mode with collinear mass variable. The various orders of the signal and backgrounds are same as in Table 5.
Furthermore, we consider the following 9 kinematic variables to train our signal and background event samples while having the M τ h τ l variable: We ensure a proper training of the event samples. In Table 10, the signal, background yields and the significance after the multivariate analysis, are presented. The normalised distribution of the four maximal discriminating kinematic variables, viz., M τ h τ l , m T 2 , m bb and p T,bb are shown in Fig. 3. Upon imposing a suitable cut on the BDT variable, we find that the zero-systematics significance is 0. 49 Table 10: Same as in Table 7 for the bbτ h τ mode with (a) m vis τ τ (b) M τ τ variable.

The bbτ τ channel
The last segment of the bbτ + τ − channel consists of two leptonically decaying τ s. We demand events containing exactly two oppositely charged isolated leptons with p T > 20 GeV, over and above the requirements stated above. We impose the following optimised cuts on top of the other variables for the scenario where we consider the invariant mass from the visible products of the τ -leptons.
A final signal significance, S/ √ B, of 0.044 is obtained, upon assuming zero systematic uncertainties. We show the event yields and the significance in Table 11 Table 11: Same as in Table 5 for the bbτ τ mode. The various orders of the signal and backgrounds are same as in Table 5.
For the second category involving the collinear mass variable, we choose the following optimised cuts on top of the other variables. The results are tabulated in Table 12.  Table 12: Same as in Table 5 for the bbτ τ mode with collinear mass variable. The various orders of the signal and backgrounds are same as in Table 5.
In an analogous manner to the previous two cases, we perform a multivariate analysis with the following 11 kinematic variables for the first case: p T,bb , m bb , ∆R bb , m vis τ τ , ∆φ τ τ , ∆φ τ 1 / E T , ∆φ τ 2 / E T , m vis hh , p vis T,hh , ∆R b 1 τ 2 , m T 2 Figure 4: Normalised distributions of M τ l τ l , m T 2 , ∆φ τ 1 / E T and p T,bb for the signal and dominant backgrounds in bbτ τ channel before applying basic selection cuts.
Following this, we perform another multivariate analysis with the following 8 kinematic variables for the case involving the collinear mass: In Table 13, the signal, background yields and the significance after the BDT analysis are presented. We also show the normalised distributions of the four kinematic variables viz., M τ l τ l , m T 2 , ∆φ τ 1 / E T and p T,bb in Fig. 4. The BDT optimisation yields a statistical significance of 0.077 for the latter scenario where we use the collinear mass observable.

The bbW W * channel
A channel often neglected in terms of rigour and clarity is the bbW W * final state, having three markedly different sub-states, viz., the fully leptonic (bb + / E T ), the  Table 13: Same as in Table 7 for the bbτ τ mode with (a) m vis τ τ (b) M τ τ variable.
semi-leptonic (bb + jets + / E T ) and the fully hadronic (bb + jets), where denotes an electron, muon or a tau lepton. Out of these three possible final states, the fully leptonic one (which has an overlapping final state from bbτ τ ; see section 2.2.3) is the cleanest owing to lesser backgrounds. The semi-leptonic channel has a larger background as compared to the former. The fully hadronic final state, on the other hand, will be swamped, mostly by QCD backgrounds and hence is omitted from any further discussion in this study. For both the leptonic and semi-leptonic channels, the major background comes in the form of tt. The fully leptonic tt scenario contributes to being the dominant background for the leptonic signal and both the fully leptonic and semi-leptonic decays of tt act as the dominant backgrounds to the semi-leptonic signal. For the semi-leptonic channel, the second-most dominant background arises in the form of W bb + jets. The much less dominant backgrounds are comprised of bbh, tth, ttV , V h, V bb and V V V , where V denotes a W or a Z boson. For both the analyses, we implement a common set of trigger cuts, viz., p T,b/j > 30 GeV, p T,e (µ) > 25 (20) GeV, |η b, | < 2.5 and |η j | < 4.7. Furthermore, in order to deal with the large tt backgrounds, we apply, at the generator level a hard cut of m bb > 50 GeV. We apply the same for the bb background. Hence, in order to be consistent, we implement this same cut for all the samples at the analysis level. In the following two sub-subsections, we focus only on multivariate analyses. We pass the signal and background samples to the BDTD algorithm upon implementing the aforementioned cuts.

The 2b2 + / E T channel
Inspired by the CMS HL-LHC studies [131], we focus on the dileptonic mode of the bbW W * channel in this part. Differing slightly from CMS, we do not impose cuts on m , ∆R and ∆φ bb . Moreover, instead of using their neural network discriminator, we consider the BDTD algorithm. Besides, in addition to their analysis, we include various subdominant backgrounds on top of the dominant tt backgrounds, as has been listed above. For this study, we select events with exactly two b-tagged jets and two isolated leptons with opposite charges. Upon inspecting various kinematic distributions, we choose the following ten for our multivariate analysis: where the last term implies the azimuthal angle separation between the reconstructed di b-tagged jet and di-lepton systems. Having tt as the dominant background by far, i.e., the weight of this background being several orders of magnitude larger than the rest, we train our BDTD algorithm with the signal sample along with this background only. We analyse the other backgrounds upon using this training. The final number of signal and background events along with the significance are listed in Table 14.
The distributions of the four best discriminatory variables, viz., m bb , m , p T,bb and p T, , after the basic cuts as listed above, are shown in Fig. 5  Finally, with a judicious cut on the BDTD observable, we find ∼ 35 signal and ∼ 3197 background events, yielding a significance of ∼ 0.62 upon neglecting systematic uncertainties. The numbers are in excellent agreement to the ones obtained by CMS [131]. This channel can thus act as an important combining channel to enhance the total SM di-Higgs significance at the HL-LHC and also serves as an important search for a resonant di-Higgs scenario [88].

The 1 2j2b + / E T channel
Before concluding this subsection, we make an attempt to decipher the potential of the semi-leptonic final state for the bbW W * channel. On the analysis front, we choose events with exactly two b-tagged jets, one isolated lepton and at least two light jets meeting the trigger criteria as discussed above. We consider the same set of cuts as for the dileptonic channel before performing the multivariate analysis. For this case, we find the following variables to have the best discriminatory properties.
where p T, jj , ∆φ bb jj and ∆R jj refer to the visible p T of the jj system (for the signal, ensuing from the h → W W * → νjj decay), the azimuthal angle separation between the di-b-tagged jet system and the jj system and the ∆R separation between the lepton and the di-jet system respectively. Here the dominant backgrounds are the semi-leptonic and the leptonic decays of tt. Hence, in an analogous way to the dileptonic case, we train the BDTD with the signal and the tt samples, albeit with proper weight factors for the leptonic and semi-leptonic backgrounds. We then utilise this training for the rest of the backgrounds as well, which are clearly subdominant with respect to the tt backgrounds. We find a significance of 0.13, however, with a much smaller S/B ratio. The results are summarised in Table 15. The distributions of the four best observables, viz., m bb , p T, 1 , p T,bb and / E T are shown in Fig. 6. We do not find a promising significance for this scenario. We obtain a negligible S/B and a significance of 0.13 assuming zero systematic uncertainties. A somewhat promising result has been obtained in Ref. [87] using jet substructure techniques.  Table 15: Signal, background yields and final significance for the 1 2j2b + / E T channel after the BDT analysis.The various orders of the signal and backgrounds are same as in Table 14.
Following this, we then discuss some of the most significant kinematic variables which distinguish the signal and backgrounds most efficiently. Finally, we present the results from multivariate analysis.

Pure leptonic decay
The signal yield in this current scenario is much smaller in comparison to the moststudied di-Higgs search channels like bbγγ and bbτ + τ − . However, as we will see below, this channel has a significantly lower background yield.
We require each event to have exactly two isolated photons and two isolated leptons having opposite electric charge. Sizeable backgrounds to this final state arise from the tth associated production, the Higgs-strahlung Zh process (merged up to three jets), and from the γγ (where = e, µ, τ for this case) final state. The irreducible background to this search channel comes from ν νγγ (mostly from V V γγ), which has a relatively smaller cross-section as compared to the aforementioned backgrounds, and hence has not been considered in the current analysis. While generating the γγ background, we merge the samples up to one extra jet and we also impose a generation-level cut on the invariant mass of the γγ pair, viz., 120 GeV < m γγ < 130 GeV.
Before listing down the variables we use for the multivariate analysis, we also impose a b-jet veto to the events. This reduces the tth background substantially. For this analysis as well for the semi-leptonic analysis that follows, we require the invariant mass of the di-photon system to be 122 GeV < m γγ < 128 GeV. As an optimised cut-based analysis for this channel is not available in the literature, we implement a BDT optimisation approach. The following are the variables used to train the signal and background samples.
where the last term denotes the azimuthal angle separation between the di-lepton and the di-photon systems. In Fig. 7, we show the kinematic distributions of the four variables, viz., m , / E T , p T,γγ and m γγ . These variables help distinguish the signal from the weighted background samples, most efficiently.
We find that upon imposing a cut on the BDT variable, the S/B improves from 4.4×10 −3 (after the basic selection) to 0.40. This is a significant improvement and perhaps has of the best signal over background ratios amongst all the channels studied so far. Unfortunately for us, this channel is plagued by very small branching ratios rendering a signal yield of less than unity. Given the dearth of signal events, we can not define a statistical significance. We must, however, note that this channel can be one of the most important channels for a 28 TeV/ 33 TeV collider. The signal and background yields are listed in Table 16. Hence we conclude that in order for this channel to have a significant contribution in the combination of the various final states, one requires either a large luminosity or higher energies.

Semi leptonic decay
This channel has been studied by ATLAS [74] with an integrated luminosity of 13.3 fb −1 . However, given the extremely small branching ratio of h → γγ, this channel is yet not sensitive and imposes a very weak observed upper limit on the non-resonant di-Higgs cross-section at 25.0 pb (95% confidence-level). Here, we concern ourselves with the γγ + jets + / E T final state. This process, however, has an additional complexity since the kinematics of the final state depends on whether the ν (jj) comes from the on-shell or the off-shell W -boson decay. Even though the event rate of the semi-leptonic scenario is larger than its purely leptonic counterpart, the presence of additional jets lead to considerably larger backgrounds.
For the event selection, we do not follow the analysis sketched in Ref. [74] as it is designed to maximise the signal events given the dearth in the integrated luminosity for such a process. We perform a multivariate analysis with looser basic selection cuts. We demand exactly one isolated lepton, two isolated photons and at least one light jet, with the p T and |η| ranges mentioned above. The irreducible background to this process comes from νγγ, merged up to one hard jet and has a tree level cross-section of ∼ 3.28 fb. In addition, γγ ( = e, µ, τ for both cases), merged up to one hard jet and having a generation level cross-section of 1.05 fb, also contributes to the background when one of the leptons goes missing. These two backgrounds have been generated with a hard cut at the generation level as has been discussed for the di-leptonic scenario. Similar to the previous analysis for the full leptonic case, tth and Zh+jets also contribute significantly to the background. In addition, we consider the W h process, merged up to 3 jets, as an important background.
We perform our standard multivariate analysis upon employing these nine kinematic variables.
where ∆φ j γγ is the azimuthal angle separation between the j and the reconstructed di-photon systems with j being the hardest jet and m T is the transverse mass variable. It is found that ∆R j , p T,γγ , m γγ and m T are the most effective variables in distinguishing the signal from the backgrounds as can be seen in Fig. 8. We find that after a proper BDT implementation, the signal over background ratio improves from 4.8×10 −3 (after basic selection) to 0.11. The signal and background yields after imposing an appropriate cut on the BDTD variable are summarised in Table 17.
Here also we find that similar to its precursor, i.e., the purely leptonic scenario, the S/B is much better than most of the channels considered thus far. However, the low rate due to the small branching ratio of h → γγ acts as a hindrance to render this final state useful at present. Going to high energy machines, higher integrated luminosities of around 5000 fb −1 with the 14 TeV collider, performing a combination of integrated luminosities from CMS and ATLAS at the HL-LHC, and lastly a modification to the SM cross-section, will enhance this channel's potential. In summary, the γγW W * final states yield extremely good S/B ratios.

The 4W channel
In this subsection, we focus on the yet-untouched final states ensuing from the di-Higgs production mode, viz., the 4W channel 9 . For completeness, we consider both semi-leptonic and fully leptonic decay modes. We lose cleanliness upon including more and more jets in the final state, i.e., upon considering the semi-leptonic decays. On the other hand, for a fully leptonic final state, the cross-section yield is extremely small. Considering two, three and four leptons, we choose following final states:   Table 17: Signal and background yields for the γγ + jets + / E T channel after the BDT analysis. The various orders for the signal and backgrounds are same as in Table 16. The order for W h + jets ( νγγ + jets) is the same as for Zh + jets ( γγ + jets).
In the following, we discuss the three cases as listed above.

The SS2 final state
Before implementing the multivariate analysis, we require each event to have exactly two leptons carrying the same electric charge and having p T > 25 GeV. Furthermore, we require events with at least two jets with a veto on b-tagged and τ -tagged jets. The W Z (W → ν, Z → ), tt and same-sign W -boson pair production constitute the most dominant backgrounds for this channel. Besides, we have the V h production (V = W ± , Z decays leptonically and Higgs decays to W W * , ZZ * ), ttX (X = W ± , Z, h). The tt channel is a fake background for this process where either jets fake as leptons or charges are misidentified. Save for the same-sign W -boson pair, all the other dibosonic backgrounds are merged up to 3 jets. We must also note that by demanding a veto on the b-tagged jets, we are able to reduce a significant portion of the tt and ttX backgrounds.
In a similar spirit as in all the previous subsections, we embark upon our multivariate analysis by choosing the six following kinematic variables. m ± ± , ∆R i j k , m jj , where i, k = 1, 2 gives four combinations and m jj signifies the invariant mass constructed out of the hardest two jets. We show the four most discriminatory variables in Fig. 9 and list down the final signal, background yields along with the zero-systematics significance in Table 18. We find that upon performing a BDT optimisation, the S/B ratio improves from 2.2 × 10 −4 (after basic selection cuts) to 9.7 × 10 −4 . Unless the production cross-section is increased significantly or we find better techniques to control the S/B, this channel does not have much hope for a standard di-Higgs search. A drastic change in kinematics might change the picture altogether. Figure 9: Normalised distributions of m ± ± , ∆R 2 j 1 , ∆R 1 j 2 and m jj for the signal and the most relevant backgrounds for the SS2 final state.  Table 18: Signal and background yields for the SS2 channel after the BDT optimisation.

The 3 final state
The trilepton analysis is somewhat similar in spirit to its SS2 counterpart. For the p T cuts on the lepton, we relax them somewhat in this analysis. We require p T, 1 > 25 GeV, p T, 2 > 20 GeV and p T, 3 > 15 GeV, in order not to make the basic selection cuts too stringent. The pseudorapidity requirements for the leptons and the various requirements for the jets are as before. Furthermore, in order to remove events with leptons ensuing from the Z-boson, we require |m Z − m | > 20 GeV for leptons having opposite sign and same flavour. The main backgrounds for this channel come from W h, diboson production (mainly W Z) and the fake backgrounds coming from tt. Apart from these, the Zh (Z → , h → W + W − ), ttX (X = W ± , Z, h) and ZZ backgrounds also contribute significantly. All the dibosonic processes are merged up to three jets.
For this installment, we choose the following kinematic variables to train our BDTD algorithm.
where i, j runs from 1 to 3, m eff is the effective mass summing the / E T , the scalar p T of the three leptons and all the jets in the event. Lastly, n jet is the count of the number of jets per event. The four best variables are shown in Fig. 10. The event yields and final significance are shown in Table 19. In this case, the S/B changes from 7.3×10 −4 (after basic selection cuts) to 2.8×10 −3 . We find that there is a slight improvement compared to the SS2 scenario. Finally, we end up with a statistical significance of 0.20.  Table 19: Signal, background yields and final significance for the trilepton channel after applying the most optimised BDT cut. The various orders for the signal and the backgrounds are same as those in Table 18. The order for Zh + jets (ZZ + jets) is the same as that for W h + jets (W Z + jets).

The 4 final state
This brings us to our final non-resonant analysis. For this analysis, we perform a simple cut-based analysis. We require each event to have four isolated leptons. The dominant backgrounds are W h, tth, tt, ZZ and Zh. Besides, we have nonnegligible contributions from ttV (V = W ± , Z). All the dibosonic backgrounds are merged up to three jets save for the ZZ sample which is merged up to one extra jet. The leading and sub-leading leptons are required to have p T > 20 GeV. For the remaining two softer leptons, we demand p T > 10 GeV. Besides, we also employ the |m Z − m i j | > 20 GeV cut in order to reduce backgrounds having a pair of opposite sign same flavour leptons coming from Z-bosons. Furthermore, we apply a cut on the missing transverse energy, viz., / E T > 50 GeV to greatly reduce the 4 background. These cuts are extremely helpful in reducing the backgrounds by a great deal. However, the extremely small signal yield reduces to an even smaller number which is not statistically significant for all practical purposes. In Table 20, we find an S/B of ∼ 2.5 × 10 −4 after imposing the aforementioned cuts. On adding the / E T cut the S/B increases to 7.8×10 −3 . However, upon having such small cross-sections, we do not perform a BDT analysis for this scenario.

Summarising the non-resonant search results
To summarise this long section, we find that the prospects of discovering the SM non-resonant di-Higgs channel at the HL-LHC (14 TeV with 3 ab −1 of integrated luminosity) are bleak. The most promising channel comes in the form of bbγγ yielding an S/B ratio of ∼ 0.19 and a statistical significance of 1.76. The situation for the bbτ + τ − channels is more challenging unless we find an excellent algorithm to reconstruct the di-tau system. The purely leptonic final state of the bbW W * mode shows promise but one will either require data-driven techniques to reduce systematic uncertainties on the backgrounds or even better ways to curb the backgrounds. Both the leptonic and semi-leptonic decay modes for the γγW W * channel yield excellent signal to background ratios. However, the extremely small event yields render these channels unimportant with the planned luminosity upgrade. The 4W channel has three distinct final states with leptons. Upon doing detailed analyses, we find that the signal yields are very small. The S/B improves upon increasing the number of leptons but the signal yields fall rapidly. Upon combining all the statistically significant searches with at least 5 signal events after all the cuts, we end up with a combined significance of 2.08σ at the HL-LHC. We expect that in the event of running the LHC till higher luminosity or upon considering the CMS and ATLAS results to be statistically independent (giving us 6 ab −1 data), one can reach close to 2.95σ (with 6 ab −1 luminosity, we gain by a factor of √ 2) upon combining all the statistically significant channels. We must note that if we consider a flat systematic uncertainty on the background estimation, then upon using the formula S = N S / N S + N B + κ 2 N 2 B , with S, N S , N B and κ being respectively the significance, number of signal and background events after all possible cuts and the systematic uncertainty, we will face a reduction in the quoted statistical significance depending on the value of κ. Even κ = 0.1, 0.2, i.e., a 10%-20% systematic uncertainty, may completely dilute our significance. Hence, we need excellent control over systematics in order for us to observe any hints coming from the di-Higgs channels. A 100 TeV collider has the potential of measuring the di-Higgs channel to a greater degree of accuracy. We also note that, in some channels, an enhancement in the production cross-section by a factor of 3 may help the discovery with the HL-LHC. Lastly, modified kinematics will alter this picture completely and we may see encouraging results with lesser integrated luminosities. In the following section, we discuss various BSM scenarios yielding the same final states as have been discussed in the present section.

Ramifications of varying the Higgs self-coupling
Before discussing the contaminations from various BSM scenarios to the standard double Higgs channels, we address the issue of the variation of the Higgs selfcoupling from its SM expectation. The Higgs self-coupling in the SM is an extremely small number and the HL-LHC study by ATLAS [100] predicts a sensitivity of −0.8 < λ hhh /λ SM < 7.7 upon assuming SM-like couplings for the remaining. In this regard, we must be wary of the differences in the kinematic distributions upon changing λ hhh because it changes the magnitude of the destructive interference with the SM box-diagram as we shall see below. This not only modifies the rate of the double Higgs production, but also alters the kinematics significantly. For the present study, we will consider the following six values of λ hhh /λ SM , viz., -1, 1, 2, 5 and 7. Because we have seen that the bbγγ channel is the most sensitive channel for di-Higgs studies at the HL-LHC, we will restrict the anomalous self-coupling study to only this channel. Hence, referring to section 2.1, we tread the following three steps. First, we consider double Higgs production with each of the aforementioned λ hhh values (one at a time) as our signal and pass them through the cut-based analysis which has been optimised (with the cuts listed in Table 1) to maximise the SM (λ hhh /λ SM = 1) signal. Following this, we pass each of the λ hhh samples through the BDT framework optimised for the SM double Higgs production (see Table 4). Thereafter, we train all the samples with an alternative λ, viz. λ hhh /λ SM = 5. Finally, we train the BDT for each λ hhh point and compute the significance. We list the results in Table 21. The cross-sections are for the process pp → hh → bbγγ as a function of λ hhh /λ SM . The efficiencies are computed as the ratios of the final number of events (after the cut and count or the multivariate analysis) to the number of generated events. Finally, the yields are given for the signal and background samples for an integrated luminosity of 3 ab −1 . The cut-efficiency is shown to be the maximum for the value of λ hhh /λ SM = 2 where incidentally the cross-section is the smallest. We had already seen that going from a simple cut and count analysis to a BDT analysis, rigorously trained to segregate the signal from background, we gain in significance. This already holds true for the first two sub-tables, with an improvement varying between 13%-23%. However, when we train the BDT with the corresponding λ hhh samples, the BDT becomes more tuned to the modified kinematic distributions and in almost all cases, we find an improvement in significance compared to its counterpart where the training was performed with the SM signal sample. We can see the results in the fourth sub -table in Table 21. Also, in order to quantify the difference in distributions for the variation of the Higgs trilinear coupling, we show the normalised distributions of the reconstructed Higgs p T in the di-photon channel (p T,γγ ) upon varying λ hhh /λ SM (see Fig. 11). Finally, we employ the log-likelihood CLs hypothesis test [132][133][134] upon assuming the SM (and also λ hhh /λ SM = 5) to be the null hypothesis. We obtain the following ranges of κ = λ hhh /λ SM : −0.86 < κ < 7.96 CBA for κ = 1 optimisation; SM null hypothesis −0.63 < κ < 8.07 BDT analysis for κ = 1 optimisation; SM null hypothesis −0.81 < κ < 6.06 BDT analysis for κ = 5 optimisation; SM null hypothesis −1.24 < κ < 6.49 BDT analysis for κ = 5 optimisation; κ = 5 null hypothesis.
Note that for κ = 1, we are quite close in reproducing the HL-LHC prediction by ATLAS (i.e., −0.8 < λ hhh /λ SM < 7.7) in both the cut-based (CBA) and BDT optimisation procedures. However, κ is an unknown parameter (as the Higgs trilinear coupling has still not been measured) and hence, in principle, should be varied as well. Upon training with a different value of κ other than 1, viz., κ = λ hhh /λ SM = 5, a shift in the allowed ranges for κ has been obtained, which further depends on the hypothesis chosen. We find a rather stronger upper-limit on the allowed range of the trilinear coupling upon training with the λ hhh /λ SM = 5 sample. To conclude this section, we emphasise the fact that we must be geared to tackle variations of the trilinear couplings from the SM expectations and must be able to segregate them with the help of various kinematic distributions up to a certain uncertainty.  Table 21: Table showing

Contaminations to non-resonant di-Higgs processes
Measuring the trilinear Higgs coupling has been the primary focus for all di-Higgs searches. However, as we have seen in details in the previous section, the SM Higgs pair production cross-section being extremely small, makes it a challenging job to look for its signatures even at the HL-LHC. In the previous section, we found that the combined significance upon assuming zero systematic uncertainties is ∼ 2.1σ. However, up until now, we reserved ourselves from introducing any BSM effects. We saw that the number of signal events (or rather the S/B) is small for most of the final states and hence small contributions from any BSM physics can potentially distort or contaminate the signal. Statistically significant deviations from the expected SM di-Higgs yields may be considered as signatures of new physics. On the one hand, such deviations can be attributed solely to modifications in λ hhh or y t with respect to their SM values. On the other hand, markedly different new physics processes can also be responsible for the modification in the event rate in a particular production mode. Having performed boosted decision tree analyses designed solely to maximise the SM di-Higgs yield, a fair question to ask at this stage is whether any new physics can at all mimic the SM signatures. The answer is twofold. If perchance the primarily discriminatory kinematic variables of the new physics scenario in question, overlap with their SM counterparts to a good degree, then there is a good chance of the new physics mimicking this SM signal. Secondly, even if the overlap is not significant then the largeness of the new physics cross-section may determine the degree of contamination. The purpose of this section is to study some such imposters ensuing from various well-motivated new physics scenarios which may potentially contaminate the non-resonant SM Higgs pair event yields in various final states. We will study the extent of these contaminations upon considering various benchmark scenarios. We will also find correlated channels during our quest of extracting the effects of contamination. The effect of correlation simply means that some search channels for the non-resonant di-Higgs searches will allow for more contaminating new physics scenarios compared to some other search channels. Broadly, the following are the three scenarios which can contaminate the non-resonant Higgs pair production in certain final states: • Double Higgs production, pp → hh(+X) through resonant or non-resonant production modes, • Single Higgs production in association with some other particles, pp → h + X and • Null Higgs scenario, pp → X, yielding some of the final states as has been discussed in section 2, where X is an object or a group of objects not coming from an SM Higgs boson decay. In the following subsections we detail these three broad scenarios citing examples from specific new physics models.

The hh(+X) channels
Several extensions of the SM, primarily with an extended Higgs sector, may significantly enhance the Higgs pair production cross section and may also alter the kinematics of certain observables. More specifically, two Higgs doublet models (2HDM) [30,32] and complex scalar extensions [63,64,88] are some prime examples. In the type-II 2HDM scenarios, which can be embedded in an MSSM, there is a CP -even Higgs, a CP -odd Higgs and two charged Higgs bosons on top of the SM-like Higgs with m h = 125 GeV. The SM-like Higgs pair can be produced from the decay of a heavy CP -even Higgs boson, H. The couplings of the various Higgses in 2HDM scenarios depend mainly on the Higgs mixing parameter, α and the ratio of the two vacuum expectation values (vevs), tan β of the two Higgs doublets. In order to abide by the LHC results and constraints pertaining to the discovered scalar at ∼ 125 GeV, one has to invoke the so-called alignment limit, where the lightest CP -even Higgs automatically aligns itself with the SM-like Higgs, having couplings close to the SM predictions. The allowed masses of the pseudo-scalar (A) and the CP -even heavy Higgs lie in the range of a few hundred GeVs. In the low tan β regime, the rate for the CP -even heavier Higgs decaying to a pair of SM-like Higgs bosons can become significant and may even surpass the SM di-Higgs cross-section [30,32]. The resonant production of a heavy CP -even Higgs can, in principle, contaminate the SM di-Higgs signal thus affecting the measurement of the Higgs self-coupling. In particular, the low tan β region can affect the Higgs trilinear coupling measurement. For large tan β, the H → bb and H → τ τ modes become dominant as the coupling scales as m b (m τ ) × tan β. Hence, we do not concern ourselves with the large tan β regime. We must also note that high tan β-low m A regions are excluded [135].
In order to study the contamination from the process pp → H → hh, we generate the signal samples in Pythia-6 and demand a narrow-width for H, i.e., in the GeV range, less than the detector resolution. The results are shown in Fig. 12 as upper limits on the cross-section pp → H times the branching ratio of H → hh, viz., σ(pp → H → hh), as functions of the heavy Higgs mass, m H . We try to present the results in a somewhat model independent fashion. One can imagine the effects of tan β or any other theory parameter to have been absorbed in the upper limit of the cross-section. The green (blue) region signifies the upper limit on the cross-section required to contaminate the SM yield at 2σ (5σ), where the cross-section upper limits are derived using the inequality where S UL NP is the computed upper limit at N σ on the new physics (NP) scenario upon considering a background which includes the SM di-Higgs contribution as well. The grey region is part of the new physics parameter space which does not contaminate the SM expectations. As we know, the invariant mass of the SM di-Higgs system peaks around 400 GeV and hence because of our robust BDT optimisation, which captures to a very precise degree the shape of the non-resonant SM observables, a heavy Higgs boson of mass m H 400 GeV gets literally treated as a background. Hence, as seen in Fig. 12, one requires larger cross-sections for m H 400 GeV in order to contaminate the SM signal even at the 95% confidence level. We see that the strongest bound on the upper limit on σ(pp → H → hh) comes about from the bbγγ channel. The upper limit varies between 76 fb and 25 fb between m H = 400 GeV and 650 GeV. This is followed by bbτ + τ − . We find the 2σ upper limit on the cross-section varying between 170 fb and 83 fb for the aforementioned mass range. The limit is also considerably strong in the fully leptonic decay of bbW W * , varying between 228 fb and 40 fb for m H varying between 450 GeV and 650 GeV. The upper limits from the W W * γγ channels are fairly strong as well. The 2σ upper limit plateaus between 129 fb and 282 fb for the fully leptonic case. Bounds from the other modes, especially from the 4W modes are much weaker. Hence, we see that the channels where we obtained the best S/ √ B values have the strongest bounds on the upper limits of the cross-section. Thus, for the best optimised modes, one requires lesser cross-sections from the heavy Higgs production in order to contaminate the non-resonant Higgs pair production. We must emphasise once again that our BDT optimisation was done solely for the SM non-resonant Higgs pair production modes and this subsection is only showing the effects of the new physics contamination to the SM signal. In order to search for such a resonance, one needs to redo the optimisation upon treating it as a signal. This will be the subject matter of our forthcoming work. To summarise this part, we find that an order 100 fb of cross-section for a resonant Higgs mass 400 GeV will contaminate the SM di-Higgs expectation to at least 2σ.
Similarly, Higgs pair production in supersymmetric models [38,60,62,101] are also very well motivated. To put things into perspective, in this work we restrict ourselves to MSSM which predicts supersymmetric partner(s) for each SM particle. The theory also requires two Higgs doublets. The decays of some of the supersymmetric scalar particles result in the SM-like Higgs along with their fermionic counterparts. The processes which can contaminate the di-Higgs search channels, other than the heavy Higgs resonance mentioned above, come from the squark (anti-squark) pair production. Although LHC has already imposed stringent bounds on the first and second generation squark masses, viz., ≥ O(TeV), still this particular channel can attain sizeable cross-sections owing to the strong couplings and contribution from each light flavour. We choose a benchmark point (BP1) to study squark pair production (q LqL ,q Lq * L ,q * Lq * L ) followed by subsequent decay of the squark to a light quark and Higgs boson accompanied by χ 0 1 . This yields a final state of hh + / E T + jets. In Table 22, we list three benchmark points which are still allowed by all experimental constraints, particularly the constraints coming from the Higgs mass and couplings measurements. The first of these is relevant for our discussion in this subsection. The common parameters for the three benchmark points are as follows: M A = 1000 GeV, tan β = 10, A t = 2500 GeV, From BP1, we see that the cross-section of hh + X is ∼ 10.8 fb, which is less than a third of the SM expectation. Moreover, we find that the / E T distribution from the squark pair production is significantly different from the signal as well as from the Benchmark Parameters (GeV) Mass (GeV) Processes Branching Points Fraction   Fig. 13. After applying the BDT cuts for the bbγγ analysis, we are left with ∼ 0.60 events, which is much smaller compared to the SM expectation and not statistically significant. Hence in order to minimise the contamination to the bbγγ final state ensuing from an SM di-Higgs production, one may perhaps impose certain exclusive cuts, especially on the / E T distribution. This will help reduce new physics contaminations with large / E T . Moreover, for certain SUSY scenarios, we may have cascade decays giving rise to multiple jets. Hence, the cut N j < 6 can come in handy to reduce such backgrounds and we may also require to optimise this cut further in order to reduce such contamination effects. In other words, removing contamination effects can be tricky and can be somewhat model dependent if we are studying inclusive final states.

The h + X channels
In the previous subsection, the heavy resonance production and the di-Higgs production ensuing from subsequent decays of a pair of (anti-)squarks, potentially contaminate all the SM di-Higgs search channels that are studied in section 2. In this subsection, we will look into two specific candidates which will contaminate some di-Higgs final states and not the others. After the HL-LHC run if one finds excesses in certain di-Higgs like final states and not in the others, then it might be possible to narrow down the new physics possibilities to a greater degree.
In 2HDMs, a resonant production of the pseudoscalar Higgs production, viz., pp → A → Zh followed by Z and h decaying to all possible final states, can, in principle, imitate various final states as shown in Fig. 14. The decay rate of the pseudoscalar, A → Zh is appreciable with M A below the tt threshold and for low values of tan β ( 5). The upper limits on the cross-sections are weaker than those from the resonant scalar production. One of the strongest bounds arise from bbγγ, varying from 330 fb (450 GeV) to around 197 fb (650 GeV). The strongest upper limits, however, comes from the bbτ + τ − search, varying between 292 fb and 186 fb in the aforementioned mass range. For the di-leptonic bbW W * channel, the bound strengthens from 1236 fb at m A = 400 GeV to ∼ 110 fb for m A = 650 GeV. From the final state tailored for the 3 mode coming from the 4W scenario, the 2σ upper limit varies between 555 fb (400 GeV) and around 341 fb (650 GeV). The upper-limits on the cross-section required for contamination from the remaining final states are rather weak. In summary, the A → Zh channel contaminates in a slightly weaker fashion as compared to the H → hh channel. One of the possible reasons is that the reconstructed Z-peak is shifted from the reconstructed Higgs peak as m bb serves as an important discriminatory variable in all the searches involving a b-jet pair. Hence, more cross-section is required here in order to contaminate the SM di-Higgs channels to a similar degree as in the H → hh channel. As an aside, we would like to mention that the process pp → Ah may also potentially contaminate the same final states as for the A → Zh case. We however, do not consider the details of this channel, for brevity.
As an extended scenario, we now shift our focus to supersymmetry. In MSSM, electroweakino pair production often results in mono-Higgs type signals. LHC has come down heavily on such SUSY scenarios constraining much of its parameter space. The bounds on squarks and gluino masses have already surpassed a TeV. In this situation, the observation of a SUSY signature will heavily rely on its electroweak sector, composed of charginos (χ ± i ) and neutralinos (χ 0 j ). In the presence of a decoupled Higgs sector, the chargino-neutralino pair production is mediated through the W -boson propagator, with the W ± χ ∓ χ 0 1 coupling containing terms which depend on both the wino and the higgsino components of the electroweakinos involved. However, it is to be noted that the contributions from the wino components dominate over the contributions from the higgsino terms. ATLAS and CMS have also performed searches for chargino-neutralino pair production in the 3 + / E T and the same-flavour opposite-sign 2 + / E T final states for a non-generic scenario where both χ ± 1 and χ 0 2 are dominantly wino-like and mass degenerate. They have obtained correlated bounds on the masses of LSP and NLSP [136][137][138][139] 10 . We carefully select a benchmark point where the wino mass parameter, M 2 is much smaller compared to the higgsino mass parameter, µ making the lightest chargino and second lightest neutralino, wino-like. A winodominated χ 0 2 and χ ± 1 yields much larger cross-section for the process pp → χ 0 2 χ ± 1 compared to other electroweakino production process, for example, χ 0 2 pair production etc. Hence, we will not consider the latter process although it can, in principle, 10 Much stronger limits have been obtained from the 13 TeV results from separate final states involving τ -leptons [140]. We do not however, consider these limits in the present work.  Table 22 and is marginally outside the projected exclusion obtained by ATLAS for the HL-LHC [141]. In this parameter space χ 0 2 dominantly decays to hχ 0 1 , while χ ± 1 has a 100% branching ratio to W ± χ 0 1 . This essentially produces a W h + / E T final state with a cross-section of ∼ 400 fb, thus generating h + X signatures. Hence, the W h + / E T final state from the chargino-neutralino pair production can modestly contaminate some of the di-Higgs search channels, viz., the bbW W * → bb jj + / E T , γγW W * → γγ jj + / E T , 4W → ± ± jjjj + / E T , 3 jj + / E T . In Table 23, we present the event yields for the benchmark point BP2, in three of the concerned di-Higgs channels, corresponding to the most optimised BDT score obtained for the nonresonant SM di-Higgs searches. We find that the contaminations are large in these channels reminding us that a possible future observation of significant number of events in these channels must be treated carefully. We also mention here that the SM di-Higgs expectations from these channels are insignificant leading to negligible signal over background ratios. Thus, observations of significant numbers of events over and above the SM backgrounds can be potential signatures for new physics.

Null Higgs channels
Before closing this section, we discuss the final category of potential contaminants, viz., the ones with no SM-like Higgs bosons in the production or decay modes. We start by revisiting the classic heavy resonant (pseudo-)scalar production. This (pseudo-)scalar is dominantly produced by the gluon-fusion production mode and in the case where its mass is greater than the tt threshold, it can decay to a pair of top quarks, the branching ratio depending on the H(A)tt Yukawa coupling. This channel can potentially contaminate the bbτ + τ − and bbW W * channels. We find from Fig. 15 that the upper limits on the cross-section times branching ratio (σ(pp → H(A) → tt)) from the relatively clean bbW W * → bb + − + / E T channel, is visibly weak. The upper limits from the semi-leptonic decay mode, viz., bbW W * → bb + / E T +jj gives slightly stronger 2σ upper limits on the contamination cross-section, varying between ∼ 1.2 pb (m H = 500 GeV) and ∼ 0.5 pb (m H = 650 GeV). The upper limits from bbτ + τ − also does not fare well. Hence, the H → tt channel does not contaminate the SM di-Higgs channels to any considerable degree. One of the prime reasons is the fact that the BDT variable m bb is strongly discriminating, peaking at the SM-like Higgs boson mass for the non-resonant Higgs pair production, with the b-quark pair from the tt mode having a distinct feature as shown in Fig. 16. Hence, one will require a very large production cross-section for the heavy resonant scalar in order to contaminate the SM signature significantly.
Another interesting category can be accommodated in various extensions of the SM involving singly charged Higgs bosons. One can consider a scenario where a singly charged Higgs is produced in association with a top quark and a bottom quark, viz., pp →tbH + /tbH − and the charged Higgs either decays to τ ν τ or tb depending on its mass. These channels may adversely contaminate the bbW W * and bbτ + τ − modes. We find from Fig. 17 that the tbtb channel poses the strongest contamination to the bb jj + / E T final state. The 2σ contamination cross-section for this final state varies between 393 fb (m H + = 250 GeV) and 204 fb (m H + = 650 GeV). The limits from the other channels are weaker. We also note in passing that all the aforementioned processes essentially affect the low tan β region of the parameter space.
As a final example, we study the stop pair production, pp →t 1t * 1 which can potentially mimic some of the di-Higgs signatures. The stop pair-production cross-section is fairly large for stop masses of the order of several hundreds of GeVs. With an appropriate choice of parameters listed as BP3 in Table 22,t 1 can have a dominant branching ratio to bχ + 1 , with χ + 1 eventually decaying to W + χ 0 1 . This gives us a final state of 2b + 2W + / E T which potentially affects the hh → bbW W + and hh → bbτ + τ − search channels. We choose BP3 such that the mass difference betweent 1 and χ 0 1 is less than the top mass, ensuring the stop decays ast 1 → W bχ 0 1 . The final number of events at the HL-LHC for the relevant search channel bb jj + / E T is shown in Table. 24. The contamination is found to be of the same order as the SM signal. We also note that the other decay mode of stop quarks, viz., tχ 0 1 , will also give rise to tt + / E T final state, affecting the same channels.
We must stress here that the entire analysis has been performed using boosted decision tree optimisation techniques which has been trained using the SM di-Higgs data  samples. Hence, the BDT cuts are very efficient in segregating any contamination, i.e., non-SM contributions. Now, if a new physics process is still able to contaminate, then it must be very efficient in passing all the cuts. This would mean that it must come with a large production cross section or a considerable overlap with the SM kinematic variables, so as to contaminate the SM signal. In other words, we can impose stringent bounds on the cross-sections for various BSM scenarios discussed above, which can potentially contribute to the di-Higgs signals. The efficiency of the BDT cuts will, of course, depend on the particular channel considered. The bound on some BSM physics can be strong from one channel and may not be so strong from the rest. It is important to note that there might be two completely different aspects of interpreting our results. The first case would be where we are already aware of the presence of new physics (through some other channel). In such situations, we want to ask whether any new physics process might contaminate the di-Higgs signal. If so, we will get an idea of how large the cross section will be for such processes and prepare our strategy. The second one is similar to our present situation, where we would be still looking for new physics. This is a much more complex scenario as we are looking for new physics in various directions. Our purpose in this work is to classify di-Higgs searches in multiple channels in a model independent manner so as to extract the best possible information about potential contaminating channels. In this case, we can, at best, put bounds on the cross-sections coming from new physics scenarios. This will give us an idea if the measurement of the Higgs self-coupling is possible and if yes, then which channel to look out for.
We wish to conclude this section by reiterating our philosophy for the second part of our study with the following observations. In the fortunate case that we discover new physics in the near future, for instance discoveries of heavy Higgs boson(s), superpartners of quarks, to name a few, then the measurement of λ hhh will be affected because of the effects of contamination to the SM channels as have been quantified above. For a possible scenario where we have hints of new physics but these are below the discovery significance, then also care must be taken to study the effects of contamination which can tell us more about the viability of such scenarios. A third possible scenario which we did not look for in this present study is the effects of new physics only modifying λ hhh . For such possibilities, it might happen that we will see no new particles and the shapes of the kinematic distributions involving the Higgs pair production can only shed light on new physics.

Summary and outlook
In the first part of this work, we evaluated the prospects of di-Higgs searches in numerous well motivated final states. Optimised cut-based analyses were performed for the bbγγ and bbτ + τ − states. We followed this up with multivariate analyses using the boosted decision tree (BDT) algorithm for the majority of our search channels. The multivariate analyses yielded improved signal to background ratio (S/B) and the overall statistical significance. The bbγγ final state presented itself as the most promising search channel with a statistical significance of 1.46 (1.76) for the cut-based (multivariate) analysis. The bbτ + τ − channel was looked for in the fully hadronic, semi-leptonic and leptonic sub-states. This channel, even upon having a higher yield as compared to its predecessor, is marred by much larger backgrounds and our limitation to reconstruct the τ invariant mass precisely. However, upon employing the collinear mass variable for reconstructing the Higgs decaying to a pair of τ s, we finally obtain statistical significances of 0.65 (0.74), 0.44 (0.49) and 0.07 (0.08) for the cut-based (multivariate) analyses in the hadronic, semi-leptonic and leptonic modes respectively. The signal to background ratio improves significantly upon using the collinear mass technique. The bbW W * state in the leptonic final state serves as a clean channel with a moderate S/B and a statistical significance of 0.62. This serves as the third most important contribution after the bbγγ and the fully hadronic bbτ + τ − channels. The semi-leptonic final state for bbW W * pales in comparison with a much smaller S/B and a statistical significance of 0.13. Both the leptonic (S/B= 0.40) and semi-leptonic (S/B = 0.11) final states for the W W * γγ channel show great potential for higher-energy and higher-luminosity colliders. The limitation in design-luminosity at the HL-LHC in addition to the smallness of BR(h → γγ) forbid us from utilising these final states while computing the combined significance. We conclude the first part of this work upon considering the SS2 , 3 and 4 final states emerging from the hh → W W * W W * search channel. The tri-leptonic channel yields a statistical significance of 0.20, however, with an insignificant S/B. One would require a manifold increase in the production cross-section in these three channels for them to become noteworthy, even in the future colliders. For all channels with less than 5 signal events, we were unable to define a statistical significance. A combined zero-systematics significance of ∼ 2.1σ was obtained upon combining all the statistically significant signals for the HL-LHC analysis at 14 TeV. The quoted significance values can get severely diluted, once systematic uncertainties are taken into account.
After this we studied the importance of considering varying values of the Higgs trilinear coupling and how it affects our conclusions. We trained the boosted decision trees with the SM case for once and then with each of the λ hhh samples and found that one can have a difference in significance because of the difference in the distributions of certain kinematic variables. We faithfully recover the expected exclusion on the Higgs trilinear coupling for the HL-LHC, as computed by ATLAS, upon using a log-likelihood CLs hypothesis for the λ SM BDT optimisation. Upon changing the training to a different value of λ and also upon choosing a hypothesis different from that of the SM, we obtain stronger upper limits.
In the final chapter of this work, we analysed some new physics scenarios which may potentially contaminate the SM di-Higgs search channels. We used the same multivariate training and cut on the BDT variable for the new physics cases as have been obtained for the SM non-resonant di-Higgs searches, in order to estimate the contaminations. Three major contamination scenarios were studied, viz., hh(+X), hX and X, X being a set of objects not ensuing from the SM-like Higgs, and upper limits on the production cross-section of heavy scalar (H), pseudoscalar (A) and charged Higgs (H ± ) bosons were obtained. In particular, we derived upper limits on σ(pp → H → hh), σ(pp → A → Zh), σ(pp → H → tt) and σ(pp → H + tb → tb(τ ν)tb) for the various search channels. The bbγγ channel emerged as the most sensitive search channel, with results indicating that for m H = 500 GeV, a production cross-section of σ(pp → H → hh) ∼ 36 fb would result in a 2σlevel of contamination to the SM search. This is closely followed by the bbτ + τ − channel, putting an upper limit of 104 fb for the same resonance mass. The limits from the leptonic decay mode of bbW W * also present competitive upper limits with σ(p → H → hh) attaining values of ∼ 98 fb at m H = 500 GeV for a 2σ-level contamination. The upper limits from the remaining decay channels are ∼ 5 − 10 times weaker. In the resonant A → Zh search, the bbγγ mode presents the strongest upper limit on the cross-section at 233 fb with m A = 500 GeV. The bbτ + τ − mode closely follows with a contaminating cross-section of 238 fb for the same mass of the pseudoscalar. The di-leptonic final state for the bbW W * channel also imposes upper limit of the same order. Next, we derived upper limits on σ(pp → H → tt), and the results were found to be significantly weaker than the previous scenarios. The 2σ upper limits derived for the charged Higgs production also exhibit similar results, with the semi-leptonic bbW W * channel offering the best sensitivity with cross-section requirements of the order of 217 fb for m H + = 500 GeV, in H + → tb mode. The epilogue to this story is provided by the contaminations from various SUSY processes. Here, we had chosen three experimentally viable benchmark points, optimised for squark pair production, chargino-neutralino pair production and stop pair production, with subsequent cascade decay modes mimicking various di-Higgs final states. Of particular interest is the contribution from the χ 0 2 −χ ± 1 pair production which may significantly contaminate the SS2 and 3 final states in the hh → 4W channel, and the semi-leptonic decay mode of the bbW W * channel, with event yield much higher than the corresponding SM di-Higgs signal. It would be logical to argue that the presence of such SUSY signatures would lead to a clear and strong contamination in these di-Higgs final state searches paving an interesting and complicated road ahead for the search of Higgs trilinear coupling.
As seen in this work, the prospects of discovering di-Higgs signals for a SMlike scenario is extremely difficult owing to the smallness of the production crosssection and the overwhelmingly large backgrounds. However, many of the search channels considered must motivate the particle physics community to either aim for higher integrated luminosities, beyond 3 ab −1 or to build higher energy colliders, viz., a 28 TeV/33 TeV and ideally 100 TeV machines. Even in our present setup, in all probability, the sensitivities can be improved upon having a better handle over the backgrounds by either minimising the uncertainties due to the Monte-Carlo computation order or by adopting data driven backgrounds. Besides, there might be certain novel discriminatory variables or certain boosted techniques which might help in reducing the backgrounds further. We also learnt from this study that looking for di-Higgs search channels may in principle be masked by new physics effects. For such scenarios our multivariate optimisation tries the best to separate the SM-signal from the new physics effects. However, in certain cases, due to similarities in kinematic distributions with the SM counterparts or due to a large cross-section yield, we may have considerable contamination effects. The techniques outlined in this paper can be easily extended and optimised as searches for the various new physics effects listed above.
optimised cut-based analysis. For the three modes, we find the following to be the most optimal cut choices: • τ h τ h : p T,bb > 100 GeV, τ h τ : p T,bb > 115 GeV and τ τ : p T,bb > 140 GeV • τ h τ h : m T 2 > 110 GeV, τ h τ : m T 2 > 130 GeV and τ τ : m T 2 > 120 GeV  49 background events for the τ h τ h , τ h τ and τ τ cases, respectively. We find a considerable reduction in the backgrounds with respect to the cut-based analysis performed earlier with the m vis τ τ variable. However, the signal yield also falls sharply. Finally, we find S/ √ B values of 0.21, 0.30 and 0.09 for the three aforementioned cases, respectively. We do not use this variable for a detailed study as the sharpness of this variable reduces upon including smearing and other detector effects.