1 Introduction

The thorough investigation of the properties of the Higgs boson discovered by ATLAS and CMS [1, 2] is one of the highest priorities in particle physics for the next two decades. A crucial property is the trilinear Higgs self-coupling which can be probed by the study of Higgs-pair production. At the Large Hadron Collider (LHC), this is considered to be one of the most challenging processes to observe, even with a data set corresponding to an integrated luminosity of 3 ab\(^{-1}\), the target for the proposed high luminosity LHC (HL-LHC) programme. Several particle-level studies were published even before the Higgs discovery [3, 4] and more have been published since then, assessing the sensitivity of different decay channels such as \(b\overline{b}\) \(\gamma \gamma \), \(b\overline{b}\) \(\tau \tau \), and \(b\overline{b}\) \(WW\) [59]. The \(b\overline{b}\) \(b\overline{b}\) final state was examined in Ref. [10], where it was found to have very low sensitivity, and more recently in Ref. [11] where the use of a tighter kinematic selection and jet substructure techniques appeared to give some improved sensitivity, although that study considered only the \(4b\) multi-jet process as background.

In this paper, we extend our previous work on resonant Higgs-pair production in the \(b\overline{b}\) \(b\overline{b}\) final state [12]—which inspired the recent ATLAS analysis [13]—to the non-resonant case, considering all the relevant background processes, namely \(b\overline{b}\) \(b\overline{b}\), \(b\overline{b}\) \(c\overline{c}\), and \(t\overline{t}\). The \(HH\rightarrow b\overline{b}b\overline{b}\) final state benefits from the high branching fraction of Higgs decaying to \(b\overline{b}\) (57.5 % in the standard model (SM) for \(m_H=125.1\,\)GeV [14], leading to about one third of the Higgs pairs decaying to \(b\overline{b}\) \(b\overline{b}\)), but suffers from large backgrounds. However, like the previously studied resonant case [12], the transverse momentum (\(p_{\mathrm T}\)) of the Higgs bosons in the non-resonant process in the SM is relatively high, with the most probable value around 150 GeV [11]. By tailoring the event selection to focus on this high-\(p_{\mathrm T}\) regime, where the two Higgs bosons are essentially back-to-back, one has the benefits outlined in Ref. [12] for the resonant case. Requiring four \(b\)-tagged jets, paired into two high-\(p_{\mathrm T}\) dijet systems, is a very powerful way to reduce the backgrounds. This is particularly true for the dominant multi-jet background, which has a cross section that falls rapidly with increasing jet and dijet \(p_{\mathrm T}\). There is also negligible ambiguity in pairing the four \(b\)-jets to correctly reconstruct the Higgs decays. Finally, due to the high boost, the four jets will have high enough transverse momenta for such events to be selected with high efficiency at the first level triggers of ATLAS and CMS, with efficient high level triggering possible through online \(b\)-tagging [13]. We note that triggering will be a major challenge at the HL-LHC, but the substantial detector and trigger upgrade programmes proposed by the two experiments should make it possible to maintain the high trigger efficiencies reported by ATLAS in the 8 TeV run [13] in channels that are essential for key measurements at the HL-LHC, such as the Higgs trilinear self-coupling.

2 Simulation of signal and background processes

Signal and background processes are modeled using simulated Monte Carlo (MC) event samples. The \(HH\rightarrow b\overline{b}b\overline{b}\) sample is produced with a special release [15] of MadGraph 1.5.12 [16], interfaced to Pythia 8.175 [17] for parton showering (PS) and hadronization. This MadGraph release simulates gluon–gluon-fusion Higgs boson pair production using the exact form factors for the top triangle and box loops at leading order (LO), taken from [18]. The CTEQ6L1 [19] LO parton-density functions (PDF) are used. The signal cross section is scaled to 11.6 fb [20]. The \(t\overline{t}\) events are generated with Powheg  [21, 22] interfaced to Pythia 8.185 and using the CT10 [23] PDF set. \(t\overline{t}\) events with W boson decays to electrons and muons, or where both W bosons decay to light jets, are not simulated, since these decays are suppressed by the requirement for four \(b\)-tagged jets to pass the event selection, as described in Sect. 4. The \(b\overline{b}\) \(b\overline{b}\) and \(b\overline{b}\) \(c\overline{c}\) backgrounds are generated by Sherpa 2.1.1 [24], using the CT10 PDF set. These event samples are scaled to their next-to-leading order (NLO) cross section by applying a \(k\)-factor of 1.5 [25]. Other multi-jet processes (such as \(c\overline{c}\) \(c\overline{c}\) and \(b\overline{b}\) \(jj\)) were also considered, but found to be negligible compared to the above two, once the \(b\)-tagging requirements are imposed. The \(b\overline{b}\) \(b\overline{b}\) and \(b\overline{b}\) \(c\overline{c}\) background samples are filtered at parton level, requiring either: at least four anti-\(k_t\) \(R=0.4\) jets [26] with \(p_{\mathrm T}>30\) GeV and \(|\eta |<2.7\); or at least two Cambridge–Aachen \(R = 1.2\) jets [27] with \(p_{\mathrm T}>150\) GeV and \(|\eta | < 2.7\). In addition, we have considered the most relevant single-Higgs production channels to give an indication of their contribution in comparison to the signal and the dominant backgrounds listed above. The \(Hb\overline{b}\) background is generated using MadGraph _aMc@nlo 1.5.12 [28] interfaced to Pythia. The \(ZH\) and \(t\overline{t}H\) processes are both generated using Pythia 8.175. The Higgs mass is fixed to 125 GeV. More details can be found in Table 1.

Table 1 Summary of the event generators used to model the signal and background processes. The quoted \(\sigma \times \mathrm {BR}\) in the last column includes the event filtering described in the text, for the \(b\overline{b}\) \(b\overline{b}\), \(b\overline{b}\) \(c\overline{c}\), and \(t\overline{t}\) processes

3 Discussion of the signal topology

Figure 1 shows the \(p_{\mathrm T}\) distribution of the Higgs bosons in signal events. As mentioned above, in a substantial fraction of signal events (36.6 %), both Higgs bosons have \(p_{\mathrm T}>150\) GeV. However, this drops to 16.6 % (3.6 %) when requiring both Higgs bosons to have \(p_{\mathrm T}>200\) GeV (300 GeV).

Fig. 1
figure 1

The \(p_{\mathrm T}\) distributions of the leading (circles) and subleading (squares) Higgs bosons in signal events

Figure 2 compares the efficiency to reconstruct the Higgs boson, as a function of its \(p_{\mathrm T}\), using two different techniques: (a) combining two anti-\(k_t\) jets with \(R=0.4\) (hereafter denoted akt04 jets); and (b) as a single Cambridge–Aachen jet with \(R=1.2\) (hereafter denoted ca12 jets). In both cases, we use the implementation of the jet clustering algorithms in Fastjet [29] and we include all stable particles in the processing except neutrinos. The efficiency is defined as follows. We take all akt04 jets with \(p_{\mathrm T}>40\) GeV, and all ca12 jets with \(p_{\mathrm T}>80\) GeV containing at least two subjets with \(p_{\mathrm T}>40\) GeV (the subjets are formed by reclustering each ca12 jet using the \(k_t\) algorithm [30] with \(R=0.3\)). We then ghost-associate [31] the \(b\)-quarks from the Higgs decay to all the jets and subjets. The efficiency for the akt04 reconstruction is defined as the fraction of Higgs decays contained in two akt04 jets with angular separation \(\Delta R=\sqrt{\Delta \eta ^2+\Delta \phi ^2}<\)1.5 , where each akt04 jet is associated with a different \(b\)-quark from the Higgs decay. The efficiency for the ca12 reconstruction is defined as the fraction of Higgs decays contained within a single ca12 jet, with the two \(b\)-quarks associated to two different subjets. From Fig. 2, it can be seen that the efficiency of the akt04 approach is higher than the ca12 approach for Higgs \(p_{\mathrm T}\) values up to about 400 GeV. This is not unexpected, given the angular separation of the two \(b\)-quarks coming from the Higgs boson decay as a function of the Higgs boson \(p_{\mathrm T}\), as shown in Fig. 3. At lower Higgs \(p_{\mathrm T}\), a ca12 jet often cannot capture all the Higgs decay products within its clustering radius. Figure  2 also shows that for Higgs boson \(p_{\mathrm T}\) is above 500 GeV the efficiency of the akt04 approach falls rapidly, but this is not a \(p_{\mathrm T}\) region of interest for the non-resonant Higgs-pair production, as can be seen from Fig. 1.

Fig. 2
figure 2

The efficiency for reconstructing correctly the Higgs boson from two anti-\(k_t\) jets with \(R=0.4\) (circles) or from a single Cambridge–Aachen jet with \(R=1.2\) (squares)

Fig. 3
figure 3

The distance \(\Delta R\) between the two \(b\)-quarks from the Higgs boson decay as a function of the Higgs boson \(p_{\mathrm T}\)

4 Event selection

The event selection proceeds by requiring at least four \(b\)-tagged akt04 jets with \(p_{\mathrm T}>40\) GeV and \(|\eta |<2.5\) . In order to emulate the effect of \(b\)-tagging in this particle-level study, we adopt the following procedure: jets are labeled as \(b\)-jets, \(c\)-jets, \(\tau \)-jets or light jets based on the ancestry of the final-state particles clustered into the jet. If a \(b\)-hadron is found in the history of any of the final-state particles, the jet is labeled a \(b\)-jet, otherwise if a \(c\)-hadron is found the jet is labeled a \(c\)-jet. If neither a \(b\)-hadron nor a \(c\)-hadron is found, but a \(\tau \)-lepton is found instead, the jet is labeled a \(\tau \)-jet. All other jets are classified as light jets. We then apply \(b\)-tagging efficiency weights inspired by the published ATLAS and CMS \(b\)-tagging performance [32, 33]: 70 % for \(b\)-labeled jets, 20 % for \(c\)-labeled and \(\tau \)-labeled jets (i.e. a rejection factor of 5) and 1 % for light-labeled jets (i.e. rejection factor of 100). All jets in the event are ordered by \(b\)-tagging weight and subsequently by \(p_{\mathrm T}\). The leading four jets are then used to form dijets, requiring \(p_{\mathrm T}^{\mathrm {dijet}}>150\) GeV, \(85<m_{\mathrm {dijet}}<140\) GeV and \(\Delta R<1.5\) between the two jets of the dijet system. If more than two dijets satisfy the above criteria, the two which are most back-to-back in the plane transverse to the beam line are retained. The two dijets are ordered in \(p_{\mathrm T}^{\mathrm {dijet}}\), and the leading dijet is required to have \(100<m_{\mathrm {dijet}}<140\) GeV, while the subleading one must satisfy \(85<m_{\mathrm {dijet}}<130\) GeV. Footnote 1 Finally, in order to reject \(t\overline{t}\) events we use the TMVA framework [34] to train a boosted decision tree (BDT) discriminant, \(X_{tt}\), using four input variables, two from each dijet system, calculated as follows. We search for a third jet with \(\Delta R<2\) from the jets of the dijet system, and then calculate: (a) the invariant mass of the three-jet system (which would be close to the top mass for a hadronic top quark decay); and (b) the invariant mass of the third jet with the least \(b\)-tagged jet of the dijet system (giving often the \(W\) mass in a hadronic top quark decay). Using \(X_{tt}\), the \(t\overline{t}\) background is reduced by a factor of \(\sim \)2.5 for a 10 % reduction in the signal and the multi-jet background.

After the above selection, the remaining signal cross section is 0.19 fb, corresponding to about 570 events in 3 ab\(^{-1}\). The multi-jet background cross section is 82 fb, dominated by \(b\overline{b}\) \(b\overline{b}\), and the \(t\overline{t}\) cross section is 29 fb, indicating that the \(t\overline{t}\) is a sizeable fraction of the total background. The single-Higgs production \(H(\rightarrow b\bar{b})\) \(b\overline{b}\), \(t\overline{t}H\), and \(ZH\) processes have a combined cross section of 0.33 fb, comparable to the signal, with the main contribution coming from \(t\overline{t}H\). Therefore, the signal-to-background (\(s/b\)) ratio at this point is 0.17 % and the expected statistical significance (\(s/\sqrt{b}\)) for 3 ab\(^{-1}\) is 1.0. Clearly, with such a low \(s/b\) ratio, it would be impossible to extract any signal sensitivity reliably.

Further to the above selection, any additional kinematic and angular differences between the signal and background can be exploited using the following list of largely uncorrelated variables:

  • the decay angle of the Higgs bosons in the rest frame of the \(4b\) system, \(\varTheta ^*\);

  • the decay angles of the \(b\)-quarks in the rest frame of the Higgs bosons, \(\theta _1\) and \(\theta _2\);

  • the angle between the decay planes of the two Higgs bosons, \(\varPhi \);

  • the angle between one of the above decay planes and the decay plane of the two-Higgs system, \(\varPhi _1\);

  • the two dijet invariant masses, \(m_{12}\) and \(m_{34}\);

  • the invariant mass of the \(4b\) system, \(m_X\);

  • the \(p_{\mathrm T}\) of the \(4b\) system, \(p_{{\mathrm T},X}\); and

  • the rapidity of the \(4b\) system, \(y_X\).

These variables have also been proposed [35] and used [36] in the context of the \(H\rightarrow ZZ^*\rightarrow 4\ell \) analyses at the LHC. Figure 4 shows the distributions of these variables in the signal and background after the above event selection. It can be seen that some of them have little discrimination following the event selection, but others show significant differences between the signal and backgrounds.

Fig. 4
figure 4

Subfigures aj show the kinematic and angular variables used to separate the signal and background processes, as described in the text. Subfigure k shows the shape of the \(t\overline{t}\) discriminant, \(X_{tt}\), after the top veto has been applied

We combine the above variables, together with \(X_{tt}\), in a single BDT discriminant, \(\mathcal{D}_{HH}\). The output distributions of this discriminant for signal and background are shown in Fig. 5.

Fig. 5
figure 5

The BDT discriminant \(\mathcal{D}_{HH}\)

5 Results and discussion

Figure 6 shows \(s/b\) and \(s/\sqrt{b}\) for an integrated luminosity of 3 ab\(^{-1}\) as a function of the signal efficiency, while varying the cut on \(\mathcal{D}_{HH}\). The highest statistical significance achieved is 1.8 with \(s/b\approx 1.3~\%\). The signal cross section remaining at this point is 0.08 fb, corresponding to about 240 events with 3 ab\(^{-1}\). The remaining background cross sections are: \(b\overline{b}\) \(b\overline{b}\), 2.8 fb; \(b\overline{b}\) \(c\overline{c}\), 0.6 fb; \(t\overline{t}\), 2.6 fb; and single-Higgs, 0.05 fb. Figure 6 also shows that it is possible to achieve much higher \(s/b\) values for a rather modest decrease in the statistical significance, which may be an important consideration when systematic uncertainties are also taken into account in the analysis. A summary of all relevant numbers is given in Table 2.

Table 2 Cross sections of the signal and background processes at various steps in the event selection, and the corresponding \(s/b\) and \(s/\sqrt{b}\). The penultimate row shows the results of the optimal selection, assuming 10 % \(b\)-tagging efficiency for \(c/\tau \)-labeled jets. The last row shows the results for the BDRS analysis described in the text
Fig. 6
figure 6

\(s/b\) and \(s/\sqrt{b}\) as a function of the relative signal efficiency when varying the cut on \(\mathcal{D}_{HH}\), for an integrated luminosity of 3 ab\(^{-1}\)

These results demonstrate that the \(t\overline{t}\) and \(b\overline{b}\) \(c\overline{c}\) processes together represent more than half of the total background. Most of the remaining \(t\overline{t}\) background consists of events where the decay products from both \(W\)’s from the top decays include a charm jet or a jet from a hadronic tau decay. This gives additional motivation to improve the charm and tau jet rejection of \(b\)-tagging at the HL-LHC. While the increasing pile-up will make this task challenging, the significantly improved pixel tracking detectors proposed for both ATLAS [37] and CMS are likely to provide the necessary \(b\)-tagging performance improvements. In order to demonstrate the potential benefits to this analysis from an improved \(c/\tau \)-jet rejection, we repeated the above study assuming a \(b\)-tagging efficiency of 10 % for \(c/\tau \)-labeled jets. On doing this, the highest statistical significance obtained is 2.1 at the optimal cut value for \(\mathcal{D}_{HH}\), with \(s/b\approx 2.4~\%\).

It is worth pointing out that recent theoretical calculations of the SM Higgs-pair production cross section with various improvements [3840] find it is 20–30 % higher than the NLO value used here. Even if the cross sections of the background processes were increased by a similar factor with more precise calculations, the \(s/\sqrt{b}\) would still be 10–15 % better than the result presented above.

In order to have a more direct comparison of the above approach with a selection based on ca12 Higgs reconstruction and jet substructure techniques in both signal and background, we have applied the BDRS [41] analysis described in Ref. [11], on the signal and background samples listed in Table 1.Footnote 2 The results of this selection are shown at the last row of Table 2. These results demonstrate that the higher Higgs reconstruction acceptance of the akt04 approach combined with all the available angular and kinematic information adds significant sensitivity to the Higgs-pair production analysis.

As this is a particle-level study, it is expected that experimental resolution effects will reduce somewhat the discriminating power of the variables used in the above event selection. However, it is worth pointing out that our particle-level predictions in Ref. [12] appear to be in broad agreement with the ATLAS result [13] that includes all the experimental resolution effects and background estimation uncertainties. In addition, there is plenty of scope for further optimizing the current analysis. Examples of possible avenues to explore for further optimization include: fitting the distribution of \(\mathcal{D}_{HH}\) to extract more information from the data; the use of control regions and data-driven techniques for determining the various backgrounds, as in Ref. [13]; the use of kinematic fitting techniques to improve the angular resolution of the four jets and hence the discriminating power of the angular variables described above; or the use of the shape of the \(b\)-tagging discriminant for each jet, to suppress further the non-4\(b\) background events.

6 Conclusions

In SM non-resonant Higgs-pair production at the LHC, the Higgs bosons are mostly produced back-to-back, with relatively large \(p_{\mathrm T}\). Selecting four \(b\)-tagged jets and forming two back-to-back pairs, with \(p_{\mathrm T}^{\mathrm {dijet}}>150\) GeV and \(\Delta R < 1.5\) between the two jets in each pair, leads to a drastic suppression of all background processes (particularly the dominant multi-jet production), while maintaining a good signal yield. Given the \(p_{\mathrm T}\) spectrum of the Higgs bosons, the use of pairs of anti-\(k_t\) jets with \(R=0.4\) appears to be more suitable for reconstructing each Higgs candidate than the use of single Cambridge–Aachen jets with \(R=1.2\).

We further find that exploiting the full kinematic and angular information of the \(4b\) system can provide very substantial additional improvement in the sensitivity for \(HH\rightarrow b\overline{b}b\overline{b}\) and the measurement of the Higgs trilinear self-coupling. Our particle-level study yields a statistical significance of 1.8 (2.1) per experiment for an integrated luminosity of 3 ab\(^{-1}\), assuming a \(b\)-jet tagging efficiency of 70 % and \(c/\tau \)-jet \(b\)-tagging efficiency of 20 % (10 %). While experimental systematic uncertainties will tend to reduce the sensitivity of the measurement, there is still plenty of scope to optimize the analysis further, hence we expect that the sensitivity quoted here should be achievable at the HL-LHC.